ascend-npu
1 article tagged with ascend-npu
April 24, 2026
model releaseDeepSeek
DeepSeek V4 cuts inference costs with 1.6T parameter model using 13.7x less memory than V3
DeepSeek released V4 in two versions: a 284 billion parameter Flash model and a 1.6 trillion parameter Pro model with 49 billion active parameters. According to DeepSeek, the models use 9.5x-13.7x less memory than V3 through compressed attention mechanisms and FP4/FP8 mixed precision, while supporting a 1 million token context window.