ascend-npu

1 article tagged with ascend-npu

April 24, 2026
model releaseDeepSeek

DeepSeek V4 cuts inference costs with 1.6T parameter model using 13.7x less memory than V3

DeepSeek released V4 in two versions: a 284 billion parameter Flash model and a 1.6 trillion parameter Pro model with 49 billion active parameters. According to DeepSeek, the models use 9.5x-13.7x less memory than V3 through compressed attention mechanisms and FP4/FP8 mixed precision, while supporting a 1 million token context window.