BlockBeats News, September 29th, the DeepSeek-V3.2-Exp model was officially released and open-sourced. The model introduces a Sparse Attention architecture, which can effectively reduce computational resource consumption and improve model inference efficiency. Currently, the model has been officially deployed on the Huawei Cloud ModelArts platform MaaS. For the DeepSeek-V3.2-Exp model, Huawei Cloud continues to use a large EP parallelization solution, deploying a context-parallel strategy based on the Sparse Attention structure to achieve long sequence affinity and considering both model latency and throughput performance. (Jin10)


