🟦 A move to fill NVIDIA’s void: Alibaba accelerates localization and cloud enhancement at the same time with “inference-specialized” AI chips

In response to the risk of dependence on NVIDIA caused by U.S. export restrictions, Alibaba is trying to improve the competitiveness of its cloud business by promoting localization and cost optimization with its own inference-specialized chips.

reuters.com

🟦 A move to fill the void in NVIDIA—Alibaba’s inference-specialized AI chip
🟦 “Regulations promote domestic production” Dual pressure between geopolitics and supply and demand
🟦 Summary

🟦 A move to fill the void in NVIDIA—Alibaba’s inference-specialized AI chip

Alibaba is developing new chips for a variety of inference workloads, including LLMs and generative AI, and will increase supply chain autonomy by switching to domestic foundry manufacturing. The strategy is to first adapt it to mass operation in the company’s cloud and gradually spread it.

Main content: We are testing our own AI chips optimized for inference. Manufacturing has shifted from relying on TSMC to domestic.
Technical Features and Benefits: Emphasis on compatibility with high-level frameworks (PyTorch/TensorFlow) to reduce the cost of migrating existing code. Pursuing effective optimization of performance, power, and latency for data centers.
Target markets and applications: Alibaba Cloud’s generative AI/API services, search, advertising, e-commerce recommendations, edge/on-premise inference.

🟦 “Regulations promote domestic production” Dual pressure between geopolitics and supply and demand

U.S. regulations make H100/Blackwell virtually unavailable, and alternative H20s remain uncertain in terms of both performance and policy. China’s domestic demand is rapidly expanding with generative AI, and inference is a cost-sensitive area, so it is reasonable to launch in-house optimized chips. In terms of competition, Huawei (Ascend) and stock Recon are running side by side, and the domestic ecosystem is dominant. On the other hand, in the learning area, Nvidia’s dominance will continue due to the depth of the CUDA ecosystem, and Alibaba is expected to take a realistic solution with a hybrid of “inference is in-house and learning is external”.

🟦 Summary

Alibaba’s inference specialized chip is a practical move aimed at establishing a domestic supply chain and reducing the unit performance cost of the cloud at the same time. It is a path to hybrid management in the short term and complete independence in the long term, and mutual complementarity and competition with other domestic companies are likely to create depth of technology.

I think it is a very reasonable move for Alibaba, which has a cloud business, to have a chip that specializes in inference rather than NVIDIA’s general-purpose GPU. On the other hand, in the field of AI, technological trends are changing rapidly, and it is difficult to read which models and algorithms will become mainstream.