Jack Ma-supported ant-at-at breakthrough on Chinese chips



Jack Ma-supported Ant Group Co. used semiconductors from Chinese to develop techniques for the training of AI models that would reduce costs by 20%, according to the persons familiar with the matter.

Ant used domestic chips, including the Affiliate Alibaba Group Holding Ltd. And the Huawei Technologies Co. to train models with the so -called mixture of mechanical learning experts, people said. It achieved results similar to that of Nvidia Corp. Chips like the H800 said and asked not to be called that because the information is not public.

Nvidia, based in Hangzhou, still uses for AI development, is now mainly based on alternatives, including Advanced Micro Devices Inc. and Chinese chips for its latest models, said one of the people.

The models mark Ant’s entry into a race between Chinese and US companies, which has been accelerated since Deepseek, how capable models can be formed by far less than the billions invested by OPENAI and Alphabet Inc. It underlines how Chinese companies try to use local alternatives to the most advanced Nvidia haulders. The H800 is not the most progressive, is a relatively powerful processor and is currently excluded from the USA from China.

The company published AResearch paperThis month, in which its models sometimes claimed, the Meta platforms Inc. exceeded in certain benchmarks that Bloomberg News did not independently check. However, if you work as announced, Ants Platforms could mark a further step forward for the development of Chinese artificial intelligence by reducing the costs for the inference or supporting AI services.

As companies, considerable money flow into the AI, Moe models have turned out to be a popular option and recognized themselves through Google and Hangzhou Startup DeeptTeek. This technology divides tasks into smaller amounts of data that a team of specialists have, each concentrating on a job segment and make the process more efficient. Ant refused to make a comment in an e -mail declaration.

However, the training of MOE models is typically based on powerful chips such as the graphics processing units that Nvidia sells. So far, the costs have been unaffordable for many small companies and a more limited introduction. Ant has worked on opportunities to train LLMS more efficiently and to eliminate this restriction. His paper title makes this clear because the company sets the goal of scaling a model “without Premium GPUS”.

This contradicts the grain of Nvidia. Chief Executive Officer Jensen Huang has argued that calculation demand will also grow with the advancement of more efficient models such as deepseeks R1. plague That companies need better chips to generate more income and not cheaper costs to reduce the costs. It is stated by the strategy of building large GPUs with more processing cores, transistors and increased storage capacities.

Ant said it cost about 6.35 million Yuan ($ 880,000) to train 1 trillion token with high-performance hardware. However, the optimized approach would reduce this with hardware with a lower specification to 5.1 million yuan. Token is the information that a model takes to learn something about the world and provide useful answers to user inquiries.

The company plans to use the latest breakthrough in the large language models Ling-Plus and Ling-Lite for industrial AI solutions such as healthcare and finances developed by it, according to people.

antboughtChinese online platform haodf.com this year toospice uphis artificial intelligence services in healthcare. Ant has created AI doctor aid to support the 290,000 doctors from Haodf with tasks such as the management of medical records, the company announced on Monday in a separate statement.

The company also has an AI app entitled “Life Assistant” called Zhixiaobao and a Financial Advisory Ai Service Maxiaocai.

After understanding in English, Ant said in his work that the ling lite model in an important benchmark performs better compared to one of Meta’s llama models. Both Ling-Lite and Ling Plus models exceeded Deepseek’s equivalent to Chinese benchmarks.

“If you find a point of attack to beat the world’s best kung FU champion in the world, you can still say that you have beaten it. Therefore, real application is important,” said Robin Yu, Chief Technology Officer of the AI ​​solving provider Shengshang Tech Co.

Ant did the Ling models Open Source. Ling-Lite contains 16.8 billion parameters, which are the adjustable settings that work like buttons and dials to steer the power of the model. Ling-Plus has 290 billion parameters, which is considered relatively large in the area of ​​the voice models. For comparison, experts estimate that the GPT-4.5 of Chatgpt has 1.8 trillion parameters.afterFor with Technology Review. Deepseek-R1has671 billion.

The company faced challenges in some areas of training, including stability. Even small changes in the hardware or in the structure of the model led to problems, including jumps in the error rate of the models.

Ant said on Monday that medically focused large model machines, which were used by seven hospitals and healthcare services in cities such as Beijing and Shanghai. The large model uses Deepseek R1, Alibabas Qwen and Ant’s own LLM and can carry out medical advice, it said.

The company also said that two medical AI agents have introduced – angels that have looked after more than 1,000 medical facilities, and Yibaoer, who supports health insurance services. The AI ​​Healthcare Manager service in Alipay, its payment app, started last September.

This story was originally on Fortune.com



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *