܄

Ant Group NextEvo fully open source AI Infra technology

【数据猿导读】 Ant Group NextEvo fully open source AI Infra technology

Ant Group NextEvo fully open source AI Infra technology

On February 1, NextEvo, the AI innovation research and development department of Ant Group, fully opened source AI Infra technology, which can help large model kcal training effective time account for more than 95%, and can achieve "automatic driving" during training, which promotes the efficiency of AI research and development. The technology framework, called DLRover, aims to make large-scale distributed training intelligent. The latest integration into DLRover is the Flash Checkpoint (FCP) scheme. During model training, it is generally necessary to Checkpoint (check point), so that when interrupted, it can be restored to the recent state. The conventional method takes a long time, the high-frequency check point is easy to reduce the training available time, and the low frequency check point is lost too much when recovering. After the training of the kilocarb parameter model, the training waste time caused by Checkpoint is reduced by about 5 times, the persistence time is reduced by about 70 times, and the effective training time is increased from 90% to 95%.


来源:DIYuan

声明:数据猿尊重媒体行业规范,相关内容都会注明来源与作者;转载我们原创内容时,也请务必注明“来源:数据猿”与作者名称,否则将会受到数据猿追责。

刷新相关文章

Iflytek: Released the first national open model
Iflytek: Released the first national open model "...
Shanghai AI Lab releases a new generation of scholar · Vision large model
Shanghai AI Lab releases a new generation of schol...
Baichuan Intelligence released more than 100 billion large model Baichuan 3
Baichuan Intelligence released more than 100 billion...

我要评论

数据猿微信公众号
2023第七届上海AI大会暨医药和医疗创新峰会
2023深圳物联网展
人工智能博览会
FMW2023全球闪存峰值
2023世界农业科技创新大会暨世界农业科技博览会
2024上海世博展览馆
返回顶部