苹果的研究人员近期推出了一种名为自回归视觉模型(AIM)的新技术。他们在最新论文《Scalable Pre-training of Large Autoregressive Image Models》中提出,通过自回归目标训练Vision Transformer(ViT)模型,能够在学习图像表征方面获得与大型语言模型(Large Language Models,LLMs)相同的扩展能力。
研究者发现,借助先进的模型架构和训练策略,AIM模型的容量可以轻松扩展到数十亿个参数。这一突破使得AIM能够有效利用大量未经整理的图像数据,进一步提高了模型的学习效果和泛化能力。
据悉,AIM模型的推出标志着苹果在计算机视觉领域取得了重要进展。这一成果有望为图像识别、生成以及人工智能助手等领域带来更多创新应用。
英文标题:Apple Unveils Autoregressive Vision Model AIM: Expanding ViT Model Capacity
英文关键词:Apple, Autoregressive Vision Model, ViT Model
英文新闻内容:
Apple researchers have recently introduced a new technology called Autoregressive Vision Model (AIM). In their latest paper “Scalable Pre-training of Large Autoregressive Image Models,” they propose that by training Vision Transformer (ViT) models with autoregressive objectives, they can achieve the same scalability in learning image representations as Large Language Models (LLMs).
The researchers found that with advanced model architectures and training strategies, the capacity of the AIM model can be easily expanded to tens of billions of parameters. This breakthrough enables AIM to effectively utilize a large amount of unstructured image data, further improving the learning effect and generalization ability of the model.
It is reported that the launch of the AIM model marks an important progress of Apple in the field of computer vision. This achievement is expected to bring more innovative applications in areas such as image recognition, generation, and artificial intelligence assistants.
【来源】https://www.jiqizhixin.com/articles/2024-01-18-7
Views: 1