苹果研发自回归视觉模型 AIM，突破图像识别界限

苹果公司研究人员近期发表了一篇关于自回归视觉模型（AIM）的论文，《Scalable Pre-training of Large Autoregressive Image Models》。该论文提出了一种新的自回归图像模型，旨在探究自回归目标训练的 ViT 模型是否能在表征学习方面具备与 LLMs 相同的扩展能力。研究结果表明，模型容量可以轻松扩展至数十亿个参数，且 AIM 能够有效利用大量未经整理的图像数据。

自回归视觉模型（AIM）是苹果研究人员在深度学习领域的一项重要突破。该模型以自然语言处理中的自回归目标为基础，将其应用于图像识别领域。通过这种创新方法，模型在学习图像表征方面实现了与大型语言模型（LLMs）相当的扩展能力。研究人员发现，AIM 能够在未经整理的图像数据上进行有效训练，进一步提高了模型的实用性。

在论文中，苹果研究者详细介绍了 AIM 的训练过程和优化策略。他们指出，通过使用自回归目标，AIM 能够在大量图像数据上进行高效预训练，从而提高模型在各种图像识别任务中的性能。这一成果对于人工智能领域的发展具有重要意义，未来或将推动计算机视觉技术的进一步突破。

英文翻译：

News Title: Apple Develops Autoregressive Visual Model AIM, Breaking Boundaries in Image Recognition
Keywords: Apple, Autoregressive Visual Model, Image Recognition, Artificial Intelligence

News Content:

Apple’s research team has recently published a paper on the autoregressive visual model (AIM), titled “Scalable Pre-training of Large Autoregressive Image Models”. The paper proposes a new autoregressive image model aimed at exploring whether vision models trained with autoregressive objectives can achieve the same representation learning capabilities as large language models (LLMs). The research reveals that model capacity can be easily expanded to tens of billions of parameters, and AIM can effectively utilize a large amount of unorganized image data.

The autoregressive visual model (AIM) is a significant breakthrough for Apple in the field of deep learning. The model applies the autoregressive objective commonly used in natural language processing to the field of image recognition. Through this innovative approach, the model achieves comparable expansion capabilities in learning image representation as LLMs. Researchers found that AIM can be effectively trained on unorganized image data, further improving its practicality.

In the paper, Apple researchers detailed the training process and optimization strategies of AIM. They pointed out that by using autoregressive objectives, AIM can be pre-trained efficiently on a large amount of image data, thereby enhancing the performance of the model in various image recognition tasks. This achievement is of great significance to the development of the field of artificial intelligence and may promote further breakthroughs in computer vision technology.

【来源】https://www.jiqizhixin.com/articles/2024-01-18-7