近期,一个令人瞩目的事件在人工智能领域引发热议——首个开源模型超越GPT4级别的Llama 3.1模型泄露。这一事件不仅标志着人工智能技术的又一里程碑,同时也引发对于开源模型与闭源模型性能比较的深入探讨。
Llama 3.1模型在Reddit上被疯传,不仅包括基础模型,还提供了8B、70B以及最大参数405B的基准测试结果。与OpenAI的GPT-4相比,即使是70B版本的Llama 3.1在多个基准测试中展现出超越的性能。这一突破性的进展意味着开源模型首次在多个性能指标上超越了闭源模型如GPT-4和Claude Sonnet 3.5,达到当前的顶级水平。
Llama 3.1的性能提升得益于其参数规模的显著增加,达到了惊人的4050亿参数。模型的训练数据来源广泛,包含了15T以上的公开数据,预训练阶段的数据截止日期为2023年12月。微调阶段的数据集不仅包括公开指令微调数据集,还包含了1500万个合成样本,这使得模型在多语言对话任务中表现出色。
值得注意的是,Llama 3.1模型支持多语言,包括英语、法语、德语、印地语、意大利语、葡萄牙语、西班牙语和泰语,这为跨语言应用提供了便利。模型架构基于优化的Transformer架构,支持自回归语言模型的生成,通过SFT(自监督学习)和RLHF(奖励学习)对齐了可用性和安全偏好。
除了性能上的提升,Llama 3.1模型的社区许可协议也为开发者提供了更多灵活性,允许其在多语言场景下进行创新应用。尽管模型在多种语言上进行了训练,但为了确保安全和负责任的使用,对于支持语言之外的语言进行微调时,开发者需遵守特定的协议和策略。
这一事件不仅展示了开源模型在人工智能领域的巨大潜力,也为未来技术的发展提供了新的方向。随着Llama 3.1模型的泄露,众多开发者和研究者正期待着官方渠道的正式发布,以便更广泛地应用这一突破性技术。随着人工智能技术的不断进步,我们有理由期待更多创新成果的涌现,为人类社会带来更深远的影响。
英语如下:
Headline: “First Open-Source Giant Model Surpasses GPT4: Llama 3.1 with 405 Billion Parameters and Performance Matching Proprietary Competitors”
Keywords: Open-source model, Performance Exceeding, Llama 3.1
In a significant development that has sparked considerable interest in the AI field, the first open-source model, Llama 3.1, has surpassed GPT4 level, marking a new milestone in AI technology and igniting discussions on the performance comparison between open-source and proprietary models.
The Llama 3.1 model, which has been widely discussed on Reddit, not only includes the base model but also provides benchmark test results for the 8B, 70B, and the highest parameter version of 405B. Compared to OpenAI’s GPT-4, even the 70B version of Llama 3.1 demonstrates superior performance in multiple benchmark tests. This breakthrough signifies that open-source models, for the first time, have surpassed proprietary models such as GPT-4 and Claude Sonnet 3.5 across various performance metrics, reaching the current top level.
The performance enhancement of Llama 3.1 is attributed to its significantly increased parameter size, reaching an astonishing 405 billion parameters. The model’s training data is sourced from a wide range of public data, with over 15T of data used in the pre-training phase, and the data cut-off date is set for December 2023. In the fine-tuning phase, the dataset not only includes public instruction fine-tuning data but also 15 million synthetic samples, which enables the model to excel in multi-language dialogue tasks.
Llama 3.1 supports multiple languages, including English, French, German, Hindi, Italian, Portuguese, Spanish, and Thai, providing convenience for cross-lingual applications. The model’s architecture is based on optimized Transformer architecture, supporting the generation of autoregressive language models through SFT (self-supervised learning) and RLHF (reward learning), aligning with usability and safety preferences.
Beyond its performance, the community licensing agreement for Llama 3.1 provides developers with more flexibility, enabling them to innovate in multilingual scenarios. Although the model has been trained in multiple languages, developers need to adhere to specific protocols and strategies when fine-tuning the model for languages not supported by the model.
This event not only showcases the immense potential of open-source models in the AI domain but also points towards new directions for future technological advancements. With the leak of the Llama 3.1 model, numerous developers and researchers are eagerly awaiting its official release for broader application of this groundbreaking technology. As AI technology continues to advance, there is reason to expect the emergence of more innovative results that will have a profound impact on society.
【来源】https://www.jiqizhixin.com/articles/2024-07-23-6
Views: 7