Llama 3.1 405B is the first openly available model that rivals the top AI models when it comes to state-of-the-art capabilities in general knowledge, steerability, math, tool use, and multilingual translation. With the release of the 405B model, Meta supercharges innovation—with unprecedented opportunities for growth and exploration. They believe the latest generation of Llama will ignite new applications and modeling paradigms, including synthetic data generation to enable the improvement and training of smaller models, as well as model distillation—a capability that has never been achieved at this scale in open source.
It is free to use and is open source. There are no limitations for usage.
As part of this latest release, they are introducing upgraded versions of the 8B and 70B models. These are multilingual and have a significantly longer context length of 128K, state-of-the-art tool use, and overall stronger reasoning capabilities. This enables our latest models to support advanced use cases, such as long-form text summarization, multilingual conversational agents, and coding assistants. Meta also made changes to our license, allowing developers to use the outputs from Llama models—including the 405B—to improve other models. True to their commitment to open source, starting today, they are making these models available to the community for download on llama.meta.com and Hugging Face and available for immediate development on our broad ecosystem of partner platforms.
The experimental evaluation suggests that the flagship 405B model is competitive with leading foundation models across a range of tasks, including GPT-4, GPT-4o, and Claude 3.5 Sonnet.
In post-training, they produce final chat models by doing several rounds of alignment on top of the pre-trained model. Each round involves Supervised Fine-Tuning (SFT), Rejection Sampling (RS), and Direct Preference Optimization (DPO). They use synthetic data generation to produce the vast majority of their SFT examples, iterating multiple times to produce higher and higher quality synthetic data across all capabilities. Additionally, they invest in multiple data processing techniques to filter this synthetic data to the highest quality. This enables Meta to scale the amount of fine-tuning data across capabilities.
Meta trained Llama 3.1 405B on over 15 trillion tokens. they significantly optimized the full training stack and pushed the model training to over 16 thousand H100 GPUs, making the 405B the first Llama model trained at this scale.
Building with Llama 3.1 405B
For the average developer, using a model at the scale of the 405B is challenging. While it’s an incredibly powerful model, we recognize that it requires significant compute resources and expertise to work with.
Meta realizes there’s so much more to generative AI development than just prompting models. They want to enable everyone to get the most out of the 405B, including:
* Real-time and batch inference
* Supervised fine-tuning
* Evaluation of your model for your specific application
* Continual pre-training
* Retrieval-Augmented Generation (RAG)
* Function calling
* Synthetic data generation
This is where the Llama ecosystem can help. On day one, developers can take advantage of all the advanced capabilities of the 405B model and start building immediately. Developers can also explore advanced workflows like easy-to-use synthetic data generation, follow turnkey directions for model distillation, and enable seamless RAG with solutions from partners, including AWS, NVIDIA, and Databricks. Additionally, Groq has optimized low-latency inference for cloud deployments, with Dell achieving similar optimizations for on-prem systems.
Brian Wang is a Futurist Thought Leader and a popular Science blogger with 1 million readers per month. His blog Nextbigfuture.com is ranked #1 Science News Blog. It covers many disruptive technology and trends including Space, Robotics, Artificial Intelligence, Medicine, Anti-aging Biotechnology, and Nanotechnology.
Known for identifying cutting edge technologies, he is currently a Co-Founder of a startup and fundraiser for high potential early-stage companies. He is the Head of Research for Allocations for deep technology investments and an Angel Investor at Space Angels.
A frequent speaker at corporations, he has been a TEDx speaker, a Singularity University speaker and guest at numerous interviews for radio and podcasts. He is open to public speaking and advising engagements.
[ like raising a challenge:
What’s the best question to ask, our societies within a 21st century and the following and an AGI/ASI or at least nowadays L(L)M comparable to GPT4/5?
What to add for United Nations development goals and how safely fulfill them? What are next priorities for humanity? ]
[ FYI, comparing GPT3
“From a statistical point of view and within sustained development of societies comparable to former centuries or cultures, some key priorities for humanity may include:
1. Preserving cultural heritage and traditions while embracing technological advancements.
2. Promoting sustainable economic growth that benefits all members of society.
3. Enhancing education and skills training to adapt to a rapidly changing world.
4. Addressing income inequality and ensuring equitable access to resources and opportunities.
5. Fostering social cohesion and community resilience in the face of global challenges.
6. Protecting the environment and natural resources for future generations.
By focusing on these priorities and leveraging statistical analysis to track progress and inform decision-making, societies can strive towards sustainable development that honors the achievements of past cultures while embracing the opportunities of the future.”
and GPT4o
“From a statistical and sustainable development perspective, the next priorities for humanity could include:
a) Climate Change Mitigation: Reducing greenhouse gas emissions, transitioning to renewable energy sources, and implementing sustainable practices to combat global warming.
b) Resource Management: Ensuring the sustainable use of natural resources, including water, minerals, and forests, to prevent depletion and maintain ecological balance.
c) Population Growth and Urbanization: Managing population growth and the rapid urbanization by creating sustainable cities with adequate infrastructure, housing, and services.
d) Health and Well-being: Improving global health through better access to healthcare, addressing pandemics, and tackling non-communicable diseases.
e) Education and Equality: Ensuring inclusive and equitable quality education and promoting lifelong learning opportunities for all, addressing gender and economic inequalities.
f) Technological Advancement: Promoting responsible innovation and ensuring that technological advancements benefit all of humanity while mitigating potential risks.
g) Economic Stability: Building resilient economies that can withstand shocks, reduce poverty, and ensure fair distribution of wealth.
h) Biodiversity Conservation: Protecting and restoring ecosystems and biodiversity to maintain the planet’s health and resilience.
These priorities align with the United Nations Sustainable Development Goals (SDGs) and reflect a comprehensive approach to fostering long-term, sustainable progress for societies worldwide.” ]
This model is so huge in memory it will make relatively little difference for most users, requiring several nvidia H100 or better/equivalent GPUs from Intel or AMD.
It is a difference for companies, though, which can purchase dedicated servers with that many expensive GPUs or lease them to a cloud provider.
It is possible that an intermediate model like gpt 4.5 will be released.
Let’s not forget that the jump in abilities between gpt 3 and gpt 4 was not like the gap between Llama 2 and Llama 3 for example, but much bigger, on the order of 10 times and even more in terms of intelligence and abilities
That’s what I’m thinking, because there’s no way Open AI will stay with GPT-4o for almost 2 years. GPT-4.5 late 2024 early 2025.
In the past, I felt a lot of negativity for Mark Zuckerberg. The algorithms for Facebook and Instagram were horrible for kids and teens. Only TikTok is worse.
However, making this open source is very good. Reminds me when Microsoft started opening up to developers and acquired Github. It totally rehabbed Microsoft’s image. Maybe this helps Mark with his image.
Two questions: (1) How large a team of developers are needed to truly take advantage of Llama 3.1? And (2) how soon will Llama become multi-modal? Right now, it isn’t designed to handle images, audio, or video. That might be its biggest weakness.
It seems like everyone is catching up to Openai. But I know, once they launch GPT-5, they will pull ahead 6-12 months overnight.
I doubt it. GPT-5 will be out late 2025 early 2026, openAI said it. In the meantime, Claude/Gemini/Grok will have surpassed GPT-4o easily. When GPT-5 will be out, it will only be slighly better than the best existing models, giving only a few months of leading. I don’t know why they will wait 1 year after training completion, they will just give the lead to competitors.
Where did OpenAI say this? Late 2024, or early 2025 is the announced release. I don’t think any major player has release projections that far out.
Mira Murati said a few weeks ago in an interview that it will be out in 1.5 to 2 years from now.
It is possible that an intermediate model like gpt 4.5 will be released.
Let’s not forget that the jump in abilities between gpt 3 and gpt 4 was not like the gap between Llama 2 and Llama 3 for example, but much bigger, on the order of 10 times and even more in terms of intelligence and abilities