Llama 3.1 405 billion Parameter Released

Llama 3.1 405B is the first openly available model that rivals the top AI models when it comes to state-of-the-art capabilities in general knowledge, steerability, math, tool use, and multilingual translation. With the release of the 405B model, Meta supercharges innovation—with unprecedented opportunities for growth and exploration. They believe the latest generation of Llama will …

Read more

Looking at Hardware for Running Local Large Language Models

ChatRTX is a demo app that lets you personalize a GPT large language model (LLM) connected to your own content—docs, notes, images, or other data. Leveraging retrieval-augmented generation (RAG), TensorRT-LLM, and RTX acceleration, you can query a custom chatbot to quickly get contextually relevant answers. It all runs locally on your Windows RTX PC or …

Read more

China Developing 1.57 Exaflop Supercomputer With China Made CPU-GPU Chip

There are reports that China has a new superchip MT-3000 processor designed by the National University of Defense Technology (NUDT). The MT-3000 has general-purpose CPU cores, control cores, and matrix accelerator cores. NUDT’s MT-3000 processor features a multi-zone structure that packs 16 general-purpose CPU cores with 96 control cores and 1,536 accelerator cores. The MT-3000 …

Read more

Tensors are Critical for AI Processing But What Are Tensors? TPUs?

Dan Fleisch briefly explains some vector and tensor concepts from A Student’s Guide to Vectors and Tensors. In the field of machine learning, tensors are used as representations for many applications, such as images or videos. They form the basis for TensorFlow’s machine learning framework. It is useful to understand Tensors, Tensorflow, and TPU (Tensor …

Read more

Desktop GPUs Simulate 24 Billion Synapse Mammal Brain Cortex

Dr James Knight and Prof Thomas Nowotny from the University of Sussex’s School of Engineering and Informatics used the latest Graphical Processing Units (GPUs) to give a single desktop PC the capacity to simulate brain models of almost unlimited size. This work will make large brain simulations accessible to researchers with tiny budgets. The research …

Read more

Eight Nvidia A100 Next Generation Tensor Chips for 5 Petaflops at $200,000

The Nvidia A100 chip was presented at the Hot Chips conference. Sander Olson provided Nextbigfuture with the presentation. The Nvidia A100 is a third-generation Tensor Core chip. It is faster and more efficient than competing chips like the prior Nvidia chip (V100), the Google Tensor processing unut (TPU) version 3, and Huawei Ascend. The Nvidia …

Read more

World’s Fastest Supercomputer Triples Performance to 445 PetaFLOPS

Using HPL-AI, a new approach to benchmarking AI supercomputers, Oak Ridge National Laboratory’s Summit supercomputer system reached 445 petaflops or nearly half an exaflops. The system’s official Linpack performance is 148 petaflops announced in the new TOP500 list of the world’s fastest supercomputers. Summit, the world’s fastest supercomputer, NVIDIA ran HPL-AI computations in less than …

Read more

AMD makes first 7 nanometer CPU and GPU and performance competitive with Nvidia

Advanced Micro Devices launched its first 7-nm CPU and GPU at the lucrative target of the data center. The working chips deliver comparable performance to Intel’s 14-nm Xeon and Nvidia’s 12-nm Volta. AMD is back from the near dead. AMD is competitive with Nvidia and Intel will get a significant market share. A single 7-nm …

Read more

Nvidia DGX-2 is 2 petaflop AI supercomputer for $399,000

NVIDIA launched NVIDIA DGX-2, the first single server capable of delivering two petaflops of computational power.A DGX-2 has the deep learning processing power of 300 servers occupying 15 racks of datacenter space, while being 60x smaller and 18x more power efficient. NVIDIA DGX-2 is a server rack with 16 Volta GPUs and Dual Xeon Platinums …

Read more