Groq 30 Days to Starting With Large Customers

Groq said that they will start operating an AI inference cluster with large business in 30 days. Groq made a presentation at the GenAI summit 2024 in San Francisco.

They are processing 30,000 inference input inference tokens and will put together about 1500 chips into an inference data center that will process 25 million inference tokens per second by the end of the year.

Groq uses fully synchronous SRAM memory. Nvidia uses HBM (High bandwidth stacked memory).
.
Nvidia announced that their H200 chip will process 24,000 inferences per second.

Groq says that their ASIC chip processes inferences with 3-10 times the energy efficiency of the Nvidia chips.

The AI inference chips and the AI models are making huge leaps in progress. There will soon be new stateless models.

1 thought on “Groq 30 Days to Starting With Large Customers”

Comments are closed.