Topic

Inference

2 articles with this tag

Groq (inference platform)

Groq runs LLM inference on custom LPU chips – at hundreds of tokens per second. Models, speed, and typical use cases at a glance.

LLM inference is a language model in day-to-day operation. How token costs add up, what drives speed, and which providers matter.