Tools & Regulation
Groq (inference platform)
Groq runs LLM inference on custom LPU chips – at hundreds of tokens per second. Models, speed, and typical use cases at a glance.
3 articles with this tag
Groq runs LLM inference on custom LPU chips – at hundreds of tokens per second. Models, speed, and typical use cases at a glance.
LLM inference is a language model in day-to-day operation. How token costs add up, what drives speed, and which providers matter.
Self-hosted AI means language models run on company hardware. Requirements, tools like Ollama and vLLM, benefits and limits at a glance.