taalas

1 article tagged with taalas

February 20, 2026
product update

Taalas serves Llama 3.1 8B at 17,000 tokens/second with custom silicon

Taalas, a new Canadian hardware startup, announced its first product: a custom silicon implementation of Meta's Llama 3.1 8B model running at 17,000 tokens/second. The startup uses aggressive quantization combining 3-bit and 6-bit parameters. The system is accessible via chatjimmy.ai.