Back to timeline

GLM 5.2 Fast via Wafer now available on AI Gateway

Vercel News·Rohan Taneja·

GLM 5.2 Fast via Wafer is now available on .AI Gateway

Based on our own benchmarking across small-context, large-context, and tool-call scenarios, Wafer delivers a 2x higher throughput than other providers serving GLM-5.2 on serverless, leading on decode and end-to-end speed for sustained generation in the small- and large-context cases.

In our testing, GLM 5.2 Fast on Wafer measured:

To use GLM 5.2 Fast, set to in the :modelzai/glm-5.2-fastAI SDK

AI Gateway provides a unified API for calling models, tracking usage and cost, and configuring retries, failover, and performance optimizations for higher-than-provider uptime. It includes built-in , , , and more.custom reportingZero Data Retention supportbudgets for API keys

AI Gateway reflects provider pricing with no markup and does not charge a platform fee on inference, including on (BYOK) requests.Bring Your Own Key

Try GLM 5.2 Fast in the .model playground

Read more

  • Small context: 170+ tok/s

  • Large context: 200+ tok/s