Zhipu AI (Z.ai)

GLM-4.6V Pricing & Cost Breakdown

Open-source multimodal vision-language model with native function calling, state-of-the-art visual understanding and reasoning at its scale, long-context multimodal processing, and support for interleaved image-text generation and agentic workflows

Pricing

Input Price
$0.300
per 1M tokens
Output Price
$0.900
per 1M tokens

Specifications

Context Length
128K tokens
Status
active
Supported Modalities
TextImageVideo
Features
StreamingFunction CallingStructured OutputNative Multimodal Tool UseLong Context Visual Reasoning

Source & Verification

Last Verified
February 13, 2026
Pricing Methodology

Prices are listed in USD, pre-tax. Text model prices are per 1 million tokens (input and output). Image model prices are per generated image. Video model prices are per second of generated video. Prices are sourced from official provider documentation and verified by our team. Special pricing (batch, cached, fine-tuned) may not be reflected.

Found an Error?

If you notice incorrect or outdated pricing, please let us know. We typically verify and update within 24 hours.

Cost Calculator

750,000 words
375,000 words
Input Cost$0.3000
Output Cost$0.4500
Total Cost$0.7500