OpenAI & other LLM API Pricing Calculator

Calculate the cost of using OpenAI and other Large Language Models(LLMs) APIs

Input tokens

Output tokens

Api Calls

Calculate by

Provider	Model	Context	Input/1k Tokens	Output/1k Tokens	Per Call	Total
Chat/Completion
Kimi	Kimi K2	128k	$0.001	$0.003	$	$
Anthropic	Claude Opus 4	200k	$0.015	$0.075	$	$
Anthropic	Claude Sonnet 4	200k	$0.003	$0.015	$	$
OpenAI	o3-pro	200k	$0.002	$0.008	$	$
OpenAI	o3	200k	$0.0002	$0.0008	$	$
OpenAI	o4-mini	200k	$0.0004	$0.0001	$	$
OpenAI	GPT-4.1	1M	$0.002	$0.008	$	$
OpenAI	GPT-4.1 mini	1M	$0.0004	$0.0016	$	$
OpenAI	GPT-4.1 nano	1M	$0.0001	$0.0004	$	$
xAI	Grok 3	1M	$0.003	$0.015	$	$
xAI	Grok 2	200k	$0.002	$0.01	$	$
Meta	Llama 4 Maverick	1M	$0.00019	$0.00049	$	$
Meta	Llama 4 Scout	10M	$0.00017	$0.00017	$	$
Meta	Llama 3.2 90b	128k	$0.00204	$0.00204	$	$
Meta	Llama 3.3 70b	200k	$0.00071	$0.00071	$	$
Google	Gemini 2.5 Flash Lite	200k	$0.0005	$0.0004	$	$
Google	Gemini 2.5 Pro	200k	$0.00125	$0.01	$	$
Google	Gemini 2.0 Flash	200k	$0.0001	$0.0007	$	$
Anthropic	Claude 3.7 Sonnet	200k	$	$0.015	$	$
Anthropic	Claude 3.7 Sonnet	200k	$0.003	$0.015	$	$
OpenAI	o3-mini	128k	$0.001	$0.004	$	$
OpenAI	o1-preview	128k	$0.015	$0.06	$	$
OpenAI	o1	200k	$0.015	$0.06	$	$
OpenAI	o1-mini	128k	$0.001	$0.004	$	$
OpenAI	GPT-4	8k	$0.03	$0.06	$	$
OpenAI	GPT-4	32k	$0.06	$0.12	$	$
OpenAI	GPT-4 Turbo	128k	$0.01	$0.03	$	$
OpenAI	GPT-4o (omni)	128k	$0.005	$0.015	$	$
OpenAI	GPT-4o mini	128k	$0.00015	$0.0006	$	$
OpenAI	GPT-4o Realtime	128k	$0.002	$0.01	$	$
OpenAI	GPT-4o mini Realtime	128k	$0.0006	$0.0024	$	$
Anthropic	Claude 3.5 Haiku	200k	$0.0008	$0.004	$	$
DeepSeek	DeepSeek V3	128k	$0		$	$
OpenAI	GPT-3.5 Turbo	16k	$0.0015	$0.002	$	$
OpenAI	GPT-3.5 Turbo	4k	$0.0015	$0.002	$	$
Anthropic	Claude 3 Haiku	200k	$0.00025	$0.00125	$	$
Anthropic	Claude 3 Sonnet	200k	$0.003	$0.015	$	$
Cohere	Command-R	128k	$0.0005	$0.0015	$	$
DeepSeek	DeepSeek-R1	64k	$0.00055	$0.00219	$	$
Google	Gemini 2.0 Flash	1M	$0		$	$
Google	Gemini 1.5 Pro	2M	$0.0012	$0.005	$	$

Frequently Asked Questions

What is LLM?

An LLM (Large Language Model) is an AI system trained on vast amounts of textual data using deep learning and transformer architectures to understand and generate human-like language. It can perform various language tasks like translation, summarization, and text generation.

What is Embedding?

Embedding is a numerical representation of words or phrases in a continuous vector space. This allows the language model to understand and process the semantic meaning of the words. Embeddings are used in various applications such as search engines, recommendations, and natural language understanding tasks.

How can i calculate Embedding pricing?

To calculate embedding pricing, you need to determine the number of tokens, characters or words you expect to process using embeddings each month. Add this data into the pricing calculator. The calculator will provide an estimated monthly cost based on the provider's pricing structure.

What is token?

A token is a unit of text that the language model processes. It can be as short as one character or as long as one word, depending on the context and language. In most LLMs, text is broken down into tokens before processing, and pricing is based on the number of tokens used.

What is the LLM API Pricing Calculator?

The LLM API Pricing Calculator is a tool designed to help users estimate the cost of using various Large Language Model APIs, embeddings and Fine-tuning based on their specific usage needs.

Which Language Models does the calculator support?

The calculator supports multiple Language Models, including OpenAI, Google, Anthropic, Mistral, Amazon, Amazon Bedrock. The list is regularly updated to include new models as they become available.

How accurate is the pricing estimate provided by the calculator?

The pricing estimates are based on the latest publicly available pricing information from the API providers. While the estimates are intended to be as accurate as possible, actual costs may vary depending on your specific usage patterns and any additional fees imposed by the providers.

Can I compare the costs of different Language Models using this calculator?

Yes, you can compare the costs of different Language Models. The calculator will provide a combined comparison of the estimated costs.

How can I reduce my API usage costs?

To reduce costs, consider optimising your usage by reducing the number of tokens processed, choosing a less expensive model, or taking advantage of any discounts or usage tiers offered by the providers.

How do I choose the right LLM for my needs?

Choosing the right Large Language Model (LLM) depends on various factors including the complexity of your tasks, required performance, and budget.

Here are some examples of proprietary LLMs and their strengths to help you make an informed decision:

Claude 3.5 Sonnet: A faster, powerful and cost-effective model, Claude 3.5 Sonnet balances power and affordability, suitable for a wide range of use cases.
GPT-4o: Known for its speed and power, GPT-4o is an excellent choice for applications requiring quick responses and robust performance.
GPT-4: The most advanced openai LLM available, offers superior capabilities in understanding and generating human-like text, making it ideal for complex and high-stakes applications.

What is an Input Token and Output Token? How does this affect the overall pricing?

Input Tokens: Units of text broken down from the input prompt or query. These are fed into the model for processing.
Output Tokens: Units of text generated by the model in response to the input. The model predicts these tokens one at a time to form the complete output.

Both input and output tokens are used to calculate the cost of using a language model. The overall pricing depends on the total number of tokens processed. Higher token usage typically results in higher costs.

📈 Example: If you process 100,000 input tokens and generate 50,000 output tokens, you will be billed for 150,000 tokens.

What are AI credits and how are they different from API pricing?

Credits are consumed based on the cost of the model being used. Different models consume different amounts of credits. For example, GPT-3.5 takes 1x credit, GPT-4o takes 5x credits, GPT-4 Turbo takes 10x credits, while GPT-4 takes 20x credits. API pricing, on the other hand, is typically based on metrics like the number of tokens or characters processed. For a detailed cost estimation using AI credits, visit our Chatbot AI Credits Calculator.