OpenAI & other LLM API Pricing Calculator

Calculate the cost of using OpenAI and other Large Language Models(LLMs) APIs

Input tokens
Output tokens
Api Calls
Calculate by
Input/1k Tokens
Output/1k Tokens
Per Call
OpenAIGPT-3.5 Turbo16k$0.0005$0.0015$$
OpenAIGPT-4 Turbo128k$0.01$0.03$$
OpenAIGPT-4o (omni)128k$0.005$0.015$$
OpenAIGPT-4o mini128k$0.00015$0.0006$$
OpenAIGPT-3.5 Turbo4k$0.0015$0.002$$
OpenAIAda v2$0.0001$$
AmazonTitan Text - Lite4k$0.00015$0.0002$$
AmazonTitan Text - Express8k$0.0002$0.0006$$
AnthropicClaude Instant100k$0.0008$0.0024$$
AnthropicClaude 2.1200k$0.008$0.024$$
AnthropicClaude 3 Haiku200k$0.00025$0.00125$$
AnthropicClaude 3 Sonnet200k$0.003$0.015$$
AnthropicClaude 3 Opus200k$0.015$0.075$$
MetaLlama 2 70b4k$0.001$0.001$$
GooglePaLM 28k$0.002$0.002$$
GooglePaLM 2$0.0004$$
GoogleGemini 1.5 Flash1M$0.0007$0.0021$$
GoogleGemini 1.0 Pro32k$0.0005$0.0015$$
GoogleGemini 1.5 Pro1M$0.007$0.021$$
TitanTitan Embeddings$0.0001$$
Mistral AI (via Anyscale)Mixtral 8x7B32k$0.0007$0.0007$$
Mistral AIMistral Small32k$0.002$0.006$$
Mistral AIMistral Large32k$0.008$0.024$$
Mistral AIembed$0.0001$$

Frequently Asked Questions

What is LLM?

An LLM (Large Language Model) is an AI system trained on vast amounts of textual data using deep learning and transformer architectures to understand and generate human-like language. It can perform various language tasks like translation, summarization, and text generation.

What is Embedding?

Embedding is a numerical representation of words or phrases in a continuous vector space. This allows the language model to understand and process the semantic meaning of the words. Embeddings are used in various applications such as search engines, recommendations, and natural language understanding tasks.

How can i calculate Embedding pricing?

To calculate embedding pricing, you need to determine the number of tokens, characters or words you expect to process using embeddings each month. Add this data into the pricing calculator. The calculator will provide an estimated monthly cost based on the provider's pricing structure.

What is token?

A token is a unit of text that the language model processes. It can be as short as one character or as long as one word, depending on the context and language. In most LLMs, text is broken down into tokens before processing, and pricing is based on the number of tokens used.

What is the LLM API Pricing Calculator?

The LLM API Pricing Calculator is a tool designed to help users estimate the cost of using various Large Language Model APIs, embeddings and Fine-tuning based on their specific usage needs.

Which Language Models does the calculator support?

The calculator supports multiple Language Models, including OpenAI, Google, Anthropic, Mistral, Amazon, Amazon Bedrock. The list is regularly updated to include new models as they become available.

How accurate is the pricing estimate provided by the calculator?

The pricing estimates are based on the latest publicly available pricing information from the API providers. While the estimates are intended to be as accurate as possible, actual costs may vary depending on your specific usage patterns and any additional fees imposed by the providers.

Can I compare the costs of different Language Models using this calculator?

Yes, you can compare the costs of different Language Models. The calculator will provide a combined comparison of the estimated costs.

How can I reduce my API usage costs?

To reduce costs, consider optimising your usage by reducing the number of tokens processed, choosing a less expensive model, or taking advantage of any discounts or usage tiers offered by the providers.

How do I choose the right LLM for my needs?

Choosing the right Large Language Model (LLM) depends on various factors including the complexity of your tasks, required performance, and budget.

Here are some examples of proprietary LLMs and their strengths to help you make an informed decision:

  • Claude 3.5 Sonnet: A faster, powerful and cost-effective model, Claude 3.5 Sonnet balances power and affordability, suitable for a wide range of use cases.
  • GPT-4o: Known for its speed and power, GPT-4o is an excellent choice for applications requiring quick responses and robust performance.
  • GPT-4: The most advanced openai LLM available, offers superior capabilities in understanding and generating human-like text, making it ideal for complex and high-stakes applications.

What is an Input Token and Output Token? How does this affect the overall pricing?

  • Input Tokens: Units of text broken down from the input prompt or query. These are fed into the model for processing.
  • Output Tokens: Units of text generated by the model in response to the input. The model predicts these tokens one at a time to form the complete output.

Both input and output tokens are used to calculate the cost of using a language model. The overall pricing depends on the total number of tokens processed. Higher token usage typically results in higher costs.

📈 Example: If you process 100,000 input tokens and generate 50,000 output tokens, you will be billed for 150,000 tokens.

What are AI credits and how are they different from API pricing?

Credits are consumed based on the cost of the model being used. Different models consume different amounts of credits. For example, GPT-3.5 takes 1x credit, GPT-4o takes 5x credits, GPT-4 Turbo takes 10x credits, while GPT-4 takes 20x credits. API pricing, on the other hand, is typically based on metrics like the number of tokens or characters processed. For a detailed cost estimation using AI credits, visit our Chatbot AI Credits Calculator.