AI Gateway API

Access Claude, DeepSeek, Llama, Qwen, Mistral and 100+ more AI models through a single OpenAI-compatible API. Automatic prompt caching, cost optimization, and unified billing.

Use this API from your AI agent via MCP

Works with OpenClaw, Claude Code/Desktop, Cursor, Windsurf, Cline and any MCP-compatible AI client.

Docs & setup

Create a skill by wrapping this MCP: https://mcp.zylalabs.com/mcp?apikey=YOUR_ZYLA_API_KEY

AI Gateway API provides developers with a single API endpoint to access 113+ AI models from 17 providers, fully compatible with the OpenAI API format.

Key Features:

Drop-in replacement for OpenAI API (same SDK, same format)
Models include Claude Opus, Sonnet, Haiku, DeepSeek R1 and V3, Llama 3.3 and 4, Qwen 3, Mistral Large, Amazon Nova, and more
Automatic prompt caching with up to 90 percent cost reduction on cache hits
Streaming support, function calling, and structured outputs
Competitive pricing starting at 0.035 dollars per million tokens

Authentication via API Key (Bearer token or api-key header). Full documentation available on our website.

API Documentation

Endpoints

Chat Completions

Send a chat completion request to any of 113+ AI models. Supports streaming, function calling, structured outputs, and vision. Compatible with the OpenAI chat completions format. Pass a model ID and messages array to get AI-generated responses.

                                                                            
POST https://www.zylalabs.com/api/11938/ai+gateway+api/22690/chat+completions

Chat Completions - Endpoint Features

Object	Description
`Request Body`	[Required] Json

Request Body

Test Endpoint

API EXAMPLE RESPONSE

       
                                                                                                        
                                                                                                                                                                                                                                                                                                                                        {"id":"chatcmpl-77c6cbc0bb0849298eefe610","object":"chat.completion","created":1772227604,"model":"nova-micro","choices":[{"index":0,"message":{"role":"assistant","content":"Hello there! How are you doing today?","tool_calls":null,"reasoning_content":null},"finish_reason":"stop","logprobs":null}],"usage":{"prompt_tokens":7,"completion_tokens":10,"total_tokens":17,"prompt_tokens_details":{"cached_tokens":0},"completion_tokens_details":{"reasoning_tokens":0}},"system_fingerprint":null}

Chat Completions - CODE SNIPPETS


curl --location --request POST 'https://zylalabs.com/api/11938/ai+gateway+api/22690/chat+completions' --header 'Authorization: Bearer YOUR_API_KEY' 

--data-raw '{"model":"nova-micro","messages":[{"role":"user","content":"Say hello in one sentence."}],"max_tokens":50}'

API Access Key & Authentication

After signing up, every developer is assigned a personal API access key, a unique combination of letters and digits provided to access to our API endpoint. To authenticate with the AI Gateway API simply include your bearer token in the Authorization header.

Headers

Header	Description
`Authorization`	[Required] Should be `Bearer access_key`. See "Your API Access Key" above when you are subscribed.

Questions

Simple Transparent Pricing

No long-term commitment. Upgrade, downgrade, or cancel anytime. Free Trial includes up to 50 requests.

Monthly Annually

(Save 2 months with annual billing 🎉)

💫Basic

$24.99/Month

50,000 Requests / Month
Then $0.0006497 per request if limit exceeded.
Rate Limit: 100 reqs per minute
Specialized Customer Support
Real-Time API Monitoring
Unlimited Data Transfer Included

Free 7-day trial

No commitment. Cancel anytime

Popular

⚡Pro

$49.99/Month

200,000 Requests / Month
Then $0.0006497 per request if limit exceeded.
Rate Limit: 150 reqs per minute
Specialized Customer Support
Real-Time API Monitoring
Unlimited Data Transfer Included

Free 7-day trial

No commitment. Cancel anytime

🔥Pro Plus

$99.99/Month

500,000 Requests / Month
Then $0.0006497 per request if limit exceeded.
Rate Limit: 180 reqs per minute
Specialized Customer Support
Real-Time API Monitoring
Unlimited Data Transfer Included

Free 7-day trial

No commitment. Cancel anytime

⚜️Premium

$199.99/Month

1,000,000 Requests / Month
Then $0.0006497 per request if limit exceeded.
Rate Limit: 200 reqs per minute
Specialized Customer Support
Real-Time API Monitoring
Unlimited Data Transfer Included

Free 7-day trial

No commitment. Cancel anytime

🌟Elite

$499.99/Month

5,000,000 Requests / Month
Then $0.0006497 per request if limit exceeded.
Rate Limit: 220 reqs per minute
Specialized Customer Support
Real-Time API Monitoring
Unlimited Data Transfer Included

Free 7-day trial

No commitment. Cancel anytime

💎Ultimate

$999.99/Month

10,000,000 Requests / Month
Then $0.0006497 per request if limit exceeded.
Rate Limit: 240 reqs per minute
Specialized Customer Support
Real-Time API Monitoring
Unlimited Data Transfer Included

Free 7-day trial

No commitment. Cancel anytime

💫Basic

$20.83/Month

50,000 Requests / Month
Then $0.0006497 per request if limit exceeded.
Rate Limit: 100 reqs per minute
Specialized Customer Support
Real-Time API Monitoring
Unlimited Data Transfer Included

Free 7-day trial

No commitment. Cancel anytime

Popular

⚡Pro

$41.66/Month

200,000 Requests / Month
Then $0.0006497 per request if limit exceeded.
Rate Limit: 150 reqs per minute
Specialized Customer Support
Real-Time API Monitoring
Unlimited Data Transfer Included

Free 7-day trial

No commitment. Cancel anytime

🔥Pro Plus

$83.33/Month

500,000 Requests / Month
Then $0.0006497 per request if limit exceeded.
Rate Limit: 180 reqs per minute
Specialized Customer Support
Real-Time API Monitoring
Unlimited Data Transfer Included

Free 7-day trial

No commitment. Cancel anytime

⚜️Premium

$166.66/Month

1,000,000 Requests / Month
Then $0.0006497 per request if limit exceeded.
Rate Limit: 200 reqs per minute
Specialized Customer Support
Real-Time API Monitoring
Unlimited Data Transfer Included

Free 7-day trial

No commitment. Cancel anytime

🌟Elite

$416.66/Month

5,000,000 Requests / Month
Then $0.0006497 per request if limit exceeded.
Rate Limit: 220 reqs per minute
Specialized Customer Support
Real-Time API Monitoring
Unlimited Data Transfer Included

Free 7-day trial

No commitment. Cancel anytime

💎Ultimate

$833.33/Month

10,000,000 Requests / Month
Then $0.0006497 per request if limit exceeded.
Rate Limit: 240 reqs per minute
Specialized Customer Support
Real-Time API Monitoring
Unlimited Data Transfer Included

Free 7-day trial

No commitment. Cancel anytime

🚀 Enterprise

Starts at
$ 10,000/Year

Custom Volume
Custom Rate Limit
Specialized Customer Support
Real-Time API Monitoring

Book a Call

Customer favorite features

✔︎ Only Pay for Successful Requests
✔︎ Free 7-Day Trial
✔︎ Multi-Language Support
✔︎ One API Key, All APIs.
✔︎ Intuitive Dashboard

✔︎ Comprehensive Error Handling
✔︎ Developer-Friendly Docs
✔︎ Postman Integration
✔︎ Secure HTTPS Connections
✔︎ Reliable Uptime

AI Gateway API FAQs

What type of data does the Chat Completions endpoint return?

The Chat Completions endpoint returns AI-generated responses based on user input. The response includes an ID, model used, generated message content, and token usage details, allowing developers to understand the interaction and resource consumption.

What are the key fields in the response data?

Key fields in the response include "id" (unique identifier), "model" (AI model used), "choices" (array of generated messages), and "usage" (token counts for prompt and completion). These fields help track the request and output effectively.

How is the response data organized?

The response data is structured in JSON format, with a main object containing fields like "id," "object," "created," "model," "choices," and "usage." Each "choice" includes a message object detailing the assistant's response and token usage.

What parameters can be used with the Chat Completions endpoint?

Users can customize requests using parameters such as "model" (to specify the AI model), "messages" (to provide conversation context), and optional parameters for streaming and structured outputs. This flexibility allows tailored interactions.

What types of information are available through the Chat Completions endpoint?

The endpoint provides conversational responses, including text replies, structured outputs, and function calls. It supports various AI models, enabling diverse applications from casual chats to complex queries.

How is data accuracy maintained?

Data accuracy is maintained through rigorous model training and evaluation processes. Each AI model undergoes quality checks to ensure reliable outputs, leveraging feedback and continuous improvements to enhance performance.

What are typical use cases for this data?

Typical use cases include customer support automation, content generation, interactive chatbots, and educational tools. Developers can leverage the API for various applications requiring natural language understanding and generation.

How can users effectively utilize the returned data?

Users can extract meaningful insights from the "choices" array, focusing on the "content" field for responses. By analyzing token usage in the "usage" field, developers can optimize interactions and manage resource consumption effectively.

What types of AI models can be accessed through the AI Gateway API?

The AI Gateway API provides access to over 113 AI models from 17 providers, including Claude Opus, Llama 3.3 and 4, DeepSeek R1 and V3, Qwen 3, and Mistral Large. This diverse range allows developers to choose models suited for various applications, from conversational agents to complex data analysis.

How can users customize their requests to the Chat Completions endpoint?

Users can customize requests by specifying the "model" parameter to select a specific AI model and the "messages" parameter to provide context for the conversation. Additional options for streaming and structured outputs can also be included to tailor the interaction further.

What is the significance of the "choices" array in the response?

The "choices" array contains the generated responses from the AI model. Each entry includes a "message" object with the assistant's reply, allowing users to access multiple response options and select the most relevant one for their needs.

How does the API handle different data formats?

The AI Gateway API primarily returns data in JSON format, which is structured for easy parsing and integration. This format ensures compatibility with various programming languages and frameworks, facilitating seamless integration into applications.

What should users do if they receive partial or empty results?

If users receive partial or empty results, they should check the input parameters for accuracy and completeness. Additionally, reviewing the "usage" field can provide insights into token consumption, helping to identify potential issues with the request.

What quality assurance measures are in place for the AI models?

The AI models undergo rigorous training and evaluation processes, including continuous feedback loops and performance assessments. This ensures that the models maintain high accuracy and reliability in generating responses, enhancing user trust in the outputs.

How can users effectively utilize the "usage" field in the response?

The "usage" field provides detailed token counts for prompt and completion, allowing users to monitor resource consumption. By analyzing this data, developers can optimize their requests, manage costs, and improve the efficiency of their interactions with the API.

What are some common use cases for the AI Gateway API?

Common use cases include building chatbots for customer support, generating creative content, automating data analysis, and developing educational tools. The API's versatility enables developers to create applications that leverage natural language processing for various industries.

General FAQs

What is Zyla API Hub?

Zyla API Hub is like a big store for APIs, where you can find thousands of them all in one place. We also offer dedicated support and real-time monitoring of all APIs. Once you sign up, you can pick and choose which APIs you want to use. Just remember, each API needs its own subscription. But if you subscribe to multiple ones, you'll use the same key for all of them, making things easier for you.

What currencies and payment methods are allowed?

Prices are listed in USD (United States Dollar), EUR (Euro), CAD (Canadian Dollar), AUD (Australian Dollar), and GBP (British Pound). We accept all major debit and credit cards. Our payment system uses the latest security technology and is powered by Stripe, one of the world's most reliable payment companies. If you have any trouble paying by card, just contact us at [email protected]

Additionally, if you already have an active subscription in any of these currencies (USD, EUR, CAD, AUD, GBP), that currency will remain for subsequent subscriptions. You can change the currency at any time as long as you don't have any active subscriptions.

Why can't I pay with my local currency even though I see it on the pricing page?

The local currency shown on the pricing page is based on the country of your IP address and is provided for reference only. The actual prices are in USD (United States Dollar). When you make a payment, the charge will appear on your card statement in USD, even if you see the equivalent amount in your local currency on our website. This means you cannot pay directly with your local currency.

My payment was declined, what should I do?

Occasionally, a bank may decline the charge due to its fraud protection settings. We suggest reaching out to your bank initially to check if they are blocking our charges. Also, you can access the Billing Portal and change the card associated to make the payment. If these does not work and you need further assistance, please contact our team at [email protected]

How will I be charged for my API subscription?

Prices are determined by a recurring monthly or yearly subscription, depending on the chosen plan.

How will my API calls be deducted from my plan?

API calls are deducted from your plan based on successful requests. Each plan comes with a specific number of calls that you can make per month. Only successful calls, indicated by a Status 200 response, will be counted against your total. This ensures that failed or incomplete requests do not impact your monthly quota.

How does your billing cycle work?

Zyla API Hub works on a recurring monthly subscription system. Your billing cycle will start the day you purchase one of the paid plans, and it will renew the same day of the next month. So be aware to cancel your subscription beforehand if you want to avoid future charges.

How do I upgrade my current subscription plan with an API?

To upgrade your current subscription plan, simply go to the pricing page of the API and select the plan you want to upgrade to. The upgrade will be instant, allowing you to immediately enjoy the features of the new plan. Please note that any remaining calls from your previous plan will not be carried over to the new plan, so be aware of this when upgrading. You will be charged the full amount of the new plan.

How can I see the remaining number of API calls I can make this month?

To check how many API calls you have left for the current month, refer to the 'X-Zyla-API-Calls-Monthly-Remaining' field in the response header. For example, if your plan allows 1,000 requests per month and you've used 100, this field in the response header will indicate 900 remaining calls.

How do I find out the maximum number of API requests allowed in my subscription plan?

To see the maximum number of API requests your plan allows, check the 'X-Zyla-RateLimit-Limit' response header. For instance, if your plan includes 1,000 requests per month, this header will display 1,000.

How do I know when my rate limit will reset?

The 'X-Zyla-RateLimit-Reset' header shows the number of seconds until your rate limit resets. This tells you when your request count will start fresh. For example, if it displays 3,600, it means 3,600 seconds are left until the limit resets.

Can I cancel anytime?

Yes, you can cancel your plan anytime by going to your account and selecting the cancellation option on the Billing page. Please note that upgrades, downgrades, and cancellations take effect immediately. Additionally, upon cancellation, you will no longer have access to the service, even if you have remaining calls left in your quota.

If I have any problems, who I should contact?

You can contact us through our chat channel to receive immediate assistance. We are always online from 8 am to 5 pm (EST). If you reach us after that time, we will get back to you as soon as possible. Additionally, you can contact us via email at [email protected]

How does the 7-day free trial work?

To give you the opportunity to experience our APIs without any commitment, we offer a 7-day free trial that allows you to make up to 50 API calls at no cost. This trial can be used only once, so we recommend applying it to the API that interests you the most. While most of our APIs offer a free trial, some may not. The trial concludes after 7 days or once you've made 50 requests, whichever occurs first. If you reach the 50 request limit during the trial, you will need to "Start Your Paid Plan" to continue making requests. You can find the "Start Your Paid Plan" button in your profile under Subscription -> Choose the API you are subscribed to -> Pricing tab. Alternatively, if you don't cancel your subscription before the 7th day, your free trial will end, and your plan will automatically be billed, granting you access to all the API calls specified in your plan. Please keep this in mind to avoid unwanted charges.

What happens if I forget to cancel my free trial?

After 7 days, you will be charged the full amount for the plan you were subscribed to during the trial. Therefore, it's important to cancel before the trial period ends. Refund requests for forgetting to cancel on time are not accepted.

How many calls can I make during the free trial?

When you subscribe to an API free trial, you can make up to 50 API calls. If you wish to make additional API calls beyond this limit, the API will prompt you to perform an "Start Your Paid Plan." You can find the "Start Your Paid Plan" button in your profile under Subscription -> Choose the API you are subscribed to -> Pricing tab.

When are Payout Orders processed?

Payout Orders are processed between the 20th and the 30th of each month. If you submit your request before the 20th, your payment will be processed within this timeframe.

Start Free Trial

Service Level

100%

Response Time

2,178ms

Category:

AI & Machine Learning

Tags:

#AI Model Access

#Unified Billing

#Prompt Caching

#Cost Optimization

#Multi-Model Support

#OpenAI Compatibility

Related APIs

AI Resource Hub API

Easily explore a vast collection of AI tools for business, design, education, and more with powe...

AI & Machine Learning Free 7-Day Trial

Service Level:

100%

Response Time:

106ms

AI Text Source Identifier API

The AI Text Source Identifier API accurately determines whether a provided text is generated by...

AI & Machine Learning Free 7-Day Trial

Service Level:

100%

Response Time:

807ms

Global AI Translation API

Enhance global communication with AI-powered translations supporting 140+ languages, ensuring co...

Natural Language Processing (NLP) Free 7-Day Trial

Service Level:

100%

Response Time:

907ms

AI Content Authenticity API

The AI Content Authenticity API reliably identifies whether a text is authored by artificial int...

AI & Machine Learning Free 7-Day Trial

Service Level:

100%

Response Time:

1,116ms

AI Text Validation API

The AI Text Validation API employs advanced algorithms to distinguish human-authored text from A...

Natural Language Processing (NLP)

Service Level:

100%

Response Time:

734ms

AI Text Generation API

The AI Text Generation API is an advanced language model, utilizing deep learning to produce hum...

Natural Language Processing (NLP)

Service Level:

100%

Response Time:

5,772ms

User Agent Interpreter API

The User Agent Insight API user agent strings to identify device, browser, and operating system...

Tools & Utilities Free 7-Day Trial

Service Level:

60%

Response Time:

294ms

AI Solution Finder API

Search and access thousands of AI tools across multiple categories like text, images, and busine...

AI & Machine Learning Free 7-Day Trial

Service Level:

100%

Response Time:

78ms

AI Animal Identifier API

Classifies animals in images, outputting their scientific name, common name, descriptive profile...

Visual Recognition & Imaging Free 7-Day Trial

Service Level:

100%

Response Time:

2,496ms

Intelligent Tools API

Access over 10,000 artificial intelligence tools and sites. Find relevant solutions for text, im...

AI & Machine Learning Free 7-Day Trial

Service Level:

100%

Response Time:

557ms

AI Gateway API

What would you like to see? See the information or check the documentation?

API Documentation

Endpoints

API EXAMPLE RESPONSE

Chat Completions - CODE SNIPPETS

API Access Key & Authentication

Questions

Simple Transparent Pricing

💫Basic

$24.99/Month

⚡Pro

$49.99/Month

🔥Pro Plus

$99.99/Month

⚜️Premium

$199.99/Month

🌟Elite

$499.99/Month

💎Ultimate

$999.99/Month

💫Basic

$20.83/Month

⚡Pro

$41.66/Month

🔥Pro Plus

$83.33/Month

⚜️Premium

$166.66/Month

🌟Elite

$416.66/Month

💎Ultimate

$833.33/Month

🚀 Enterprise

Starts at $ 10,000/Year

Customer favorite features

AI Gateway API FAQs

What type of data does the Chat Completions endpoint return?

What are the key fields in the response data?

How is the response data organized?

What parameters can be used with the Chat Completions endpoint?

What types of information are available through the Chat Completions endpoint?

How is data accuracy maintained?

What are typical use cases for this data?

How can users effectively utilize the returned data?

What types of AI models can be accessed through the AI Gateway API?

How can users customize their requests to the Chat Completions endpoint?

What is the significance of the "choices" array in the response?

How does the API handle different data formats?

What should users do if they receive partial or empty results?

What quality assurance measures are in place for the AI models?

How can users effectively utilize the "usage" field in the response?

What are some common use cases for the AI Gateway API?

General FAQs

What is Zyla API Hub?

What currencies and payment methods are allowed?

Why can't I pay with my local currency even though I see it on the pricing page?

My payment was declined, what should I do?

How will I be charged for my API subscription?

How will my API calls be deducted from my plan?

How does your billing cycle work?

How do I upgrade my current subscription plan with an API?

How can I see the remaining number of API calls I can make this month?

How do I find out the maximum number of API requests allowed in my subscription plan?

How do I know when my rate limit will reset?

Can I cancel anytime?

If I have any problems, who I should contact?

How does the 7-day free trial work?

What happens if I forget to cancel my free trial?

How many calls can I make during the free trial?

When are Payout Orders processed?

Service Level

Response Time

Category:

Tags:

Related APIs

AI Resource Hub API

AI Text Source Identifier API

Global AI Translation API

AI Content Authenticity API

Starts at
$ 10,000/Year