AI Gateway


AI Gateway is an API for working directly with language models without going through AI agents.

Unlike the OpenAI compatible API and the native API, AI Gateway doesn't provide agent level capabilities such as:

  • RAG
  • MCP
  • System level configuration (system prompt, temperature, response length limits)

AI Gateway is the right choice when you need to work with models directly, without any agent logic. For example:

  • You're building your own logic on top of models (custom RAG, MCP integrations)
  • You need a single access point for multiple models
  • You're already using the OpenAI SDK and want to switch between models without changing your code

AI Gateway vs the OpenAI Compatible API

 

OpenAI Compatible API

AI Gateway

Operates on

An agent

A model

RAG / MCP

Supported

Not supported

System level settings

Configured in the agent's dashboard

Passed with each request

Model selection

The agent's assigned model

Any available model

We recommend using the OpenAI SDK to work with the API, it abstracts away the differences and simplifies integration.

Supported methods and capabilities depend on the model you're using. For example, the responses endpoint is only available for models that support reasoning.

AI Gateway does not support:

  • Files API for uploading and managing files
  • Fine-tuning for training and customizing models
  • Video API for working with video

Connecting to AI Gateway
Copy link

To connect to AI Gateway, go to the AI Gateway tab in the AI Agents section.

Creating an API Key
Copy link

AI Gateway uses separate keys that aren't linked to your account level API keys.

To create a key:

  1. Go to AI AgentsAI Gateway.
  2. Click Create key.
  3. Enter a name for the key and select the validity period.
  4. (Optional) Add a description that will show in the dashboard.
  5. (Optional) Select the project.
  6. Click Create.

AI Agents 06 11 2026 12 16 Pm

  1. Copy the generated key and store it locally.

Deleting an API Key
Copy link

To delete an API key:

  1. Go to AI AgentsAI Gateway.
  2. Open the API Keys tab.
  3. Click the three dot menu next to the key you want to remove.
  4. Click Delete.
  5. Confirm the action.

Getting Connection Details
Copy link

You can find the connection details in the Connection tab.

  1. Go to AI AgentsAI Gateway.
  2. Open the Connection tab.
  3. Select a model and a programming language. The interface will show the input and output token pricing for the selected model.
  4. Copy the code sample for connecting and the command for installing the OpenAI library.

Usage Example
Copy link

We recommend using the OpenAI SDK to work with AI Gateway, it removes the need to send raw HTTP requests and makes integration much simpler.

Available SDKs:

The full list of SDKs is available in the OpenAI repository.

The examples below use Python and the openai library. Install it with pip:

pip install openai

Sending a Request
Copy link

Use the Chat Completions method to send messages. Messages are passed in the messages array.

Each message contains:

  • role: the sender's role (user, assistant, system)
  • content: the message text
from openai import OpenAI

client = OpenAI(
    api_key="YOUR_API_KEY",
    base_url="https://ai-api.hostman.com/v1"
)

response = client.chat.completions.create(
    model="MODEL_NAME",
    messages=[
        {
            "role": "system",
            "content": "Answer briefly and to the point.",
        },
        {
            "role": "user",
            "content": "Explain what Kubernetes is",
        },
    ],
)

print(response.choices[0].message.content)

Parameters:

  • api_key: your AI Gateway API key. Replace this with your own key.
  • base_url: the base URL for connecting to AI Gateway.
  • model: the name of the model you want to use.
  • messages: an array of messages with roles and text.

Sending a Request With Message History
Copy link

To preserve conversation context, pass previous messages in the messages array:

from openai import OpenAI

client = OpenAI(
    api_key="YOUR_API_KEY",
    base_url="https://ai-api.hostman.com/v1"
)

response = client.chat.completions.create(
    model="MODEL_NAME",
    messages=[
        {
            "role": "system",
            "content": "Reply only in short phrases.",
        },
        {
            "role": "user",
            "content": "What's 2 + 5?",
        },
        {
            "role": "assistant",
            "content": "7",
        },
        {
            "role": "user",
            "content": "Now multiply the result by 2",
        },
    ],
)

print(response.choices[0].message.content)

In this example, previous messages (assistant and user) are included to preserve the conversation context.

Sending a Request (Responses API)
Copy link

The Responses API is a newer way to work with models. It simplifies the request structure and doesn't require building a messages array explicitly.

from openai import OpenAI

client = OpenAI(
    api_key="YOUR_API_KEY",
    base_url="https://ai-api.hostman.com/v1"
)

response = client.responses.create(
    model="MODEL_NAME",
    instructions="Answer briefly and to the point.",
    input="Explain what Kubernetes is"
)

print(response.output_text)

Parameters:

  • model: the name of the model you want to use
  • instructions: instructions for the model (similar to a system prompt)
  • input: the request text

Sending a Request With Message History (Responses API)
Copy link

To preserve conversation context with the Responses API, pass previous_response_id, which is the ID of the previous response.

from openai import OpenAI

client = OpenAI(
    api_key="YOUR_API_KEY",
    base_url="https://ai-api.hostman.com/v1"
)

response = client.responses.create(
    model="MODEL_NAME",
    instructions="Reply only in short phrases.",
    input="What's 2 + 5?"
)

next_response = client.responses.create(
    model="MODEL_NAME",
    instructions="Reply only in short phrases.",
    previous_response_id=response.id,
    input="Now multiply the result by 2"
)

print(next_response.output_text)

In this example, the first request returns a response object containing a unique id. This id is passed as previous_response_id in the next request, letting you continue the conversation without sending the full message history again.

The model parameter must be specified in every request, including follow up calls that use previous_response_id.

Listing Available Models
Copy link

AI Gateway lets you retrieve a list of available models:

from openai import OpenAI

client = OpenAI(
    api_key="YOUR_API_KEY",
    base_url="https://ai-api.hostman.com/v1"
)

models = client.models.list()

for model in models.data:
    print(model.id)

The models.list() method returns a list of models you can use in the model parameter.

Using Embeddings
Copy link

Embeddings convert text into a vector representation. This is useful for semantic search, clustering, or RAG.

AI Gateway provides the openai/text-embedding-3-large model for creating embeddings.

from openai import OpenAI

client = OpenAI(
    api_key="YOUR_API_KEY",
    base_url="https://ai-api.hostman.com/v1"
)

response = client.embeddings.create(
    model="openai/text-embedding-3-large",
    input="Text to vectorize",
)

print(response.data[0].embedding)

The method returns a vector representation of the input text.