Products

App Platform
- Overview
- How It Works
- Connect Repositories
- Deploy Frontend Apps
- Deploy Backend Apps
- Deploy with a Dockerfile
- Deploy with Docker Compose
- Managing Apps
- Healthcheck Path
- Variables
- FAQ
AI Agents
- Overview
- Manage Agents
- Manage Knowledge Bases
- API Usage
- Pricing
Cloud Servers
Cloud Databases
Kubernetes
- Overview
- Create Clusters
- Connect to the Cluster
- Manage Clusters
- Load Balancing
- Autoscaling
- OIDC Provider Setup
- Working with Helm
- Creating Helm Charts
- Network Plugins
- Addons
S3 Object Storage
- Overview
- Manage Storage
- Tools
- Use Cases
- Supported Features
- S3 SDKs
VPC
Public IP
Load Balancers
Firewall
Domains

Platform Guides

Account
- Overview
- Access Recovery
- Change Password
- Update Profile Info
- Authentication with Passkey
- Restrict Access
- Manage Sessions
- API Tokens
- Access Management
- Projects
Billing
- Overview
- Linking a Card
Terraform
- Overview
- Getting Started with Terraform

Updated on 20 February 2026

The OpenAI-compatible API provides a unified request and response structure for different AI providers. Instead of dealing with dozens of different formats, you work with a single standard that is compatible with SDKs, libraries, and UIs.

Why Unification Matters

Consider two native APIs: agent.hostman.com and a hypothetical api.somerandomai.com. Let’s see how message sending is implemented.

Even for the same task, request structures can differ significantly. For example, one API uses parent_message_id to maintain context, while another uses a nested object:

"context": {
 "thread_id": "abc123",
 "reply_to": "msg789"
}

Additionally, with our API, settings are configured via a separate request, whereas in the third-party API, they are included in the settings object with every message.

As a result, you would need to:

Write a separate client for each API
Learn each API’s field names and behavior
Test everything from scratch

The OpenAI-compatible API solves this problem. The structure is always the same: a messages array with role and content fields, plus standard parameters (model, temperature, max_tokens). Differences remain only in the URL and model name.

Benefits:

Fast integration with any software supporting the OpenAI API
Support for popular libraries (e.g., LangChain)
Works with ready-made UIs (e.g., Open WebUI)
Minimal effort when migrating from OpenAI

Limitations

The current implementation is not fully OpenAI-compatible.

Not supported:

Embeddings: the /v1/embeddings endpoint is unavailable
Fine-tuning: training and customization of models
Images API: image generation
Audio API: speech-to-text and text-to-speech conversion (except for audio messages in chat)
Files API: file upload and management
Assistants API: creation and management of assistants
Web search: searching and retrieving data from the internet

Implementation notes:

The model parameter in requests is ignored, the model defined in the agent settings is used.
Some parameters may be ignored depending on the agent’s selected model.
The prompt defined in the agent’s settings is applied to requests without explicit prompts.

Usage

Let’s look at how to use the OpenAI-compatible API.

You can find all available API methods in the documentation.

Authentication

Regardless of the API type (private or public), an API token is required for requests.

Include the token in the request header:

Authorization: Bearer $TOKEN

In cURL examples, you can:

Provide the token manually by replacing $TOKEN with your actual token in each request, or
Use an environment variable to avoid inserting it every time:

export TOKEN=your_access_token

The $TOKEN variable will then be automatically substituted.

In Python and Node.js examples, the token is represented as {{token}}. We recommend storing it in environment variables or configuration files instead of in code to prevent leaks.

Base URL

The API also requires a base URL, which can be found in the Dashboard tab of the agent’s control panel.

Sending Messages to the Agent

Two methods are supported: Chat Completions and Text Completions. The Text Completions method is deprecated and only maintained for backward compatibility; it is not recommended.

Chat Completions Usage Example

POST /api/v1/cloud-ai/agents/{{agent_id}}/v1/chat/completions

cURL:

curl --request POST \
  --url https://agent.hostman.com/api/v1/cloud-ai/agents/{{agent_id}}/v1/chat/completions \
  --header 'authorization: Bearer $TOKEN' \
  --header 'content-type: application/json' \
  --data '{ "model": "gpt-4.1", "messages": [ { "role": "user", "content": "Hello!" } ], "temperature": 1, "max_tokens": 100, "stream": false }'

Python:

import requests

url = "https://agent.hostman.com/api/v1/cloud-ai/agents/{{agent_id}}/v1/chat/completions"

payload = {
    "model": "gpt.1",
    "messages": [
        {
            "role": "user",
            "content": "Hello!"
        }
    ],
    "temperature": 1,
    "max_tokens": 100,
    "stream": False
}
headers = {
    "authorization": "Bearer {{token}}",
    "content-type": "application/json"
}

response = requests.post(url, json=payload, headers=headers)

print(response.json())

Node.js:

const request = require('request');

const options = {
  method: 'POST',
  url: 'https://agent.hostman.com/api/v1/cloud-ai/agents/{{agent_id}}/v1/chat/completions',
  headers: {authorization: 'Bearer {{token}}', 'content-type': 'application/json'},
  body: {
    model: 'gpt.1',
    messages: [{role: 'user', content: 'Hello!'}],
    temperature: 1,
    max_tokens: 100,
    stream: false
  },
  json: true
};

request(options, function (error, response, body) {
  if (error) throw new Error(error);

  console.log(body);
});

Parameters:

model: optional, ignored (for compatibility)
messages: array of messages:
- role: sender role (user, assistant, system)
- content: message text
temperature: creativity of the response
max_tokens: response length limit
stream: stream output (true/false)

For GPT-5 models, max_tokens is replaced with max_completion_tokens, and using temperature will trigger an error.

Additional parameters may vary depending on the model. When constructing a request, refer to the parameters available in the control panel for the selected model: if a parameter is present in the panel, it is supported when accessed through the API.

Example Response:

{
  "id": "fc8cd652-af12-4a89-8ef7-490a0526e8d3",
  "object": "chat.completion",
  "created": 1757601532,
  "model": "gpt-4.1",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Hello! How can I help you? 😊"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 10,
    "completion_tokens": 8,
    "total_tokens": 18
  }
}

Text Completions Usage Example

POST /api/v1/cloud-ai/agents/{{agent_id}}/v1/completions

cURL:

curl --request POST \
  --url https://agent.hostman.com/api/v1/cloud-ai/agents/{{agent_id}}/v1/completions \
  --header 'authorization: Bearer $TOKEN' \
  --header 'content-type: application/json' \
  --data '{ "prompt": "Hello!", "model": "gpt-4.1", "max_tokens": 100, "temperature": 0.7, "top_p": 0.9, "n": 1, "stream": false, "logprobs": null, "echo": false, "stop": [ "\n" ], "presence_penalty": 0, "frequency_penalty": 0, "best_of": 1, "user": "hostman" }'

Python:

import requests

url = "https://agent.hostman.com/api/v1/cloud-ai/agents/{{agent_id}}/v1/completions"

payload = {
    "prompt": "Hello!",
    "model": "4.1",
    "max_tokens": 100,
    "temperature": 0.7,
    "top_p": 0.9,
    "n": 1,
    "stream": False,
    "logprobs": None,
    "echo": False,
    "stop": [" "],
    "presence_penalty": 0,
    "frequency_penalty": 0,
    "best_of": 1,
    "user": "hostman"
}
headers = {
    "authorization": "Bearer {{token}}",
    "content-type": "application/json"
}

response = requests.post(url, json=payload, headers=headers)

print(response.json())

Node.js:

const request = require('request');

const options = {
  method: 'POST',
  url: 'https://agent.hostman.com/api/v1/cloud-ai/agents/{{agent_id}}/v1/completions',
  headers: {authorization: 'Bearer {{token}}', 'content-type': 'application/json'},
  body: {
    prompt: 'Hello!',
    model: 'gpt-4.1',
    max_tokens: 100,
    temperature: 0.7,
    top_p: 0.9,
    n: 1,
    stream: false,
    logprobs: null,
    echo: false,
    stop: ['\n'],
    presence_penalty: 0,
    frequency_penalty: 0,
    best_of: 1,
    user: 'hostman'
  },
  json: true
};

request(options, function (error, response, body) {
  if (error) throw new Error(error);

  console.log(body);
});

Parameters:

prompt: text of the request
model: ignored, included for compatibility
max_tokens: response length limit
temperature: creativity level
top_p: probability sampling
n: number of responses (ignored)
stream: stream output

Other parameters (logprobs, echo, stop, presence_penalty, frequency_penalty, best_of, user) are partially supported, mainly for compatibility.

Example Response:

{
  "id": "7f967459-428c-46ad-87f0-213ff951024d",
  "object": "text_completion",
  "created": 1757601892,
  "model": "gpt-4.1",
  "choices": [
    {
      "text": "Hello! How can I help you? 😊",
      "index": 0,
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 10,
    "completion_tokens": 8,
    "total_tokens": 18
  },
  "response_id": "ea9aa124-0c51-467c-9ab6-218ab4ec65e7"
}

Roles

In Chat Completions, each message includes a role. There are three roles:

user: user request
assistant: AI response
system: system prompt

user

The user role is used to send regular user requests to the AI.

Example:

 {
  "role": "user",
  "content": "What is 2+5?"
}

assistant

This role indicates the AI’s reply to a previous message.

Important: In OpenAI-compatible API, dialogue history must be included in every request. Include both user requests and assistant responses to maintain context.

Example:

curl --request POST \
  --url https://agent.hostman.com/api/v1/cloud-ai/agents/{{agent_id}}/v1/chat/completions \
  --header 'authorization: Bearer $TOKEN' \
  --header 'content-type: application/json' \
  --data '{
    "model": "gpt-4",
    "messages": [
      { "role": "user", "content": "What is 2+5? Provide only the answer, no formatting." },
      { "role": "assistant", "content": "7" },
      { "role": "user", "content": "Now multiply the result by 2. Provide only the answer, no formatting." }
    ]
  }'

The request passes the previous question and answer, marking messages as "role": "user" or "role": "assistant". The last user message is left without a response; this is what the model will generate a new answer for.

Example Response:

{
  "index": 0,
  "message": {
    "role": "assistant",
    "content": "14",
    "refusal": null,
    "annotations": []
  },
  "finish_reason": "stop"
}

system

The system role defines the agent’s behavior: style, tone, constraints, and goals. It is usually the first message in the messages array. See the article on system prompts for details.

Looking at the example of using the assistant role, you can notice that the instruction is repeated in the user requests:

Provide only the answer, no formatting

This duplication can be avoided by specifying the instruction once in the system prompt.

Example:

curl --request POST \
  --url https://agent.hostman.com/api/v1/cloud-ai/agents/{{agent_id}}/v1/chat/completions \
  --header 'authorization: Bearer $TOKEN' \
  --header 'content-type: application/json' \
  --data '{
    "model": "gpt-4",
    "messages": [
      { "role": "system", "content": "When answering questions, provide only the calculation result, without any formatting." },
      { "role": "user", "content": "What is 2+5?" },
      { "role": "assistant", "content": "7" },
      { "role": "user", "content": "Now multiply the result by 2" }
    ]
  }'

Now the model will follow the instruction even if it’s not repeated in every user message.

Was this page helpful?

Updated on 20 February 2026

OpenAI-compatible API

Why Unification Matters

Limitations

Usage

Authentication

Base URL

Sending Messages to the Agent

Chat Completions Usage Example

Text Completions Usage Example

Roles

user

assistant

system

Do you have questions, comments, or concerns?

Do you have questions,
comments, or concerns?