OpenAI-compatible API
The OpenAI-compatible API provides a unified request and response structure for different AI providers. Instead of dealing with dozens of different formats, you work with a single standard that is compatible with SDKs, libraries, and UIs.
Why Unification Matters Copy link
Consider two native APIs: agent.hostman.com and a hypothetical api.somerandomai.com. Let’s see how message sending is implemented.

Even for the same task, request structures can differ significantly. For example, one API uses parent_message_id to maintain context, while another uses a nested object:
"context": {
 "thread_id": "abc123",
 "reply_to": "msg789"
}Additionally, with our API, settings are configured via a separate request, whereas in the third-party API, they are included in the settings object with every message.
As a result, you would need to:
- Write a separate client for each API
- Learn each API’s field names and behavior
- Test everything from scratch
The OpenAI-compatible API solves this problem. The structure is always the same: a messages array with role and content fields, plus standard parameters (model, temperature, max_tokens). Differences remain only in the URL and model name.

Benefits:
- Fast integration with any software supporting the OpenAI API
- Support for popular libraries (e.g., LangChain)
- Works with ready-made UIs (e.g., Open WebUI)
- Minimal effort when migrating from OpenAI
Limitations Copy link
The current implementation is not fully OpenAI-compatible.
Not supported:
- Embeddings: the
/v1/embeddingsendpoint is unavailable - Fine-tuning: training and customization of models
- Images API: image generation
- Audio API: speech-to-text and text-to-speech conversion (except for audio messages in chat)
- Files API: file upload and management
- Assistants API: creation and management of assistants
- Web search: searching and retrieving data from the internet
Implementation notes:
- The
modelparameter in requests is ignored, the model defined in the agent settings is used. - Some parameters may be ignored depending on the agent’s selected model.
- The prompt defined in the agent’s settings is applied to requests without explicit prompts.
Usage Copy link
Let’s look at how to use the OpenAI-compatible API.
Authentication Copy link
An API token is required for requests.
Include the token in the request header:
Authorization: Bearer $TOKENIn cURL examples, you can:
- Provide the token manually by replacing
$TOKENwith your actual token in each request, or - Use an environment variable to avoid inserting it every time:
export TOKEN=your_access_tokenThe $TOKEN variable will then be automatically substituted.
In Python and Node.js examples, the token is represented as {{token}}. We recommend storing it in environment variables or configuration files instead of in code to prevent leaks.
Base URL Copy link
The API also requires a base URL, which can be found in the Dashboard tab of the agent’s control panel.

Sending Messages to the Agent Copy link
Two methods are supported: Chat Completions and Text Completions. The Text Completions method is deprecated and only maintained for backward compatibility; it is not recommended.
Chat Completions Usage Example
POST /api/v1/cloud-ai/agents/{{agent_id}}/v1/chat/completionscURL:
curl --request POST \
--url https://agent.hostman.com/api/v1/cloud-ai/agents/{{agent_id}}/v1/chat/completions \
--header 'authorization: Bearer $TOKEN' \
--header 'content-type: application/json' \
--data '{ "model": "gpt-4.1", "messages": [ { "role": "user", "content": "Hello!" } ], "temperature": 1, "max_tokens": 100, "stream": false }'Python:
import requests
url = "https://agent.hostman.com/api/v1/cloud-ai/agents/{{agent_id}}/v1/chat/completions"
payload = {
"model": "gpt.1",
"messages": [
{
"role": "user",
"content": "Hello!"
}
],
"temperature": 1,
"max_tokens": 100,
"stream": False
}
headers = {
"authorization": "Bearer {{token}}",
"content-type": "application/json"
}
response = requests.post(url, json=payload, headers=headers)
print(response.json())Node.js:
const request = require('request');
const options = {
method: 'POST',
url: 'https://agent.hostman.com/api/v1/cloud-ai/agents/{{agent_id}}/v1/chat/completions',
headers: {authorization: 'Bearer {{token}}', 'content-type': 'application/json'},
body: {
model: 'gpt.1',
messages: [{role: 'user', content: 'Hello!'}],
temperature: 1,
max_tokens: 100,
stream: false
},
json: true
};
request(options, function (error, response, body) {
if (error) throw new Error(error);
console.log(body);
});Parameters:
model: optional, ignored (for compatibility)messages: array of messages:role: sender role (user,assistant,system)content: message text
temperature: creativity of the responsemax_tokens: response length limitstream: stream output (true/false)
Additional parameters may vary depending on the model. When constructing a request, refer to the parameters available in the control panel for the selected model: if a parameter is present in the panel, it is supported when accessed through the API.
Example Response:
{
"id": "fc8cd652-af12-4a89-8ef7-490a0526e8d3",
"object": "chat.completion",
"created": 1757601532,
"model": "gpt-4.1",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! How can I help you? 😊"
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 10,
"completion_tokens": 8,
"total_tokens": 18
}
}Text Completions Usage Example
POST /api/v1/cloud-ai/agents/{{agent_id}}/v1/completionscURL:
curl --request POST \
--url https://agent.hostman.com/api/v1/cloud-ai/agents/{{agent_id}}/v1/completions \
--header 'authorization: Bearer $TOKEN' \
--header 'content-type: application/json' \
--data '{ "prompt": "Hello!", "model": "gpt-4.1", "max_tokens": 100, "temperature": 0.7, "top_p": 0.9, "n": 1, "stream": false, "logprobs": null, "echo": false, "stop": [ "\n" ], "presence_penalty": 0, "frequency_penalty": 0, "best_of": 1, "user": "hostman" }'Python:
import requests
url = "https://agent.hostman.com/api/v1/cloud-ai/agents/{{agent_id}}/v1/completions"
payload = {
"prompt": "Hello!",
"model": "4.1",
"max_tokens": 100,
"temperature": 0.7,
"top_p": 0.9,
"n": 1,
"stream": False,
"logprobs": None,
"echo": False,
"stop": [" "],
"presence_penalty": 0,
"frequency_penalty": 0,
"best_of": 1,
"user": "hostman"
}
headers = {
"authorization": "Bearer {{token}}",
"content-type": "application/json"
}
response = requests.post(url, json=payload, headers=headers)
print(response.json())Node.js:
const request = require('request');
const options = {
method: 'POST',
url: 'https://agent.hostman.com/api/v1/cloud-ai/agents/{{agent_id}}/v1/completions',
headers: {authorization: 'Bearer {{token}}', 'content-type': 'application/json'},
body: {
prompt: 'Hello!',
model: 'gpt-4.1',
max_tokens: 100,
temperature: 0.7,
top_p: 0.9,
n: 1,
stream: false,
logprobs: null,
echo: false,
stop: ['\n'],
presence_penalty: 0,
frequency_penalty: 0,
best_of: 1,
user: 'hostman'
},
json: true
};
request(options, function (error, response, body) {
if (error) throw new Error(error);
console.log(body);
});Parameters:
prompt: text of the requestmodel: ignored, included for compatibilitymax_tokens: response length limittemperature: creativity leveltop_p: probability samplingn: number of responses (ignored)stream: stream output
Other parameters (logprobs, echo, stop, presence_penalty, frequency_penalty, best_of, user) are partially supported, mainly for compatibility.
Example Response:
{
"id": "7f967459-428c-46ad-87f0-213ff951024d",
"object": "text_completion",
"created": 1757601892,
"model": "gpt-4.1",
"choices": [
{
"text": "Hello! How can I help you? 😊",
"index": 0,
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 10,
"completion_tokens": 8,
"total_tokens": 18
},
"response_id": "ea9aa124-0c51-467c-9ab6-218ab4ec65e7"
}Roles Copy link
In Chat Completions, each message includes a role. There are three roles:
user: user requestassistant: AI responsesystem: system prompt
user Copy link
The user role is used to send regular user requests to the AI.
Example:
 {
 "role": "user",
 "content": "What is 2+5?"
}assistant Copy link
This role indicates the AI’s reply to a previous message.
Example:
curl --request POST \
--url https://agent.hostman.com/api/v1/cloud-ai/agents/{{agent_id}}/v1/chat/completions \
--header 'authorization: Bearer $TOKEN' \
--header 'content-type: application/json' \
--data '{
"model": "gpt-4",
"messages": [
{ "role": "user", "content": "What is 2+5? Provide only the answer, no formatting." },
{ "role": "assistant", "content": "7" },
{ "role": "user", "content": "Now multiply the result by 2. Provide only the answer, no formatting." }
]
}'
The request passes the previous question and answer, marking messages as "role": "user" or "role": "assistant". The last user message is left without a response; this is what the model will generate a new answer for.
Example Response:
{
"index": 0,
"message": {
"role": "assistant",
"content": "14",
"refusal": null,
"annotations": []
},
"finish_reason": "stop"
}system Copy link
The system role defines the agent’s behavior: style, tone, constraints, and goals. It is usually the first message in the messages array. See the article on system prompts for details.
Looking at the example of using the assistant role, you can notice that the instruction is repeated in the user requests:
Provide only the answer, no formattingThis duplication can be avoided by specifying the instruction once in the system prompt.
Example:
curl --request POST \
--url https://agent.hostman.com/api/v1/cloud-ai/agents/{{agent_id}}/v1/chat/completions \
--header 'authorization: Bearer $TOKEN' \
--header 'content-type: application/json' \
--data '{
"model": "gpt-4",
"messages": [
{ "role": "system", "content": "When answering questions, provide only the calculation result, without any formatting." },
{ "role": "user", "content": "What is 2+5?" },
{ "role": "assistant", "content": "7" },
{ "role": "user", "content": "Now multiply the result by 2" }
]
}'Now the model will follow the instruction even if it’s not repeated in every user message.