gpt-4o-mini-2024-07-18
Image to Text - Response
POST
Authorization
- Auth Type:
Bearer Auth(In:header) - Format:
Authorization: Bearer <YOUR_API_KEY> - Description: Use
Bearer <YOUR_API_KEY>. Format:Authorization: Bearer sk-xxxxxx. - API Key: where API Key is your AGCloud API KEY
Request Body
Core Parameters
| Field | Type | Required | Range | Description |
|---|---|---|---|---|
model | string | ✅ | - | Model ID used to generate the response. |
input | array<object> | ✅ | - | The input content. |
>input.role | enum | ✅ | user assistant system developer | The role of the message sender. Can be user model. |
>input.content | string|array<object> | ✅ | - | A text input to the model when string; a list of one or many input items to the model, containing different content types when array. See Multimodal Input for details. |
Content Structure
| Field | Type | Required | Range | Description |
|---|---|---|---|---|
type | string | ✅ | input_text input_image input_file | Identifies the content block type for multimodal input. |
text | string | - | - | The text input content. |
file_id | string | - | - | The ID of the file to be sent to the model. |
detail | string | - | low high auto | The detail level of the image to be sent to the model. One of high, low, or auto. Defaults to auto. Only required when type=input_image. Default auto. |
image_url | string | - | - | The URL of the image to be sent to the model. A fully qualified URL or base64 encoded image in a data URL. Only required when type=input_image. |
file_url | string | - | - | The URL of the file to be sent to the model. Only required when type=input_file. |
file_data | string | - | - | The content of the file to be sent to the model. Only required when type=input_file. |
filename | string | - | - | The name of the file to be sent to the model. Only required when type=input_file. |
Advanced Parameters
| Field | Type | Required | Range | Description |
|---|---|---|---|---|
stream | boolean | - | - | Whether to stream the response back incrementally. Defaults false. |
max_output_tokens | internet | - | - | An upper bound for the number of tokens that can be generated for a response, including visible output tokens and reasoning tokens. |
reasoning | object | - | - | Configuration options for reasoning models (gpt-5 and o-series models only). |
>reasoning.effort | enum | - | none minimal low medium high xhigh | Constrains effort on reasoning for reasoning models. Reducing reasoning effort can result in faster responses and fewer tokens used on reasoning. See Model-Specific Reasoning Configurations for details. |
>reasoning.summary | enum | - | auto concise detailed | A summary of the reasoning performed by the model. Useful for debugging and understanding the model’s reasoning process. |
tools | array<object> | - | - | A list of tools the model may call. Currently, only functions are supported as a tool. Use this to provide a list of functions the model may generate JSON inputs for. See Tools Parameters for details. |
Tools Parameters
| Field | Type | Required | Range | Description |
|---|---|---|---|---|
type | enum | - | web_search web_search_2025_08_26 | The type of the web search tool. |
filter | object | - | - | Filters for the search. |
>filter.allowed_domains | array<string> | - | - | Allowed domains for the search. If not provided, all domains are allowed. Subdomains of the provided domains are allowed as well. |
search_context_size | enum | - | low medium high | High level guidance for the amount of context window space to use for the search. One of low, medium, or high. medium is the default. |
user_location | object | - | - | The approximate location of the user. |
>user_location.city | string | - | - | Free text input for the city of the user, e.g. San Francisco. |
>user_location.country | string | - | - | The two-letter ISO country code of the user, e.g. US. |
>user_location.region | string | - | - | Free text input for the region of the user, e.g. California. |
>user_location.timezone | string | - | - | The IANA timezone of the user, e.g. America/Los_Angeles. |
>user_location.type | string | - | approximate | The type of location approximation. Always approximate. |
Model-Specific Reasoning.effort Configurations
Constrains effort on reasoning for reasoning models. Currently supported values are none, minimal, low, medium, high, and xhigh. Reducing reasoning effort can result in faster responses and fewer tokens used on reasoning in a response.- gpt-5.1 defaults to none, which does not perform reasoning. The supported reasoning values for gpt-5.1 are none, low, medium, and high. Tool calls are supported for all reasoning values in gpt-5.1.
- All models before gpt-5.1 default to medium reasoning effort, and do not support none.
- The gpt-5-pro model defaults to (and only supports) high reasoning effort.
- xhigh is supported for all models after gpt-5.1-codex-max.