Vision (Image Understanding)

Models with vision capabilities can accept image inputs for image-text understanding and analysis.

Models Supporting Vision

Filter by "Vision" tag on the models page to see all supported models.

Request Format

Change the content field from string to array containing text and image URL.

JSON

{
  "model": "gpt-4o",
  "messages": [
    {
      "role": "user",
      "content": [
        {
          "type": "text",
          "text": "What is this image?"
        },
        {
          "type": "image_url",
          "image_url": {
            "url": "https://example.com/image.jpg"
          }
        }
      ]
    }
  ]
}

Image Format

Parameter	Type	Required	Description
`type`	string	Yes	Fixed value "image_url"
`image_url.url`	string	Yes	Image URL (supports http(s) and base64 data URI)
`image_url.detail`	string	No	Resolution: auto / low / high, default auto

Base64 Images

You can also use base64 encoded images: "data:image/jpeg;base64,/9j/4AAQ..."