Vision (Image Understanding)

Models with vision capabilities can accept image inputs for image-text understanding and analysis.

Models Supporting Vision

Filter by "πŸ‘οΈ Vision" tag on the models page to see all supported models.

β†’ View models with vision

Request Format

Change the content field from string to array containing text and image URL.

JSON
{
  "model": "gpt-4o",
  "messages": [
    {
      "role": "user",
      "content": [
        {
          "type": "text",
          "text": "What is this image?"
        },
        {
          "type": "image_url",
          "image_url": {
            "url": "https://example.com/image.jpg"
          }
        }
      ]
    }
  ]
}

Image Format

ParameterTypeRequiredDescription
typestringYesFixed value "image_url"
image_url.urlstringYesImage URL (supports http(s) and base64 data URI)
image_url.detailstringNoResolution: auto / low / high, default auto

πŸ’‘ Base64 Images

You can also use base64 encoded images: "data:image/jpeg;base64,/9j/4AAQ..."

API Documentation - Ciyuano | Ciyuano