Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/MicrosoftDocs/azure-ai-docs/llms.txt

Use this file to discover all available pages before exploring further.

Vector Queries in Azure AI Search

Vector queries find semantically similar content using numeric embeddings and nearest neighbor algorithms.

Prerequisites

  • Vector index with vector fields
  • Embedding model (Azure OpenAI, etc.)
  • Optional: Vectorizer for query-time conversion

Basic Vector Query

{
  "vectorQueries": [
    {
      "kind": "vector",
      "vector": [
        -0.009154141,
        0.018708462,
        // ... 1536 dimensions
        -0.00086512347
      ],
      "fields": "contentVector",
      "k": 50
    }
  ],
  "select": "title, content, category"
}
Parameters:
  • kind: “vector” for embedding arrays
  • vector: Query embedding (same dimensions as field)
  • fields: Vector field(s) to search
  • k: Number of nearest neighbors to return

Generate Query Embeddings

Azure OpenAI

POST https://{openai}.openai.azure.com/openai/deployments/{model}/embeddings?api-version=2024-02-01
Content-Type: application/json
api-key: {key}

{
  "input": "luxury hotel with ocean view"
}
Response:
{
  "data": [
    {
      "embedding": [
        -0.009154141,
        0.018708462,
        // ... 1536 values
      ]
    }
  ]
}

Use Same Model

Always use the same embedding model for indexing and querying. Mixing models produces poor results.

Integrated Vectorization

Let Azure AI Search handle vectorization:

Configure Vectorizer

{
  "vectorizers": [
    {
      "name": "my-openai-vectorizer",
      "kind": "azureOpenAI",
      "azureOpenAIParameters": {
        "resourceUri": "https://my-openai.openai.azure.com",
        "deploymentId": "text-embedding-ada-002",
        "apiKey": "..."
      }
    }
  ]
}

Query with Text

{
  "vectorQueries": [
    {
      "kind": "text",
      "text": "luxury hotel with ocean view",
      "fields": "descriptionVector",
      "k": 50
    }
  ]
}
Benefits:
  • No manual embedding generation
  • Consistent model usage
  • Simplified queries

Multiple Vector Fields

Search across multiple vector fields:
{
  "vectorQueries": [
    {
      "kind": "vector",
      "vector": [...],
      "fields": "titleVector,contentVector,synopsisVector",
      "k": 50
    }
  ]
}
All fields must use embeddings from the same model and have the same dimensions.

Multiple Vector Queries

Execute multiple vector queries in parallel:
{
  "vectorQueries": [
    {
      "kind": "vector",
      "vector": [...],  // text embedding
      "fields": "textVector",
      "k": 50,
      "weight": 1.0
    },
    {
      "kind": "vector",
      "vector": [...],  // image embedding
      "fields": "imageVector",
      "k": 50,
      "weight": 2.0
    }
  ]
}
Use case: Multimodal search with CLIP embeddings Results merged using Reciprocal Rank Fusion (RRF).

Vector Weighting

Adjust relative importance:
{
  "vectorQueries": [
    {
      "vector": [...],
      "fields": "titleVector",
      "k": 50,
      "weight": 2.0  // 2x importance
    },
    {
      "vector": [...],
      "fields": "contentVector",
      "k": 50,
      "weight": 1.0  // baseline
    }
  ]
}
Default weight: 1.0

Filtering Vector Results

Apply filters to vector queries:
{
  "vectorQueries": [
    {
      "vector": [...],
      "fields": "contentVector",
      "k": 50
    }
  ],
  "filter": "category eq 'Hotels' and rating ge 4.5",
  "vectorFilterMode": "postFilter"
}

Filter Modes

  • preFilter: Apply before vector search (faster, fewer candidates)
  • postFilter: Apply after vector search (more candidates, better recall)

Exhaustive KNN

Force exact search instead of approximate:
{
  "vectorQueries": [
    {
      "vector": [...],
      "fields": "contentVector",
      "k": 50,
      "exhaustive": true
    }
  ]
}
Use when:
  • Maximum accuracy required
  • Small dataset
  • Willing to accept slower queries

Threshold Filtering (Preview)

Exclude low-similarity results:
{
  "vectorQueries": [
    {
      "vector": [...],
      "fields": "contentVector",
      "k": 50,
      "threshold": {
        "kind": "vectorSimilarity",
        "value": 0.8
      }
    }
  ]
}
Effect: Returns fewer than k results if similarities below 0.8

Query Response

{
  "@odata.count": 3,
  "value": [
    {
      "@search.score": 0.89,
      "id": "1",
      "title": "Azure AI Search",
      "content": "Fully managed search service..."
    },
    {
      "@search.score": 0.85,
      "title": "Vector Search",
      "content": "Semantic similarity matching..."
    }
  ]
}
Score interpretation:
  • Higher score = more similar
  • Range depends on similarity metric
  • Cosine: -1 to 1 (1 = identical)

Oversampling

Request more candidates for reranking:
{
  "vectorQueries": [
    {
      "vector": [...],
      "fields": "contentVector",
      "k": 10,
      "oversampling": 20.0
    }
  ]
}
Effect: Retrieves k × oversampling candidates, reranks with uncompressed vectors, returns top k

Hybrid Vector + Text

Combine for best results:
{
  "search": "luxury hotel ocean view",
  "vectorQueries": [
    {
      "kind": "text",
      "text": "luxury hotel ocean view",
      "fields": "descriptionVector",
      "k": 50
    }
  ],
  "top": 10
}
Benefits:
  • Keyword precision + semantic recall
  • RRF fusion
  • Better than either alone

Performance Optimization

  • Request only needed results
  • Typical: k=10-50
  • Larger k = slower queries
  • Enable scalar/binary quantization
  • 75-96% size reduction
  • Minimal accuracy loss with rescoring
  • Adjust efSearch for accuracy vs speed
  • Higher efSearch = more accurate, slower
  • Default 500 works for most cases
  • Reduces search space
  • Faster than post-filtering
  • Better for selective filters

Common Patterns

{
  "vectorQueries": [
    {
      "kind": "text",
      "text": "comfortable running shoes for marathons",
      "fields": "descriptionVector",
      "k": 50
    }
  ],
  "filter": "inStock eq true and price le 200",
  "select": "name, description, price, rating"
}
{
  "vectorQueries": [
    {
      "kind": "vector",
      "vector": [...],  // CLIP image embedding
      "fields": "imageVector",
      "k": 20
    },
    {
      "kind": "text",
      "text": "red sports car",
      "fields": "textVector",
      "k": 20,
      "weight": 0.5
    }
  ]
}

Document Similarity

{
  "vectorQueries": [
    {
      "vector": [...],  // embedding of reference document
      "fields": "contentVector",
      "k": 10
    }
  ],
  "filter": "documentId ne '{reference-doc-id}'",  // exclude self
  "select": "documentId, title, summary"
}

Troubleshooting

Low Quality Results

  • Verify same embedding model for index and query
  • Check vector dimensions match
  • Ensure sufficient k value
  • Consider hybrid search instead

Slow Queries

  • Reduce k value
  • Enable compression
  • Use preFilter instead of postFilter
  • Tune HNSW efSearch parameter

No Results

  • Check filter conditions
  • Verify vector field name
  • Ensure index has vector data
  • Remove threshold if set too high

Next Steps

Create Vector Index

Build a vector-enabled index

Hybrid Search

Combine with keyword search

Generate Embeddings

Create embeddings from content