Embeddings

Model-Agnostic Approach

MIF stores embedding metadata, not raw vectors:

embedding:
  model: text-embedding-3-small
  model_version: "2024-01"
  dimensions: 1536
  source_text: "The text that was embedded"
  normalized: true
  quantization: null  # or "float16", "int8"

This allows:

Re-embedding on import with different models
Smaller file sizes
Model migration without data loss

Optional Vector Storage

For providers that need vector portability:

External Reference:

embedding:
  model: text-embedding-3-small
  source_text: "..."
  vector_uri: "vectors/550e8400.bin"

Inline (JSON-LD only):

"embedding": {
  "model": "text-embedding-3-small",
  "sourceText": "...",
  "vector": {
    "@type": "Vector",
    "encoding": "base64-float32",
    "data": "SGVsbG8gV29ybGQh..."
  }
}