mxbai-embed-large-v1

Free.ai (self-hosted) · embeddings · ~100 tokens per call
~100 tokens per call

mxbai-embed-large-v1 is an embedding model built by mixedbread.ai. Strongest at Semantic search, clustering, similarity.. Self-hosted on Free.ai GPUs — runs free against your daily token pool (100 tokens per call). Released under Apache 2.0 — commercial use permitted on Free.ai.

Use via API

OpenAI-compatible REST API. Generate a key and call this model in seconds.

curl -X POST https://api.free.ai/v1/image/generate/ \
  -H "Authorization: Bearer sk-free-..." \
  -H "Content-Type: application/json" \
  -d '{"model":"mxbai-embed-large-v1","prompt":"your prompt here"}'
API Documentation Get API Key

Frequently Asked Questions

mxbai-embed-large-v1 converts text into a dense vector (a list of floats) that captures meaning. Use it for semantic search, clustering, recommendation, retrieval-augmented generation (RAG), and any task where "is this text similar to that text" matters.

Typical dimensions are 384, 768, 1024, or 1536 depending on the model. BGE-M3 emits 1024-dim; OpenAI Ada emits 1536. The API response includes the dimension so your vector DB picks the right index.

Modern embedding models (including most options on Free.ai) are trained on 100+ languages. Cross-language retrieval works — search in English, match documents in Spanish.

512 to 8,192 tokens depending on the model. Longer inputs are truncated — chunk long documents into paragraphs before embedding.

mxbai-embed-large-v1 runs on our own GPUs and is among the cheapest tools — about ~100 tokens per call drawn from your daily free pool. $5 = 200K tokens.

Yes — POST a list of strings to /v1/embeddings/ and mxbai-embed-large-v1 returns a list of vectors in the same order. Batch size up to 2,048 per request.

L2-normalized by default — cosine similarity = dot product. Pass `normalize=false` if you want raw vectors for a different distance metric.

Any — Pinecone, Weaviate, Qdrant, Chroma, pgvector, FAISS, LanceDB. mxbai-embed-large-v1 returns plain JSON floats; the DB never sees the model.

Yes — POST to /v1/embeddings/ with model="mxbai-embed-large-v1". OpenAI-compatible response shape, so existing client libraries work unchanged. /api/ has the full reference.

Self-hosted models keep your text on our GPUs and discard it after the call returns. Premium pass through with a DPA. We do not train on your inputs.

Sub-100ms for short text on self-hosted, 100–500ms on premium. Batch calls scale roughly linearly — 1,000 chunks complete in 2–10 seconds.

Yes — Free.ai grants commercial use of embeddings. Build production search, RAG pipelines, recommendation systems with no per-vector royalty.

Love Free.ai? Tell your friends!

Rate this page