Nomic Embed v2

Free.ai (self-hosted) · embeddings · ~100 tokens per call

Nomic Embed v2 is an embedding model built by Nomic AI. Strongest at Retrieval augmented generation with flexible vector sizes.. Self-hosted on Free.ai GPUs — runs free against your daily token pool (100 tokens per call). Released under Apache 2.0 — commercial use permitted on Free.ai.

Use via API

OpenAI-compatible REST API. Generate a key and call this model in seconds.

curl -X POST https://api.free.ai/v1/image/generate/ \
  -H "Authorization: Bearer sk-free-..." \
  -H "Content-Type: application/json" \
  -d '{"model":"nomic-embed-v2","prompt":"your prompt here"}'

API Documentation Get API Key

Frequently Asked Questions

Nomic Embed v2 converts text into a dense vector (a list of floats) that captures meaning. Use it for semantic search, clustering, recommendation, retrieval-augmented generation (RAG), and any task where "is this text similar to that text" matters.

Typical dimensions are 384, 768, 1024, or 1536 depending on the model. BGE-M3 emits 1024-dim; OpenAI Ada emits 1536. The API response includes the dimension so your vector DB picks the right index.

Modern embedding models (including most options on Free.ai) are trained on 100+ languages. Cross-language retrieval works — search in English, match documents in Spanish.

512 to 8,192 tokens depending on the model. Longer inputs are truncated — chunk long documents into paragraphs before embedding.

Nomic Embed v2 runs on our own GPUs and is among the cheapest tools — about ~100 tokens per call drawn from your daily free pool. $5 = 200K tokens.

Yes — POST a list of strings to /v1/embeddings/ and Nomic Embed v2 returns a list of vectors in the same order. Batch size up to 2,048 per request.

L2-normalized by default — cosine similarity = dot product. Pass `normalize=false` if you want raw vectors for a different distance metric.

Any — Pinecone, Weaviate, Qdrant, Chroma, pgvector, FAISS, LanceDB. Nomic Embed v2 returns plain JSON floats; the DB never sees the model.

Yes — POST to /v1/embeddings/ with model="Nomic Embed v2". OpenAI-compatible response shape, so existing client libraries work unchanged. /api/ has the full reference.

Self-hosted models keep your text on our GPUs and discard it after the call returns. Premium pass through with a DPA. We do not train on your inputs.

Sub-100ms for short text on self-hosted, 100–500ms on premium. Batch calls scale roughly linearly — 1,000 chunks complete in 2–10 seconds.

Yes — Free.ai grants commercial use of embeddings. Build production search, RAG pipelines, recommendation systems with no per-vector royalty.

Nomic Embed v2

Use via API

Frequently Asked Questions

What does Nomic Embed v2 do?

What dimension are Nomic Embed v2 embeddings?

Is Nomic Embed v2 multilingual?

What is the max input length for Nomic Embed v2?

How much does Nomic Embed v2 cost?

Can I batch embed with Nomic Embed v2?

Does Nomic Embed v2 normalize the vectors?

Which vector DBs work with Nomic Embed v2?

Is there an API for Nomic Embed v2?

What about privacy when I embed sensitive text with Nomic Embed v2?

How long does Nomic Embed v2 take per call?

Can I use Nomic Embed v2 output commercially?

Get 10,000 Free Tokens

Wait — 30K free tokens/day!

Want more?