Skip to content

feat(ollama): return real token counts from embedding endpoint#11

Merged
pyramation merged 1 commit into
mainfrom
feat/embedding-token-counts
May 21, 2026
Merged

feat(ollama): return real token counts from embedding endpoint#11
pyramation merged 1 commit into
mainfrom
feat/embedding-token-counts

Conversation

@pyramation
Copy link
Copy Markdown
Contributor

Summary

Ollama's /api/embed endpoint returns prompt_eval_count (input token count) but agentic-kit was discarding it. This PR:

  1. Switches from deprecated /api/embeddings to /api/embed — the newer endpoint supports batch input and returns prompt_eval_count
  2. generateEmbedding() now returns EmbeddingResult instead of number[]:
    interface EmbeddingResult {
      embedding: number[];
      promptTokens: number;
    }
  3. Adds OllamaAdapter.embed() convenience method that delegates to the client

BREAKING CHANGE: generateEmbedding() return type changed from Promise<number[]> to Promise<EmbeddingResult>. Callers need to use result.embedding instead of the raw array.

Before:

const embedding = await client.generateEmbedding('hello');
// embedding is number[] — no token info

After:

const result = await client.generateEmbedding('hello');
// result.embedding is number[], result.promptTokens is the real count

This unblocks constructive's metering layer from using real token counts for embeddings (currently using ~4 chars/token placeholder).

Review & Testing Checklist for Human

  • Run live tests against a real Ollama instance with nomic-embed-text installed: OLLAMA_LIVE_MODEL=qwen3.5:4b pnpm --filter @agentic-kit/ollama test:live:extended — verify promptTokens > 0 in the new embedding tests
  • Verify the /api/embed endpoint works with your Ollama version (requires Ollama ≥ 0.4.0 for /api/embed; older versions only have /api/embeddings)
  • Check downstream consumers that call generateEmbedding() — they need to update from result (array) to result.embedding (array) and can now access result.promptTokens

Notes

  • The pre-existing agentic-kit package test failure (reasoning field mismatch with ProviderAdapter type) is unrelated — same failure on main
  • The old /api/embeddings endpoint (singular) returned { embedding: number[] }. The new /api/embed returns { embeddings: number[][], prompt_eval_count: number } — we take embeddings[0] for single-text input

Link to Devin session: https://app.devin.ai/sessions/2b5a29d83d3f478e8d3d972653b4879c
Requested by: @pyramation

- Switch from deprecated /api/embeddings to /api/embed
- generateEmbedding() now returns EmbeddingResult { embedding, promptTokens }
  instead of plain number[] — promptTokens comes from prompt_eval_count
- Add OllamaAdapter.embed() convenience method
- Update live tests to verify promptTokens > 0
- Update README with new return type and adapter example

BREAKING CHANGE: generateEmbedding() return type changed from
Promise<number[]> to Promise<EmbeddingResult>
@devin-ai-integration
Copy link
Copy Markdown

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment and CI monitoring

@pyramation pyramation merged commit 2256d7d into main May 21, 2026
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant