feat(ollama): return real token counts from embedding endpoint by pyramation · Pull Request #11 · constructive-io/agentic-kit

pyramation · 2026-05-21T22:24:10Z

Summary

Ollama's /api/embed endpoint returns prompt_eval_count (input token count) but agentic-kit was discarding it. This PR:

Switches from deprecated /api/embeddings to /api/embed — the newer endpoint supports batch input and returns prompt_eval_count

generateEmbedding() now returns EmbeddingResult instead of number[]:

interface EmbeddingResult {
  embedding: number[];
  promptTokens: number;
}

Adds OllamaAdapter.embed() convenience method that delegates to the client

BREAKING CHANGE: generateEmbedding() return type changed from Promise<number[]> to Promise<EmbeddingResult>. Callers need to use result.embedding instead of the raw array.

Before:

const embedding = await client.generateEmbedding('hello');
// embedding is number[] — no token info

After:

const result = await client.generateEmbedding('hello');
// result.embedding is number[], result.promptTokens is the real count

This unblocks constructive's metering layer from using real token counts for embeddings (currently using ~4 chars/token placeholder).

Review & Testing Checklist for Human

Run live tests against a real Ollama instance with nomic-embed-text installed: OLLAMA_LIVE_MODEL=qwen3.5:4b pnpm --filter @agentic-kit/ollama test:live:extended — verify promptTokens > 0 in the new embedding tests
Verify the /api/embed endpoint works with your Ollama version (requires Ollama ≥ 0.4.0 for /api/embed; older versions only have /api/embeddings)
Check downstream consumers that call generateEmbedding() — they need to update from result (array) to result.embedding (array) and can now access result.promptTokens

Notes

The pre-existing agentic-kit package test failure (reasoning field mismatch with ProviderAdapter type) is unrelated — same failure on main
The old /api/embeddings endpoint (singular) returned { embedding: number[] }. The new /api/embed returns { embeddings: number[][], prompt_eval_count: number } — we take embeddings[0] for single-text input

Link to Devin session: https://app.devin.ai/sessions/2b5a29d83d3f478e8d3d972653b4879c
Requested by: @pyramation

- Switch from deprecated /api/embeddings to /api/embed - generateEmbedding() now returns EmbeddingResult { embedding, promptTokens } instead of plain number[] — promptTokens comes from prompt_eval_count - Add OllamaAdapter.embed() convenience method - Update live tests to verify promptTokens > 0 - Update README with new return type and adapter example BREAKING CHANGE: generateEmbedding() return type changed from Promise<number[]> to Promise<EmbeddingResult>

devin-ai-integration · 2026-05-21T22:24:14Z

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

Disable automatic comment and CI monitoring

devin-ai-integration Bot assigned pyramation May 21, 2026

pyramation merged commit 2256d7d into main May 21, 2026
12 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(ollama): return real token counts from embedding endpoint#11

feat(ollama): return real token counts from embedding endpoint#11
pyramation merged 1 commit into
mainfrom
feat/embedding-token-counts

pyramation commented May 21, 2026

Uh oh!

devin-ai-integration Bot commented May 21, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

pyramation commented May 21, 2026

Summary

Review & Testing Checklist for Human

Notes

Uh oh!

devin-ai-integration Bot commented May 21, 2026

🤖 Devin AI Engineer

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant