Skip to content

Add Vector Search plugin#200

Open
adamgurary wants to merge 3 commits intodatabricks:mainfrom
adamgurary:feat/vector-search-plugin
Open

Add Vector Search plugin#200
adamgurary wants to merge 3 commits intodatabricks:mainfrom
adamgurary:feat/vector-search-plugin

Conversation

@adamgurary
Copy link
Copy Markdown

Summary

  • Adds @databricks/appkit-vector-search — a plugin that gives Databricks Apps built with AppKit instant vector search query capabilities
  • Ships backend (Express routes, VS REST API client, service principal + OBO auth) and frontend (React hook, styled components with Radix UI)
  • Developer experience target: ~45 lines for a full search page with search box, results, filters, and keyword highlighting
  • 82 tests included; validated against real VS index on dogfood

What's included

Backend plugin (src/plugin/):

  • VectorSearchPlugin.ts — plugin class with lifecycle, config, route injection
  • VectorSearchClient.ts — REST API client for VS endpoints
  • routes.ts — Express route handlers
  • auth.ts — service principal default, OBO opt-in per index

Frontend UI (src/ui/):

  • useVectorSearch React hook
  • SearchBox, SearchResults, SearchResultCard, SearchLoadMore components

Design decisions

Decision Choice Rationale
Package structure Single package (backend + UI) Shared types, single dependency, matches Lakebase plugin pattern
Default search mode Hybrid (ANN + keyword) Best out-of-the-box quality
Reranking Off by default, opt-in per index Adds latency; too slow for interactive search by default
Auth Service principal default, OBO opt-in Simple default, secure option when needed

Test plan

  • Review plugin structure against existing AppKit plugin patterns (Lakebase)
  • Run test suite (vitest run in packages/vector-search/)
  • Validate against live VS index on dogfood
  • Review API surface and types

Adam Gurary added 3 commits March 20, 2026 13:06
Adds @databricks/appkit-vector-search — a plugin that gives Databricks Apps
built with AppKit instant vector search query capabilities. Ships backend
(Express routes, VS REST API client, auth) and frontend (React hook, styled
components with Radix UI).

Developer experience target: ~45 lines for a full search page with search box,
results, filters, and keyword highlighting.

82 tests included. Validated against real VS index on dogfood.
- Move from packages/vector-search/ into packages/appkit/src/plugins/vector-search/
- Replace custom auth (ServicePrincipalTokenProvider, OboTokenExtractor) with
  AppKit's built-in asUser(req) and getWorkspaceClient() context
- Add VectorSearchConnector using workspaceClient.apiClient.request()
  instead of raw fetch with manual token management
- Plugin now extends Plugin base class with proper manifest.json,
  defaults.ts, this.route(), this.execute(), and toPlugin() factory
- Remove standalone package.json, tsconfig.json, and vitest config
- Register plugin and connector in index barrel exports

Addresses review feedback:
- Plugin lives under plugins/ folder alongside analytics, genie, files
- No custom auth handling — uses AppKit's built-in mechanisms
- Follows create-core-plugin patterns (manifest, defaults, connector)

Signed-off-by: Adam Gurary <[email protected]>
- Connector: wrap VS API calls in telemetry spans with index name,
  query type, result count, and latency attributes
- Connector: check AbortSignal before executing requests
- Connector: add WideEvent context logging with query metadata
- Plugin: replace this.execute() in route handlers with direct
  try/catch — preserves actual error details (code, message, status)
  instead of swallowing them into undefined
- Remove unused SearchFilters import

Signed-off-by: Adam Gurary <[email protected]>
pkosiec added a commit that referenced this pull request Mar 25, 2026
Brainstorm: add PR #166 (Agent plugin) and PR #200 (Vector Search
plugin) as future extension references. Rename future enhancement
section to cover both Vector Search and Lakebase pgvector options.

Plan: address findings from multi-agent code review (architecture,
security, performance, spec flow, pattern recognition):
- Fix cache infrastructure: use shared CacheManager pool, not
  fictional maxEntries config
- Clarify error contract: programmatic API errors propagate,
  HTTP handlers use execute() for interceptors
- Separate _chatCollect()/_embed() from HTTP handlers
- Add SSE buffer max size (1MB) to prevent OOM
- Restrict response_format to text/json_object (no json_schema v1)
- Add runtime role validation against known set
- Add model to parameter allowlist for Foundation Model API
- Add stop parameter bounds (4 entries, 256 chars)
- Standardize connection pool at 100 (was contradictory 50/100)
- Add retry on 503 for chatCollect() (cold-start resilience)
- Specify setup() throws on missing endpoint, shutdown() cleanup
- Extract SSE parser to stream/sse-parser.ts in Phase 2
- Add per-route body-parser middleware (not global)
- Update acceptance criteria and security checklist

Signed-off-by: Pawel Kosiec <[email protected]>
pkosiec added a commit that referenced this pull request Mar 25, 2026
Brainstorm:
- Added chatCollect() for non-streaming programmatic API
- Scoped out vision/multimodal, thinking/budget_tokens, tools/tool_choice
  as v2 items with specific rationale
- Added reasoning_effort to v1 scope
- Referenced PRs #166 (agent plugin) and #200 (vector search)
- Updated references with query/vision/reasoning/function-calling docs

Plan:
- Cross-referenced Databricks Query API spec vs OpenAI conventions
- Documented type sourcing decision (hand-write for v1, sourced from
  OpenAI API reference)
- Added SDK comparison table (OpenAI vs Anthropic vs AppKit)
- Fixed id: string | null in response types
- Noted served-model-name header for telemetry
- Documented extra_params vs top-level field convention

Signed-off-by: Pawel Kosiec <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant