AI Tax Assistant Platform

One governed AI assistant per department

Every department gets its own workspace and a document-grounded assistant for officers, to ask questions, draft replies for review, and triage cases. Answers are cited, and every model call is routed, logged, and costed under one platform-wide governance standard.

Browse workspaces

Two capabilities

Document-grounded answers

Each workspace uploads its own guidance. The platform chunks, embeds, and indexes it per workspace (OpenAI embeddings over pgvector or a local store), and the assistant answers from that corpus with inline citations you can click back to the source.

Open Documents

Governed to one standard

One policy as code: PII detect, redact, and audit, grounding, an eval-score gate, and a per-call cost ceiling, applied uniformly to every workspace. Every model call is routed by rule, logged with its cost, and recorded in the audit trail.

Open the dashboard

In each workspace

Scoped to the department you select. Documents, chat history, tools, and instructions stay per workspace.

Assistant

A multi-step agent that retrieves cited guidance, drafts replies for review, and triages cases, with a step trace and source inspector.

Documents

Upload guidance, then search the index and see chunks, similarity scores, and citations.

AI Tools

Build tools without code: lookup tables, message templates, or sandboxed calculators the assistant can call.

AI Instructions

The assistant's system prompt for this workspace, versioned with line diffs and an activation pointer.

Usage analytics

Training needs, documentation gaps, and process hotspots, mined from usage with Python embeddings and clustering.

AI Gateway

Every model call with its latency, tokens, cost, and any provider fallback.

Platform-wide governance

One standard across every workspace, with live aggregates and a full audit trail.

AI Dashboard

Platform-wide health: usage, eval pass rate, cost, and reliability across all workspaces.

AI Policy

The governance policy as code and the deterministic model routing rules, editable in place.

AI Audit Trail

Every model call, eval run, and instruction change across the platform, newest first.

AI Evaluation

Graded test cases with a keyword grader or an LLM judge, behind an eval pass-rate gate.

Under the hood

Three pieces across two runtimes: a Next.js app (the agent, gateway, routing, evals, and governance), a Python RAG service, and a Python insights pipeline. Built so every model decision is deterministic, logged, and reversible.

Bounded agent loop

A capped loop (up to 5 steps, temperature 0) on the Vercel AI SDK, with cited retrieval and tool calls.

Deterministic routing

Keyword rules pick one of six models per query, no extra model call, with cross-provider fallback between Anthropic and OpenAI.

One model gateway

Every call is wrapped to record latency, tokens, and USD cost, logged per workspace.

RAG service

Python FastAPI and LlamaIndex with OpenAI embeddings over pgvector or a local store, one index per workspace.

Eval harness

Keyword and LLM-as-judge graders with a pass-rate gate and a trend across runs.

Versioned prompts

Immutable prompt versions behind an activation pointer, with line diffs.

Sandboxed code tools

Custom calculators run in a QuickJS WASM sandbox: a 1s deadline, 32MB, and no host access.

Usage insights

Python embeddings and KMeans clustering turn usage into training needs and documentation gaps.

General information only. Demo documents are self-authored, open, or synthetic.