Scaffold server query pipeline & streaming API #353
Conversation
There was a problem hiding this comment.
Code Review
This pull request introduces the 'Atlas' scope component, integrating tldraw to render interactive text and chart cards on a canvas, alongside supporting elements like buttons, skeletons, and custom icons. On the backend, it implements a streaming query API powered by Gemini and Data Commons MCP tools, complete with query analysis, safety filtering, and metadata fetching. The review feedback focuses on aligning with repository conventions—such as renaming files to snake_case and applying category-first naming to components—and ensuring strict TypeScript compliance. Additionally, the reviewer recommends enhancing robustness by adding unmount cleanup hooks to prevent memory leaks, caching compiled regular expressions for better performance, and wrapping JSON parsing in try-catch blocks to prevent runtime TypeErrors.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
b80edeb to
c108463
Compare
656111f to
b2e7d1e
Compare
b2e7d1e to
a2aab0e
Compare
beets
left a comment
There was a problem hiding this comment.
Thanks for the PR! The scaffolding and overall flow for the data discovery query pipeline are well-designed. Since this is an initial commit, most of these comments are fine to be added as TODO's in the code for future refactor.
There was a problem hiding this comment.
Should we consolidate all our agent instructions into the skills folder?
There was a problem hiding this comment.
Sure thing! I went back and forth about this a few times and couldn't decide where they should live. Now that I'm looking at it again, I think this whole server directory got a bit messy :-D. I reorganized it a bit, let me know what you think!
There was a problem hiding this comment.
I like having all the prompts centralized -- thanks!
| .replace(/```json/g, '') | ||
| .replace(/```/g, '') | ||
| .trim(); |
There was a problem hiding this comment.
Similar to earlier comment on handling possible LLM preamble.
const jsonMatch = responseText.match(/\{[\s\S]*\}/);
const cleaned = jsonMatch ? jsonMatch[0] : responseText;
parsedResponse = JSON.parse(cleaned);
beets
left a comment
There was a problem hiding this comment.
Thank you for the updates Scott! Added some minor comments, that you can address now or in a follow up PR.
There was a problem hiding this comment.
I like having all the prompts centralized -- thanks!
| const cleaned = text | ||
| .replace(/```json/g, '') | ||
| .replace(/```/g, '') | ||
| .trim(); |
There was a problem hiding this comment.
This could also use the update to .match(/\{[\s\S]*\}/)
We can also lift that up into a utility since it's repeated in a few places.
| const responseText = response.text || ''; | ||
| const jsonMatch = responseText.match(/\{[\s\S]*\}/); | ||
| const cleaned = jsonMatch ? jsonMatch[0] : responseText; |
There was a problem hiding this comment.
Same block that can be lifted out.
| followUps?: string[]; | ||
| } | ||
|
|
||
| export async function POST(request: NextRequest) { |
There was a problem hiding this comment.
Could add a TODO for this in the interim, and we can find the right time for this as we continue developing.
Overview
Adds the streaming query API route and supporting server-side pipeline that powers Data Weaver's natural-language-to-chart workflow. A user's query flows through safety checks, analysis, Gemini tool-calling with the Data Commons MCP server, and streams structured chart specifications back to the client via SSE.
Changes Made
/api/query) — POST endpoint that accepts a natural-language query and streams SSE events (status,analysis,tool_call,chart_spec,metadata,error,complete)src/server/)safety.ts— layered prompt safety gate (regex patterns + LLM classification)analyze/— extracts places, topic, titles, and date ranges from the query via Geminiquery/— Gemini tool-loop that iteratively calls MCP tools to resolve variablesobservations/— fetches variable metadata/facets from Data Commonsmcp.ts— JSON-RPC client for the Data Commons MCP servergemini.ts— shared Gemini client factorydc-api.ts— Data Commons REST API helpersconfig.ts+config/— service config and skill prompt loadingtypes.ts— shared types for the query pipeline and SSE eventsuse-streaming-query.ts) — SSE parser with abort support and typed event dispatch