NL To Mongo Pipeline + TUI: Ultra-Compact UIDSL
Hey guys, let's dive into a cool project that merges natural language (NL) to MongoDB pipelines with a super neat, responsive Terminal UI (TUI) using something called Ultra-Compact UIDSL. This is all about making ad-hoc analytics over JSONL files and unbounded streams a breeze on a single-process CLI built with Node/V8. Forget about persistence; we're all about real-time insights and a slick user experience. This project is all about making it easy to get insights from your data.
The Core Idea
The main goal here is to take natural language, transform it into a MongoDB aggregation pipeline, and then visualize the results in a rich, interactive TUI. We're talking about tables, lists, trees, cards, stats, and even charts, all updating in real time. The core idea revolves around a model that spits out two key pieces of information:
- A MongoDB aggregation pipeline in a JSON string format (
q
). - A Terminal UI DSL (UIDSL) string that describes the layout (
ui
).
We then take these two strings, parse them, execute the pipeline against the data, and render the TUI. It's a pretty slick process that aims to give you powerful analytical capabilities without a lot of fuss.
What Aggo-AI Already Does
Before we jump into the new features, let's recap what the existing aggo-ai system can do. It already rocks at:
- NL to Mongo: Converting natural language queries into valid MongoDB aggregation pipelines. This is the engine that translates your questions into database operations.
- Execution: Running these pipelines against local data. Whether your data is a bounded set or an unbounded stream, it can handle it.
- Streaming Results: Providing incremental updates as the data streams in. This means you get real-time insights without waiting for the entire dataset to process.
- Basic Rendering: Displaying results using JSON lines or simple tables. While functional, it's not the most visually appealing.
What's New
This new feature adds significant enhancements to the existing capabilities. Instead of just getting raw results, you now get:
- Structured Output: The LLM (Language Model) returns a structured response containing two main elements: the MongoDB pipeline (
q
) and the UIDSL (ui
). - Rich TUI Rendering: The system parses the UIDSL and renders a rich TUI with a variety of components (tables, lists, charts, etc.), all updating in real time.
This update is designed to make the whole process more user-friendly and visually appealing.
Contract: Structured Output for Efficiency
To keep things efficient and lean, the model's output is designed to be compact. The output consists of two strings and an optional windowing hint. This structured approach is validated using Zod, which ensures that the output is in the correct format. This structured output is crucial for the system to work efficiently.
The Code Behind the Scenes
Here's a glimpse into the code structure using TypeScript:
import OpenAI from "openai";
import { z } from "zod";
import { zodTextFormat } from "openai/helpers/zod";
export const Plan = z.object({
v: z.literal("v1").default("v1"),
q: z.string().max(100_000), // Mongo pipeline JSON string: `[{"$match":...}, ...]`
ui: z.string().max(8_000), // Ultra-compact UIDSL v1 string
w: z.object({
mode: z.enum(["b","u"]).default("b"), // bounded | unbounded
emitMs: z.number().int().min(10).max(5000).optional(),
maxDocs: z.number().int().positive().optional()
}).optional()
});
const plan = (await new OpenAI().responses.parse({
model: "gpt-5-nano",
input: [{ role: "system", content: SYSTEM_PROMPT }, { role: "user", content: userRequest }],
text: { format: zodTextFormat(Plan, "plan") }
})).output_parsed;
This code uses Zod to define the expected structure of the output. This ensures that the system receives the correct data format, making it easier to process the information.
UIDSL v1: The Language of the TUI
UIDSL (Ultra-Compact TUI DSL) is a language that allows you to define the layout of the TUI in a concise way. This helps keep the output small while still providing a lot of flexibility in terms of UI design. Think of it as a mini-language for describing how the TUI should look and behave.
Type Codes
The type codes are the building blocks of the UIDSL and they consist of:
- Containers:
g
(grid),tb
(tabs). - Leaves:
t
(table),li
(list),tr
(tree),st
(stat),sk
(sparkline),br
(bar),js
(raw JSON).
These codes specify the different types of UI elements that can be used in the TUI.
Props (Short for Speed)
UIDSL uses short props to keep the strings compact. These props define the attributes of each UI element. For example:
i
for idf
for from (JSONPath-lite)dr
for directionc
for columnss
for sortlb
for label
These props allow you to customize the UI elements.
Mini-Syntax for Columns
For tables, the column syntax is c=Header:path[:l|r|c[:width]]
, where:
Header
is the column header.path
is the JSONPath-lite expression.l|r|c
specifies alignment (left, right, center).width
specifies the column width.
This syntax allows for precise control over the table's appearance.
Grammar in Simple Terms
The informal grammar for UIDSL is:
UI := 'ui:v1;' COMP
COMP := g(...) [ ... ] | tb(...) [ ... ] | t(...) | li(...) | tr(...) | st(...) | sk(...) | br(...) | js(...)
This grammar defines how you can structure your UI components.
Example Code
Here are a few examples to give you an idea of how UIDSL works:
ui:v1;g(dr=R,gp=1)[
t(i=top,f=$.items,c='Endpoint':$.endpoint|p95:$.lat.p95:r, s=$.lat.p95:desc,pg=20),
g(dr=C)[ st(lb='Req/min',v=$.meta.rpm), sk(i=trend,f=$.meta.rpmSeries) ]
]
ui:v1;tr(i=err,f=$.groups,lb=$.service,ch=$.errors)
ui:v1;tb(ti='Top','Trend')[ t(f=$.items,c=Svc:$.svc|p95:$.p95:r,pg=15), sk(f=$.series) ]
These examples show how you can create complex layouts with just a few lines of code. This makes it easy to create custom and interactive TUIs.
Flow: From LLM to Interactive TUI
Let's break down the flow of how this all comes together, step by step:
-
LLM Output: The LLM produces the
{ q, ui, w }
structure.q
is the Mongo pipeline,ui
is the UIDSL, andw
contains windowing hints. -
Pipeline Parsing: The
q
string (the Mongo pipeline) is parsed into a JSON array. This ensures that the pipeline is valid and ready to be executed. No allowlist is used; the parsed pipeline is directly passed to the aggo executor. This direct approach ensures flexibility. -
Execution: The existing aggo engine executes the pipeline, either in bounded or streaming mode, based on
w.mode
. -
UIDSL Parsing: The
ui
string (UIDSL) is parsed to create an Abstract Syntax Tree (AST). A custom scanner is used for this, performing a single pass. This step translates the UIDSL into a format that can be rendered. -
Compilation to Ink: The AST is then compiled into Ink components. Ink is a library for building interactive command-line applications.
g
(grid) becomes<Box flexDirection>
+ gap.tb
(tabs) creates a title header and active pane.t
(table) uses fast pre-render strings for efficient rendering.li
(list) uses template rows with cached JSONPath lookups.tr
(tree) renders pre-rendered unicode trees.st
,sk
,br
, andjs
are also compiled into Ink components.
-
Rendering: The UI is rendered with specific considerations for different modes:
- Bounded: Skeleton → final fill.
- Streaming: Paint throttling (50–100ms), stable row keys, and pagination/tab input. This ensures a smooth and responsive user experience.
-
Responsiveness: The system uses
stdout.columns
. The grid (g
) flipsR↔C
when the terminal is narrow, and tables drop low-priority columns first. This ensures that the UI adapts to different screen sizes.
This process is designed to create a smooth and responsive user experience, no matter the size of the data or the complexity of the analysis.
Prompting: Guiding the LLM
The system relies on careful prompting to guide the LLM to generate the correct outputs. The SYSTEM_PROMPT
is crucial for this. The core of the prompt includes these elements:
- Output Format: The LLM must output the exact structure:
q
(Mongo pipeline JSON string),ui
(UIDSL string), and an optionalw
object. q
Requirements: Theq
string must be a valid MongoDB aggregation pipeline JSON array. This ensures that the pipeline can be executed without errors.ui
Requirements: Theui
string must follow UIDSL v1 and be concise. This ensures that the UI layout is well-defined and efficient.- Allowed UIDSL Types and Props: Specifies the allowed UIDSL types (like
g, tb, t, li, tr, st, sk, br, js
) and their properties (likei, f, dr, gp, ti, c, s, pg, lb, v, u, x, y, st
). This restricts the options and ensures consistency. - JSONPath-lite: Paths use JSONPath-lite (
$.a.b
) for accessing data within the JSON documents. - No Extra Content: No extra prose, HTML, or JavaScript is allowed. This keeps the output clean and easy to parse.
- Few-shot Examples: Includes 3–5 few-shot pairs to guide the LLM. These examples demonstrate the desired output format and style.
Guardrails and Fallbacks: Handling Errors Gracefully
To make sure the system is robust, there are several guardrails and fallbacks in place to handle potential errors. This ensures the user experience remains positive even when something goes wrong.
q
Parsing Failure: IfJSON.parse
fails onq
, the system attempts a light fix (e.g., missing brackets or quotes). If that fails, it displays an inline error panel and halts execution. The raw JSON string is available behind a toggle for debugging.- Aggo Executor Rejection: If the aggo executor rejects the pipeline, the error is captured and displayed in an inline error panel, showing the title, stage index, and message. There is no need for an allowlist.
ui
Parsing Failure: If theui
parsing fails, the system falls back to eitherui:v1;js(f=$)
ort(f=$,c=_id:$._id|val:$.value,pg=20)
. This provides a default UI to prevent the system from crashing.- Other Considerations: The system respects
NO_COLOR
, caps the page size, truncates cells, and never crashes on unknown UIDSL keys, ignoring and warning inline. These precautions ensure that the system functions smoothly in various environments.
Streaming Semantics: Real-Time Data Updates
The system is designed to work seamlessly with streaming data. Here's how it hooks into the existing engine:
w.mode='u'
: Incremental updates are enabled usingemitMs
(default: 100ms). This means the TUI updates in real time as new data arrives.- Stable Row Keys: The system uses stable row keys from
_id
or the primary key. If neither is available, it uses an index. This ensures that the UI elements remain consistent during updates. - Memory Bounding: Memory is bounded by a ring size, and aggregates are kept separately for
st
(stats). This helps manage memory usage effectively during streaming.
Deliverables: The Building Blocks
The project deliverables include a variety of components:
planner/plan.zod.ts
: The schema defined earlier.uidsl/parser.ts
anduidsl/compiler.ts
: These are the components that parse and compile the UIDSL into Ink components.ui/renderers/*
: Renderers for different UI elements (table, list, tree, stat, sparkline, bar, json).engine/run_pipeline.ts
: This component accepts the MongoDB pipeline JSON string and calls the aggo executor.prompt/system.md
: The file containing the strict rules and few-shot examples for prompting the LLM.tests/
: Various tests including:- Pipeline string parsing and execution.
- UIDSL parser and renderer snapshots.
- Streaming performance and responsiveness tests.
- Fallback tests for error handling.
Acceptance Criteria: Ensuring Quality
The project's acceptance criteria define the standards for quality. Here's what needs to be achieved:
- Token Usage: Typical responses should be less than or equal to 1k tokens total (both strings). This ensures that the output is compact.
- Execution and Rendering: Valid pipelines must execute, and the UIDSL must render with a stable FPS during streaming. This ensures that the system works correctly.
- Performance (Bounded): Bounded top-10 table renders in less than 50ms post-data. This ensures quick response times.
- Performance (Streaming): The system should achieve approximately 10–20 fps at 100k ev/s synthetic data with no major GC pauses. This ensures a smooth user experience.
- Error Handling: On malformed
q
or aggo rejection, the user sees a clear inline error panel. The UI remains interactive (tabs/pagination still work). This maintains usability even when errors occur.
Notes: Key Takeaways
- Mongo Flexibility: The project retains the full flexibility of MongoDB through aggo, capitalizing on its strengths. Token savings are achieved through the UIDSL.
- UIDSL Versioning: UIDSL is versioned (
ui:v1;…
) so components can evolve without breaking prompts. This ensures the system can adapt to new requirements without disrupting existing functionality.
This project is designed to deliver a powerful and flexible system for ad-hoc analytics. It is designed for efficiency, responsiveness, and a great user experience. The combination of natural language input, MongoDB pipelines, and an interactive TUI makes it a powerful tool for anyone needing to analyze data in real time.