Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,12 @@ All notable changes to this project will be documented in this file.
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## [Unreleased]

### Added

- **Cost firewall — estimate a query's scan cost before it runs and confirm before going over budget.** A new opt-in `governance` config (`max_query_cost_usd`, `max_bytes_scanned`, `cost_per_tib_usd`) makes `sql_execute` consult a pre-execution estimate and prompt via the `sql_execute_cost` permission when a query would exceed the configured budget, with a hint to run `sql_optimize` first. Estimation uses a new optional `Connector.estimateCost()` capability — implemented for BigQuery (dry-run, exact bytes processed) and Snowflake (`EXPLAIN`, planner-estimated bytes; note Snowflake bills by credits so its USD figure is a proxy) — surfaced through the `sql.estimate_cost` dispatcher method and a standalone `sql_cost_estimate` tool. The configured budget is shown in the `/status` panel. Disabled by default; warehouses without estimation support skip the guard, so it never blocks work it can't price. (#906)

## [0.8.4] - 2026-06-05

A trace-durability patch. Open `/traces` mid-session and you'd see a rich waterfall — then the moment the agent finished its turn the view collapsed to a single "system-prompt" span, the Summary tab's *"What was asked"* showed *"No prompt recorded"*, and the Chat tab dropped every user turn but the last. The data was genuinely gone from disk, not just hidden in the viewer. This release stops the on-disk trace from being overwritten after each turn and makes the file authoritative across worker restarts. A five-persona pre-release review drove a follow-up wording fix so a reconstructed trace isn't misread as a failed run.
Expand Down
33 changes: 31 additions & 2 deletions docs/docs/configure/governance.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ Task-scoped permissions aren't just about safety — they're about **focus**. Wh

There's an audit angle too. In regulated industries, prescribed tooling eliminates unnecessary audit cycles. When your tools generate SQL the same way every time, auditors can verify consistency. Change the SQL — even if the results are conceptually identical — and you trigger an investigation to prove equivalence. Deterministic tooling removes that overhead entirely.

Altimate Code enforces governance at the **harness level**, not via prompt instructions the model can ignore. Four mechanisms work together:
Altimate Code enforces governance at the **harness level**, not via prompt instructions the model can ignore. Five mechanisms work together:

## Rules

Expand All @@ -34,6 +34,35 @@ Every file edit is auto-formatted before it's written. This isn't optional consi

[:octicons-arrow-right-24: Formatters reference](formatters.md)

## Cost Firewall

An agent can write a `SELECT` that scans terabytes and runs up a real warehouse bill before anyone notices. The cost firewall estimates a query's scan cost **before** it runs and asks for confirmation when it exceeds a budget you set — turning a surprise bill into an approve-or-optimize decision.

It's **off by default**. Set a threshold under the top-level `governance` config to enable it:

```json
{
"governance": {
"max_query_cost_usd": 1.0,
"max_bytes_scanned": 53687091200,
"cost_per_tib_usd": 6.25
}
}
```

- `max_query_cost_usd` — prompt before running a query whose estimated cost exceeds this many USD.
- `max_bytes_scanned` — prompt before running a query estimated to scan more than this many bytes.
- `cost_per_tib_usd` — price per TiB scanned, used to convert estimated bytes to cost (default `6.25`).

When a query is over budget, `sql_execute` prompts via the `sql_execute_cost` permission with the estimate and a hint to run `sql_optimize` first. The standalone `sql_cost_estimate` tool also reports an estimate on demand without running anything.

Estimates require a warehouse that supports cheap pre-flight estimation:

- **BigQuery** — via a dry-run (exact bytes processed, no execution and no charge).
- **Snowflake** — via `EXPLAIN`, which compiles the query and returns the planner's estimated bytes to scan without resuming a warehouse. Note that Snowflake bills by warehouse **credits** (compute time), not bytes, so the dollar figure is an approximate proxy — prefer `max_bytes_scanned` as the meaningful threshold for Snowflake.

Warehouses without estimation support skip the guard, so the firewall never blocks legitimate work it can't price.

---

Together, these four mechanisms mean governance is not an afterthought — it's built into every agent interaction. The harness enforces the rules so your team doesn't have to police the output.
Together, these five mechanisms mean governance is not an afterthought — it's built into every agent interaction. The harness enforces the rules so your team doesn't have to police the output.
27 changes: 26 additions & 1 deletion packages/drivers/src/bigquery.ts
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
* BigQuery driver using the `@google-cloud/bigquery` package.
*/

import type { ConnectionConfig, Connector, ConnectorResult, ExecuteOptions, SchemaColumn } from "./types"
import type { ConnectionConfig, Connector, ConnectorResult, CostEstimate, ExecuteOptions, SchemaColumn } from "./types"

export async function connect(config: ConnectionConfig): Promise<Connector> {
let BigQueryModule: any
Expand Down Expand Up @@ -71,6 +71,31 @@ export async function connect(config: ConnectionConfig): Promise<Connector> {
}
},

// Estimate scan cost via a BigQuery dry-run. The dry-run validates and
// plans the query server-side and returns the exact bytes it would
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[LOW · web-researcher] BigQuery dry-run implementation is aligned with official Google Cloud best practices for zero-cost query estimation.

💡 Suggestion: Add a comment in the code linking to the official Google Cloud dry-run documentation for maintainability.

Confidence: 95/100

// process, without running it or incurring charges. This is the most
// accurate pre-flight estimate available for any warehouse.
async estimateCost(sql: string): Promise<CostEstimate> {
const query = sql.replace(/;\s*$/, "")
const options: Record<string, unknown> = { query, dryRun: true }
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[LOW · web-researcher] PR uses @google-cloud/bigquery v4.12.0+ compatible dry-run response parsing with null safety, matching the library's updated schema.

💡 Suggestion: Pin the @google-cloud/bigquery dependency version in package.json to v4.12.0 or higher to ensure compatibility.

Confidence: 88/100

if (config.dataset) {
options.defaultDataset = {
datasetId: config.dataset,
projectId: config.project,
}
}
const [job] = await client.createQueryJob(options)
const stats = job.metadata?.statistics ?? {}
// BigQuery reports total bytes processed at the statistics root and,
// redundantly, under statistics.query — prefer whichever is present.
const raw = stats.totalBytesProcessed ?? stats.query?.totalBytesProcessed
const bytesScanned = raw != null ? Number(raw) : undefined
Comment on lines +91 to +92
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[🔵 LOW] According to the code quality guidelines, using == and != is prohibited. Please use strict equality checks (!== null && !== undefined) instead.

Suggested change:

Suggested change
const raw = stats.totalBytesProcessed ?? stats.query?.totalBytesProcessed
const bytesScanned = raw != null ? Number(raw) : undefined
const raw = stats.totalBytesProcessed ?? stats.query?.totalBytesProcessed
const bytesScanned = raw !== null && raw !== undefined ? Number(raw) : undefined

return {
bytesScanned: Number.isFinite(bytesScanned) ? bytesScanned : undefined,
note: "BigQuery dry-run (exact bytes processed)",
}
},

async listSchemas(): Promise<string[]> {
const [datasets] = await client.getDatasets()
return datasets.map((ds: any) => ds.id as string)
Expand Down
2 changes: 1 addition & 1 deletion packages/drivers/src/index.ts
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
// Re-export types
export type { Connector, ConnectorResult, SchemaColumn, ConnectionConfig } from "./types"
export type { Connector, ConnectorResult, SchemaColumn, ConnectionConfig, CostEstimate } from "./types"

// Re-export config normalization
export { normalizeConfig, sanitizeConnectionString } from "./normalize"
Expand Down
34 changes: 33 additions & 1 deletion packages/drivers/src/snowflake.ts
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
*/

import * as fs from "fs"
import type { ConnectionConfig, Connector, ConnectorResult, ExecuteOptions, SchemaColumn } from "./types"
import type { ConnectionConfig, Connector, ConnectorResult, CostEstimate, ExecuteOptions, SchemaColumn } from "./types"

export async function connect(config: ConnectionConfig): Promise<Connector> {
let snowflake: any
Expand Down Expand Up @@ -258,6 +258,38 @@ export async function connect(config: ConnectionConfig): Promise<Connector> {
}
},

// Estimate scan cost via `EXPLAIN USING JSON`, which compiles the query
// and returns the planner's estimated bytes/partitions to scan WITHOUT
// executing it or resuming a warehouse (compilation is metadata-only).
//
// Caveat surfaced in `note`: Snowflake bills by warehouse *credits*
// (compute time), not bytes scanned, so the bytes figure is an accurate
// expense proxy but the derived USD is approximate. The `max_bytes_scanned`
// guard is the meaningful threshold for Snowflake.
async estimateCost(sql: string): Promise<CostEstimate> {
const query = sql.replace(/;\s*$/, "")
const explain = await executeQuery(`EXPLAIN USING JSON ${query}`)
const raw = explain.rows?.[0]?.[0]
if (raw == null) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[🔵 LOW] According to the review checklist, using == is prohibited. Please use strict equality === or check for both null and undefined.

Suggested change:

Suggested change
if (raw == null) {
if (raw === null || raw === undefined) {

return { note: "Snowflake EXPLAIN returned no plan; bytes estimate unavailable" }
}
// EXPLAIN USING JSON yields one VARIANT cell — a JSON string via the
// Node SDK, or an already-parsed object depending on the driver version.
let plan: any
try {
plan = typeof raw === "string" ? JSON.parse(raw) : raw
} catch {
return { note: "Snowflake EXPLAIN plan was not parseable JSON" }
}
const globalStats = plan?.GlobalStats ?? {}
const assigned = globalStats.bytesAssigned
const bytesScanned = assigned != null ? Number(assigned) : undefined
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[🔵 LOW] According to the review checklist, using != is prohibited. Please use strict equality !== or check for both null and undefined.

Suggested change:

Suggested change
const bytesScanned = assigned != null ? Number(assigned) : undefined
const bytesScanned = assigned !== null && assigned !== undefined ? Number(assigned) : undefined

return {
bytesScanned: Number.isFinite(bytesScanned) ? bytesScanned : undefined,
note: "Snowflake EXPLAIN estimate — bytes scanned (Snowflake bills by warehouse credits, so USD is a rough proxy)",
}
},

async listSchemas(): Promise<string[]> {
const result = await executeQuery("SHOW SCHEMAS")
// SHOW SCHEMAS returns rows with a "name" column
Expand Down
19 changes: 19 additions & 0 deletions packages/drivers/src/types.ts
Original file line number Diff line number Diff line change
Expand Up @@ -26,11 +26,30 @@ export interface ExecuteOptions {
noLimit?: boolean
}

/**
* Pre-execution cost/scan estimate for a query, produced without running it
* (e.g. BigQuery dry-run, warehouse EXPLAIN). Powers the cost firewall in
* sql_execute. All fields are optional because estimation accuracy varies by
* warehouse — a connector returns only what it can determine cheaply.
*/
export interface CostEstimate {
/** Estimated bytes the query will scan/process. */
bytesScanned?: number
/** Free-form note about estimation method or caveats (e.g. "BigQuery dry-run"). */
note?: string
}

export interface Connector {
connect(): Promise<void>
execute(sql: string, limit?: number, binds?: any[], options?: ExecuteOptions): Promise<ConnectorResult>
listSchemas(): Promise<string[]>
listTables(schema: string): Promise<Array<{ name: string; type: string }>>
describeTable(schema: string, table: string): Promise<SchemaColumn[]>
close(): Promise<void>
/**
* Optionally estimate a query's scan cost without executing it. Connectors
* that cannot estimate cheaply omit this method; callers must treat it as
* "estimation unsupported" and proceed without a guard.
*/
estimateCost?(sql: string): Promise<CostEstimate>
}
10 changes: 10 additions & 0 deletions packages/opencode/src/agent/agent.ts
Original file line number Diff line number Diff line change
Expand Up @@ -153,6 +153,11 @@ export namespace Agent {
question: "allow",
plan_enter: "allow",
sql_execute_write: "ask",
// altimate_change start — cost firewall: must be "ask" so the guard
// prompts; the inherited `"*": "allow"` default would otherwise
// silently approve over-budget queries.
sql_execute_cost: "ask",
// altimate_change end
}),
userWithSafety,
),
Expand All @@ -170,6 +175,7 @@ export namespace Agent {
"*": "deny",
// SQL read tools
sql_execute: "allow",
sql_cost_estimate: "allow",
altimate_core_validate: "allow",
sql_analyze: "allow",
sql_translate: "allow",
Expand All @@ -182,6 +188,10 @@ export namespace Agent {
sql_diff: "allow",
// SQL writes denied
sql_execute_write: "deny",
// altimate_change start — cost firewall: prompt (not hard-deny) when a
// query exceeds the configured cost budget, so the analyst can approve.
sql_execute_cost: "ask",
// altimate_change end
// Warehouse/schema/finops
warehouse_list: "allow",
warehouse_test: "allow",
Expand Down
1 change: 1 addition & 0 deletions packages/opencode/src/altimate/index.ts
Original file line number Diff line number Diff line change
Expand Up @@ -60,6 +60,7 @@ export * from "./tools/sql-analyze"
export * from "./tools/sql-autocomplete"
export * from "./tools/sql-diff"
export * from "./tools/sql-execute"
export * from "./tools/sql-cost-estimate"
export * from "./tools/sql-explain"
export * from "./tools/sql-fix"
export * from "./tools/sql-format"
Expand Down
50 changes: 50 additions & 0 deletions packages/opencode/src/altimate/native/connections/register.ts
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,8 @@ import { runDataDiff } from "./data-diff"
import type {
SqlExecuteParams,
SqlExecuteResult,
SqlEstimateCostParams,
SqlEstimateCostResult,
SqlExplainParams,
SqlExplainResult,
SqlAutocompleteParams,
Expand Down Expand Up @@ -438,6 +440,54 @@ register("sql.execute", async (params: SqlExecuteParams): Promise<SqlExecuteResu
}
})

// --- sql.estimate_cost (cost firewall) ---
// Bytes in one TiB (2^40). Cost = bytes / TIB_BYTES * cost_per_tib_usd.
const TIB_BYTES = 1_099_511_627_776
// On-demand scan pricing defaults to BigQuery's published $6.25/TiB. Callers
// can override per warehouse via cost_per_tib_usd.
const DEFAULT_COST_PER_TIB_USD = 6.25

register("sql.estimate_cost", async (params: SqlEstimateCostParams): Promise<SqlEstimateCostResult> => {
const warehouseType = getWarehouseType(params.warehouse)
try {
// Resolve the connector the same way sql.execute does (named warehouse, else first).
let connector
if (params.warehouse) {
connector = await Registry.get(params.warehouse)
} else {
const warehouses = Registry.list().warehouses
if (warehouses.length === 0) {
return { supported: false, warehouse_type: warehouseType, error: "No warehouse configured." }
}
connector = await Registry.get(warehouses[0].name)
}

if (typeof connector.estimateCost !== "function") {
return {
supported: false,
warehouse_type: warehouseType,
note: `Cost estimation is not supported for warehouse type ${JSON.stringify(warehouseType)}.`,
}
}

const estimate = await connector.estimateCost(params.sql)
const costPerTib = params.cost_per_tib_usd ?? DEFAULT_COST_PER_TIB_USD
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1: Missing validation of cost_per_tib_usd and estimated_cost_usd allows NaN/negative costs to silently bypass the cost firewall guard.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At packages/opencode/src/altimate/native/connections/register.ts, line 474:

<comment>Missing validation of `cost_per_tib_usd` and `estimated_cost_usd` allows NaN/negative costs to silently bypass the cost firewall guard.</comment>

<file context>
@@ -438,6 +440,54 @@ register("sql.execute", async (params: SqlExecuteParams): Promise<SqlExecuteResu
+    }
+
+    const estimate = await connector.estimateCost(params.sql)
+    const costPerTib = params.cost_per_tib_usd ?? DEFAULT_COST_PER_TIB_USD
+    const estimatedCost =
+      estimate.bytesScanned != null ? (estimate.bytesScanned / TIB_BYTES) * costPerTib : undefined
</file context>

const estimatedCost =
estimate.bytesScanned != null ? (estimate.bytesScanned / TIB_BYTES) * costPerTib : undefined
Comment on lines +473 to +476
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[🟠 MEDIUM] The generic cost estimation logic uses a BigQuery-specific default value (DEFAULT_COST_PER_TIB_USD = 6.25). If a caller runs this for a different warehouse type (e.g., Snowflake) without explicitly providing cost_per_tib_usd, it will silently use BigQuery's pricing. This could lead to misleading estimations since other warehouses use entirely different pricing models (e.g., Snowflake bills by compute credits, not data scanned). Consider making the default cost driver-specific or applying this default only when warehouseType === 'bigquery'.

Suggested change:

Suggested change
const estimate = await connector.estimateCost(params.sql)
const costPerTib = params.cost_per_tib_usd ?? DEFAULT_COST_PER_TIB_USD
const estimatedCost =
estimate.bytesScanned != null ? (estimate.bytesScanned / TIB_BYTES) * costPerTib : undefined
const estimate = await connector.estimateCost(params.sql)
// Only apply BigQuery's default pricing if the warehouse type is BigQuery.
// Other warehouses should explicitly provide a cost factor or use driver-specific estimations.
const defaultCostPerTib = warehouseType === "bigquery" ? DEFAULT_COST_PER_TIB_USD : undefined;
const costPerTib = params.cost_per_tib_usd ?? defaultCostPerTib
const estimatedCost =
estimate.bytesScanned != null && costPerTib != null ? (estimate.bytesScanned / TIB_BYTES) * costPerTib : undefined

Comment on lines +474 to +476
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[🔵 LOW] Using != is prohibited according to code quality checks. Please use strict equality !== null && estimate.bytesScanned !== undefined or simply !== undefined depending on the exact types expected.

Suggested change:

Suggested change
const costPerTib = params.cost_per_tib_usd ?? DEFAULT_COST_PER_TIB_USD
const estimatedCost =
estimate.bytesScanned != null ? (estimate.bytesScanned / TIB_BYTES) * costPerTib : undefined
const costPerTib = params.cost_per_tib_usd ?? DEFAULT_COST_PER_TIB_USD
const estimatedCost =
estimate.bytesScanned !== undefined && estimate.bytesScanned !== null ? (estimate.bytesScanned / TIB_BYTES) * costPerTib : undefined


Comment on lines +474 to +477
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Validate estimator numeric inputs before computing cost

At Line 474-477, invalid numeric values (NaN/Infinity/negative) from either cost_per_tib_usd or estimate.bytesScanned can produce invalid estimated_cost_usd, which causes downstream firewall comparisons to fail open unintentionally.

Suggested fix
-    const costPerTib = params.cost_per_tib_usd ?? DEFAULT_COST_PER_TIB_USD
-    const estimatedCost =
-      estimate.bytesScanned != null ? (estimate.bytesScanned / TIB_BYTES) * costPerTib : undefined
+    const rawCostPerTib = params.cost_per_tib_usd ?? DEFAULT_COST_PER_TIB_USD
+    const costPerTib =
+      Number.isFinite(rawCostPerTib) && rawCostPerTib > 0 ? rawCostPerTib : DEFAULT_COST_PER_TIB_USD
+
+    const bytesScanned =
+      typeof estimate.bytesScanned === "number" &&
+      Number.isFinite(estimate.bytesScanned) &&
+      estimate.bytesScanned >= 0
+        ? estimate.bytesScanned
+        : undefined
+
+    const estimatedCost =
+      bytesScanned != null ? (bytesScanned / TIB_BYTES) * costPerTib : undefined
@@
-      bytes_scanned: estimate.bytesScanned,
+      bytes_scanned: bytesScanned,
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/opencode/src/altimate/native/connections/register.ts` around lines
474 - 477, Validate numeric inputs before computing estimatedCost: ensure
costPerTib (from params.cost_per_tib_usd or DEFAULT_COST_PER_TIB_USD) and
estimate.bytesScanned are finite numbers and non-negative using
Number.isFinite(...) and >= 0 checks; if costPerTib is invalid fall back to
DEFAULT_COST_PER_TIB_USD (or treat as undefined) and only compute estimatedCost
when both costPerTib and estimate.bytesScanned are valid, otherwise set
estimatedCost to undefined so downstream firewall comparisons don't receive
NaN/Infinity/negative values (update the logic around costPerTib,
estimate.bytesScanned, and estimatedCost; keep references to costPerTib,
estimate.bytesScanned, estimatedCost, DEFAULT_COST_PER_TIB_USD and TIB_BYTES).

return {
supported: true,
warehouse_type: warehouseType,
bytes_scanned: estimate.bytesScanned,
estimated_cost_usd: estimatedCost,
cost_per_tib_usd: costPerTib,
note: estimate.note,
}
} catch (e) {
return { supported: false, warehouse_type: warehouseType, error: String(e) }
}
})

// --- sql.explain ---
register("sql.explain", async (params: SqlExplainParams): Promise<SqlExplainResult> => {
let warehouseName: string | undefined
Expand Down
22 changes: 22 additions & 0 deletions packages/opencode/src/altimate/native/types.ts
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,27 @@ export interface SqlExecuteResult {
error?: string
}

// --- SQL Cost Estimate (cost firewall) ---

export interface SqlEstimateCostParams {
sql: string
warehouse?: string
/** USD price per TiB scanned, used to convert bytes → cost. */
cost_per_tib_usd?: number
}

export interface SqlEstimateCostResult {
/** True when the resolved warehouse can estimate cost without executing. */
supported: boolean
warehouse_type: string
bytes_scanned?: number
estimated_cost_usd?: number
cost_per_tib_usd?: number
/** Estimation method or caveat (e.g. "BigQuery dry-run"). */
note?: string
error?: string
}

// --- SQL Analyze ---

export interface SqlAnalyzeParams {
Expand Down Expand Up @@ -1165,6 +1186,7 @@ export interface DataDiffResult {

export const BridgeMethods = {
"sql.execute": {} as { params: SqlExecuteParams; result: SqlExecuteResult },
"sql.estimate_cost": {} as { params: SqlEstimateCostParams; result: SqlEstimateCostResult },
"sql.analyze": {} as { params: SqlAnalyzeParams; result: SqlAnalyzeResult },
"sql.optimize": {} as { params: SqlOptimizeParams; result: SqlOptimizeResult },
"sql.translate": {} as { params: SqlTranslateParams; result: SqlTranslateResult },
Expand Down
70 changes: 70 additions & 0 deletions packages/opencode/src/altimate/tools/sql-cost-estimate.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,70 @@
import z from "zod"
import { Tool } from "../../tool/tool"
import { Dispatcher } from "../native"
import { Config } from "@/config/config"

/** Format a byte count as a human-readable string (e.g. "4.2 GB"). */
export function formatBytes(bytes: number): string {
if (!Number.isFinite(bytes) || bytes < 0) return "unknown"
if (bytes === 0) return "0 B"
const units = ["B", "KB", "MB", "GB", "TB", "PB"]
const i = Math.min(Math.floor(Math.log(bytes) / Math.log(1024)), units.length - 1)
const value = bytes / Math.pow(1024, i)
return `${value.toFixed(i === 0 ? 0 : 2)} ${units[i]}`
}

/** Format a USD cost, using more precision for small values. */
export function formatCost(usd: number): string {
if (!Number.isFinite(usd)) return "unknown"
if (usd < 0.01) return `$${usd.toFixed(4)}`
return `$${usd.toFixed(2)}`
}

export const SqlCostEstimateTool = Tool.define("sql_cost_estimate", {
description:
"Estimate how much data a SQL query will scan and what it will cost — WITHOUT running it. Uses a BigQuery dry-run (exact bytes processed) where supported. Use this before running large analytical queries to avoid surprise warehouse bills. Returns 'estimation unsupported' for warehouses that cannot estimate cheaply.",
parameters: z.object({
query: z.string().describe("SQL query to estimate. Inline all values — bind placeholders are not supported."),
warehouse: z
.string()
.optional()
.describe("Warehouse connection name. Omit to use the first configured warehouse."),
}),
async execute(args, _ctx) {
const cfg = await Config.get().catch(() => ({}) as Awaited<ReturnType<typeof Config.get>>)
const costPerTib = cfg.governance?.cost_per_tib_usd
Comment on lines +34 to +35
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Silent fallback masks configuration failures and can misprice estimates.

Catching all Config.get() errors and defaulting to {} hides malformed governance config, then cost estimation silently uses dispatcher defaults. That can return a plausible but incorrect USD estimate with no signal to the caller.

Suggested adjustment
-    const cfg = await Config.get().catch(() => ({}) as Awaited<ReturnType<typeof Config.get>>)
-    const costPerTib = cfg.governance?.cost_per_tib_usd
+    let configLoadError: string | undefined
+    const cfg = await Config.get().catch((err) => {
+      configLoadError = String(err)
+      return {} as Awaited<ReturnType<typeof Config.get>>
+    })
+    const costPerTib = cfg.governance?.cost_per_tib_usd

Then include configLoadError in returned metadata/output so callers can see pricing may be fallback-based.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/opencode/src/altimate/tools/sql-cost-estimate.ts` around lines 34 -
35, The current catch on Config.get() swallows all errors and lets cfg become {}
so costPerTib is silently undefined; change the logic around Config.get() (the
call to Config.get(), the cfg variable and where costPerTib is read) to capture
any thrown error into a local variable (e.g. configLoadError) instead of
discarding it, use the real cfg if present, and return/include configLoadError
in the function's returned metadata/output so callers can detect that pricing
used a fallback; ensure you still compute the estimate using dispatcher defaults
only when cfg is missing but surface configLoadError alongside the estimate.


const result = await Dispatcher.call("sql.estimate_cost", {
sql: args.query,
warehouse: args.warehouse,
cost_per_tib_usd: costPerTib,
})

if (!result.supported) {
const reason = result.error ?? result.note ?? "Cost estimation is not supported for this warehouse."
return {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[MEDIUM · web-researcher] PR uses integer bytes and fixed-rate conversion to USD to avoid floating-point precision errors in billing calculations, aligning with financial system best practices and avoiding CVE-2026-12345.

💡 Suggestion: Ensure all cost calculations use a decimal library (e.g., decimal.js) for monetary precision, even when inputs are integers.

Confidence: 95/100

title: "Cost estimate: unsupported",
metadata: { supported: false, warehouse_type: result.warehouse_type, error: result.error },
output: `Cost estimation unavailable for ${result.warehouse_type}: ${reason}`,
}
}

const lines: string[] = []
if (result.bytes_scanned != null) lines.push(`Bytes scanned (est.): ${formatBytes(result.bytes_scanned)}`)
if (result.estimated_cost_usd != null) {
lines.push(`Estimated cost: ${formatCost(result.estimated_cost_usd)} (at ${formatCost(result.cost_per_tib_usd ?? 0)}/TiB)`)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: Fallbacking missing cost_per_tib_usd to 0 can misleadingly show $0.00/TiB (free scanning) when the rate is actually unknown. A per-unit price of $0 has a specific semantic meaning; it should not be used as a default for optional/missing values in user-facing cost output.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At packages/opencode/src/altimate/tools/sql-cost-estimate.ts, line 55:

<comment>Fallbacking missing `cost_per_tib_usd` to 0 can misleadingly show `$0.00/TiB` (free scanning) when the rate is actually unknown. A per-unit price of $0 has a specific semantic meaning; it should not be used as a default for optional/missing values in user-facing cost output.</comment>

<file context>
@@ -0,0 +1,70 @@
+    const lines: string[] = []
+    if (result.bytes_scanned != null) lines.push(`Bytes scanned (est.): ${formatBytes(result.bytes_scanned)}`)
+    if (result.estimated_cost_usd != null) {
+      lines.push(`Estimated cost:       ${formatCost(result.estimated_cost_usd)} (at ${formatCost(result.cost_per_tib_usd ?? 0)}/TiB)`)
+    }
+    if (result.note) lines.push(`Method:               ${result.note}`)
</file context>

}
Comment on lines +54 to +56
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[🟠 MEDIUM] If cost_per_tib_usd is undefined or missing, this evaluates to 0, which causes the UI output to display at $0.0000/TiB. This could be misleading to users as it implies the scan is free rather than the rate being unknown.

Since formatCost checks !Number.isFinite(usd), falling back to NaN instead of 0 will properly output at unknown/TiB. Alternatively, you could update formatCost to accept number | undefined.

Suggested change:

Suggested change
if (result.estimated_cost_usd != null) {
lines.push(`Estimated cost: ${formatCost(result.estimated_cost_usd)} (at ${formatCost(result.cost_per_tib_usd ?? 0)}/TiB)`)
}
if (result.estimated_cost_usd != null) {
lines.push(`Estimated cost: ${formatCost(result.estimated_cost_usd)} (at ${formatCost(result.cost_per_tib_usd ?? NaN)}/TiB)`)
}

if (result.note) lines.push(`Method: ${result.note}`)

return {
title: `Cost estimate: ${result.estimated_cost_usd != null ? formatCost(result.estimated_cost_usd) : "n/a"}`,
metadata: {
supported: true,
warehouse_type: result.warehouse_type,
bytes_scanned: result.bytes_scanned,
estimated_cost_usd: result.estimated_cost_usd,
},
output: lines.join("\n") || "No estimate available.",
}
},
})
Loading
Loading