Skip to content

v0.11.0

Latest

Choose a tag to compare

@github-actions github-actions released this 28 Jun 12:27
Immutable release. Only release title and notes can be modified.

SQL over Vortex grows a compute layer: a Calcite WHERE-filtered SUM/COUNT/MIN/MAX is now answered from the zone-map statistics — folding the chunks the predicate fully selects without decoding them and decoding only the one or two chunks its range cuts through (ADR 0013 §6 / ADR 0018 boundary tier). On a 100-chunk file a SELECT SUM(x) WHERE id BETWEEN … answers ~12× faster than a full scan when the range is wide. Plus the security hardening of the untrusted-input parse paths (ADR 0003) and a vortex.zstd binding bump.

Added

  • VortexCalcite.connect(schemaName, tables) opens a Calcite JDBC connection with the VortexSchema registered in one call, folding away the DriverManager / unwrap / getRootSchema().add(...) boilerplate. It wires the Babel SQL parser, so columns whose names are reserved words (close, open, value, year, …) are queryable unquotedselect close, open from vtx.ohlc — which columnar files routinely need; only the typed-literal keywords (date, time, timestamp, interval) still require back-tick quoting. (24b64b32)
  • The Calcite aggregate push-down rule now auto-registers over a bare jdbc:calcite: connection: a VortexTable translates to a VortexTableScan whose register() installs the rules when the planner first sees it, so SELECT MIN/MAX/COUNT/SUM over a VortexSchema is answered from zone-map statistics with no caller wiring (previously the rule had to be attached to the planner by hand). AVG joins them — AggregateReduceFunctionsRule reduces it to SUM/COUNT, both of which push down, so a whole-table AVG also decodes no data segment. Column projection and WHERE chunk-skip push-down are unchanged. (24b64b32)
  • The Calcite aggregate push-down rule now rewrites a whole-table SUM(col) to a single-row Values computed from the zone-map table (via VortexTable.zoneSum), so SELECT SUM(col) answers metadata-only with no data segment decoded — joining the existing MIN/MAX/COUNT push-down. SUM over an all-null (or empty) column answers the SQL NULL, and the rule abandons to a normal scan when a zone carries no usable sum (no zone map, or an overflowed zone). (24b64b32)
  • A Calcite WHERE-filtered SUM/COUNT/MIN/MAX is now answered from zone-map statistics when the predicate partitions the chunks cleanly — each chunk wholly matches or wholly fails it — folding only the kept chunks' stats into a scan-free result. A predicate that cuts through a chunk (the typical selective range) still falls back to the zone-map-pruned scan. (32cc4a29)
  • A WHERE-filtered aggregate whose range cuts through a chunk no longer falls back to a full scan: the interior chunks still fold from the zone-map stats, and each straddling boundary chunk is decoded on its own and reduced under a row-level filter, so SELECT SUM(volume) WHERE close BETWEEN 100 AND 200 decodes only the one or two boundary chunks instead of every surviving chunk. The push-down still abandons to the (correct) scan for an unsigned or floating-point filter column, a non-numeric SUM, or a missing zone map. (f89b5b69)
  • VortexReader.decodeChunk(chunkIndex, columns) and chunkCount() decode a single chunk for a chosen subset of columns in isolation, rather than streaming the whole file — the returned Chunk owns its memory and is valid until closed. (084a0133)
  • ScanIterator.columnZoneStats(column) surfaces per-zone min/max/sum/null-count from a column's vortex.stats zone-map table without decoding any data segment — the read side of aggregate push-down (ADR 0013 §6). ArrayStats gains a sum component, decoded from the zone-map table (where the Rust reference stores it too), so the Calcite adapter now answers SUM/AVG metadata-only when every zone carries a sum, falling back to a streaming scan only for columns without a zone map. (05dd9204)

Changed

  • Bumped io.github.dfa1.zstd (the vortex.zstd FFM bindings, pinned by the BOM) 0.3 → 0.6, which ships smaller jars (native debug symbols stripped). (677c2cf7, 6dcdbe94, fec0a0d3)
  • Bumped Apache Calcite (the SQL adapter's engine) 1.40 → 1.42. (2f9f02c6)

Security

  • DType-tree and array-node decoding are now depth-capped (64, matching the layout-tree guard): a crafted or self-referential FlatBuffer surfaces as a VortexException instead of a StackOverflowError — which, being an Error, previously escaped sanitization and leaked the reader's memory-mapped Arena. (93f8d5f4, 428026d3)
  • The HTTP reader validates footer segmentSpecs against the file size before any Range request is built from them, matching the local-file path. (1d8ddebc)
  • vortex.zstd decode bounds-checks each frame's declared uncompressed size and overflow-checks the total before allocating, and range-checks VarBin length prefixes — a crafted payload can no longer under-allocate or read out of bounds. (2df4e3a7, adc445e8)
  • The HTTP reader parses the server-controlled Content-Range header and slices the tail buffer defensively, so a malformed response yields a VortexException rather than a raw NumberFormatException/IndexOutOfBoundsException. (feac99b7)