Skip to content

Features

Aryeh Citron edited this page May 12, 2026 · 4 revisions

Features

Dataset CRUD

Full dataset lifecycle management:

  • Create: client.CreateDataset(datasetId)
  • Get: client.GetDataset(datasetId)
  • List: client.ListDatasets()
  • Update: client.UpdateDataset(datasetId, ...)
  • Delete: client.DeleteDataset(datasetId) (with deleteContents option)

Table CRUD

Full table lifecycle with schema enforcement:

  • Create: client.CreateTable(datasetId, tableId, schema)
  • Get: client.GetTable(datasetId, tableId)
  • List: client.ListTables(datasetId)
  • Update: client.UpdateTable(datasetId, tableId, ...) (add columns)
  • Delete: client.DeleteTable(datasetId, tableId)

Streaming Inserts

Insert rows with schema validation:

await client.InsertRowsAsync("dataset", "table", new[]
{
    new BigQueryInsertRow { ["id"] = 1, ["name"] = "Alice" },
});
  • Type validation against schema
  • insertId dedup (optional)
  • Per-row error reporting
  • Rows immediately queryable (no buffer delay)

SQL Queries

GoogleSQL dialect with 325+ built-in functions. See SQL Queries for full reference.

var results = await client.ExecuteQueryAsync(
    "SELECT name, COUNT(*) as cnt FROM dataset.table GROUP BY name",
    parameters: null);

DML

Data Manipulation Language via SQL:

  • INSERT INTO ... VALUES
  • INSERT INTO ... SELECT
  • UPDATE ... SET ... WHERE
  • DELETE FROM ... WHERE
  • MERGE INTO ... USING ... ON ... WHEN MATCHED/NOT MATCHED
  • TRUNCATE TABLE

DDL

Data Definition Language via SQL:

  • CREATE TABLE / CREATE TABLE IF NOT EXISTS
  • CREATE OR REPLACE TABLE
  • CREATE TABLE ... AS SELECT (CTAS)
  • CREATE TABLE LIKE / CREATE TABLE COPY
  • CREATE SNAPSHOT TABLE / CREATE TABLE CLONE
  • DROP TABLE / DROP TABLE IF EXISTS
  • ALTER TABLE ADD COLUMN / DROP COLUMN
  • ALTER TABLE RENAME TO / RENAME COLUMN
  • ALTER TABLE SET OPTIONS
  • ALTER COLUMN SET DATA TYPE / SET DEFAULT / DROP DEFAULT / DROP NOT NULL
  • CREATE SCHEMA / DROP SCHEMA
  • CREATE [OR REPLACE] MATERIALIZED VIEW (treated as regular view)
  • CREATE [OR REPLACE] EXTERNAL TABLE (creates regular table)
  • CREATE [OR REPLACE] PROCEDURE / DROP PROCEDURE (no-op stubs)
  • CREATE [OR REPLACE] TABLE FUNCTION / DROP TABLE FUNCTION (no-op stubs)
  • CREATE ROW ACCESS POLICY / DROP ROW ACCESS POLICY (no-op stubs)
  • CREATE SEARCH INDEX / DROP SEARCH INDEX (no-op stubs)

Views

  • CREATE VIEW ... AS SELECT
  • CREATE OR REPLACE VIEW
  • View queries re-execute the stored SQL against current data
  • Views appear in INFORMATION_SCHEMA.TABLES with table_type = 'VIEW'

Fault Injection

Simulate errors for testing retry logic and error handling:

bq.SetFaultInjector(request =>
    new HttpResponseMessage(HttpStatusCode.ServiceUnavailable));

Supported error codes: 403, 404, 429, 500, 503

Request Logging

Record all HTTP requests and SQL queries for assertions:

bq.Handler.RequestLog  // All HTTP requests
bq.Handler.QueryLog    // All SQL queries

Partitioning

  • CREATE TABLE ... PARTITION BY (time partitioning, range partitioning)
  • _PARTITIONTIME / _PARTITIONDATE pseudo-columns
  • Partition pruning in WHERE clauses

User-Defined Functions (UDFs)

SQL and JavaScript UDFs via CREATE FUNCTION:

// SQL UDF
await client.ExecuteQueryAsync(
    @"CREATE TEMP FUNCTION double_it(x INT64) AS (x * 2);
    SELECT double_it(21) AS result;", parameters: null);

// JavaScript UDF (requires JsUdfs package)
await client.ExecuteQueryAsync(
    @"CREATE TEMP FUNCTION plusOne(x FLOAT64) RETURNS FLOAT64
    LANGUAGE js AS ""return x+1;"";
    SELECT plusOne(4) AS result;", parameters: null);
  • CREATE [OR REPLACE] [TEMP] FUNCTION name(params) AS (body) — SQL UDFs
  • CREATE [OR REPLACE] [TEMP] FUNCTION name(params) RETURNS type LANGUAGE js AS "body" — JavaScript UDFs
  • DROP FUNCTION [IF EXISTS] name
  • Multi-parameter support, string/numeric return types
  • Case-insensitive routine names (matching real BigQuery)

JavaScript UDF Engine (Optional Package)

Install InMemoryEmulator.BigQuery.JsUdfs to enable LANGUAGE js support:

using InMemoryEmulator.BigQuery.JsUdfs;

var bq = InMemoryBigQuery.Create("project", "dataset");
bq.Store.UseJsUdfs(); // Register the Jint JavaScript engine

Supported JS features: return statements, arithmetic, string manipulation, null handling, multi-line bodies (triple-quoted r"""...""").

Production Extensions (Optional Package)

Install InMemoryEmulator.BigQuery.ProductionExtensions for IAsyncEnumerable and typed mapping:

using InMemoryEmulator.BigQuery.ProductionExtensions;

var results = await client.ExecuteQueryAsync("SELECT ...", parameters: null);

// IAsyncEnumerable iteration
await foreach (var row in results.AsAsyncEnumerable())
    Console.WriteLine(row["name"]);

// Typed mapping
var items = await results.ToListAsync(row => new { Id = (long)row["id"], Name = (string)row["name"] });

Range Functions

Range type support for date/numeric ranges:

  • RANGE(lower, upper) — construct a range
  • RANGE_START(range) / RANGE_END(range) — extract bounds
  • RANGE_CONTAINS(range, value) / RANGE_CONTAINS(range, range) — containment check
  • RANGE_OVERLAPS(range1, range2) — overlap check
  • GENERATE_RANGE_ARRAY(range) — generate array from range

Geography Functions

41 spatial functions via the ST_* namespace:

  • Constructors: ST_GEOGPOINT, ST_GEOGFROMTEXT, ST_GEOGFROMWKT, ST_GEOGFROMGEOJSON, ST_GEOGFROMWKB
  • Output: ST_ASTEXT, ST_ASGEOJSON, ST_ASBINARY
  • Accessors: ST_X, ST_Y, ST_NUMPOINTS, ST_NPOINTS, ST_DIMENSION, ST_ISEMPTY, ST_ISCOLLECTION, ST_GEOMETRYTYPE
  • Measurement: ST_DISTANCE, ST_AREA, ST_LENGTH, ST_PERIMETER
  • Predicates: ST_CONTAINS, ST_WITHIN, ST_INTERSECTS, ST_DISJOINT, ST_EQUALS, ST_DWITHIN, ST_COVEREDBY, ST_COVERS, ST_TOUCHES
  • Transformers: ST_MAKELINE, ST_CENTROID, ST_BOUNDARY, ST_CLOSESTPOINT, ST_CONVEXHULL, ST_DIFFERENCE, ST_INTERSECTION, ST_UNION, ST_BUFFER, ST_SIMPLIFY, ST_DUMP
  • Aggregates: ST_CENTROID_AGG, ST_UNION_AGG

Vector / Distance Functions

6 vector similarity and distance functions:

  • COSINE_DISTANCE(v1, v2) — cosine distance between two vectors
  • EUCLIDEAN_DISTANCE(v1, v2) — Euclidean distance between two vectors
  • DOT_PRODUCT(v1, v2) — dot product of two vectors
  • APPROX_COSINE_DISTANCE(v1, v2) — approximate cosine distance (exact in-memory)
  • APPROX_EUCLIDEAN_DISTANCE(v1, v2) — approximate Euclidean distance (exact in-memory)
  • APPROX_DOT_PRODUCT(v1, v2) — approximate dot product (exact in-memory)

Procedural Language

Full procedural language support for scripting:

  • DECLARE (with DEFAULT expressions) / SET
  • IF ... THEN ... ELSE ... END IF
  • WHILE ... DO ... END WHILE
  • LOOP ... END LOOP / BREAK / CONTINUE
  • FOR ... IN (query) DO ... END FOR
  • REPEAT ... UNTIL ... END REPEAT
  • CASE (procedural form)
  • BEGIN ... END (with EXCEPTION handling and @@error.message)
  • RETURN / RAISE (with USING MESSAGE) / ASSERT
  • EXECUTE IMMEDIATE
  • CALL (procedure invocation)
  • BEGIN TRANSACTION / COMMIT / ROLLBACK (no-op stubs)
  • Labels on BEGIN, LOOP, WHILE, FOR, REPEAT blocks

INFORMATION_SCHEMA

3 metadata views:

  • INFORMATION_SCHEMA.TABLES — table catalog, schema, name, type, creation time
  • INFORMATION_SCHEMA.COLUMNS — column metadata (name, position, type, nullability)
  • INFORMATION_SCHEMA.SCHEMATA — dataset listing

Advanced Query Clauses

  • QUALIFY — filter on window function results
  • PIVOT / UNPIVOT — row-to-column and column-to-row transformations
  • GROUP BY ROLLUP — hierarchical grouping with subtotals
  • WITH RECURSIVE — recursive CTEs
  • TABLESAMPLE — random row sampling
  • WINDOW clause — named window definitions
  • UNNEST WITH OFFSET — array flattening with position
  • Wildcard tables (table_*) with _TABLE_SUFFIX filtering

DCL (Data Control Language)

  • GRANT / REVOKE — parsed and accepted (no-op stubs)

Other Statement Stubs

  • EXPORT DATA / LOAD DATA — parsed and accepted (no-op stubs)

Load Jobs (CSV / JSON Upload)

Full support for the BigQuery SDK's UploadCsvAsync and UploadJsonAsync methods:

  • CSV: Parses comma-separated values, supports SkipLeadingRows for header rows
  • JSON: Parses newline-delimited JSON (NEWLINE_DELIMITED_JSON)
  • Creates the destination table automatically if it does not exist (schema required)
  • Data is inserted into the in-memory store and is immediately queryable
  • Uses the Google APIs resumable upload protocol under the hood

Extract Jobs (No-Op Stub)

CreateExtractJobAsync returns a completed job without writing data (no real GCS to export to):

  • Validates that the source table exists
  • Supports single and multiple destination URIs
  • Returns a DONE job — useful for integration tests that exercise the extract API path

BigQuery Storage Read API (gRPC)

Full support for the BigQuery Storage Read API via InMemoryBigQueryResult.CreateReadClient():

using var bq = InMemoryBigQuery.Create("test-project", "my_dataset", ds =>
{
    ds.AddTable("users", schema);
});

// Get a BigQueryReadClient backed by the same in-memory data
var readClient = bq.CreateReadClient();

// Use the standard Storage API
var session = readClient.CreateReadSession(new CreateReadSessionRequest
{
    Parent = "projects/test-project",
    ReadSession = new ReadSession
    {
        Table = "projects/test-project/datasets/my_dataset/tables/users",
        DataFormat = DataFormat.Avro,
    },
    MaxStreamCount = 1,
});

Supported RPCs:

  • CreateReadSession — create a read session with configurable stream count
  • ReadRows — stream Avro-serialized rows from a read stream
  • SplitReadStream — split a stream into two child streams

Clone this wiki locally