Skip to content

Query Generator for deterministic, and non-deterministic TPC-H queries#22

Open
tobodner wants to merge 28 commits into
compiler-pr-762-headfrom
compiler-pr-798-head
Open

Query Generator for deterministic, and non-deterministic TPC-H queries#22
tobodner wants to merge 28 commits into
compiler-pr-762-headfrom
compiler-pr-798-head

Conversation

@tobodner

@tobodner tobodner commented Nov 3, 2025

Copy link
Copy Markdown
Member

Generate SQL queries for TPC-H benchmarking, and testing purposes.

Relevant functions:

  • BuildQuery(BenchmarkItemId item_id)
  • BuildDeterministicQuery(BenchmarkItemId item_id)

Zero-based index:

  • item_id = 0 → TPC-H Q1,
  • item_id = 1 → TPC-H Q2, etc.

bweengener and others added 28 commits July 7, 2022 15:33
…roughput_parallel_benchmark (#752)

* add repetition metrics to LambdaBenchmarkOutput-builder

* use new metrics in network_throughput_parallel_benchmark + add bool repetition metric to LambdaBenchmarkOutput-Builder (#713)

* Fix clang format issues (#713)

* trigger pipeline

* Add throughput aggregates (percentiles) for invocations of a repetition to LambdaOutput of ThroughputParallelBenchmark (#713)

* Implement review findings (#713)

* fix readme

* ...

* Adjust includes in benchmark-result  (#713)

Co-authored-by: d-justen <33455240+d-justen@users.noreply.github.com>
This commit increases the compiler compatibility of the project. No functional features are added. The changes involve adding missing includes and refactoring a function to obtain the full path of the executable. The project (i.e. the unit tests) now compiles on MacOS.
…lation (#740)

The `CompilationContext` holds compilation state and configuration to support the translation of SQL queries. It includes, for instance, execution constraint or S3 key prefixes. 

It is meant to be passed to various compiler components via a shared pointer.
Add basic implementation for ParquetFormatWriter with support for all data types, NULL values and compression.
* First attempt on implementing a Left Outer Join using a bitmap #NOTICKET

* Adjusts auto data-types in HashJoin #785

* Add documentation comments to HashJoin-operator #785

* Changed LeftOuter-HashJoin according to feedback (#785)

* Fixes minor bug + cleans up code for Left Outer HashJoin (#785)

* performance tweaks for hash-join (#785)

* extend HashJoinMicrobenchmark for left-outer-joins (#785)

* draft google-benchmark output transformer and plotter for HashJoin-microbenchmark (#785)

* fix clang-format style issue in HashJoin-operator (#785)

* fix format style issue in python utility scripts (#785)

* refactor scripts for microbenchmark-plots (#785)

* Update docs in src/lib/operator/hash_join_operator.cpp

Co-authored-by: d-justen <33455240+d-justen@users.noreply.github.com>

* Update docs in src/lib/operator/hash_join_operator.cpp

Co-authored-by: d-justen <33455240+d-justen@users.noreply.github.com>

* applied feedback for left-outer-join (#785)

Co-authored-by: d-justen <33455240+d-justen@users.noreply.github.com>
Basic parquet format reader.

Co-authored-by: Niklas Riekenbrauck <nnikriek@gmail.com>
Co-authored-by: Thomas Bodner <thomas.bodner@hpi.de>
Changes with this PR:

- Pull AL2 provided image from ECR registry and use latest version
- Use buildx for multi-arch builds
- Adapt Dockerfile to work for both ARM and x86 host architecture.

As ARM currently requires the AWS-SDK to be built as shared lib, we link aws packages dynamically / static based on the host architecture. For x86 host processors, nothing changes.

Co-authored-by: Thomas Bodner <thomas.bodner@hpi.de>
Depending on the platform long might not equal `int64_t` (https://en.cppreference.com/w/cpp/language/types). Let us be explicit here and use the same names here as in `value_segment.cpp`. The types need to match.
* add workflow for automated builds of multi platform docker images
Co-authored-by: Thomas Bodner <thomas.bodner@hpi.de>
This separate PR introduces the mapping from several skyrise expression (binary predicate, logical, in between etc) to arrow expression for predicate pushdown. This PR should be merged before #759 and the predicate pushdown implementation in the reader

Co-authored-by: Niklas Riekenbrauck <nnikriek@gmail.com>
Co-authored-by: TheoRadig <radig.theo@gmail.com>
Co-authored-by: Thomas Bodner <thomas.bodner@hpi.de>
Co-authored-by: LukasBudach <38383837+LukasBudach@users.noreply.github.com>
Add predicate pushdown to the parquet format reader. Predicates can be supplied via ParquetFormatReaderOptions.arrow_expression as arrow::Expressions (see #779 ).

Co-authored-by: Niklas Riekenbrauck <nnikriek@gmail.com>
Co-authored-by: TheoRadig <radig.theo@gmail.com>
Co-authored-by: Thomas Bodner <thomas.bodner@hpi.de>
Co-authored-by: Niklas Riekenbrauck <nikriek@users.noreply.github.com>
Co-authored-by: Theo <42896593+TheoRadig@users.noreply.github.com>
Fixes #776.

Co-authored-by: Thomas Bodner <thomas.bodner@hpi.de>
Co-authored-by: Thomas Bodner <thomas.bodner@hpi.de>
#792 and #793 introduced support for multi-platform images.

The following Docker scripts are not aware of this change yet, and thus are broken:
- run_tests.sh
- format.sh

This PR implements command substitution in image.conf to add a platform suffix to IMAGE_NAME during runtime, fixing the aforementioned scripts.

Moreover, IMAGE_DATE gets updated to 20220819.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants