You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Pipeline mode has been on the TODO list for a while (Support pipeline mode: https://www.postgresql.org/docs/14/libpq-pipeline-mode.html). We have a working implementation and would like to get feedback on the API design before pushing further.
Motivation
Pipeline mode (libpq, PG14+) batches multiple queries into fewer network round trips. Instead of the normal send-wait-receive cycle per query, the client queues N queries and collects N results after a sync point. This can dramatically reduce latency for workloads that issue many independent queries — batch inserts, bulk prepared statement execution, etc.
Every major PostgreSQL driver has added or is adding pipeline support: psycopg3 (conn.pipeline()), Ruby pg (enter_pipeline_mode), pgx (SendBatch), tokio-postgres (implicit pipelining).
Proposed API
Low-level wrappers matching the Ruby pg gem's approach — thin 1:1 mappings of the libpq pipeline functions. Users manage the lifecycle explicitly.
$dbh->pg_enter_pipeline_mode();
$dbh->pg_send_query_params('INSERT INTO t VALUES (?)', [1]);
$dbh->pg_send_query_params('INSERT INTO t VALUES (?)', [2]);
$dbh->pg_send_query_params('SELECT * FROM t WHERE id = ?', [1]);
$dbh->pg_pipeline_sync();
# Results arrive: query -> undef -> query -> undef -> ... -> PIPELINE_SYNCmy$r1 = $dbh->pg_getresult(); # {status => 1} INSERT$dbh->pg_getresult(); # undefmy$r2 = $dbh->pg_getresult(); # {status => 1} INSERT$dbh->pg_getresult(); # undefmy$r3 = $dbh->pg_getresult(); # {status => 2, rows => [...]}$dbh->pg_getresult(); # undef$dbh->pg_getresult(); # {status => 10} PIPELINE_SYNC$dbh->pg_exit_pipeline_mode();
Design decisions
Low-level, not automatic. We considered a high-level approach where execute() queues internally and results are distributed back to statement handles (like psycopg3). That would require fundamental changes to DBI's execute flow. The low-level approach is safer, simpler, and can be layered on later.
pg_getresult returns a hashref, not a statement handle. Pipeline results don't map naturally to DBI's one-query-one-handle model. The hashref gives users everything they need (including row data for SELECTs) without coupling to DBI internals.
Both ? and $1/$2 placeholders are supported. The pipeline send methods call libpq's PQsendQueryParams directly, which requires $1/$2 syntax. However, since DBI users (and especially DBIx::Class/SQL::Abstract) generate SQL with ? placeholders, the implementation includes a C-level placeholder converter (pg_convert_placeholders in dbdimp.c) that transforms ? to $1, $2, etc. while properly skipping single-quoted strings, double-quoted identifiers, dollar-quoted strings, and SQL comments — following the same skip patterns as the existing pg_st_split_statement parser. When the SQL already contains $N placeholders, conversion is skipped entirely.
COPY is explicitly rejected in pipeline mode. This is a PostgreSQL protocol constraint, not a driver choice. We added guards that croak with clear messages on pg_putcopydata, pg_putcopyend, and pg_getcopydata.
Compile-time gating. The implementation uses #if PGLIBVERSION >= 140000. Methods croak at runtime if compiled against older libpq. Tests skip entirely on PG < 14.
Open questions
Method naming — Are the pg_send_* / pg_pipeline_* / pg_getresult names consistent with the project's conventions? Any preferences?
Version gating approach — Methods croak at runtime on older libpq. Tests skip on PG < 14. The CI matrix already includes PG 15 and 17. Is this the right approach, or would you prefer the methods to not exist at all on older builds?
pg_getresult scope — Currently this is a general-purpose PQgetResult wrapper that works outside pipeline mode too. Should it be pipeline-only, or is general availability useful?
Placeholder conversion approach — The C-level ? to $N converter reuses the skip patterns from pg_st_split_statement but is a separate ~180 line function. An alternative would be to refactor the existing parser to be callable standalone, but that would be significantly more invasive. Is a dedicated converter function acceptable, or would you prefer a different approach?
High-level API future — This PR is deliberately low-level. A future enhancement could add higher-level support (e.g., making execute() pipeline-aware, automatically distributing results to statement handles). Is that something you'd want to see as a follow-up, or is the low-level API sufficient?
Reference implementation
PR linked below has a working implementation with 109 tests covering the full pipeline lifecycle, error handling, placeholder conversion edge cases, and more. It passes the full author test suite. The code is meant as a concrete starting point for discussion — not a final API.
Pipeline mode has been on the TODO list for a while (
Support pipeline mode: https://www.postgresql.org/docs/14/libpq-pipeline-mode.html). We have a working implementation and would like to get feedback on the API design before pushing further.Motivation
Pipeline mode (libpq, PG14+) batches multiple queries into fewer network round trips. Instead of the normal send-wait-receive cycle per query, the client queues N queries and collects N results after a sync point. This can dramatically reduce latency for workloads that issue many independent queries — batch inserts, bulk prepared statement execution, etc.
Every major PostgreSQL driver has added or is adding pipeline support: psycopg3 (
conn.pipeline()), Ruby pg (enter_pipeline_mode), pgx (SendBatch), tokio-postgres (implicit pipelining).Proposed API
Low-level wrappers matching the Ruby pg gem's approach — thin 1:1 mappings of the libpq pipeline functions. Users manage the lifecycle explicitly.
pg_enter_pipeline_modePQenterPipelineModepg_exit_pipeline_modePQexitPipelineModepg_pipeline_statusPQpipelineStatuspg_pipeline_syncPQpipelineSyncpg_getresultPQgetResultpg_send_query_paramsPQsendQueryParamspg_send_preparePQsendPreparepg_send_query_preparedPQsendQueryPreparedpg_send_flush_requestPQsendFlushRequestpg_flushPQflushpg_getresultreturns a hashref:{ status => 2, # ExecStatusType integer error => undef, # error message or undef ntuples => 3, # row count nfields => 2, # column count cmdtuples => '', # affected rows (DML) rows => [[...], ...], # only present for TUPLES_OK }Usage example
Design decisions
Low-level, not automatic. We considered a high-level approach where
execute()queues internally and results are distributed back to statement handles (like psycopg3). That would require fundamental changes to DBI's execute flow. The low-level approach is safer, simpler, and can be layered on later.pg_getresultreturns a hashref, not a statement handle. Pipeline results don't map naturally to DBI's one-query-one-handle model. The hashref gives users everything they need (including row data for SELECTs) without coupling to DBI internals.Both
?and$1/$2placeholders are supported. The pipeline send methods call libpq'sPQsendQueryParamsdirectly, which requires$1/$2syntax. However, since DBI users (and especially DBIx::Class/SQL::Abstract) generate SQL with?placeholders, the implementation includes a C-level placeholder converter (pg_convert_placeholdersin dbdimp.c) that transforms?to$1,$2, etc. while properly skipping single-quoted strings, double-quoted identifiers, dollar-quoted strings, and SQL comments — following the same skip patterns as the existingpg_st_split_statementparser. When the SQL already contains$Nplaceholders, conversion is skipped entirely.COPY is explicitly rejected in pipeline mode. This is a PostgreSQL protocol constraint, not a driver choice. We added guards that croak with clear messages on
pg_putcopydata,pg_putcopyend, andpg_getcopydata.Compile-time gating. The implementation uses
#if PGLIBVERSION >= 140000. Methods croak at runtime if compiled against older libpq. Tests skip entirely on PG < 14.Open questions
Method naming — Are the
pg_send_*/pg_pipeline_*/pg_getresultnames consistent with the project's conventions? Any preferences?Version gating approach — Methods croak at runtime on older libpq. Tests skip on PG < 14. The CI matrix already includes PG 15 and 17. Is this the right approach, or would you prefer the methods to not exist at all on older builds?
pg_getresultscope — Currently this is a general-purposePQgetResultwrapper that works outside pipeline mode too. Should it be pipeline-only, or is general availability useful?pg_flushoverlap —pg_flush(wrappingPQflush) is also needed for async COPY FROM (PR Add non-blocking async COPY FROM support (pg_putcopydata_async, pg_putcopyend_async, pg_flush) #176). The two PRs add it independently. If both land, we'd deduplicate.Placeholder conversion approach — The C-level
?to$Nconverter reuses the skip patterns frompg_st_split_statementbut is a separate ~180 line function. An alternative would be to refactor the existing parser to be callable standalone, but that would be significantly more invasive. Is a dedicated converter function acceptable, or would you prefer a different approach?High-level API future — This PR is deliberately low-level. A future enhancement could add higher-level support (e.g., making
execute()pipeline-aware, automatically distributing results to statement handles). Is that something you'd want to see as a follow-up, or is the low-level API sufficient?Reference implementation
PR linked below has a working implementation with 109 tests covering the full pipeline lifecycle, error handling, placeholder conversion edge cases, and more. It passes the full author test suite. The code is meant as a concrete starting point for discussion — not a final API.