Skip to content

Add VECTOR and BSON data types support#16

Open
okramarenko wants to merge 18 commits into
add10supportfrom
add_vector
Open

Add VECTOR and BSON data types support#16
okramarenko wants to merge 18 commits into
add10supportfrom
add_vector

Conversation

@okramarenko
Copy link
Copy Markdown
Collaborator

@okramarenko okramarenko commented Apr 28, 2026

This PR introduces SingleStoreDbType.Vector and SingleStoreDbType.Bson, enables SingleStore extended protocol metadata by default on supported servers (adjustable via EnableExtendedDataTypes conn string parameter), and maps native VECTOR/BSON columns to appropriate .NET types.

Note

Medium Risk
Medium risk: introduces new protocol parsing and session initialization behavior (EnableExtendedDataTypes) that affects how column metadata/values and bulk copy parameter serialization work, with potential compatibility impact on older servers and existing binary/blob workflows.

Overview
Adds native VECTOR and BSON support via SingleStore extended protocol metadata, including new SingleStoreDbType.Bson/Vector mappings, typed VECTOR reads as ReadOnlyMemory<T>, and updated schema (SingleStoreDbColumn) metadata for dimensions/element type.

Introduces EnableExtendedDataTypes connection-string option (default true) that sets enable_extended_types_metadata on supported servers (8.5.28+) and falls back to legacy metadata unless explicitly required; ResetConnectionAsync now re-applies this session initialization.

Extends parameter inference/serialization and bulk copy handling for VECTOR/BSON (including correct casting expressions for SingleStoreBulkCopy), adds shared binary conversion utilities, and updates/expands docs and tests to cover round-trips, legacy-disable behavior, and bulk copy scenarios.

Reviewed by Cursor Bugbot for commit eb57e40. Bugbot is set up for automated code reviews on this repo. Configure here.

Comment thread src/SingleStoreConnector/Protocol/Payloads/ColumnDefinitionPayload.cs Outdated
Comment thread src/SingleStoreConnector/Core/TypeMapper.cs Outdated
Comment thread src/SingleStoreConnector/SingleStoreParameter.cs
Comment thread src/SingleStoreConnector/Core/SingleStoreBinaryValueConverter.cs
Comment thread src/SingleStoreConnector/Core/ServerSession.cs Outdated
Comment thread src/SingleStoreConnector/SingleStoreDbColumn.cs
Comment thread src/SingleStoreConnector/Protocol/Payloads/ColumnDefinitionPayload.cs Outdated
Comment thread src/SingleStoreConnector/SingleStoreParameter.cs Outdated
Comment thread src/SingleStoreConnector/SingleStoreConnection.cs
Comment thread src/SingleStoreConnector/SingleStoreConnection.cs
Comment thread src/SingleStoreConnector/Core/ServerSession.cs Outdated
Comment thread tests/SideBySide/BulkLoaderSync.cs
Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes using default mode and found 1 potential issue.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, have a team admin enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit eb57e40. Configure here.

((column.ColumnFlags & ColumnFlags.Blob) != 0 || column.ColumnType is ColumnType.TinyBlob or ColumnType.Blob or ColumnType.MediumBlob or ColumnType.LongBlob);
IsLong = mySqlDbType != SingleStoreDbType.Vector &&
column.ColumnLength > 255 &&
((column.ColumnFlags & ColumnFlags.Blob) != 0 || column.ColumnType is ColumnType.TinyBlob or ColumnType.Blob or ColumnType.MediumBlob or ColumnType.LongBlob);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BSON column size uses uint.MaxValue causing wrong ColumnSize

Low Severity

The IsLong check excludes SingleStoreDbType.Vector but does not account for SingleStoreDbType.Bson. Since BSON uses blob transport with columnSize: uint.MaxValue in the type metadata, it relies on the existing IsLong logic path. However, the Bson case in the constructor switch explicitly sets type = typeof(byte[]) and dataTypeName = "BSON" — values that are already identical to what the typeBinary metadata mapping provides (typeof(byte[]) and simpleDataTypeName: "BSON"). This is redundant but not harmful. The actual concern here is minor and cosmetic — the BSON case block is unnecessary code.

Additional Locations (1)
Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit eb57e40. Configure here.

UseCompression = csb.UseCompression;
UseXaTransactions = false;
EnableExtendedDataTypes = csb.EnableExtendedDataTypes;
EnableExtendedDataTypesWasExplicitlySet = csb.ContainsKey("Enable Extended Data Types");
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what abiut the case enableextendeddatatypes=true, is it covered?

if (schema[i] is not SingleStoreDbColumn singleStoreColumn)
{
// fallback to existing behavior
goto LegacyHandling;
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we avoid goto here? It makes sense when there are several places that lead to one label (e.g. the code encounters errors and goto errorLabel is added to each "bad" scenario), here I'd prefer if-else possibly with some refactoring.

Comment on lines +24 to +26
type = typeof(byte[]);
dataTypeName = "BSON";
break;
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't this be already set for a column that has SingleStoreDbType.Bson?

{
writer.WriteString(ulongValue);
}
else if (Value is byte[] or ReadOnlyMemory<byte> or Memory<byte> or ArraySegment<byte> or MemoryStream or float[] or ReadOnlyMemory<float> or Memory<float>)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this be extended to double[], sbyte[], int[]?


var vectorDimensions = reader.ReadUInt32();
VectorDimensions = vectorDimensions == 0 ? null : vectorDimensions;
VectorElementType = (SingleStoreVectorElementType) reader.ReadByte();
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we validate here that we support this element type?

Comment on lines +534 to +536
await using var cmd = new SingleStoreCommand(
"SET SESSION enable_extended_types_metadata = TRUE;",
this);
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

awkward indentation here

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds native SingleStore VECTOR and BSON support to the connector by enabling extended type metadata (via a new EnableExtendedDataTypes connection-string option) and updating type mapping, column metadata, value readers, parameter handling, and bulk copy casting to understand these types.

Changes:

  • Introduces SingleStoreDbType.Vector / SingleStoreDbType.Bson, parses extended protocol metadata for these types, and returns VECTOR values as typed ReadOnlyMemory<T> while treating BSON as byte[].
  • Adds session initialization (SET SESSION enable_extended_types_metadata = TRUE) controlled by EnableExtendedDataTypes (default true) with version-gated fallback/exception behavior.
  • Extends parameter inference/serialization and SingleStoreBulkCopy mapping to correctly serialize/cast VECTOR/BSON; adds docs and expands tests for round-trips and bulk copy.

Reviewed changes

Copilot reviewed 28 out of 28 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
tests/SingleStoreConnector.Tests/SingleStoreConnectionStringBuilderTests.cs Updates tests for the new EnableExtendedDataTypes option default/parse/round-trip.
tests/SideBySide/StoredProcedureTests.cs Adds a stored procedure test validating BSON result typing and value retrieval.
tests/SideBySide/ServerFeatures.cs Adds ServerFeatures.ExtendedDataTypes feature flag.
tests/SideBySide/QueryTests.cs Adds round-trip tests for VECTOR/BSON parameters and supported memory formats.
tests/SideBySide/ParameterTests.cs Adds parameter inference tests for vector-like numeric arrays and extended db types.
tests/SideBySide/ExtendedDataTypeTestUtilities.cs New helper utilities for asserting vector values across multiple data representations.
tests/SideBySide/DataTypesFixture.cs Creates/populates VECTOR and BSON datatype tables when supported.
tests/SideBySide/DataTypes.cs Adds schema and query tests for VECTOR/BSON, plus a legacy-metadata test when disabled.
tests/SideBySide/ConnectionTests.cs Adds reset-connection test ensuring extended types session state is preserved.
tests/SideBySide/BulkLoaderSync.cs Adds bulk copy tests for VECTOR/BSON using DataTable and DataReader (sync).
tests/SideBySide/BulkLoaderAsync.cs Adds bulk copy tests for VECTOR/BSON using DataTable (async).
src/SingleStoreConnector/SingleStoreParameter.cs Adds inference/serialization paths for VECTOR/BSON; refactors binary literal writing and vector literal creation.
src/SingleStoreConnector/SingleStoreDbType.cs Adds Bson and Vector provider/logical types.
src/SingleStoreConnector/SingleStoreDbColumn.cs Surfaces vector dimension/element type schema metadata and adjusts DataType/DataTypeName for extended types.
src/SingleStoreConnector/SingleStoreConnectionStringBuilder.cs Adds EnableExtendedDataTypes connection-string option.
src/SingleStoreConnector/SingleStoreConnection.cs Adds session initialization to enable extended metadata on supported servers and re-applies it on reset.
src/SingleStoreConnector/SingleStoreBulkCopy.cs Adds VECTOR/BSON column mappings using UNHEX(...):>VECTOR(...) and UNHEX(...):>BSON, plus broader vector input support.
src/SingleStoreConnector/Protocol/Payloads/ColumnDefinitionPayload.cs Parses extended type metadata from column definitions (type code, vector dimensions/element type).
src/SingleStoreConnector/Protocol/ColumnType.cs Clarifies that BSON/VECTOR are exposed via extended metadata, not new base column types.
src/SingleStoreConnector/Core/TypeMapper.cs Adds type metadata for BSON/VECTOR and maps extended type codes to provider types.
src/SingleStoreConnector/Core/SingleStoreBinaryValueConverter.cs New shared converter for BSON raw bytes and vector element-to-byte conversion plus inference helpers.
src/SingleStoreConnector/Core/ServerVersions.cs Adds SupportsExtendedDataTypes server version gate (8.5.28+).
src/SingleStoreConnector/Core/Row.cs Updates binary column checks to allow BSON but disallow VECTOR-to-bytes casts.
src/SingleStoreConnector/Core/ConnectionSettings.cs Plumbs EnableExtendedDataTypes and “explicitly set” detection into settings.
src/SingleStoreConnector/ColumnReaders/VectorColumnReader.cs New typed readers for vector element types with length/endianness validation.
src/SingleStoreConnector/ColumnReaders/ColumnReader.cs Routes extended types to appropriate readers (BytesColumnReader for BSON; vector readers for VECTOR).
docs/content/tutorials/best-practices.md Documents how to read/write VECTOR/BSON and the EnableExtendedDataTypes option.
docs/content/connection-options.md Adds EnableExtendedDataTypes connection option documentation and behavior details.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread tests/SideBySide/ParameterTests.cs Outdated
Comment on lines +1 to +2
using SingleStoreConnector.Core;

Comment on lines +387 to +401
[Fact]
public void ExplicitVectorCanUseByteArray()
{
var bytes = new byte[] { 1, 2, 3 };

var parameter = new SingleStoreParameter
{
SingleStoreDbType = SingleStoreDbType.Vector,
Value = bytes,
};

Assert.Equal(DbType.Binary, parameter.DbType);
Assert.Equal(SingleStoreDbType.Vector, parameter.SingleStoreDbType);
Assert.Same(bytes, parameter.Value);
}
Comment on lines +830 to +836
/// <summary>
/// Enables extended data types, by enabling the enable_extended_types_metadata engine variable, that allows the connector to support extended data types, such as VECTOR and BSON
/// </summary>
[Category("Other")]
[DefaultValue(true)]
[Description("Enable extended data types engine variable for VECTOR and BSON support.")]
[DisplayName("Enable extended data types")]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

3 participants