Skip to content

Add DatasetSchema and ColumnSchema core infrastructure#247

Open
GaganNarula wants to merge 4 commits into
mainfrom
gagan/schema-core
Open

Add DatasetSchema and ColumnSchema core infrastructure#247
GaganNarula wants to merge 4 commits into
mainfrom
gagan/schema-core

Conversation

@GaganNarula
Copy link
Copy Markdown
Collaborator

Summary

  • Adds esp_data/schema.py with DatasetSchema, ColumnSchema, and SchemaValidationError
  • Adds get_dtype() method to DataBackend protocol and its pandas/polars implementations
  • Adds optional schema class attribute and _validate_schema() method to base Dataset class
  • Exports schema classes from esp_data.__init__

Part of

Stacked PR 1/4 for issue #227. Subsequent PRs add tests, per-dataset schemas, and description fields.

Test plan

  • Unit tests will be added in the next PR in the stack
  • Verify from esp_data import DatasetSchema, ColumnSchema works

@GaganNarula GaganNarula requested a review from a team as a code owner March 3, 2026 07:21
Comment thread docs/schema_spec.md
@@ -0,0 +1,259 @@
# RFC: Dataset Schema System

**Status:** Draft
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mil-ad I think we could discuss this, and then with our answers it'll be easier to guide schema development ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant