Tabicl model implementation#21
Open
4xel-C wants to merge 8 commits into
Open
Conversation
classifier and embedder. pyproject.toml: Adding optional dependencies for tabicl
management on non fitted models. test.unit.test_tabicl: Redacted unit testing for tabicl model
addition of tabicl regressor and classifier
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
Adds Mother-compatible wrappers for TabICL (classification, regression, and embedding extraction) and wires the new models into the existing ML/CV test matrices, along with dependency metadata updates.
Changes:
- Introduces
TabICLClassifierMother,TabICLRegressorMother, andTabICLEmbeddingTransformerwith Optuna hyperparameter support and sklearn-style APIs. - Adds unit tests covering parameter handling, prediction/uncertainty, and embedding transformer behavior.
- Registers
tabiclas an optional dependency group and includes TabICL in algorithm-selection fixtures.
Reviewed changes
Copilot reviewed 5 out of 6 changed files in this pull request and generated 9 comments.
Show a summary per file
| File | Description |
|---|---|
| test/unit/test_tabicl.py | New unit tests for TabICL wrappers and embedding transformer. |
| test/unit/test_mother_cv.py | Adds tabicl to classification/regression algorithm fixtures for CV tests. |
| test/unit/test_ml.py | Adds tabicl to ML algorithm fixtures. |
| src/mother/ml/models/m_tabicl.py | New Mother wrappers for TabICL + embedding transformer implementation. |
| pyproject.toml | Adds tabicl dependency group (TabICL + torch). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
default quantiles variable. Documentation and typo correction. Correction of Optuna hyperparameter logging that could raises depending of the type of Trial / FixedTrial used through Optuna that may miss the `.number` attribute.
(--extra tabicl) for running tests src.mother.ml.models.m_tabicl.py: solved mutability problems on directly returning the parameters dictionarries. Typo correction. Adding error handling on GroupKFoldCV if a user pass a column of 1 group only. Ensure 2D for X on check_X_y methods.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Add TabICL Support to MotherML
https://github.com/soda-inria/tabicl
Summary
This pull request introduces full MotherML integration for TabICL by adding new model wrappers, uncertainty support for regression, embedding extraction utilities, dedicated unit tests, fixture updates in the shared ML test suite, and optional dependency declarations for TabICL and PyTorch.
The main additions are:
TabICLClassifierMotherandTabICLRegressorMotherinm_tabicl.pyTabICLEmbeddingTransformerfor extracting TabICL representationstest/unit/test_tabicl.pytest/unit/test_ml.pyso shared algorithm-level tests includetabicltest/unit/test_mother_cv.pyso shared algorithm-level tests includetabiclpyproject.tomlfortabiclandtorchMotivation
MotherML already supports several tabular ML backends such as CatBoost, Random Forest and TabPFN. This change extends that ecosystem with TabICL so it can be used through the same MotherML abstractions for:
This keeps the API surface aligned across supported algorithms and allows TabICL to participate in both model-level and framework-level tests.
Main Changes
1. New TabICL wrappers in
src/mother/ml/models/m_tabicl.pyThis PR adds a new module implementing Mother-compatible wrappers around the upstream
tabiclpackage._TabICLHyperParamsIntroduces a shared hyperparameter mixin responsible for:
_init_paramsget_params()andset_params()in a sklearn-compatible wayget_hyperparameter_space()_check_input_type()The Optuna search space is model-aware:
n_estimatorsis always tunablesoftmax_temperatureis only suggested for classifiersaverage_logitsis only suggested for classifiersaverage_logitsis forced toFalsewhenn_estimators == 1outlier_thresholdis only suggested for regressorsTabICLClassifierMotherAdds a Mother wrapper for
TabICLClassifierwith:list,numpy.ndarray, andpandas.DataFrameinputsget_params()andset_params()integration_is_fittedTabICLRegressorMotherAdds a Mother wrapper for
TabICLRegressorwith the same parameter-management behavior, plus:predict_uncertainty()returning a standardized MotherML uncertainty outputreturn_quantiles=Trueuncertainty_for_opt=TrueThe returned regression uncertainty output follows the same MotherML structure:
mean_predictionsknowledge_uncertaintydata_uncertaintytotal_uncertaintyTabICLEmbeddingTransformerAdds a transformer that extracts TabICL row representations via a forward hook on
row_interactor.Key supported behaviors:
StratifiedGroupKFoldandGroupKFoldThis makes TabICL usable as a learned feature extractor inside Mother pipelines.
2. Dedicated TabICL unit tests in
test/unit/test_tabicl.pyThis PR adds a dedicated test file for the new TabICL module.
The test coverage includes:
get_params()/set_params()behaviorThe test suite currently reaches 91% coverage on
src/mother/ml/models/m_tabicl.py.3. Shared ML test integration in
test/unit/test_ml.pyandtest/unit/test_mother_cv.pyThe shared algorithm fixtures in
test/unit/test_ml.pywere updated sotabiclis now recognized as a valid algorithm in framework-level uncertainty tests.Specifically:
all_classification_algorithmsnow instantiatesTabICLClassifierMotherwhenalgorithm == "tabicl"all_regression_algorithmsnow instantiatesTabICLRegressorMotherwhenalgorithm == "tabicl"This was necessary because
ml.get_available_algorithms()already includestabicl, but the fixtures previously only handled:catboostrandomforesttabpfnlasso4. Optional dependencies in
pyproject.tomlThis PR also updates optional dependencies so TabICL can be installed through extras.
The new optional dependency block is:
This is aligned with the existing pattern already used for optional model backends such as TabPFN.
Validation
Complete tests suite have been run, including new tests.
uv run pytestResults:
Notes
TabICLClassifierMothercurrently relies on the generic MotherML fallback implemented inAbstractMotherPipeline.predict_uncertainty(). This is consistent with the behavior already used for other classifiers without explicit uncertainty implementations.TabICLRegressorMotherusing quantile outputs.Tabpfnimplementation.Change Overview
In short, this PR:
m_tabicl.pytabiclparticipates in framework-level teststabiclandtorchdependencies inpyproject.toml