Skip to content

MLO-70 multi partition support#59

Merged
MDobransky merged 4 commits into
developfrom
MLO-70_multi_partition_support
Apr 7, 2026
Merged

MLO-70 multi partition support#59
MDobransky merged 4 commits into
developfrom
MLO-70_multi_partition_support

Conversation

@MDobransky
Copy link
Copy Markdown
Collaborator

No description provided.

@MDobransky MDobransky requested a review from vvancak as a code owner March 20, 2026 20:53
Comment thread rialto/runner/runner.py Outdated
df = self._execute(feature_group, run_date, pipeline)
self.writer.write(df, info_date, target)
records = self._check_written(info_date, target)
records = self._check_written(info_date, target, df)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This df comes from df._execute and when you run .collect() on it, it will trigger the whole computation once again. The whole purpose of _check_written instead of df.count() was that we don't compute the same dataframe twice as some of our computations are heavy

Comment thread rialto/runner/runner.py

df = self.reader.get_table(
table.get_table_path(), date_column=table.partition, date_from=date, date_to=date, filters=filters
)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

date_from=info_date, date_to=info_date

@MDobransky MDobransky merged commit 8d0469c into develop Apr 7, 2026
1 check passed
@MDobransky MDobransky deleted the MLO-70_multi_partition_support branch April 7, 2026 11:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants