#746 Remove CHAR(n) support from data types supported when creting Hive tables. by yruslan · Pull Request #747 · AbsaOSS/pramen

yruslan · 2026-05-18T09:47:23Z

Closes #746

Summary by CodeRabbit

Bug Fixes
- Corrected CHAR data type handling when importing database schemas from JDBC connections—CHAR fields are now properly mapped as VARCHAR types to ensure consistent behavior throughout the system
- Enhanced string field length metadata detection and application with improved validation thresholds to more accurately capture database schema constraints
- Added dedicated support for UUID field types with proper length validation handling and metadata extraction

…ve tables.

coderabbitai · 2026-05-18T09:47:35Z

Warning

Rate limit exceeded

@yruslan has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 53 minutes and 21 seconds before requesting another review.

You’ve run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 1b190d0d-ad08-4982-88e8-5e44c7e090fc

📥 Commits

Reviewing files that changed from the base of the PR and between fd4d604 and c5efad1.

📒 Files selected for processing (1)

pramen/core/src/main/scala/za/co/absa/pramen/core/utils/SparkUtils.scala

Walkthrough

This PR removes CHAR(n) support from JDBC-sourced Spark metadata and standardizes on VARCHAR(n). String-length constants are updated to increase the maximum from 255 to 8192 bytes, type resolution logic is modified to map CHAR metadata to VARCHAR, imports are adjusted, and tests are updated to validate the new mapping behavior.

Changes

CHAR to VARCHAR Migration

Layer / File(s)	Summary
String-length constant definitions `pramen/core/src/main/scala/za/co/absa/pramen/core/utils/JdbcSparkUtils.scala`	`MAXIMUM_CHAR_LENGTH` (255) is removed; `MAXIMUM_VARCHAR_LENGTH` (8192) and `MAXIMUM_UUID_LENGTH` (50) are introduced.
CHAR to VARCHAR type resolution `pramen/core/src/main/scala/za/co/absa/pramen/core/utils/SparkUtils.scala`, `pramen/core/src/test/scala/za/co/absa/pramen/core/tests/utils/SparkUtilsSuite.scala`	`SparkUtils` import is updated to reference `MAXIMUM_VARCHAR_LENGTH`. `getStringTypeFromMetadata` no longer maps `CHAR(len)` metadata to `CharType`; instead, it maps to `VarcharType(len)`. Test assertion for `CHAR(12)` metadata is updated to expect `VarcharType(12)`.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

Poem

🐰 No more CHAR constraints to fear,
VARCHAR opens spaces clear!
From 255 to 8192 we soar,
Hive's limits plague us nevermore!
A humble hop toward compatibility fair.

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The pull request title accurately describes the main change: removing CHAR(n) support and replacing it with VARCHAR(n) for Hive table creation, which aligns with all file modifications.
Linked Issues check	✅ Passed	The pull request successfully implements the objective from issue `#746` by removing CHAR(n) support, updating constants to use VARCHAR(n) with 8192 limit, and modifying metadata handling to map CHAR to VARCHAR instead of CharType.
Out of Scope Changes check	✅ Passed	All changes are directly related to the objective of removing CHAR(n) support from Hive table creation. No unrelated modifications were introduced.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch bugfix/746-remove-char-support-from-hive

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

🧹 Nitpick comments (2)

pramen/core/src/test/scala/za/co/absa/pramen/core/tests/utils/SparkUtilsSuite.scala (1)

706-713: ⚡ Quick win

Update test name to reflect VARCHAR behavior.

The test name says "return char type from CHAR_VARCHAR_METADATA_KEY with char" but the test now expects VarcharType(12) instead of CharType(12). Update the test name to accurately describe that CHAR metadata is now mapped to VarcharType.

📝 Proposed test name fix

-    "return char type from CHAR_VARCHAR_METADATA_KEY with char" in {
+    "return varchar type from CHAR_VARCHAR_METADATA_KEY with char" in {
       val metadata = new MetadataBuilder
       metadata.putString(CHAR_VARCHAR_METADATA_KEY, "CHAR(12)")
 
       val stringType = SparkUtils.getStringTypeFromMetadata(metadata.build())
 
       assert(stringType == VarcharType(12))
     }

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@pramen/core/src/test/scala/za/co/absa/pramen/core/tests/utils/SparkUtilsSuite.scala`
around lines 706 - 713, Rename the failing test's description in
SparkUtilsSuite: change the test name string that currently reads "return char
type from CHAR_VARCHAR_METADATA_KEY with char" to something that reflects
VARCHAR mapping (e.g., "return varchar type from CHAR_VARCHAR_METADATA_KEY with
char") so it matches the assertion that
SparkUtils.getStringTypeFromMetadata(metadata.build()) returns VarcharType(12);
reference the test in SparkUtilsSuite and the CHAR_VARCHAR_METADATA_KEY constant
to locate and update the description only.

pramen/core/src/main/scala/za/co/absa/pramen/core/utils/SparkUtils.scala (1)

420-431: ⚡ Quick win

Update documentation to reflect removal of CharType support.

The method documentation still mentions CharType as a possible return type, but the implementation now only returns VarcharType or StringType (never CharType). Update the doc comment to remove references to CharType.

📝 Proposed documentation fix

   /**
-    * Extracts a string-based data type (CharType, VarcharType, or StringType) from field metadata.
+    * Extracts a string-based data type (VarcharType or StringType) from field metadata.
     *
     * First checks for the presence of character/varchar type metadata key. If found, parses the metadata
-    * value using a pattern to determine if it represents a CHAR or VARCHAR type with a specific length.
+    * value using a pattern to extract the length. Both CHAR and VARCHAR metadata are mapped to VarcharType.
     * If the metadata key is not present, attempts to extract a length value from metadata and creates
     * a VarcharType if successful. Falls back to StringType if no specific type information is found
     * or if the metadata cannot be parsed.
     *
     * `@param` metadata the metadata object containing type information for a field
-    * `@return` the resolved string-based DataType, which can be CharType, VarcharType, or StringType
+    * `@return` the resolved string-based DataType, which can be VarcharType or StringType
     */

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pramen/core/src/main/scala/za/co/absa/pramen/core/utils/SparkUtils.scala`
around lines 420 - 431, Update the ScalaDoc for the string-type extractor in
SparkUtils.scala to remove references to CharType and reflect current behavior:
state that the method (e.g., extractStringBasedDataType) returns either
VarcharType or StringType only, describe how metadata keys are parsed to produce
a VarcharType with length or fall back to StringType, and remove any mention
that CharType may be returned.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@pramen/core/src/main/scala/za/co/absa/pramen/core/utils/SparkUtils.scala`:
- Around line 420-431: Update the ScalaDoc for the string-type extractor in
SparkUtils.scala to remove references to CharType and reflect current behavior:
state that the method (e.g., extractStringBasedDataType) returns either
VarcharType or StringType only, describe how metadata keys are parsed to produce
a VarcharType with length or fall back to StringType, and remove any mention
that CharType may be returned.

In
`@pramen/core/src/test/scala/za/co/absa/pramen/core/tests/utils/SparkUtilsSuite.scala`:
- Around line 706-713: Rename the failing test's description in SparkUtilsSuite:
change the test name string that currently reads "return char type from
CHAR_VARCHAR_METADATA_KEY with char" to something that reflects VARCHAR mapping
(e.g., "return varchar type from CHAR_VARCHAR_METADATA_KEY with char") so it
matches the assertion that
SparkUtils.getStringTypeFromMetadata(metadata.build()) returns VarcharType(12);
reference the test in SparkUtilsSuite and the CHAR_VARCHAR_METADATA_KEY constant
to locate and update the description only.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 554d823d-8ad1-4017-8dec-05e5a93f3a5b

📥 Commits

Reviewing files that changed from the base of the PR and between 3e3f035 and fd4d604.

📒 Files selected for processing (3)

pramen/core/src/main/scala/za/co/absa/pramen/core/utils/JdbcSparkUtils.scala
pramen/core/src/main/scala/za/co/absa/pramen/core/utils/SparkUtils.scala
pramen/core/src/test/scala/za/co/absa/pramen/core/tests/utils/SparkUtilsSuite.scala

💤 Files with no reviewable changes (1)

pramen/core/src/main/scala/za/co/absa/pramen/core/utils/JdbcSparkUtils.scala

github-actions · 2026-05-18T10:00:50Z

Unit Test Coverage

Overall Project	77.26%	🍏

Module	Coverage
pramen:core Jacoco Report	78.3%	🍏

Files

Module	File	Coverage
pramen:core Jacoco Report	JdbcSparkUtils.scala	91.75%	🍏
	SparkUtils.scala	87.08%	🍏

#746 Remove CHAR(n) support from data types supported when creting Hi…

fd4d604

…ve tables.

coderabbitai Bot reviewed May 18, 2026

View reviewed changes

#746 Fix the documentation of the changed method.

c5efad1

yruslan merged commit a4cc52d into main May 18, 2026
7 checks passed

yruslan deleted the bugfix/746-remove-char-support-from-hive branch May 18, 2026 10:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

#746 Remove CHAR(n) support from data types supported when creting Hive tables.#747

#746 Remove CHAR(n) support from data types supported when creting Hive tables.#747
yruslan merged 2 commits into
mainfrom
bugfix/746-remove-char-support-from-hive

yruslan commented May 18, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented May 18, 2026 •

edited

Loading

Rate limit exceeded

Uh oh!

coderabbitai Bot left a comment

Uh oh!

github-actions Bot commented May 18, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

yruslan commented May 18, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rate limit exceeded

Walkthrough

Changes

Estimated code review effort

Poem

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented May 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Unit Test Coverage

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

yruslan commented May 18, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 18, 2026 •

edited

Loading

github-actions Bot commented May 18, 2026 •

edited

Loading