MLE-30013 MLE-30014 MLE-30015 MLE-30016 MLE-30017 MLE-30018 CVEs in dependencies#647
MLE-30013 MLE-30014 MLE-30015 MLE-30016 MLE-30017 MLE-30018 CVEs in dependencies#647rjdew-progress wants to merge 4 commits into
Conversation
rjdew-progress
commented
Jun 26, 2026
- MLE-30013 MLE-30014 MLE-30015 OpenNLP CVEs
- MLE-30016 GHSA-jjwr-xmw6-gf78 in Apache PDFBox v3.0.7
- MLE-30017 MLE-30018 Jetty CVEs
There was a problem hiding this comment.
Pull request overview
Updates several Gradle-managed dependencies to remediate referenced CVEs (OpenNLP/PDFBox/Jetty-related) and adjusts S3/AWS SDK integration and tests to remain compatible with the upgraded AWS SDK/Hadoop stack.
Changes:
- Bump core dependency versions (Hadoop, AWS SDK, LangChain4j, Tika; plus Azure storage client).
- Add explicit AWS SDK Apache HTTP client dependency needed by
hadoop-awsafter excluding the large AWS SDK bundle. - Adjust S3 credential handling usage in code/tests (AWS SDK
DefaultCredentialsProviderinstantiation + test arguments/fixtures).
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
gradle.properties |
Updates shared dependency versions (Hadoop/AWS SDK/LangChain4j/Tika). |
flux-embedding-model-minilm/build.gradle |
Bumps the minilm embeddings artifact version. |
flux-cli/build.gradle |
Adds software.amazon.awssdk:apache-client and bumps Azure Data Lake dependency. |
flux-cli/src/main/java/com/marklogic/flux/impl/S3Params.java |
Updates AWS SDK credentials provider construction for newer SDK versions. |
flux-cli/src/test/java/com/marklogic/flux/impl/S3ParamsTest.java |
Adds fake AWS credentials setup/teardown for deterministic DefaultCredentialsProvider behavior. |
flux-cli/src/test/java/com/marklogic/flux/impl/HandleErrorTest.java |
Uses explicit S3 access key/secret args in the S3 connectivity error test. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| private S3Params params = new S3Params(); | ||
|
|
||
| @BeforeEach | ||
| void setFakeAwsCredentials() { | ||
| System.setProperty("aws.accessKeyId", "fakeAccessKeyId"); | ||
| System.setProperty("aws.secretAccessKey", "fakeSecretKey"); | ||
| } | ||
|
|
||
| @AfterEach | ||
| void clearFakeAwsCredentials() { | ||
| System.clearProperty("aws.accessKeyId"); | ||
| System.clearProperty("aws.secretAccessKey"); | ||
| } |
| hadoopVersion=3.5.0 | ||
| awssdkVersion=2.46.17 | ||
| langchain4jVersion=1.16.3 | ||
| tikaVersion=3.3.1 |
|
|
||
| dependencies { | ||
| implementation "dev.langchain4j:langchain4j-embeddings-all-minilm-l6-v2:1.11.0-beta19" | ||
| implementation "dev.langchain4j:langchain4j-embeddings-all-minilm-l6-v2:1.11.8-beta19" |
rjrudin
left a comment
There was a problem hiding this comment.
I would do all the bumps except forcing Hadoop to a version other than the one that Spark depends on.
| awssdkVersion=2.29.52 | ||
| langchain4jVersion=1.11.0 | ||
| tikaVersion=3.2.3 | ||
| hadoopVersion=3.5.0 |
There was a problem hiding this comment.
Spark can be bumped to 4.1.2 - https://spark.apache.org/releases/spark-release-4-1-2.html - but I strongly recommend not forcing Spark to use a version of Hadoop that it's not tested with. Spark and Hadoop have a tight coupling and once we force Spark to use a different version of Hadoop, we're no longer using the Apache-verified/tested Spark, we're using a custom version of Spark.
Any vulnerabilities associated with Hadoop thus have to be accepted and I think can reasonably be justified as "There are far more users of Apache Spark that are living with these vulnerabilities and waiting for Apache Spark 4.2.0 to be released".
If necessary, you could try bumping to the latest preview release of Spark 4.2.0 to see how many vulnerabilities that addresses.
e45c19d to
ba383ff
Compare
ba383ff to
390f5f7
Compare