chore(loader): improve the perf in loader-ci#724
Conversation
|
@imbajin could you please take a look when you are convenient❤️ |
| echo "PATH=${PATH}:${HADOOP_HOME}/bin:${HADOOP_HOME}/sbin" >> "${GITHUB_ENV}" | ||
| fi | ||
|
|
||
| echo "export HADOOP_HOME=${HADOOP_HOME}" >> ~/.bashrc |
There was a problem hiding this comment.
~/.bashrc appends still make it non-idempotent: every invocation adds another Hadoop segment to PATH, so a reused runner or local repro environment keeps growing shell state. It would be safer to guard these writes (or rely on GITHUB_ENV/GITHUB_PATH in CI) so repeated runs stay stable.
| echo "export HADOOP_HOME=${HADOOP_HOME}" >> ~/.bashrc | |
| if ! grep -qxF "export HADOOP_HOME=${HADOOP_HOME}" ~/.bashrc; then | |
| echo "export HADOOP_HOME=${HADOOP_HOME}" >> ~/.bashrc | |
| fi | |
| if ! grep -qxF "export HADOOP_COMMON_LIB_NATIVE_DIR=${HADOOP_HOME}/lib/native" ~/.bashrc; then | |
| echo "export HADOOP_COMMON_LIB_NATIVE_DIR=${HADOOP_HOME}/lib/native" >> ~/.bashrc | |
| fi | |
| if ! grep -qxF "export PATH=\$PATH:${HADOOP_HOME}/bin:${HADOOP_HOME}/sbin" ~/.bashrc; then | |
| echo "export PATH=\$PATH:${HADOOP_HOME}/bin:${HADOOP_HOME}/sbin" >> ~/.bashrc | |
| fi |
| DB_PASS="${2:-root}" | ||
|
|
||
| # Skip if MySQL is already running (e.g., provided by GitHub Actions service) | ||
| if docker ps | grep -q "mysql:5.7"; then |
There was a problem hiding this comment.
mysql:5.7 container running, the script now exits without ensuring the requested test container name/password/port match what the loader tests expect.
That can turn a clean setup failure into a much harder-to-diagnose auth/connection failure later. Could we scope the idempotency check to the requested container instead?
| if docker ps | grep -q "mysql:5.7"; then | |
| if docker ps --format '{{.Names}}' | grep -qx "${DB_NAME}"; then | |
| echo "MySQL container ${DB_NAME} is already running, skipping setup." | |
| exit 0 | |
| fi |
Also consider upgrading the mysql version to 8.x (stable version) if possible
| set -ev | ||
|
|
||
| sudo wget https://archive.apache.org/dist/hadoop/common/hadoop-2.8.5/hadoop-2.8.5.tar.gz | ||
| HADOOP_VERSION="2.8.5" |
There was a problem hiding this comment.
Prefer to use 3.x (stable version)
| @@ -23,12 +27,20 @@ TRAVIS_DIR=$(dirname $0) | |||
| CONF=hugegraph-test/src/main/resources/hugegraph.properties | |||
| MYSQL_USERNAME=root | |||
|
|
|||
There was a problem hiding this comment.
mysql, so the guard will miss the case you want to avoid and the script will still try to start a second MySQL instance.
Consider checking whether port 3306 is already occupied or probing the database itself instead.
| .setParameter("rewriteBatchedStatements", "true") | ||
| .setParameter("useServerPrepStmts", "false") | ||
| .setParameter("autoReconnect", "true"); | ||
| .setParameter("autoReconnect", "true") |
There was a problem hiding this comment.
buildUrl() is shared by MYSQL and POSTGRESQL. Putting a MySQL-only auth workaround in the common path leaks this flag into the PostgreSQL URL too.
Please scope allowPublicKeyRetrieval to the MySQL branch instead.
| .setParameter("autoReconnect", "true") | |
| uriBuilder.setParameter("useSSL", "false") | |
| .setParameter("characterEncoding", Constants.CHARSET.name()) | |
| .setParameter("rewriteBatchedStatements", "true") | |
| .setParameter("useServerPrepStmts", "false") | |
| .setParameter("autoReconnect", "true"); |
Purpose of the PR
Main Changes
mysql:8.0container service with health check configured. This eliminates the need to executedocker pullanddocker runfor every workflow run.~/hadoop-3.3.6.tar.gzwith a fixed cache keyhadoop-3.3.6.~/hugegraph-cache-${{ env.COMMIT_ID }}; the cache key contains the commit ID to enable auto-invalidation when the version changes.install-mysql.sh: MySQL is now provisioned by GitHub Actions services./usr/local/hadoopalready exists; use the cached tar.gz file directly if it exists (while the local directory is missing) to avoid repeated downloads from archive.apache.org.GITHUB_ENV: Ensure subsequent steps can correctly obtainHADOOP_HOMEandPATHenvironment variables.core-site.xmlandhdfs-site.xmlif the configuration files do not exist or their content is mismatched.docker pullanddocker runif themysql:8.0container is already running (e.g., provided by GitHub Actions services).apache-hugegraph-*.tar.gzexists in~/hugegraph-cache-${COMMIT_ID}before building. Use the cached tarball directly and skip the time-consuminggit clone+mvn packageif it exists; copy the tarball to the cache directory for reuse in future runs after the build if it does not exist.result:

Verifying these changes
Does this PR potentially affect the following parts?
Documentation Status
Doc - TODODoc - DoneDoc - No Need