Feature: Add external database volume and external database connection#155
Feature: Add external database volume and external database connection#155nshandra wants to merge 9 commits intodevelopmentfrom
Conversation
|
@nshandra Sorry, I had this pending. Can you merge with development? (re-add me when done) |
|
@tokland done. |
| @@ -300,6 +318,9 @@ def run_docker_compose( | |||
| ("ROOT_PATH", ROOT_PATH), | |||
| ("PSQL_ENABLE_QUERY_LOGS", "") if not enable_postgres_queries_logging else None, | |||
| ("GLOWROOT_PORT", get_port_glowroot(glowroot_port)) | |||
There was a problem hiding this comment.
Missing trailing comma, the code does not run as-is.
There was a problem hiding this comment.
Yep, i messed the merge, sorry about that.
| "--external-db-volume", | ||
| metavar="DIRECTORY", | ||
| help="Directory for external database volume", | ||
| ) |
There was a problem hiding this comment.
This option worked for me. Only problem is that the final permissions of the folder are off:
$ ls -al | grep volume
drwx------ 19 70 70 4.0K Apr 14 10:21 volumeHowever, that's something d2-docker has been ignoring all along - we probably should have user: "${UID}:${GID}" or similar in our docker-compose to avoid this kind of problems. Not sure it's the moment to try that (it may have unintended consequences), but I'd be a step in the good direction.
There was a problem hiding this comment.
I did a test with data["services"]["db"]["user"] = "{}:{}".format(os.getuid(), os.getgid()) and as expected it failed with errors like:
initdb: could not look up effective user ID 1000: user does not exist
because "my" user (or more specifically, an user with the same UID) doesn't exist in the db container.
The next test was with data["services"]["db"]["user"] = "0:0". This one works because there should be a superuser (should we run postgresql as root instead of postgres? probably not). In practice i'm not sure how much of an improvement this is, you still need sudo to view the folder.
I guess a better solution is to have a DB container that creates a postgres user whose UID/GID matches the user launching d2-docker.
| type=str, | ||
| metavar="postgresql://user:pass@host:port/dbname", | ||
| help="Use external PostgreSQL database" | ||
| ) |
| "--load-dump-from-data", | ||
| action="store_true", | ||
| help="Load database dump from data container (only with --external-db-url)", | ||
| ) |
There was a problem hiding this comment.
It worked for me, only that I had to explicitly create the database with psql for it to work. Is that the intended workflow?
There was a problem hiding this comment.
In README it explain this option as:
Use option
--load-dump-from-datawith--external-db-urlto import the SQL dump to the external database. Equivalent to running without-k/--keep-containers. The receiving DB should have the appropriate config (DB owner, user permissions, postgis extension).
Given that setting up a new DB can be a complex process (installing packages, sudo access, remote access to the machine hosting the DB and setting IPs in pg_hba.conf) i decided to not include a DB setup in the script.
| else | ||
| echo "Normal mode: $path" | ||
| $psql_cmd < "$path" | ||
| $psql_cmd <"$path" |
There was a problem hiding this comment.
Formatting is being applied to code outside of the PR. It would be fine if the formatting were incorrect, but we don't have a defined prettifier, so there is no "incorrect" for now. Would the best approach be a .vscode/settings.json file where we set up the formatter and its options?
There was a problem hiding this comment.
If im not mistaken .vscode/settings.json would mean using the same shell format plugin or include the configs for the popular ones.
In my case the plugin uses shfmt and that has no typical config file like .pylintrc. Closest thing is using an .editorconfig file with the printer flags:
binary_next_line = false
switch_case_indent = false
space_redirects = false
keep_padding = false
function_next_line = false
Also a format should be decided (not sure how many of our devs have strong preferences in sh formatting). So maybe this could be a topic for the technical decisions meetings?
For this PR i will revert this formatting.
There was a problem hiding this comment.
I don't think anyone has a strong opinion about this, me neither, take the one your prefer.
| ("GLOWROOT_PORT", get_port_glowroot(glowroot_port)) | ||
| ("EXTERNAL_DB_VOLUME", external_db_volume) if external_db_volume else None, | ||
| ("EXTERNAL_DB_URL", external_db_url) if external_db_url else None, | ||
| ("LOAD_DUMP_FROM_DATA", "yes" if load_dump_from_data else "no") if external_db_url else "no", |
There was a problem hiding this comment.
The grouping/parentheses look off, right?
There was a problem hiding this comment.
Yeah, should be either (k, v) or None. With check_conflicting_external_params from src/d2_docker/commands/start.py there is no need for the if external_db_url check.
| logger.info("Validating external database connection...") | ||
| try: | ||
| psql_cmd = ["psql", "-d", db_url, "-c", "SELECT now();", "&> /dev/null"] | ||
| run(psql_cmd, capture_output=False) |
There was a problem hiding this comment.
it seems it worked, but I am curious about "&> /dev/null", since the default for run is shell=False.
There was a problem hiding this comment.
That output capture does nothing indeed. Changed code to hide the SELECT output via capture_output=True. The exception still shows the psql command error output due to raise_on_error = True in run default.
|
|
||
| if args.external_db_url: | ||
| utils.validate_external_db_connection(args.external_db_url) | ||
|
|
There was a problem hiding this comment.
Should we add a check to raise error if both options are passed, as they are mutually exclusive?
Also, a standalone --load-dump-from-data is not correct workflow.
There was a problem hiding this comment.
Added check_conflicting_external_params to raise exceptions with incorrect parameters.
Also added a line in README about the mutually exclusive parameters.
| psql_cmd="$psql_base_cmd -v ON_ERROR_STOP=0" | ||
| psql_strict_cmd="$psql_base_cmd -v ON_ERROR_STOP=1" | ||
| pgrestore_cmd="pg_restore -h db -U dhis -d dhis2" | ||
| pgrestore_cmd="pg_restore ${db_url:-"-h db -U dhis dhis2"}" |
There was a problem hiding this comment.
Before we had -d dhis2 but now it's directly dhis2. Is that correct?
There was a problem hiding this comment.
Checking pg_restore docs it should have -d. Restored.
| if parsed.hostname == "localhost": | ||
| hostname = "host.docker.internal" | ||
| else: | ||
| hostname = parsed.hostname |
There was a problem hiding this comment.
[minor] we can use hostname = ..
There was a problem hiding this comment.
True, changed to:
hostname = "host.docker.internal" if parsed.hostname == "localhost" else parsed.hostname
…t this. README spellcheck.
📌 References
📝 Implementation
Add parameters to d2-docker start to allow either:
Testing
External Persistent Volume Case
This case will store the DB volume at the specified absolute path.
External DB Case
This case needs a psql DB runnig in the machine hosting the dockers or in a accessible network.
This DB has to have the correct settings:
/etc/postgresql/{ver}/main/postgresql.confSet
listen_addressesto:'*'(easiest for testing)'localhost,172.17.0.1'(docker and DB in same system)'localhost,<reachable_ip>'/etc/postgresql/{ver}/main/pg_hba.confAdd an entry to allow the connection. Example:
Running d2-cloner:
#1ck80y8