TEEHR Cloud is a coordinated set of services built around the Tools for Exploratory Evaluation in Hydrologic Research (TEEHR) to support large scale data analytics for the purpose of evaluating hydrologic model performance.
TEEHR Cloud utilizes Kubernetes to orchestrate the services that make up the system. The primary components are:
- JupyterHub
- Prefect
- Apache Iceberg
- Apache Spark
- Trino
- FastAPI
- React
For local development developers can run a local instance of the cluster using KinD. You will need to have the following installed. The versions shown are known to work for our developers. Subsequent minor and bug fix releases should also work but your milage may vary.
KinD https://kind.sigs.k8s.io/
% kind version
kind v0.27.0 go1.24.0 darwin/arm64Garden https://docs.garden.io/getting-started/quickstart
% garden version
garden version: 0.13.54kubctl https://kubernetes.io/docs/tasks/tools/
% kubectl version
Client Version: v1.33.0
Kustomize Version: v5.5.0
Server Version: v1.33.0Optional, but recommended, k9s https://k9scli.io/
% k9s version
____ __ ________
| |/ / __ \______
| /\____ / ___/
| \ \ / /\___ \
|____|\__ \/____//____ /
\/ \/
Version: v0.50.2
Commit: bc22b8705304b86c2f4c417a088accdfed13fdf8
Date: 2025-04-10T15:32:12ZAfter you have the dependencies above installed you should be able to create a kind cluster by running the following from the repo root.
./kind/create_kind_cluster.sh If the kind cluster creation is successful, you can then run the following to deploy the application to the local cluster:
garden deployThis should create all the services in the cluster. To test, open a browser and go to https://api.teehr.local.app.garden. Two notes:
- We use a self-sign certificate for local development so you will have to accept it in your browser. Specifically, you will need to do so for the API before the dashboards will work by going to
api.teehr.local.app.gardenand accepting the self-signed cert. - Note you may need to edit your
/etc/hostsfile to have this address point to localhost. You likely need the following entries in your/etc/hostsfile.
# Add for TEEHR-HUB development
127.0.0.1 hub.teehr.local.app.garden
127.0.0.1 minio.teehr.local.app.garden
127.0.0.1 dashboards.teehr.local.app.garden
127.0.0.1 api.teehr.local.app.garden
127.0.0.1 panel.teehr.local.app.gardenLoading data is a little fractured depending on what data you are loading. For the purpose of developing there are 2 different types of data that can be loaded. Regardless, you first need to create an Iceberg warehouse in the KinD cluster, then load some data.
-
To create the Iceberg warehouse and load some historic simulation data, start by going to the JupyterHub environment
hub.teehr.local.app.gardenand logging in with username:userand password:password. -
Copy the following notebooks to JupyterHub and run them in order. This will create an Iceberg data warehouse in the KinD cluster and populate it with historic observations and simulations for 10 sites.
examples/01_setup_minio_warehouse.ipynbexamples/02_create_joined_timeseries.ipynbexamples/03_generate_basic_metrics.ipynb
- To load some recent (but not too recent forecasts). This involves port forwarding the Prefect service to localhost, going to Prefect in the browser and executing a couple of workflows.
kubectl config use-context kind-kind
kubectl port-forward -n teehr-hub svc/prefect-server 4200:4200Open your browser and go to http://localhost:4200. Navigate to Deployments.
-
Click on
ingest-usgs-streamflow-obs. In the upper right corner select Run > Custom Run. Change the num_lookback_days to 10, togglestart_spark_clusteroff, and Submit. Monitor the run through the browser UI. When done, proceed to the next one. -
Click on
ingest-nwm-medium-range-streamflow-forecasts. In the upper right corner select Run > Custom Run. Change the end_dt to a date approximately 9 days prior to today, togglestart_spark_clusteroff, and Submit. Monitor the run through the browser UI. When done, proceed to the next one. -
Click on
update-joined-forecast-table. In the upper right corner select Run > Custom Run. Togglestart_spark_clusteroff, and Submit. -
Click on
update-forecast-metrics-table. In the upper right corner select Run > Custom Run. Togglestart_spark_clusteroff, and Submit.
Now go to https://dashboards.teehr.local.app.garden. You should be able to go to both the retrospective and forecast dashboards and see some data.
Now the fun of adding new features and bug fixes starts.
When working on the API or the frontend it is convenient to have code syncing. Code syncing can be done in garden by running:
garden deploy --syncThe JupyterHub deployment supports two types of image configurations:
Built-in Images (e.g., TEEHR Evaluation System) are automatically built and managed by Garden using build actions. External Images (e.g., HEFS-FEWS Evaluation System) are built outside this project and must be manually loaded into the Kind cluster when locally tested.
When using external images in JupyterHub profiles, load them into the Kind cluster:
# Build your image locally (or pull from a registry)
docker build -t hefs-hub:0.3.0 /path/to/external-project
# Load the image into the Kind cluster
kind load docker-image hefs-hub:0.3.0 --name kind
# Verify the image is loaded
docker exec -it kind-control-plane crictl images | grep hefs-hubKey Configuration Points for such images:
- Set
image_pull_policy: IfNotPresent(orNever) inkubespawner_overrideto prevent Kubernetes from attempting to pull from Docker Hub - Use static image names (e.g.,
"hefs-hub:0.3.0") without registry prefix - External images must include the
jupyterhubpackage in their Python environment to provide thejupyterhub-singleusercommand
When you rebuild an external image, reload it into Kind and restart any running pods using that image.
This section will walk you through standing up the `teehr-hub`` in an AWS account. The instructions should generally work with other providers, but some steps will certainly be different.
NOTE: you must clean up complete from other approaches before running this.
Login to AWS. You need to login with a user that has sufficient permissions. We used Admin for testing but need to determine the minimum set of permissions needed.
aws configureOr set a profile
export AWS_PROFILE=ciroh_mdennoPlan Terraform (can take ~15 mins)
cd terraform
terraform init
terraform plan -var-file=teehr-hub.tfvars
cd ..Plan Terraform (can take ~15 mins)
cd terraform
terraform apply -var-file=teehr-hub.tfvars
cd ..Connect to cluster
aws eks update-kubeconfig --name teehr-hub --region us-east-2 --role-arn arn:aws:iam::935462133478:role/teehr-hub-teehr-hub-admin
kubectl config set-context $(kubectl config current-context) --namespace teehr-hub
k9sAfter the cluster is setup in an AWS account, you can deploy using the GitHub Actions.