Describe the feature
By default OpenCHAMI cycles certificates daily. Some services are restarted automatically, but not all. For example, after a certificate rotation, I see the following in the coresmd-coredns logs:
May 22 08:29:03 admin coresmd-coredns[869438]: time="2026-05-22T07:29:03Z" level=info msg="initiating cache refresh" prefix="plugins/coresmd"
May 22 08:29:03 admin coresmd-coredns[869438]: time="2026-05-22T07:29:03Z" level=error msg="failed to refresh cache: failed to fetch EthernetInterfaces from SMD: failed to execute HTTP request: Get "https://admin.cluster.hpcnexuslab.ie:8443/hsm/v2/Inventory/EthernetInterfaces": tls: failed to verify certificate: x509: certificate has expired or is not yet valid: current time 2026-05-22T07:28:33Z is after 2026-05-22T07:28:21Z" prefix="plugins/coresmd"
I am running podman containers.
Ideally, when certificates rotate, all the services which serve certificates would be restarted to use the new certificates.
(ref: https://openchami.slack.com/archives/C066RMDS708/p1780222681743229?thread_ts=1779814910.245869&cid=C066RMDS708)
Why do you want this feature?
I am currently having to manually restart OpenCHAMI daily.
Alternatives you've considered
My current workaround is a cron job to perform the restart. However, this is sub-optimal in that:
- Certificates rotate daily. A daily cron job would leave a potential window where certificates are invalid but the restart job has not yet fired. Additionally, if misaligned, this period could be quite long
- A cron job could perform unnecessary restarts if a user adjusts the certificate validity period to be longer.
- Not all services need a restart and restarting these could impair ongoing system operations.
Therefore, I think a better solution is to adjust the systemd unit files to force restarts for the containers where a restart is required after the certificate job fires.
Additional context
No response
Code of Conduct
Describe the feature
By default OpenCHAMI cycles certificates daily. Some services are restarted automatically, but not all. For example, after a certificate rotation, I see the following in the
coresmd-corednslogs:I am running podman containers.
Ideally, when certificates rotate, all the services which serve certificates would be restarted to use the new certificates.
(ref: https://openchami.slack.com/archives/C066RMDS708/p1780222681743229?thread_ts=1779814910.245869&cid=C066RMDS708)
Why do you want this feature?
I am currently having to manually restart OpenCHAMI daily.
Alternatives you've considered
My current workaround is a cron job to perform the restart. However, this is sub-optimal in that:
Therefore, I think a better solution is to adjust the systemd unit files to force restarts for the containers where a restart is required after the certificate job fires.
Additional context
No response
Code of Conduct