Skip to content

fix: retry discovery when OIDC endpoint is unreachable at setup#76

Open
matiasjrossi wants to merge 1 commit into
cavefire:mainfrom
matiasjrossi:fix/discovery-retry-on-setup
Open

fix: retry discovery when OIDC endpoint is unreachable at setup#76
matiasjrossi wants to merge 1 commit into
cavefire:mainfrom
matiasjrossi:fix/discovery-retry-on-setup

Conversation

@matiasjrossi

@matiasjrossi matiasjrossi commented May 25, 2026

Copy link
Copy Markdown

Summary

Fixes a startup-time race where the integration permanently fails to load if the OIDC discovery endpoint (configure_url) is briefly unreachable when Home Assistant boots and cached endpoints are not present. At that point, only a manual HA restart recovers. Common on reboots where an HA container shares a host with its IdP and IdP container comes up just after HA. Existing entries with a healthy previous cache are not affected by the boot race because discovery isn't attempted.

Same user-facing symptom as #62 ("HA starts before the IdP is ready, login broken until manual restart"), via a different code path post-#70 refactor — _async_prepare_config raising in the YAML setup path rather than fetch_urls silently swallowing the exception. The shared fix is to give Home Assistant's ConfigEntryNotReady retry loop a chance to run.

Changes

custom_components/openid/__init__.py:

  • async_setup_entry — wrap the _async_prepare_config call and raise ConfigEntryNotReady on failure. HA then retries with its standard exponential backoff instead of marking the entry dead.
  • async_setup (YAML path) — no longer aborts the integration when initial discovery fails. The YAML data is still forwarded to the config entry import flow without the discovered fields, and async_setup_entry then retries discovery via the same ConfigEntryNotReady path. On a subsequent HA start where YAML discovery succeeds, the import flow updates the entry with the full data as before.

Symptom before vs after

Without the patch, this morning's log on my setup:

ERROR (MainThread) [custom_components.openid] Failed to prepare YAML OpenID configuration: Configuration endpoint returned 404
ERROR (MainThread) [homeassistant.setup] Setup failed for custom integration 'openid': Integration failed to initialize.

With the patch, this becomes a WARNING and the entry enters setup_retry until the IdP responds.

Manual verification

Reproduced on HA 2026.4.4 + Authentik:

  1. Stopped HA. In .storage/core.config_entries, stripped authorize_url, token_url, user_info_url, logout_url, use_pkce from the openid entry to force discovery.
  2. Pointed configure_url at a path Authentik returns 404.
  3. Started HA.
Result
Without patch state=not_loaded. Integration dead until manual HA restart.
With patch state=setup_retry
reason="Failed to prepare OpenID configuration: Configuration endpoint returned 404". HA's exponential-backoff retry loop engaged.

Scope notes

  • Behavior on a configuration error (bad client_id/client_secret, malformed YAML, etc.) is unchanged — those still propagate from the import flow / entry setup as before.
  • Behavior when discovery succeeds is unchanged — happy path takes the same branch it always did.
  • The Exception catches are intentionally broad to match the existing style in async_setup and because _async_prepare_config can raise anything aiohttp throws (ClientError, asyncio.TimeoutError, custom RuntimeError from the non-200 status check) — all of which deserve the same retry treatment.

Closes #62

Disclosure: Change was authored using Claude Code, but I understand what it does, directed the tests and the implementation.

When Home Assistant starts before the OIDC provider (or DNS) is fully
available, _async_prepare_config raises while fetching the discovery
document, which causes the integration to fail setup with no automatic
retry — only a Home Assistant restart recovers it. This is a common race
on reboots when HA is on the same host as the IdP (Authentik, Authelia,
Pocket-ID, etc.).

Handle the transient failure in both setup paths:

- async_setup_entry now wraps the discovery call and raises
  ConfigEntryNotReady on failure, so Home Assistant retries with its
  standard exponential backoff instead of marking the entry as dead.
- async_setup (YAML path) no longer aborts the integration when initial
  discovery fails. The YAML data is still forwarded to the config entry
  import flow without the discovered fields, and async_setup_entry then
  retries discovery via the same ConfigEntryNotReady path.

Refs cavefire#62

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@matiasjrossi matiasjrossi marked this pull request as ready for review May 25, 2026 02:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] fetch_urls silently fails on startup leaving authorize_url unpopulated, causing KeyError on login

1 participant