Skip to content

feat: improve metadata observability#3463

Draft
Rachit-Gandhi wants to merge 1 commit into
apache:mainfrom
Rachit-Gandhi:feat/metadata-observability-3356
Draft

feat: improve metadata observability#3463
Rachit-Gandhi wants to merge 1 commit into
apache:mainfrom
Rachit-Gandhi:feat/metadata-observability-3356

Conversation

@Rachit-Gandhi

Copy link
Copy Markdown

Summary

  • add contextual wrapping for metadata service-name mapping register/get/remove failures
  • publish and collect metadata mapping register/get/remove metrics from DelegateMetadataReport
  • add MetadataError/MetadataErrorKind so metadata report, RPC, URL-build, and nil-metadata failures are categorizable with context

Verification

  • go test ./metadata/mapping/metadata ./metadata ./metrics/metadata ./registry/servicediscovery -count=1
  • go test ./...

Relates to #3356

@sonarqubecloud

Copy link
Copy Markdown

package servicediscovery

import (
stderrors "errors"

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
stderrors "errors"
"errors"

@Alanxtl Alanxtl left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the new metadata.MetadataError currently feels over-wrapped.

MetadataError carries Kind/Source/App/Revision/RegistryID/StorageType, but Error() only returns the wrapped error string. The production paths also mostly log or wrap the error with %v, so those new fields are not visible in logs or final error messages. At the moment, the fields are only consumed by tests via errors.As, which makes this look like type-level categorization without real observability value. Either those fields should be emitted in logs/metrics/error text, or the PR should keep simpler wrapping at the call sites instead of introducing a public error type.

Comment on lines +127 to +129
var ch = make(chan metrics.MetricsEvent, 10)
metrics.Subscribe(constant.MetricsMetadata, ch)
defer close(ch)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

metadata/report_instance_test.go:127, :173, :218
The tests call metrics.Subscribe(constant.MetricsMetadata, ch) and then defer close(ch). But metrics.Publish keeps sending to the channel stored in the global listener map. Closing the channel manually does not unsubscribe it, so any later metadata metric publish can panic with send on closed channel. These tests should clean up with metrics.Unsubscribe(constant.MetricsMetadata) or use a safer cancellable subscription pattern.、

@Rachit-Gandhi

Copy link
Copy Markdown
Author

Sure @Alanxtl will work on it, Thanks for the review

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[FEATURE] Improve observability and error semantics for application-level metadata / 增强应用级 metadata 链路观测与错误语义

2 participants