-
Notifications
You must be signed in to change notification settings - Fork 1.3k
chore(policy): polish logging #13379
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
a02eb15 to
a9d05d1
Compare
| } => match original_dst { | ||
| Some(original_dst) => outbound::Backend { | ||
| metadata: Some(Metadata { | ||
| kind: Some(metadata::Kind::Resource(api::meta::Resource { | ||
| group: "policy.linkerd.io".to_string(), | ||
| kind: "EgressNetwork".to_string(), | ||
| name, | ||
| namespace, | ||
| section: Default::default(), | ||
| port: u16::from(policy.port).into(), | ||
| })), | ||
| }), | ||
| } => { | ||
| debug_assert!( | ||
| original_dst.is_some(), | ||
| "Must not serve EgressNetwork for named lookups; IP:PORT required" | ||
| ); | ||
| let metadata = Some(Metadata { | ||
| kind: Some(metadata::Kind::Resource(api::meta::Resource { | ||
| group: "policy.linkerd.io".to_string(), | ||
| kind: "EgressNetwork".to_string(), | ||
| name, | ||
| namespace, | ||
| section: Default::default(), | ||
| port: u16::from(policy.port).into(), | ||
| })), | ||
| }); | ||
|
|
||
| let Some(addr) = original_dst else { | ||
| tracing::error!( | ||
| ?metadata, | ||
| "Unexpected state: EgressNetworks should only be returned when lookup is by IP:PORT; synthesizing invalid backend" | ||
| ); | ||
| return outbound::Backend { | ||
| metadata, | ||
| queue: None, | ||
| kind: None, | ||
| }; | ||
| }; | ||
|
|
||
| outbound::Backend { | ||
| metadata, | ||
| queue: Some(default_queue_config()), | ||
| kind: Some(outbound::backend::Kind::Forward( | ||
| destination::WeightedAddr { | ||
| addr: Some(original_dst.into()), | ||
| addr: Some(addr.into()), | ||
| weight: 1, | ||
| ..Default::default() | ||
| }, | ||
| )), | ||
| }, | ||
| None => { | ||
| tracing::error!("no original_dst for Egresspolicy"); | ||
| outbound::Backend { | ||
| metadata: Some(Metadata { | ||
| kind: Some(metadata::Kind::Default("invalid".to_string())), | ||
| }), | ||
| queue: None, | ||
| kind: None, | ||
| } | ||
| } | ||
| }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@zaharidichev I tried to get clearer about this state, but I want to confirm that I understand it properly: this is really exceptional, it should not be possible for a named address to resolve to an EgressNetwork, so if we hit this, there's a bug.
To that end, I've made a debug_assert! so this will blow up in tests, etc. I've preserved the initial metadata for clarity and clarified the error message.
96d6e52 to
cbf6550
Compare
|
Reconciliation logs have been improved so that we log the number of resources being inspected, as well as the total patches in a reconciliation. Tracing contexts are set so we know which resource is being updated. The awkward "Lease non-holder skipping controller update" DEBUG messages have been consolidated in a helper as fn reconcile_if_leader(&self) {
let lease = self.claims.borrow();
if !lease.is_current_for(&self.name) {
tracing::trace!(%lease.holder, "Reconcilation skipped");
return;
}
self.reconcile();
} |
bd88260 to
5e3ff63
Compare
cratelyn
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this looks good to me. thank you for doing this! ✔️
|
note that two ci jobs appear to be failing, my understanding is that these are known to be flaky. |
5e3ff63 to
712c9bd
Compare
alpeb
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! 👍
There are a few things about the policy controller logging that can be cleaned up for consistency and clarity: * We frequently log ERROR messages when processing resources with unexpected values. These messages are more appropriately emitted at WARN--we want to surface these situations, but they are not really exceptional. * The leadership status of the status controller is not logged at INFO level, so it's not possible to know about status changes without DEBUG logging. * We generally use Sentence-cased log messages when emitting user-facing messages. There are a few situations where we are not consistent. * The status controller reconciliation logging is somewhat noisy and misleading. * The status controller does not log any messages when patching resources. ``` DEBUG status::Index: linkerd_policy_controller_k8s_status::index: Lease holder has changed DEBUG status::Index: linkerd_policy_controller_k8s_status::index: Lease holder reconciling cluster index.name=linkerd-destination-74d7fdc45d-xfb8l DEBUG status::Index: linkerd_policy_controller_k8s_status::index: Lease holder reconciling cluster index.name=linkerd-destination-74d7fdc45d-xfb8l DEBUG status::Index: linkerd_policy_controller_k8s_status::index: Lease holder reconciling cluster index.name=linkerd-destination-74d7fdc45d-xfb8l DEBUG status::Index: linkerd_policy_controller_k8s_status::index: Lease holder has changed DEBUG status::Index: linkerd_policy_controller_k8s_status::index: Lease holder reconciling cluster index.name=linkerd-destination-74d7fdc45d-xfb8l DEBUG status::Index: linkerd_policy_controller_k8s_status::index: Lease holder reconciling cluster index.name=linkerd-destination-74d7fdc45d-xfb8l DEBUG status::Index: linkerd_policy_controller_k8s_status::index: Lease holder reconciling cluster index.name=linkerd-destination-74d7fdc45d-xfb8l DEBUG status::Index: linkerd_policy_controller_k8s_status::index: Lease holder has changed DEBUG status::Index: linkerd_policy_controller_k8s_status::index: Lease holder reconciling cluster index.name=linkerd-destination-74d7fdc45d-xfb8l DEBUG status::Index: linkerd_policy_controller_k8s_status::index: Lease holder reconciling cluster index.name=linkerd-destination-74d7fdc45d-xfb8l DEBUG status::Index: linkerd_policy_controller_k8s_status::index: Lease holder reconciling cluster index.name=linkerd-destination-74d7fdc45d-xfb8l ``` The "Lease holder has changed" message actually indicates that the _lease_ has changed, though the holder may be unchanged. To improve logging clarity, this change does the following: * Adds an INFO level log when the leadership status of the controller changes. * Adds an INFO level log when the status controller patches resources. * Adds DEBUG level logs when the status controller patches resources. * Reconciliation housekeeping logging is moved to TRACE level. * Consistently uses sentence capitalization in user-facing log messages * Reduces ERROR messages to WARN when handling invalid user-provided data (including cluster resources). This ensures that ERRORs are reserved for exceptional policy controller states.
712c9bd to
9bd5387
Compare
There are a few things about the policy controller logging that can be cleaned
up for consistency and clarity:
The "Lease holder has changed" message actually indicates that the lease has changed, though the holder may be unchanged.
To improve logging clarity, this change does the following:
(including cluster resources). This ensures that ERRORs are reserved for
exceptional policy controller states.