Skip to content

Conversation

c-fteixeira
Copy link
Contributor

Even though we could track/parse logs, a failing container seems more appropriate if it is unable to do its main task. Let me know if this is too drastic..

@c-fteixeira c-fteixeira requested a review from gflarity April 5, 2025 19:18
@gflarity
Copy link
Contributor

gflarity commented Apr 7, 2025

Now that we're using spiffe-jwt instead, there were still issues? It does seem a bit drastic but if something is happening regularily this might be needed.

I believe this will only restart the container though and not the whole pod, so if there's a bug/issue with spiffe-jwt, panic'ing like this might not help at all.

@c-fteixeira
Copy link
Contributor Author

there were still issues?

I would not call it "regularly", but yes, i've noticed a few instances.
The restarts would only make it easier to detect, even if it doesn't auto fix for now.

@gflarity
Copy link
Contributor

gflarity commented Apr 7, 2025

I guess we have no idea why it's happening?

@gflarity
Copy link
Contributor

gflarity commented Apr 7, 2025

Anyways, this is fine for now. I wonder if just having metrics around failures with improve the alerting and then make it easier to debug while it's happening. I've created a ticket for integrating metrics.

@c-fteixeira c-fteixeira merged commit d8ec153 into main Apr 7, 2025
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants