Allow sending logs to stdout by using STDOUT_LOG env var #334

magec · 2023-02-28T12:51:21Z

Hi!

We are currently using PgCat in production 💃 (at least for some hours already), thanks for the good work!.

There are two things that have came up during the deployment:

Logs are all written to stderr: We expect error logging to go to stderr and warns/info/etc to stdout. This is because we monitor the output of every piece of software we run and have set alarms to check whether there is an increase in the error logging. Current logging implementation does not allow a fine grained control over this.
Stats filled out the buffer: There is a buffer for sending stats in the code, it is currently set to 100_000. When we started using this, during peaks of traffic, this buffer filled up and led the system into an inconsistent state and a flood of
[2023-02-23T12:09:59.609264Z WARN pgcat::stats] Got event ServerActive { client_id: 1612304972, server_id: 1350173897 } for unregistered server Started. Our current implementation reaches 100k clients connected to 4 different instances of pgcat, so at high traffic, that buffer can fill up.

We propose a solution here where the log can be configured to log to stdout (maintaining backward compatibility), and also where the buffer is increased (by 5).

Logging mechanism

The idea was to use a special kind of logger that allows sending logs to different targets depending on the log level.

By default, if nothing is set, it acts as a regular env_log logger, it sends everything to standard error.

If the Env variable STDOUT_LOG is defined, it will be used for configuring the standard out logger.

The behavior is:

If it is an error, the message is written to standard error.
If it is not, and it matches the log level of the standard output logger (STDOUT_LOG env var), it will be send to standard output.
If the above is not true, it is sent to the stderr logger that will log it or not depending on the value
of the RUST_LOG env var.

So to summarize, if no STDOUT_LOG env var is present, the logger is the default logger. If STDOUT_LOG is set, everything but errors, that matches the log level set in the STDOUT_LOG env var is sent to stdout. You can have also some esoteric configuration where you set RUST_LOG=debug and STDOUT_LOG=info, in here, erros will go to stderr, warns and infos to stdout and debugs to stderr.

In a previous contribution I was said that 'We are not supporting env vars for configuration yet' but given that this is a special case, because is the log system I implemented it this way. If you think the name of the ENV var is not the right one, or have another proposal for the implementation just let me know. Still, independently of the implementation, I think logging everything to stderr is not flexive enough, actually that's why stderr/stdout exists.

If we agree on a merge, I'll write documentation on the README for whaver was agreed upon.

dat2 · 2023-02-28T19:20:44Z

src/main.rs

@@ -160,7 +161,7 @@ fn main() -> Result<(), Box<dyn std::error::Error>> {
        let client_server_map: ClientServerMap = Arc::new(Mutex::new(HashMap::new()));

        // Statistics reporting.
-        let (stats_tx, stats_rx) = mpsc::channel(100_000);
+        let (stats_tx, stats_rx) = mpsc::channel(500_000);


great find! @levkk do you think we could setup an mpmc channel for this to avoid this bottleneck?

Not sure how mpmc will help here, since we'll have to dedup the events in the end. The events count things like statistics, how many clients are connected, queries & transactions sent etc.

Current metrics implementation is a bit lax in guarantees of accuracy. I'm thinking more and more we should switch to atomics instead and count what we need on each ConnectionPool. Since atomics can be read and written to very quickly from many threads, I think it'll work nicely.

This change is good however, to keep us going. 500_000 should be enough, and the only constraint here is memory. The events themselves should be pretty small and since this is just a limit to help process burst workloads, I think this will continue to work.

levkk

This is great!

magec added 2 commits February 28, 2023 13:30

Allow sending logs to stdout by using STDOUT_LOG env var

2ec55fc

Increase stats buffer size

6f521ff

magec force-pushed the better-logging branch from 40b063b to 6f521ff Compare February 28, 2023 13:31

dat2 reviewed Feb 28, 2023

View reviewed changes

levkk approved these changes Feb 28, 2023

View reviewed changes

levkk merged commit 9241df1 into postgresml:main Feb 28, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Allow sending logs to stdout by using STDOUT_LOG env var #334

Allow sending logs to stdout by using STDOUT_LOG env var #334

Uh oh!

magec commented Feb 28, 2023

Uh oh!

dat2 Feb 28, 2023

Uh oh!

levkk Feb 28, 2023 •

edited

Loading

Uh oh!

levkk left a comment

Uh oh!

Uh oh!

Allow sending logs to stdout by using STDOUT_LOG env var #334

Allow sending logs to stdout by using STDOUT_LOG env var #334

Uh oh!

Conversation

magec commented Feb 28, 2023

Logging mechanism

Uh oh!

dat2 Feb 28, 2023

Choose a reason for hiding this comment

Uh oh!

levkk Feb 28, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

levkk left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

levkk Feb 28, 2023 •

edited

Loading