Skip to content

Conversation

@YasserB94
Copy link
Contributor

Why?

Whilst creating a WordPress plugin to send exceptions to Flare. I noticed reports sent, when I invoked them from the wp-cli, they would silently fail.

In 1.0 I had a middleware that would just remove the whole stack trace to resolve this. I spoke to @rubenvanassche about this in a mail-thread a couple weeks ago.

After some digging, I noticed that an error would be thrown if I added JSON_THROW_ON_ERROR to the ReportTrimmer here:

public function needsToBeTrimmed(array $payload): bool
{
return strlen((string)json_encode($payload)) > self::getMaxPayloadSize();
}

it would throw (JsonException: Malformed UTF-8 characters, possibly incorrectly encoded),
further down the road I noticed that, even should I pave the road for the report to reach the API. Either curl would fail, or the API wouldn't like it.

TLDR: Non UTF-8 characters, from binaries end up in the stack trace, silently making the report fail.

Changes in this PR

Disclaimer: I tried to do this right. Although I figured I'd just create a PR and iterate on feedback instead of spending years attempting to wrap some code with a golden bow.

I added a Support class that checks whether an array can be json_encoded. If it can't, it recurses through the whole payload. Replacing any entry that cannot be json_encoded.

For the replacement, I opted to add the error message, the amount of bytes omitted, and a Prefix so it's clear that this was replaced by the client internally

Additional Ideas/caveats

  • Add a config option to enable/disable this behaviour
  • Add an option for early checkout, checking if the whole $payload is fixed upon each replacement and stopping recursion early
  • There is a lot going on in Flare (Thank you Spatie!), I feel like there is a better place to put this code. If anyone shares the same itch, I would love to learn
  • I think currently a resource would be replaced by null in json_encode? Though now it would be replaced by a NON ENCODABLE message.
  • Not sure if I should add a test about this behaviour in ReportTest as well? (Though it may be some repetition asserting that Malformed UTF is replaced.)

Art

After adding these changes, I was able to successfully see my exception/report in flare.
stacktrace_wp-cli

@YasserB94 YasserB94 changed the title Ensure reports are json_encodable. Ensure reports are json_encodable. (Fix silently failing reports with Malformed UTF8 in stacktrace) Aug 11, 2025
@YasserB94
Copy link
Contributor Author

Just as a followup.

It might be better if I just update json_encode occurrences with the following flags instead of the above?
JSON_INVALID_UTF8_SUBSTITUTE
Happy to look into this

@YasserB94
Copy link
Contributor Author

I noticed the following issue in the phpstan workflow (71cae65) output:

Error: Method Spatie\FlareClient\Report::toArray() should return array<string, mixed> but returns array<int|string, mixed>.
Error: Process completed with exit code 1.

This is fixed in 3660995. Though brought up a typing caveat...

As in the following method:

 /**
     * @template K of array-key
     *
     * @param  array<K,mixed>  $input
     * @return array<K,mixed>
     */
    protected static function replaceNonEncodableEntries(array $input): array
    {
        foreach ($input as $key => $value) {
            try {
                json_encode($input[$key], JSON_THROW_ON_ERROR);
            } catch (JsonException $e) {
                if (is_string($value)) {
                    $input[$key] = sprintf('[%s: CORRUPT DATA]: %d bytes | ERR: %s', self::REPLACED_ENTRY_PREFIX, strlen($value), $e->getMessage());
                } elseif (is_array($value)) {
                    $input[$key] = self::replaceNonEncodableEntries($value);
                } else {
                    $input[$key] = sprintf('[%s: NON ENCODABLE]: type %s', self::REPLACED_ENTRY_PREFIX, gettype($value));
                }
            }
        }

        return $input;
    }
    ```
    Should the value be anything other than a string, it would also become a string. Not sure if this arrises an issue? And my `else` statement on the replacement part should ensure the type of the value remains intact?

@rubenvanassche
Copy link
Member

Hi @YasserB94,

Sorry for the late response to your PR we've been quite busy with the whole performance monitoring launch. While I like the idea, you're the first one to have this problem and think it is not that useful to spend computer cycles on making a JSON string from an array to throw it away again.

So what I propose, is that we keep the sanitizer as is, add it to the CurlSender & GuzzleSender with an option to enable sanitizing the payload but disabled by default.

That way we transform to JSON when we need it and immediately also do it for traces.

Last question, does this happen a lot in Wordpress that you've got malformed UTF-8? Haven't had such an issue with Laravel in the last 7 years or so.

Kind regards,

Ruben

@YasserB94
Copy link
Contributor Author

Hey @rubenvanassche,

No worries! I've been loving all the posts and updates from spatie the past weeks!😄

I like that you bring up the compute cycles on processing the json and going through the array. It itched the back of my mind as well.

Thanks for pointing out a better spot to put this code as well, I wasn't sure where to put it all. Though on this code I figured, let's not get paralysed and just await a review.

For your last question, nope it doesn't happen a lot. Exceptions and Errors are handled great by Flare.
I wrote a simple wp flare:test command. Which uses the wp cli to send a flare exception. The CLI itself includes binary data. Which ends up in the raw stack trace, hence my discovery of this.
I could for sure just enable the option here. So that's a great suggestion as well.

I have been unable to discover why raw binary data is added to the php files. It's somewhere on the bottom of my list of rabbit holes to dive into in this wonderful digital world

I'll turn this PR into a draft and do my best at adding the changes above 😄

@YasserB94 YasserB94 marked this pull request as draft November 18, 2025 09:32
@YasserB94 YasserB94 marked this pull request as ready for review November 19, 2025 08:20
@YasserB94 YasserB94 force-pushed the fix/prevent-json-encoding-crashes branch from abb4431 to 10c51c1 Compare November 19, 2025 08:23
- Added shared sanitization behaviour into an AbstractSender class
  - Injected the PayloadSanitizer here
  - Optionally call it in the prepare payload method when configured
  - Used the existing config array to take in the option
  - added the logic from ReportSanitizer into JsonEncodableSanitizer
- Bound the sanitizer in the container
@YasserB94 YasserB94 force-pushed the fix/prevent-json-encoding-crashes branch from 10c51c1 to f15086a Compare November 19, 2025 08:28
@YasserB94
Copy link
Contributor Author

YasserB94 commented Nov 19, 2025

@rubenvanassche

I hope I understood you correctly and applied the changes in f15086a

Changes made

  • I have renamed the ReportSanitizer to PayloadSanitizer
    • This is a contract injected into all Senders, by default it uses the JsonEncodableSanitizer
  • As sanitising the payload is shared between senders I have added an AbstractSender class with a preparePayloadForEncoding method, alongside a $shouldSanitizePayloads property.
    • the property is set using the existing $config array with key sanitize_malformed_data
    • The prepare method will only use the sanitiser when set to true.
    • I updated existing senders to extend this AbstractSender base class.
  • I have updated the container bindings as well to reflect these changes
  • Tests have been updated to test the JsonEncodableSanitizer

Thoughts

  • I wanted to assert that the preparePayloadForEncoding method gets called in both the CurlSender' and GuzzleSender's post method, but I feel like more changes are required to be able to do this (extracting curl methods, injecting the guzzle client,...) and I am not sure if these are suited for this PR?
  • Please do comment on the naming of things should they be unclear to you, I tried not to break my head over these too much 😛
  • I first opted to use a constructor argument instead of the config array to configure this option. Though it felt like I was modifying too much existing code.
  • I refrained from adding a sanitizeMalformedPayloads() option to the config. As I feel like these could be a request for a different PR. (or may not be wanted at all)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants