Skip to content

Conversation

amorenoz
Copy link
Contributor

We all love Wireshark!
Here is a proof of concept of using Wireshark as UI for retis.

Code will surely need some refinement but sending it out soon for early feedback and experimentation.

In a nutshell, the PR does the following

  • Uses schemars to build a Json-schema that represets our Event
  • Adds a custom block to the beginning of the pcap file containing the json schema
  • Adds a custom option to each packet containing the json representation of the event
  • Adds a wireshark plugin that:
    • Creates two dissectors, one for the custom block and one post-dissector (called for every packet): the Retis protocol
    • Dissects the json schema from the custom block and dynamically registers fields for the "Retis" protocol.
    • Dissects every packet option enriching each packet with the retis metadata.
  • Relaxes the requirement of having to provide a probe filter on retis pcap .

The result is all the retis metadata inside Wireshark, the user can create columns and filters based on them.

Further experiments:

  • One of the obvious issues that I found was the TCP Analyzer engine flagging the same packet hitting different probes as a retransmit. I tried changing the interface name from "{dev}-{ns}" to {probe_type}/{probe_name}. That, plus setting "deinterlacing conversations" option to "VMI" (which adds VLAN, MAC and Interface to the 5-tuple), fixes some of the wrong TCP analysis, but not all. A packet can go multiple times through the same probe. For probes such as netif_receive_skb we could add dev and ns back (in addition to probe info) but for other probes (NFT, OVS) we are still going to see the packet multiple times. So maybe we just need a warning somewhere. Maybe taking the issue upstream (in Wireshark community).
  • The name of each field in the Retis protocol is extracted from the "description" in Json-schema which, itself, comes from the doc-comment of each struct. These comments are currently inconsistent, maybe we revisit some. Also, we sometimes tend to write multiple lines as comments (which of course is perfectly right), so I did some arguably dirty trick of truncating the string on the first "." or "\n". Any other ideas? We could also use the json "key" (which is the current fallback), i.e: the name of the field.

Dependencies:

amorenoz added 11 commits June 5, 2025 15:50
Instead of returning a Boxed EventSection, make the factories insert
sections in the event directly.

Signed-off-by: Adrian Moreno <[email protected]>
Instead of a dynamically allocated hashmap of sections, make it a struct
with optional fields.

This has the following benefits:
- cleaner internal usage (no more downcasts)
- getting rid of another Id (SectionId)
- cleaner python access "event.skb"
- cleaner marshaling, Event now implementes {De,Se}rialize
- cleaner event-section generation, no longer needed to internally bind
  an event section to a SectinId

Signed-off-by: Adrian Moreno <[email protected]>
There is no need to treat EventSections as a common trait. For python
handling, we have pyo3::Py and for json-related stuff we have
serde::Serialize.

Since Event is Serialize, Series is as well. So converting them to/from
json can be done directly on the types without going through
serde_json::Value.

Signed-off-by: Adrian Moreno <[email protected]>
Now that Event directly implementes serde::Serialize, we can easily make
Series do so as well.

With that, there is no need to go through serde_json::Value as
intermediate step to serialize and deserialize them.

Signed-off-by: Adrian Moreno <[email protected]>
Using schemars, we can generate a JSON schema that would describe the
Event struct gets serialized.

Signed-off-by: Adrian Moreno <[email protected]>
DO NOT MERGE: It still depends on non-upstreamed pcap-file work.

Add a custom block at the begining of the pcap file containing the
json-schema of the Event and a custom option on each packet with the
serialized Event data.

Signed-off-by: Adrian Moreno <[email protected]>
The retis wireshark plugin reads the json schema from the first
custom block and uses that information to dissect each packet and add
the retis metadata to the protocol tree.

Signed-off-by: Adrian Moreno <[email protected]>
Shamelessly doing so in a way that wireshark events look nice.

Signed-off-by: Adrian Moreno <[email protected]>
Just adding the probe is redundant (already present in interface info).

Signed-off-by: Adrian Moreno <[email protected]>
Signed-off-by: Adrian Moreno <[email protected]>
TEST.
In a naive attempt to make Wireshark not display the same packet
traversing different probes as a retransmit, use the probe as interface.

Signed-off-by: Adrian Moreno <[email protected]>
@atenart
Copy link
Contributor

atenart commented Jul 21, 2025

I tested a bit this PR. Retis captured data integrates quite well with the Wirehark workflow and it's not only a way to see packets in Wireshark while having access to extra Retis data, but also to use that data (for filtering for example). It's really nice!

This is an RFC so I'll only make two high level comments:

  • I'm wondering how generic the dissector actually is. From a quick look it does not seem really linked to Retis but is more a generic way to add extra data in a custom block (JSON formatted). Could this become a third party project living on its own? Or even shipped by default? I know there's a JSON dissector (for JSON in the payload); maybe this could be extended to also support what we do here?
  • When generating the PCAP output in Retis the logic is currently to translate the events 1:1 to their JSON representation. It might however make sense to tweak this representation, e.g. adding the computed tracking id for easier filtering or not including the raw packet as it's duplicate data.

@amorenoz
Copy link
Contributor Author

This is an RFC so I'll only make two high level comments:

* I'm wondering how generic the dissector actually is. From a quick look it does not seem really linked to Retis but is more a generic way to add extra data in a custom block (JSON formatted). Could this become a third party project living on its own? Or even shipped by default? I know there's a JSON dissector (for JSON in the payload); maybe this could be extended to also support what we do here?

At the moment it's quite generic. If it remains generic after the proper review and rework of this feature I agree we could move it outside or even consider proposing it to upstream Wireshark.

* When generating the PCAP output in Retis the logic is currently to translate the events 1:1 to their JSON representation. It might however make sense to tweak this representation, e.g. adding the computed tracking id for easier filtering or not including the raw packet as it's duplicate data.

Agree, some tweaking seems appropriate, e.g: remove the raw packet base64 representation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants