Skip to content

Conversation

@william-silversmith
Copy link
Contributor

@william-silversmith william-silversmith commented Nov 16, 2025

Adds the ostd file format.

Adds Skeleton.to_osdt and Skeleton.from_osdt.

Adds osdt command line tool for converting, viewing, and printing info from skeleton files.

Why osdt

The ostd file format fills a gap in existing skeleton file formats by offering a self-contained, high performance, bit-rot safe, binary format that supports optional vertex attributes for the serialization of skeletal structures. Incorporating metadata is not a design goal, as this core file is intended to be wrapped in a container file format for that purpose.

Typically skeleton file formats represent the geometry in either text, JSON, or XML which uses excess space and requires a comparatively slow parser. Examples include SWC, CSV/TSV, NML, OBJ. In the case of SWC, it is not easily extensible with additional attributes. Other file formats are designed to handle multiple kinds of objects. SWC only supports trees, when some skeletons may include loops. Other formats, e.g. TRK, only support paths. Precomputed only supports a fully general (and space hungry) edge list.

Precomputed has most of the features one would desire, but is inflexible on the vertex and edge data types and also requires a separate info file to interpret the binary, so is not stand-alone. Furthermore, there is no indication of which physical scale to use (nanometers? micrometers?) which is a weakness of most of the other formats too. Precomputed also only supports an edge list representation, which is space inefficient. Precomputed smartly includes a 3x4 transform matrix for affine transforms to map from, e.g. voxel to physical space, but a fully forward-compatible design would use a 4x4 matrix capable of transforms using homogeneous coordinates, perspective transforms, and is more broadly compatible with graphics pipelines, especially since the matrix remains square.

Another design issue in Precomputed is that it specifies edges lists must be uint32 le, which on its face is a sensible tradeoff between space and maximum representable size, but many years later, we are finally encountering skeletons that are > 2^32 vertices (at least at certain stages of processing).

Run Time Information

Some vital statistics based on a flywire SWC.

ENCODE/DECODE:

save ostd 0.008s
load ostd 0.006s
save swc 0.158s
load swc 0.058s

RELATIVE SIZE:

819K 720575940608181122.ostd
2.2M 720575940608181122.swc

HEADER INFO (doesn't include all info in transform, spatial, or atrributes)

magic:             b'ostd'
version:           0

id:                339587994209515012196002228007625829611

num verts:         48184
num edges:         48183

num components:    1
cable length:      10801182.0 nm
graph type:        TREE
representation:    LINKED_PATHS

vertex dtype:      F32
edge dtype:        U16

vert compression:  NONE
edge compression:  NONE

current space:     1
has_transform?:    True
voxel centered?:   True
coord frame:       +X-Y-Z

header  bytes:     88 bytes
vertex bytes:      578212 bytes
edge bytes:        19746 bytes
index bytes:       0 bytes
attr header bytes: 54 bytes
total bytes:       839096 bytes

crc16:             15890

attributes: radius (nm), vertex_types (0)

@william-silversmith
Copy link
Contributor Author

william-silversmith commented Nov 24, 2025

At least for attributes, it might make more sense to represent units as a 7-tuple of fundamental physical values with their exponents. This could be packed into a uint32. The advantage is that this scheme would allow the user to represent almost any arbitrary unit within a certain range of exponent values (but I think values greater than 4 are pretty rare).

For simplicity, it would be good to use a unified system, but for the header, only a length unit makes sense.

A downside with this system is representing non-metric units is tricky, though how many people would be using imperial units that would be annoyed at having to convert them?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants