Skip to content

Feature/new uast serialization #453

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Sep 5, 2018

Conversation

mcarmonaa
Copy link
Contributor

@mcarmonaa mcarmonaa commented Aug 31, 2018

Instead of serialize UASTs as a JSON array of marshaled UAST nodes encoded as base64, now the list of nodes are serialized as a sequence of marshaled nodes preceded by their lengths, it is:

BigEndianInt32(len(marhsal(node))+marshal(node)+
BigEndianInt32(len(marhsal(node))+marshal(node)+
BigEndianInt32(len(marhsal(node))+marshal(node)+
...

The node size is being encoded as an int32 in big endian to make it directly readable from the JVM just in case we will need manage it from spark. If you think there are better options just let me know and I'll change it.

Also, there is a new flag in the cli --old-uast-serialization and an environment variable GITBASE_UAST_SERIALIZATION to choose if this mode or the old one. This way, we don't brake compatibility for the moment.

@mcarmonaa mcarmonaa requested a review from a team August 31, 2018 11:10
@erizocosmico
Copy link
Contributor

I think given this is a v0 we could actually break compatibility and just enable this mode. WDYT @ajnavarro?

@ajnavarro
Copy link
Contributor

@erizocosmico yep, the idea is to use this new format as default, but provide a way to make it work with dependant applications providing the flag (the flag will be removed in next versions).

@@ -56,6 +56,7 @@ type Server struct {
DisableGit bool `long:"no-git" description:"disable the load of git standard repositories."`
DisableSiva bool `long:"no-siva" description:"disable the load of siva files."`
Verbose bool `short:"v" description:"Activates the verbose mode"`
AltUast bool `long:"alt-uast-serialization" description:"serialize uast in a new format" env:"GITBASE_UAST_SERIALIZATION"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you change the flag to do the opposite? it should active old serialization format.

@ajnavarro
Copy link
Contributor

Pinging @bzz @carlosms to make them aware of that change (anyways, on new release notes we will add a paragraph explaining this breaking change).

@ajnavarro ajnavarro requested review from smola and bzz August 31, 2018 14:05
@bzz
Copy link
Contributor

bzz commented Sep 5, 2018

Thank you for kind notice @ajnavarro! @carlosms thank you for taking care of it

@ajnavarro ajnavarro merged commit 63901f6 into src-d:master Sep 5, 2018
@mcarmonaa mcarmonaa deleted the feature/new-uast-serialization branch September 5, 2018 14:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants