Dgraph's RDF parser used in dgraph live does not appear to understand \uXXXX escape sequences inside facets.
The dgraph documentation says that it uses the RDF N-Quad spec, which specifies support for the \uXXXX escape sequences, but dgraph's implementation does not appear to respect it.
I tried adding these test cases to the chunker pacakage's TestLex, and the second one fails:
diff --git a/chunker/rdf_parser_test.go b/chunker/rdf_parser_test.go
index f2c45df5..7c733bbd 100644
--- a/chunker/rdf_parser_test.go
+++ b/chunker/rdf_parser_test.go
@@ -503,6 +503,28 @@ var testNQuads = []struct {
},
expectedErr: false,
},
+ {
+ input: `<alice> <lives> "wonderland" (friend="hatter").`,
+ nq: api.NQuad{
+ Subject: "alice",
+ Predicate: "lives",
+ ObjectId: "",
+ ObjectValue: &api.Value{Val: &api.Value_DefaultVal{DefaultVal: `wonderland`}},
+ Facets: []*api.Facet{{Key: "friend", Value: []byte("hatter"), Tokens: []string{"\001hatter"}}},
+ },
+ expectedErr: false,
+ },
+ {
+ input: `<alice> <lives> "wonderland" (friend="hatter \u0045") .`,
+ nq: api.NQuad{
+ Subject: "alice",
+ Predicate: "lives",
+ ObjectId: "",
+ ObjectValue: &api.Value{Val: &api.Value_DefaultVal{DefaultVal: `wonderland`}},
+ Facets: []*api.Facet{{Key: "friend", Value: []byte("hatter E"), Tokens: []string{"\001hatter E"}}},
+ },
+ expectedErr: false,
+ },
{
input: `<alice> <lives> "\u004 wonderland" .`,
expectedErr: true, // should have 4 hex values after \u
The failure:
--- FAIL: TestLex (0.00s)
rdf_parser_test.go:1008:
Error Trace: rdf_parser_test.go:1008
Error: Received unexpected error:
while lexing <alice> <lives> "wonderland" (friend="hatter \u0045"). at line 1 column 37: Not a valid escape char: 'u'
github.com/dgraph-io/dgraph/lex.(*Lexer).ValidateResult
/Users/adg/t/dgraph/dgraph/lex/lexer.go:200
github.com/dgraph-io/dgraph/chunker.ParseRDF
/Users/adg/t/dgraph/dgraph/chunker/rdf_parser.go:84
github.com/dgraph-io/dgraph/chunker.TestLex
/Users/adg/t/dgraph/dgraph/chunker/rdf_parser_test.go:1000
testing.tRunner
/Users/adg/go/src/testing/testing.go:909
runtime.goexit
/Users/adg/go/src/runtime/asm_amd64.s:1357
Test: TestLex
Messages: Got error for input: "<alice> <lives> \"wonderland\" (friend=\"hatter \\u0045\")."
FAIL
This manifests itself in dgraph live in that if you pass it an RDF of the form
<alice> <lives> "wonderland" (friend="hatter \u0045") .
it will fail with the above error.
Dgraph's RDF parser used in
dgraph livedoes not appear to understand\uXXXXescape sequences inside facets.The dgraph documentation says that it uses the RDF N-Quad spec, which specifies support for the
\uXXXXescape sequences, but dgraph's implementation does not appear to respect it.I tried adding these test cases to the chunker pacakage's
TestLex, and the second one fails:The failure:
This manifests itself in
dgraph livein that if you pass it an RDF of the formit will fail with the above error.