Fix data race in regular expression processing#4065
Conversation
There was a problem hiding this comment.
✅ A review job has been created and sent to the PullRequest network.
@mangalaman93 you can click here to see the review status or cancel the code review job.
There was a problem hiding this comment.
Looks good! Before committing, please check out my comment about two-dimensional slice copying -- apologies if I misunderstood the bug fix.
Reviewed with ❤️ by PullRequest
martinmr
left a comment
There was a problem hiding this comment.
except for a couple of comments.
- add a comment to explain why the fix is needed and what the issue is. Otherwise it's unclear why we are doing this instead of the much simpler
uids = arg.q.UidList. - Make the formatting changes in a separate PR.
Reviewed 2 of 3 files at r1.
Reviewable status: 2 of 3 files reviewed, 2 unresolved discussions (waiting on @mangalaman93, @manishrjain, and @pawanrawal)
worker/task.go, line 964 at r1 (raw file):
// If this is a filter eval, use the given uid list (good) case arg.q.UidList != nil: uids.Uids = append(arg.q.UidList.Uids[:0:0], arg.q.UidList.Uids...)
I would add a comment here explaining why we are doing this for future reference. Why we are doing this instead of just setting uids to arg.q.UidList will not be obvious to someone who lacks the context of this issue.
martinmr
left a comment
There was a problem hiding this comment.
Reviewed 1 of 3 files at r1.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @mangalaman93, @manishrjain, and @pawanrawal)
pawanrawal
left a comment
There was a problem hiding this comment.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @mangalaman93, @manishrjain, and @martinmr)
worker/task.go, line 964 at r1 (raw file):
Previously, martinmr (Martin Martinez Rivera) wrote…
I would add a comment here explaining why we are doing this for future reference. Why we are doing this instead of just setting
uidsto arg.q.UidList will not be obvious to someone who lacks the context of this issue.
I agree, please add a elaborate comment. It took me a while to understand the race here. Basically concurrent regex filters SubGraphs are sharing their SrcUid slice pointer which is passed to processTask in the form of UidList. In this case we mutate this pointer in algo.IntersectWith below and hence the race.
There might be other cases in this package where a similar thing might be happening which makes me wonder if we should just create a copy of SrcUids while passing to UidList in query.go and fix it all in one place.
mangalaman93
left a comment
There was a problem hiding this comment.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @manishrjain, @martinmr, and @pullrequest[bot])
worker/task.go, line 964 at r1 (raw file):
Previously, pullrequest[bot] wrote…
I don't have context on this change or know much about the code, but I'm assuming the race condition has to do with
args.q.UidListbeing mutated elsewhere and this storing a pointer to the slice, instead of the slice's item values. If this is the case and we care about the preserving the original values of the sub items (e.g. arg.q.UidList.Uids[1][2]), then we'll need to copy each individual slice inUids. Also, I'd recommend using golangcopyfunction to make it more clear with what is going on. The following code snippet demonstrates what I'm referring to. It includes three ways of "copying" a slice -- to clarify, I'm recommending using the way foo4 is copied.package main type foo struct { bar [][]string } func main() { foo1 := foo{ bar: [][]string{[]string{"a"}}, } foo2 := foo{} foo3 := foo{} foo4 := foo{} foo3.bar = foo1.bar foo2.bar = append(foo1.bar[:0:0], foo1.bar...) foo4.bar = make([][]string, len(foo1.bar)) for i := range foo1.bar { foo4.bar[i] = make([]string, len(foo1.bar[i])) copy(foo4.bar[i], foo1.bar[i]) } foo1.bar[0][0] = "z" println(foo1.bar[0][0]) // prints z println(foo2.bar[0][0]) // prints z println(foo3.bar[0][0]) // prints z println(foo4.bar[0][0]) // prints a }If my assumption about what we're fixing is wrong, sorry and just ignore me :P
The type of arg.q.UidList.Uids is `[]uint64=. Not sure if that was understood. I also like the idea of using copy function instead. It is more readable.
worker/task.go, line 964 at r1 (raw file):
Previously, pawanrawal (Pawan Rawal) wrote…
I agree, please add a elaborate comment. It took me a while to understand the race here. Basically concurrent regex filters SubGraphs are sharing their SrcUid slice pointer which is passed to processTask in the form of UidList. In this case we mutate this pointer in
algo.IntersectWithbelow and hence the race.There might be other cases in this package where a similar thing might be happening which makes me wonder if we should just create a copy of SrcUids while passing to UidList in
query.goand fix it all in one place.
Done.
I plan to work on adding support for filtering on predicates even when an index doesn't exist. I will write more test cases there to ensure that there are no more races in this code. #4077
0111d7a to
e79698a
Compare
There was a problem hiding this comment.
⚠️ Warning
PullRequest detected a force-push on this branch. This may have caused some information to be lost, and additional time may be required to complete review of the code. Read More
11a28eb to
dc7ebf8
Compare
pawanrawal
left a comment
There was a problem hiding this comment.
Reviewed 1 of 3 files at r1, 2 of 2 files at r2.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @manishrjain, @martinmr, and @pullrequest[bot])
worker/task.go, line 964 at r1 (raw file):
Previously, mangalaman93 (Aman Mangal) wrote…
The type of
arg.q.UidList.Uidsis `[]uint64=. Not sure if that was understood. I also like the idea of using copy function instead. It is more readable.
Are you going to change the code to use copy then?
manishrjain
left a comment
There was a problem hiding this comment.
Reviewable status: all files reviewed, 3 unresolved discussions (waiting on @mangalaman93, @martinmr, and @pullrequest[bot])
worker/task.go, line 968 at r2 (raw file):
// a copy of the list to avoid the race conditions later. uids = &pb.List{} uids.Uids = append(arg.q.UidList.Uids[:0:0], arg.q.UidList.Uids...)
why is this a problem only for regular expressions. Seems like this should be a general enough problem applicable to other things as well.
mangalaman93
left a comment
There was a problem hiding this comment.
Reviewable status: all files reviewed, 3 unresolved discussions (waiting on @manishrjain, @martinmr, and @pullrequest[bot])
worker/task.go, line 964 at r1 (raw file):
Previously, pawanrawal (Pawan Rawal) wrote…
Are you going to change the code to use
copythen?
I did, but then there are two things that I realized (and reverted the change) -
- The code for copying is longer if I use
- More likelihood for bug, for example I forgot to check for
nilcondition.
I have mentioned in the comment that the code below makes a copy.
worker/task.go, line 968 at r2 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
why is this a problem only for regular expressions. Seems like this should be a general enough problem applicable to other things as well.
I thought so too, I did a race condition test but didn't find any. My understanding is that because regexp is only function that is allowed in filter condition even when predicates do not have an index, and this was new code that was added to support the aforementioned case, the race condition is only in this part of the code.
manishrjain
left a comment
There was a problem hiding this comment.
Expand on comments about why this is a problem affecting only this piece of code.
Reviewed 1 of 3 files at r1, 2 of 2 files at r2.
Reviewable status: all files reviewed, 3 unresolved discussions (waiting on @mangalaman93, @martinmr, and @pullrequest[bot])
worker/task.go, line 968 at r2 (raw file):
Previously, mangalaman93 (Aman Mangal) wrote…
I thought so too, I did a race condition test but didn't find any. My understanding is that because
regexpis only function that is allowed infiltercondition even when predicates do not have an index, and this was new code that was added to support the aforementioned case, the race condition is only in this part of the code.
Add a comment with this explanation, so future readers know why you're doing extra work here, when not anywhere else. In general, smaller fixes need longer comments.
dc7ebf8 to
4168eaa
Compare
Fixes #4030
This change is