Skip to content

Fix EndTranscodingSession() call and potential race#2735

Merged
cyberj0g merged 1 commit intomasterfrom
ip/fix-fv-errors
Jan 25, 2023
Merged

Fix EndTranscodingSession() call and potential race#2735
cyberj0g merged 1 commit intomasterfrom
ip/fix-fv-errors

Conversation

@cyberj0g
Copy link
Copy Markdown
Contributor

@cyberj0g cyberj0g commented Jan 25, 2023

Addresses #2732

Despite all efforts, I was yet unable to reproduce the error locally, however, I added a fix for related issue, and the fix which potentially should address this issue.

Findings

  • EndTranscodingSession() call here was failing, because of context cancelled error. It's not clear why this context, which is, presumably, goes all the way down from HandlePush(), is causing that error on gRPC communication, but it's consistently reproducible. Fixed by changing it to Context.Background().
  • EndTranscodingSession() is called again here on stream end. There might be a race condition between receiving from the stop channel and locking the mutex to cleanup the session on T. We had reports of noop transcoding jobs (empty segments?), could it be that it caused EndTranscodingSession() to be called twice in a very quick succession, if above context cancelled issue doesn't happen on all systems? Addressed by removing the session id from the map right in EndTranscodingSession() with mutex locked.

@cyberj0g cyberj0g requested a review from thomshutt January 25, 2023 13:04
@codecov
Copy link
Copy Markdown

codecov bot commented Jan 25, 2023

Codecov Report

Merging #2735 (def4696) into master (af495ed) will increase coverage by 0.01272%.
The diff coverage is 50.00000%.

Impacted file tree graph

@@                 Coverage Diff                 @@
##              master       #2735         +/-   ##
===================================================
+ Coverage   56.33553%   56.34825%   +0.01272%     
===================================================
  Files             88          88                 
  Lines          19146       19147          +1     
===================================================
+ Hits           10786       10789          +3     
+ Misses          7768        7767          -1     
+ Partials         592         591          -1     
Impacted Files Coverage Δ
core/orchestrator.go 78.20690% <0.00000%> (+0.13793%) ⬆️
core/lb.go 80.11696% <50.00000%> (+0.70520%) ⬆️
server/rpc.go 72.02381% <100.00000%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update af495ed...def4696. Read the comment docs.

Impacted Files Coverage Δ
core/orchestrator.go 78.20690% <0.00000%> (+0.13793%) ⬆️
core/lb.go 80.11696% <50.00000%> (+0.70520%) ⬆️
server/rpc.go 72.02381% <100.00000%> (ø)

@cyberj0g cyberj0g merged commit d8ad5c7 into master Jan 25, 2023
@cyberj0g cyberj0g deleted the ip/fix-fv-errors branch January 25, 2023 15:38
eliteprox pushed a commit to eliteprox/go-livepeer that referenced this pull request Feb 3, 2023
@thomshutt thomshutt mentioned this pull request Feb 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants