Skip to content

drivers: Refactor package as a more generic and performant library#2223

Merged
victorges merged 9 commits intomasterfrom
vg/feat/drivers-lib
May 19, 2022
Merged

drivers: Refactor package as a more generic and performant library#2223
victorges merged 9 commits intomasterfrom
vg/feat/drivers-lib

Conversation

@victorges
Copy link
Copy Markdown
Contributor

@victorges victorges commented Feb 1, 2022

What does this pull request do? Explain your changes. (required)
This is to make some improvements to the drivers package so that it can also be used
as a library from other projects, specifically the task-runner for VOD tasks. It is currently
using the drivers library for a platform-agnostic way of accessing Object Stores for reading
and writing VOD assets.

The greatest advantage of that is that we can just use the same Object Store API from
livepeer-com to declare where assets should be stored as well as stream recordings,
keeping the whole interface leaner.

Specific updates (required)

  • Make the Size property optional in our File representation. It is optional in the AWS SDK and is not returned in all implementations of it, like previous versions of Filebase, so it was required at some point (not anymore) but I still think it makes sense to keep it.
  • Make the drivers receive an io.Reader instead of the whole buffered file. The reader is compatible with clients that have the whole bytes in memory, but requiring all clients to buffer the entire contents in memory is not ideal. This would have made the VOD tasks much more inefficient, needing to read the entire files in memory before sending to the OS.
  • Use an S3 Uploader for uploading files to S3. This avoids the need of buffering the files before uploading, since the simpler upload SDK requires the file to be an io.ReadSeeker due to hashing purposes. The uploader is the suggested solution in the open issue in their SDK repo.

How did you test each of these updates (required)

  • Fixed and then ran/passed all tests in go-livepeer
  • Used the drivers lib in the task-runner for VOD and already deployed it in production, it's working great
  • Did a bunch of performance tests with Storj folks to come up with the optimal configurations for the S3 uploader (which end up being good both for Storj and for GCP)

Does this pull request close any open issues?
Part of livepeer/task-runner#14

Checklist:

(does it need a changelog update?)

@victorges victorges changed the base branch from master to vg/fix/drivers-file-size February 1, 2022 16:19
@victorges victorges force-pushed the vg/feat/drivers-lib branch from ad83025 to 32501a0 Compare March 3, 2022 14:16
@victorges victorges changed the base branch from vg/fix/drivers-file-size to master March 29, 2022 23:13
@victorges victorges force-pushed the vg/feat/drivers-lib branch 3 times, most recently from a33e9b4 to 9f8c762 Compare March 29, 2022 23:25
@victorges victorges changed the title Vg/feat/drivers lib drivers: Export drivers package as a more generic and performant library Mar 29, 2022
@victorges victorges changed the title drivers: Export drivers package as a more generic and performant library drivers: Refactor package as a more generic and performant library Mar 29, 2022
@victorges victorges marked this pull request as ready for review March 29, 2022 23:34
@victorges victorges requested review from leszko and yondonfu and removed request for darkdarkdragon March 29, 2022 23:34
Copy link
Copy Markdown
Contributor

@leszko leszko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added 2 comments, other than that LGTM 👍

Also remember to:

  • Update CHANGELOG_PENDING.md
  • Consider squashing some commits before merging

@victorges victorges force-pushed the vg/feat/drivers-lib branch from 9f8c762 to 8726c58 Compare April 13, 2022 15:41
@victorges victorges force-pushed the vg/feat/drivers-lib branch from 8726c58 to f142c67 Compare May 6, 2022 19:09
victorges added 6 commits May 19, 2022 12:51
The Filebase S3-compatible API does not return a
file size in the response for files greater than 255 bytes.

That caused a panic on s3.go:267 when reading recordings.

With this change, the problem won't happen anymore cause
we don't even try to dereference the ContentLength, but
hold it as a nilable pointer that should be checked wherever
it is used (right now: nowhere).
We want to use this lib for saving large files to ObjectStore
as well for the VOD tasks. Since they can be really large, we
don't want to have to buffer the whole files in memory for
saving. So changing this API to receive a Reader instead makes
more sense.
Avoid the need to buffer the whole input in memory before saving.
Will also be more performant for large files.
Increase the concurrency and the part size so the uploads
are more performant in our production environment.

These are also more compatible with Storj's configuration
which saves files in 64MB chunks on the decentralized network,
so that part size also works well for them. But it still improves
upload performance to gcloud in around 10x as well.
@victorges victorges force-pushed the vg/feat/drivers-lib branch from f142c67 to b76b14f Compare May 19, 2022 15:51
@codecov
Copy link
Copy Markdown

codecov bot commented May 19, 2022

Codecov Report

Merging #2223 (1f5892b) into master (30b7330) will decrease coverage by 0.06086%.
The diff coverage is 33.33333%.

❗ Current head 1f5892b differs from pull request most recent head 5c28f2f. Consider uploading reports for the commit 5c28f2f to get more accurate results

Impacted file tree graph

@@                 Coverage Diff                 @@
##              master       #2223         +/-   ##
===================================================
- Coverage   54.93891%   54.87805%   -0.06086%     
===================================================
  Files             93          93                 
  Lines          19316       19352         +36     
===================================================
+ Hits           10612       10620          +8     
- Misses          8115        8140         +25     
- Partials         589         592          +3     
Impacted Files Coverage Δ
core/orchestrator.go 78.75723% <0.00000%> (ø)
drivers/gs.go 10.23622% <0.00000%> (-0.16379%) ⬇️
drivers/s3.go 11.08179% <27.27273%> (+0.94093%) ⬆️
drivers/local.go 50.00000% <30.00000%> (-1.11111%) ⬇️
server/segment_rpc.go 78.66667% <50.00000%> (-0.08519%) ⬇️
drivers/drivers.go 27.96610% <66.66667%> (ø)
drivers/overwrite_queue.go 84.48276% <100.00000%> (ø)
drivers/session_mock.go 52.63158% <100.00000%> (ø)
server/broadcast.go 78.24497% <100.00000%> (ø)
server/mediaserver.go 66.13672% <100.00000%> (ø)
... and 1 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 30b7330...5c28f2f. Read the comment docs.

victorges added 3 commits May 19, 2022 13:12
It returns an error so had to change a few
signatures.
Also add comments with context about how those numbers came up.
@victorges victorges merged commit 13165dd into master May 19, 2022
@victorges victorges deleted the vg/feat/drivers-lib branch May 19, 2022 16:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants