Skip to content

[DataPipe] Fix OnDiskCacheHolder to list all files for decompressing operations #203

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 7 commits into from

Conversation

ejguan
Copy link
Contributor

@ejguan ejguan commented Feb 8, 2022

Stack from ghstack:

Add FileLister to make sure OnDiskCacheHolder can list all of files after any 1-to-N operations like decompression.

Differential Revision: D34085743

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Feb 8, 2022
@ejguan ejguan mentioned this pull request Feb 8, 2022
@ejguan ejguan requested a review from NivekT February 8, 2022 20:03
@ejguan ejguan changed the title [DataPipe] Fix OnDiskCacheHolder to list all files for decompressed operations [DataPipe] Fix OnDiskCacheHolder to list all files for decompressing operations Feb 8, 2022
…ompressing operations"



Add `FileLister` to make sure `OnDiskCacheHolder` can list all of files after any 1-to-N operations like decompression.

[ghstack-poisoned]
@@ -137,6 +137,30 @@ def _read_and_decode(x):
self.assertTrue(os.path.exists(expected_csv_path))
self.assertEqual(expected_csv_path, csv_path)

# Cache decompressed archive but only check root directory
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The following test won't pass with previous implementation.

…ompressing operations"



Add `FileLister` to make sure `OnDiskCacheHolder` can list all of files after any 1-to-N operations like decompression.

[ghstack-poisoned]
@ejguan
Copy link
Contributor Author

ejguan commented Feb 8, 2022

@ejguan has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@ejguan ejguan requested a review from VitalyFedyunin February 9, 2022 15:49
Copy link
Contributor

@NivekT NivekT left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

…ompressing operations"



Add `FileLister` to make sure `OnDiskCacheHolder` can list all of files after any 1-to-N operations like decompression.

Differential Revision: [D34085743](https://our.internmc.facebook.com/intern/diff/D34085743)

[ghstack-poisoned]
…ompressing operations"



Add `FileLister` to make sure `OnDiskCacheHolder` can list all of files after any 1-to-N operations like decompression.

Differential Revision: [D34085743](https://our.internmc.facebook.com/intern/diff/D34085743)

[ghstack-poisoned]
@ejguan
Copy link
Contributor Author

ejguan commented Feb 9, 2022

@ejguan has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

…ompressing operations"



Add `FileLister` to make sure `OnDiskCacheHolder` can list all of files after any 1-to-N operations like decompression.

Differential Revision: [D34085743](https://our.internmc.facebook.com/intern/diff/D34085743)

[ghstack-poisoned]
@ejguan
Copy link
Contributor Author

ejguan commented Feb 9, 2022

@ejguan has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

…ompressing operations"



Add `FileLister` to make sure `OnDiskCacheHolder` can list all of files after any 1-to-N operations like decompression.

Differential Revision: [D34085743](https://our.internmc.facebook.com/intern/diff/D34085743)

[ghstack-poisoned]
@ejguan
Copy link
Contributor Author

ejguan commented Feb 9, 2022

@ejguan has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@facebook-github-bot facebook-github-bot deleted the gh/ejguan/19/head branch February 13, 2022 15:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants