Skip to content

In fs.ftpfs.FTPFS.upload, replaced self._manage_ftp by self.ftp. #457

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Mar 25, 2021

Conversation

atollk
Copy link
Member

@atollk atollk commented Mar 19, 2021

Type of changes

  • Other

(performance improvement)

Checklist

  • I've run the latest black with default args on new code.
  • I've updated CHANGELOG.md and CONTRIBUTORS.md where appropriate.
  • I've added tests for new code.
  • I accept that @PyFilesystem/maintainers may be pedantic in the code review.

Description

Fixes #455

_manage_ftp opens a new FTP connection, which is not necessary in this case, as self._lock is obtained anyway.

See issue 455 for a detailed discussion.
@atollk
Copy link
Member Author

atollk commented Mar 19, 2021

I have left CHANGELOG.md unchanged for now until the full scope of this PR has been set in stone.

I have not added tests since I wouldn't really know how, except for maybe hijacking ftplib itself to check that no inapproprate FTP commands are sent. If that's desired, let me know and I can add it.

@coveralls
Copy link

coveralls commented Mar 19, 2021

Coverage Status

Coverage decreased (-0.0009%) to 95.321% when pulling 20e260b on atollk:issue_455 into 64d7a52 on PyFilesystem:master.

@althonos
Copy link
Member

althonos commented Mar 25, 2021

I think the tests here are already present, as long as the function is tested i think we are fine. Maybe mocking the _manage_ftp, and checking with assert_not_called after calling FTPFS.upload could work.

The question here is whether to extend this to other methods like download, readbytes, writebytes, etc. For instance, in fs.sshfs I implemented all SSHFS methods with a lock, and only open a new connection when a file is obtained with open or openbin. This is likely faster when the FS object is used sequentially, but I'm not sure about parallel performance.

I think we will need benchmarking to know the right approach.

@althonos althonos added this to the v2.4.13 milestone Mar 25, 2021
@atollk
Copy link
Member Author

atollk commented Mar 25, 2021

Sure thing. Is there already an established framework for benchmarks in PyFs or some form of CI?

@althonos
Copy link
Member

I don't think Python has an established benchmarking framework, unfortunately.

In order to avoid blowing the scope of this PR, I'll just merge these changes, and we can consider a performance improvement of larger scale after we made sure which strategy is the most efficient.

@althonos althonos merged commit 94f0323 into PyFilesystem:master Mar 25, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

fs.ftpfs.FTPFS.upload opens new connections
3 participants