Skip to content
This repository was archived by the owner on Feb 12, 2024. It is now read-only.
This repository was archived by the owner on Feb 12, 2024. It is now read-only.

major bottleneck in the usage of write-file-atomic #1785

Closed
@mcollina

Description

@mcollina
  • Version: master
  • Platform: Mac OS X
  • Subsystem: unix-fs

I think I have identified the source of the majority of latency (and thus GC) issues in js-ipfs. As noted in ipfs/js-ipfs-repo#183, js-ipfs needs to make sure that files are properly written to disk to avoid missing file issues in case of crashes. Moreover, this is on par with the Go implementation.

The write-file-atomic module on npm is used to implement this behavior. In our latest benchmarks, this is the major bottleneck of synchronous and asynchronous interaction of ipfs.

On the synchronous side, write-file-atomic uses fs.realpath(). In fs.realpath(), the LOOP function is showing up in our flamegraphs because it needs to walk down the path and resolve all links: schermata 2018-12-18 alle 17 07 54. This function alone can account for 5-10% of the total execution time.

On the asynchronous side, write-file-atomic schedules at least schedules 5 async file system operations and 11 microtask queue operation. Moreover, it does a lstat for every part ‘/part/’ of a path, and as many process.nextTick queue. If it’s an a folder at depth 6, this will grow to 10 file system operation, and 11 microtask queue operation, and 10 process.nextTick.

js-ipfs writes a lot of small files to disk, and all of that activity adds up. Note that files are not written in parallel, but rather in sequence, so that all that activity is in the hot path and it increase the overall latency of adding and receiving a file. Some of those fs operations can take up to 20ms to complete: when transferring big files, this would immediately become the major bottleneck of ipfs. As an example, adding a file of 64MB with chunk size of 262144 of a folder that is 5 level deep from the root will cause 256 * 5 * 5 = 6400 asynchronous operations, plus 256 * 20 = 5120 scheduling (Promise+nextTick) operations.

https://upload.clinicjs.org/public/c9d1a87e0ac4c3e3ea7330016f069d9c346ea12f3c97c673eded56aad54311ae/72590.clinic-bubbleprof.html#x11-c194

schermata 2018-12-17 alle 22 04 06

To solve this issue, I recommend:

  1. clarify the algorithm for “atomic writes” with the Go team - ideally reviewing their code.
  2. write a new tiny module for doing atomic writes in the context of ipfs, and replace write-file-atomic
  3. investigate if the files could be processed in parallel instead of serially, ideally over a sliding window approach.

Metadata

Metadata

Assignees

No one assigned

    Labels

    P2Medium: Good to have, but can wait until someone steps upexp/expertHaving worked on the specific codebase is important

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions