Skip to content

bcftools merge seemingly opens all files together, hitting open file limits #2474

@jchorl

Description

@jchorl

I tried to run bcftools merge on thousands of files. This runs up against the open fd limit on my linux machine.

I managed to reproduce by generating thousands of vcfs, then running:

docker run -it --rm -v $(pwd):/work -w /work --ulimit nofile=2048:2048 ubuntu:24.04 bash

apt-get update
apt-get install -y bcftools

bcftools concat -a -O z -f file-list.txt -o /dev/null

The result:

root@b0b48a108195:/work# bcftools concat -a -O z -f file-list.txt -o /dev/null
Checking the headers and starting positions of 10000 files
[E::hts_idx_load3] Could not load local index file 'generated_vcfs/06_02044_chr6.vcf.bgz.csi' : Too many open files
Failed to open generated_vcfs/06_02044_chr6.vcf.bgz: could not load index

Intuitively, I would expect merge to handle many, many files. I know I can just do a recursive merge, but does it need to open all the files at the same time?

Thanks!!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions