Skip to content

add support to TarArchiver for more common compression suffixes #8369

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

toffaletti
Copy link

@toffaletti toffaletti commented Mar 13, 2025

  • include more suffixes for: gzip, bzip2, lzma, xz

Motivation:

Add support for compression algorithms that can achieve higher compression ratios than gzip.
Support another common extension for gzip as well.

Modifications:

I modified the list of supported extensions for the TarArchiver.

I researched various manpages to see what was commonly supported.

  • The GNU tar manpage has the most extensive list of suffixes.
  • FreeBSD tar supports most of what GNU tar does in terms of compression algorithms.
  • OpenBSD tar doesn't support the a flag used for auto-compress which TarArchiver.compress uses ref. It also doesn't seem to have anything beyond gzip and bzip2 (no lzma or xz).
  • NetBSD tar has gzip, bzip2, lzma, xz
  • Windows tar.exe seems to be a port of BSD tar similar in support to NetBSD.
  • macOS tar is an older FreeBSD tar, similar support to NetBSD.

Result:

More compression algorithms will be supported.

- include more suffixes for: gzip, bzip2, lzma, xz
@toffaletti
Copy link
Author

I wasn't sure about adding test files for all these extensions?

@@ -16,7 +16,14 @@ import struct TSCBasic.FileSystemError

/// An `Archiver` that handles Tar archives using the command-line `tar` tool.
public struct TarArchiver: Archiver {
public let supportedExtensions: Set<String> = ["tar", "tar.gz"]
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It wasn't clear to me whether "tar.gz" is really needed or if "gz" is good enough here? The tar commands don't seem to care:

 tar tzf test.gz | head
6.0.3-RELEASE_rhel_ubi9_x86_64.artifactbundle/
6.0.3-RELEASE_rhel_ubi9_x86_64.artifactbundle/6.0.3-RELEASE_rhel_ubi9_x86_64/
6.0.3-RELEASE_rhel_ubi9_x86_64.artifactbundle/info.json
6.0.3-RELEASE_rhel_ubi9_x86_64.artifactbundle/6.0.3-RELEASE_rhel_ubi9_x86_64/x86_64-unknown-linux-gnu/

@dschaefer2
Copy link
Member

I wasn't sure about adding test files for all these extensions?

Tests would be extremely helpful. It's hard to know whether these things work or will stay working without them.

@MaxDesiatov MaxDesiatov added the needs tests This change needs test coverage label Mar 17, 2025
@MaxDesiatov
Copy link
Contributor

MaxDesiatov commented Mar 17, 2025

Another thing to keep in mind that not all of these formats are supported by tar on all platforms that SwiftPM supports. We need to make sure that in case of unsupported formats appropriate diagnostic message is emitted to the user and that this support is detected correctly on a given platform.

@toffaletti
Copy link
Author

Another thing to keep in mind that not all of these formats are supported by tar on all platforms that SwiftPM supports. We need to make sure that in case of unsupported formats appropriate diagnostic message is emitted to the user and that this support is detected correctly on a given platform.

Is there a canonical list of supported platforms and their versions?

@MaxDesiatov
Copy link
Contributor

Is there a canonical list of supported platforms and their versions?

Yes, it's available at https://swift.org/install

@tbkka
Copy link

tbkka commented Jul 15, 2025

There are many, many different tar implementations on many different platforms. If you want consistent cross-platform behavior, you'll need to implement this yourself. A few notes:

  • GNU tar only detects the compression and format when reading from a file on disk. It can't auto-detect when reading from stdin.
  • bsdtar can auto-detect from any input source.
  • Few other tar implementations do any auto-detection at all. So for portability, you'll need to pass appropriate command-line flags to tar.
  • The specific compression formats supported by either of the above depends on how it was built and what platform it's running on. Both will attempt to run command-line gzip, xz, etc to handle decompression, which means that it requires the appropriate command-line tool to be installed. bsdtar can be built to use compression libraries for fully internal compression, but those are all optional. So the only way to know for sure is to actually try feeding something to tar -- you can't just look at the OS version and know anything useful. (In particular, different Linux distros vary widely in what tar implementation they include by default. Many use GNU tar, but by no means all.)
  • Windows only recently started including tar as a standard command-line utility. Older Windows systems lack this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs tests This change needs test coverage
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants