Skip to content

Write sparse index #563

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 31 commits into from
Nov 4, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
31 commits
Select commit Hold shift + click to select a range
7f012cf
add `is_sparse` access method for `State`
Oct 20, 2022
8a8a53e
add sparse index text fixtures
Oct 20, 2022
5589a7f
add temporary sparse index playground testfile
Oct 20, 2022
ddaa003
added tests for reading sparse indexes
Oct 20, 2022
762e4cb
capability to write `sdir`extension
Oct 20, 2022
66a675f
added first tests and implementation for writing the `sdir` extension
Oct 20, 2022
77a9d42
updated docs
Oct 27, 2022
b05a2e7
update `gix progress` records
Oct 27, 2022
cd1c752
regenerated archive
Oct 27, 2022
4a6d46f
WIP: sketch out how a write implementation could work
Oct 27, 2022
70963f5
Merge branch 'main' into write-sparse-index
Byron Oct 27, 2022
5bfd947
thanks clippy
Byron Oct 27, 2022
a929bcf
refactor
Byron Oct 27, 2022
2012b27
respect the current 'is_sparse()` state when writing.
Byron Oct 27, 2022
3e37443
Make clear in code that mandatory extensions will always be written…
Byron Oct 27, 2022
3173c0b
added fixture, adjusted tests, refactor
Nov 2, 2022
646b868
thanks clippy
Nov 2, 2022
3683963
refactor
Nov 2, 2022
e41ad0f
add and use `checked_is_sparse()` instead of cached `is_sparse` flag
Nov 2, 2022
c4e6849
Merge branch 'main' into write-sparse-index
Byron Nov 3, 2022
5406630
Merge branch 'main' into write-sparse-index (upgrade to Rust 1.65)
Byron Nov 4, 2022
fe1e646
make note of `extension.worktreeConfig`
Byron Nov 4, 2022
1ec27f8
take note of additional options for promisor packs and partial clone …
Byron Nov 4, 2022
ad44982
notes about the split-index extension.
Byron Nov 4, 2022
e61957e
bake knowledge about sparse related config parameters into types.
Byron Nov 4, 2022
53af48c
Act like git and write a sparse index even if it contains no dir entr…
Byron Nov 4, 2022
0a74625
refactor
Byron Nov 4, 2022
177d1c8
Remove tests and scaffolding code that probably won't be implemented …
Byron Nov 4, 2022
3de621e
update crate-status for `git-index` to match current state.
Byron Nov 4, 2022
49b539b
thanks clippy
Byron Nov 4, 2022
da96d34
plan `index.version` for when we can write V4 indices.
Byron Nov 4, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 17 additions & 5 deletions crate-status.md
Original file line number Diff line number Diff line change
Expand Up @@ -378,12 +378,26 @@ The git staging area.
* [x] REUC resolving undo
* [x] UNTR untracked cache
* [x] FSMN file system monitor cache V1 and V2
* [x] EOIE end of index entry
* [x] IEOT index entry offset table
* [x] 'link' base indices to take information from, split index
* [x] 'sdir' sparse directory entries - marker
* [x] 'sdir' [sparse directory entries](https://github.blog/2021-08-16-highlights-from-git-2-33/) - marker
* [x] verification of entries and extensions as well as checksum
* write
* [x] V2
* [x] V3 - extension bits
* [ ] V4
* extensions
* [x] TREE
* [ ] REUC
* [ ] UNTR
* [ ] FSMN
* [x] EOIE
* [x] 'sdir'
* [ ] 'link'
* `stat` update
* [ ] optional threaded `stat` based on thread_cost (aka preload)
* [ ] handling of `.gitignore` and system file exclude configuration
* [x] handling of `.gitignore` and system file exclude configuration
* [ ] handle potential races
* maintain extensions when altering the cache
* [ ] TREE for speeding up tree generation
Expand All @@ -394,14 +408,12 @@ The git staging area.
* [ ] IEOT index entry offset table
* [ ] 'link' base indices to take information from, split index
* [ ] 'sdir' sparse directory entries
* additional support
* [ ] non-sparse
* [ ] sparse (search for [`sparse index` here](https://github.blog/2021-08-16-highlights-from-git-2-33/))
* add and remove entries
* [x] API documentation
* [ ] Some examples

### git-commitgraph

* [x] read-only access
* [x] Graph lookup of commit information to obtain timestamps, generation and parents, and extra edges
* [ ] Bloom filter index
Expand Down
12 changes: 11 additions & 1 deletion git-index/src/access.rs → git-index/src/access/mod.rs
Original file line number Diff line number Diff line change
@@ -1,6 +1,9 @@
use crate::{entry, extension, Entry, PathStorage, State, Version};
use bstr::{BStr, ByteSlice};

use crate::{entry, extension, Entry, PathStorage, State, Version};
// TODO: integrate this somehow, somewhere, depending on later usage.
#[allow(dead_code)]
mod sparse;

/// General information and entries
impl State {
Expand Down Expand Up @@ -100,6 +103,13 @@ impl State {
pub fn entry(&self, idx: usize) -> &Entry {
&self.entries[idx]
}

/// Returns a boolean value indicating whether the index is sparse or not.
///
/// An index is sparse if it contains at least one [Mode::DIR][entry::Mode::DIR] entry.
pub fn is_sparse(&self) -> bool {
self.is_sparse
}
}

/// Extensions
Expand Down
59 changes: 59 additions & 0 deletions git-index/src/access/sparse.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
/// Configuration related to sparse indexes.
#[derive(Debug, Default, Clone, Copy)]
pub struct Options {
/// If true, certain entries in the index will be excluded / skipped for certain operations,
/// based on the ignore patterns in the `.git/info/sparse-checkout` file. These entries will
/// carry the [`SKIP_WORKTREE`][crate::entry::Flags::SKIP_WORKTREE] flag.
///
/// This typically is the value of `core.sparseCheckout` in the git configuration.
pub sparse_checkout: bool,

/// Interpret the `.git/info/sparse-checkout` file using _cone mode_.
///
/// If true, _cone mode_ is active and entire directories will be included in the checkout, as well as files in the root
/// of the repository.
/// If false, non-cone mode is active and entries to _include_ will be matched with patterns like those found in `.gitignore` files.
///
/// This typically is the value of `core.sparseCheckoutCone` in the git configuration.
pub directory_patterns_only: bool,

/// If true, will attempt to write a sparse index file which only works in cone mode.
///
/// A sparse index has [`DIR` entries][crate::entry::Mode::DIR] that represent entire directories to be skipped
/// during checkout and other operations due to the added presence of
/// the [`SKIP_WORKTREE`][crate::entry::Flags::SKIP_WORKTREE] flag.
///
/// This is typically the value of `index.sparse` in the git configuration.
pub write_sparse_index: bool,
}

impl Options {
/// Derive a valid mode from all parameters that affect the 'sparseness' of the index.
///
/// Some combinations of them degenerate to one particular mode.
pub fn sparse_mode(&self) -> Mode {
match (
self.sparse_checkout,
self.directory_patterns_only,
self.write_sparse_index,
) {
(true, true, true) => Mode::IncludeDirectoriesStoreIncludedEntriesAndExcludedDirs,
(true, true, false) => Mode::IncludeDirectoriesStoreAllEntriesSkipUnmatched,
(true, false, _) => Mode::IncludeByIgnorePatternStoreAllEntriesSkipUnmatched,
(false, _, _) => Mode::Disabled,
}
}
}

/// Describes the configuration how a sparse index should be written, or if one should be written at all.
#[derive(Debug)]
pub enum Mode {
/// index with DIR entries for exclusion and included entries, directory-only include patterns in `.git/info/sparse-checkout` file.
IncludeDirectoriesStoreIncludedEntriesAndExcludedDirs,
/// index with all file entries and skip worktree flags for exclusion, directory-only include patterns in `.git/info/sparse-checkout` file.
IncludeDirectoriesStoreAllEntriesSkipUnmatched,
/// index with all file entries and skip-worktree flags for exclusion, `ignore` patterns to include entries in `.git/info/sparse-checkout` file.
IncludeByIgnorePatternStoreAllEntriesSkipUnmatched,
/// index with all entries, non is excluded, `.git/info/sparse-checkout` file is not considered, a regular index.
Disabled,
}
3 changes: 2 additions & 1 deletion git-index/src/entry/mode.rs
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,8 @@ use bitflags::bitflags;
bitflags! {
/// The kind of file of an entry.
pub struct Mode: u32 {
/// directory (only used for sparse checkouts), equivalent to a tree
/// directory (only used for sparse checkouts), equivalent to a tree, which is _excluded_ from the index via
/// cone-mode.
const DIR = 0o040000;
/// regular file
const FILE = 0o100644;
Expand Down
7 changes: 1 addition & 6 deletions git-index/src/extension/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -93,9 +93,4 @@ pub(crate) mod resolve_undo;
pub mod untracked_cache;

///
pub mod sparse {
use crate::extension::Signature;

/// The signature of the sparse index extension, nothing more than an indicator at this time.
pub const SIGNATURE: Signature = *b"sdir";
}
pub mod sparse;
11 changes: 11 additions & 0 deletions git-index/src/extension/sparse.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
use crate::extension::Signature;

/// The signature of the sparse index extension, nothing more than an indicator at this time.
pub const SIGNATURE: Signature = *b"sdir";

/// Serialize the sparse index extension to `out`
pub fn write_to(mut out: impl std::io::Write) -> Result<(), std::io::Error> {
out.write_all(&SIGNATURE)?;
out.write_all(&0_u32.to_be_bytes())?;
Ok(())
}
1 change: 0 additions & 1 deletion git-index/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -97,7 +97,6 @@ pub struct State {
/// A memory area keeping all index paths, in full length, independently of the index version.
path_backing: PathStorage,
/// True if one entry in the index has a special marker mode
#[allow(dead_code)]
is_sparse: bool,

// Extensions
Expand Down
32 changes: 21 additions & 11 deletions git-index/src/write.rs
Original file line number Diff line number Diff line change
@@ -1,20 +1,24 @@
use std::{convert::TryInto, io::Write};

use crate::{entry, extension, write::util::CountBytes, State, Version};
use std::{convert::TryInto, io::Write};

/// A way to specify which extensions to write.
/// A way to specify which of the optional extensions to write.
#[derive(Debug, Copy, Clone)]
pub enum Extensions {
/// Writes all available extensions to avoid loosing any information, and to allow accelerated reading of the index file.
/// Writes all available optional extensions to avoid loosing any information.
All,
/// Only write the given extensions, with each extension being marked by a boolean flag.
/// Only write the given optional extensions, with each extension being marked by a boolean flag.
///
/// # Note: mandatory extensions
///
/// Mandatory extensions, like `sdir` or other lower-case ones, may not be configured here as they need to be present
/// or absent depending on the state of the index itself and for it to be valid.
Given {
/// Write the tree-cache extension, if present.
tree_cache: bool,
/// Write the end-of-index-entry extension.
end_of_index_entry: bool,
},
/// Write no extension at all for what should be the smallest possible index
/// Write no optional extension at all for what should be the smallest possible index
None,
}

Expand Down Expand Up @@ -90,11 +94,17 @@ impl State {
T: std::io::Write,
{
type WriteExtFn<'a> = &'a dyn Fn(&mut dyn std::io::Write) -> Option<std::io::Result<extension::Signature>>;
let extensions: &[WriteExtFn<'_>] = &[&|write| {
extensions
.should_write(extension::tree::SIGNATURE)
.and_then(|signature| self.tree().map(|tree| tree.write_to(write).map(|_| signature)))
}];
let extensions: &[WriteExtFn<'_>] = &[
&|write| {
extensions
.should_write(extension::tree::SIGNATURE)
.and_then(|signature| self.tree().map(|tree| tree.write_to(write).map(|_| signature)))
},
&|write| {
self.is_sparse()
.then(|| extension::sparse::write_to(write).map(|_| extension::sparse::SIGNATURE))
},
];

let mut offset_to_previous_ext = offset_to_extensions;
let mut out = Vec::with_capacity(5);
Expand Down
Git LFS file not shown
Git LFS file not shown
Git LFS file not shown
Git LFS file not shown
Git LFS file not shown
20 changes: 20 additions & 0 deletions git-index/tests/fixtures/make_index/v2_sparse_index_no_dirs.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
#!/bin/bash
set -eu -o pipefail

git init -q

touch a b c

git add .
git commit -m "init"

git config extensions.worktreeConfig true

git config --worktree core.sparseCheckout true
git config --worktree core.sparseCheckoutCone true
git config --worktree index.sparse true

echo "/*" > .git/info/sparse-checkout &&
echo "!/*/" >> .git/info/sparse-checkout

git checkout main
16 changes: 16 additions & 0 deletions git-index/tests/fixtures/make_index/v3_skip_worktree.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
#!/bin/bash
set -eu -o pipefail

git init -q

touch a b
mkdir c1
(cd c1 && touch a b && mkdir c2 && cd c2 && touch a b)
(cd c1 && mkdir c3 && cd c3 && touch a b)
mkdir d
(cd d && touch a b && mkdir c4 && cd c4 && touch a b c5)

git add .
git commit -m "init"

git sparse-checkout set c1/c2
16 changes: 16 additions & 0 deletions git-index/tests/fixtures/make_index/v3_sparse_index.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
#!/bin/bash
set -eu -o pipefail

git init -q

touch a b
mkdir c1
(cd c1 && touch a b && mkdir c2 && cd c2 && touch a b)
(cd c1 && mkdir c3 && cd c3 && touch a b)
mkdir d
(cd d && touch a b && mkdir c4 && cd c4 && touch a b c5)

git add .
git commit -m "init"

git sparse-checkout set c1/c2 --sparse-index
16 changes: 16 additions & 0 deletions git-index/tests/fixtures/make_index/v3_sparse_index_non_cone.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
#!/bin/bash
set -eu -o pipefail

git init -q

touch a b
mkdir c1
(cd c1 && touch a b && mkdir c2 && cd c2 && touch a b)
(cd c1 && mkdir c3 && cd c3 && touch a b)
mkdir d
(cd d && touch a b && mkdir c4 && cd c4 && touch a b c5)

git add .
git commit -m "init"

git sparse-checkout set c1/c2 --no-cone
Loading