Skip to content

Conversation

AbhiRam162105
Copy link
Contributor

Feature Documentation: .gitingest Ignore Patterns

This pull request includes changes to the src/gitingest/query_ingestion.py file to enhance the query ingestion process by applying ignore patterns from a .gitingest file. The most important changes include importing the tomllib library and adding a new function to handle the .gitingest file.

Enhancements to query ingestion:

Overview

This feature introduces support for a .gitingest configuration file in the project root directory that allows users to define file and directory ignore patterns using a TOML format. The .gitingest file enables more flexible and customizable query ingestion by allowing users to exclude specific files or directories from being processed.

Purpose

The primary goal of this feature is to provide users with a simple and effective way to define ignore patterns for files and directories that should not be ingested by the query ingestion process. This reduces unnecessary processing, improves performance, and allows for better control over the indexing of repository contents.

File Format

The .gitingest file is written in TOML format and consists of a [config] section where users specify ignore patterns as a list.

Example .gitingest File:

[config]
ignore_patterns = ["README.md", "tests/", "*.log", "docs/*.md"]

Explanation of Fields:

  • ignore_patterns: A list of file paths, directory paths, or glob patterns that define which files should be excluded from ingestion.
    • Specific file names (e.g., README.md) will be ignored.
    • Directory names (e.g., tests/) will cause all files inside the directory to be ignored.
    • Wildcard patterns (e.g., *.log) allow for flexible filtering of file types.
    • Path-based patterns (e.g., docs/*.md) allow filtering within specific subdirectories.

Future Enhancements

  • Support for negation rules (e.g., exclude everything except a specific file type).
  • Support for inclusion rules (e.g., exclude everything except a specific file type).
  • Allow .gitingest to be placed in subdirectories for more granular control.
  • Writing tests for the .gitingest file working.

By implementing the .gitingest file support, this feature significantly improves the usability and efficiency of the query ingestion system in gitingest.

@AbhiRam162105
Copy link
Contributor Author

@cyclotruc this is a bit of bare bones implementation, I shall look into writing tests for this feature as well as add options to include files.

Until then can you merge this :)

@cyclotruc cyclotruc merged commit f90595d into coderamp-labs:main Feb 19, 2025
18 checks passed
filipchristiansen added a commit that referenced this pull request Mar 13, 2025
…on (#191)


Co-authored-by: Romain Courtois <[email protected]>
Co-authored-by: Filip Christiansen <[email protected]>
filipchristiansen added a commit that referenced this pull request Mar 13, 2025
…on (#191)

Co-authored-by: Romain Courtois <[email protected]>
Co-authored-by: Filip Christiansen <[email protected]>
Signed-off-by: Filip Christiansen <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants