Skip to content

Clean up repo analysis results and refactor #6963

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

ZainRizvi
Copy link
Contributor

@ZainRizvi ZainRizvi commented Jul 31, 2025

Updating script to include more repo analysis and organize it under a subdir since there will be more scripts coming

These are tools used to do a one-off investigation on org repos, but checking them in in case they turn out to be more widely useful

Copy link

vercel bot commented Jul 31, 2025

The latest updates on your projects. Learn more about Vercel for Git ↗︎

1 Skipped Deployment
Name Status Preview Updated (UTC)
torchci ⬜️ Ignored (Inspect) Visit Preview Jul 31, 2025 10:18pm

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jul 31, 2025
@ZainRizvi ZainRizvi force-pushed the zainr/migration-analysis branch from 673af9e to 65e877e Compare July 31, 2025 21:22
@ZainRizvi ZainRizvi force-pushed the zainr/migration-analysis branch from 215695e to 50bc9bb Compare July 31, 2025 22:18
@ZainRizvi ZainRizvi requested a review from a team August 1, 2025 22:34
@ZainRizvi ZainRizvi marked this pull request as ready for review August 1, 2025 22:34
@zxiiro zxiiro requested a review from Copilot August 7, 2025 16:17
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR refactors organization analytics tools by extracting cache management into a reusable module and reorganizing scripts under a dedicated subdirectory structure.

  • Extracted cache management functionality from the main script into a separate cache_manager.py module
  • Added proper requirements.txt and documentation for the analytics toolset
  • Enhanced runner usage analysis with improved filtering and alphabetical sorting of output

Reviewed Changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
tools/analytics/org/requirements.txt Defines Python dependencies for the analytics tools
tools/analytics/org/cache_manager.py New module providing caching functionality for GitHub API responses
tools/analytics/org/analyze_runner_usage.py Refactored to use extracted cache manager and added output sorting
tools/analytics/org/README.md Documentation for the analytics tools and their usage
tools/analytics/org/.gitignore Ignores cache directory and temporary files

Comment on lines +18 to +19
CACHE_DIR.mkdir(exist_ok=True)

Copy link
Preview

Copilot AI Aug 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line creates the global CACHE_DIR but should create self.cache_dir instead. The instance variable cache_dir is set on line 20-21, making this line redundant and potentially confusing.

Suggested change
CACHE_DIR.mkdir(exist_ok=True)

Copilot uses AI. Check for mistakes.

Comment on lines +133 to +139
def make_cached_request(url: str, headers: Dict[str, str]) -> Optional[Dict]:
"""
Make an HTTP request with caching. Returns the JSON response if successful.

Args:
url: The URL to request
headers: Headers for the request (required)
Copy link
Preview

Copilot AI Aug 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The headers parameter should be Optional[Dict[str, str]] to match the usage pattern and handle cases where headers might not be provided, even though the current implementation requires them.

Suggested change
def make_cached_request(url: str, headers: Dict[str, str]) -> Optional[Dict]:
"""
Make an HTTP request with caching. Returns the JSON response if successful.
Args:
url: The URL to request
headers: Headers for the request (required)
def make_cached_request(url: str, headers: Optional[Dict[str, str]] = None) -> Optional[Dict]:
"""
Make an HTTP request with caching. Returns the JSON response if successful.
Args:
url: The URL to request
headers: Optional headers for the request (default: None)

Copilot uses AI. Check for mistakes.

# Add more runner labels to exclude here as needed
]

USELESS_RUNNER_LABELS = [
"self-hosted", # really, a useless label we want to ignoreß
"self-hosted", # really, a useless label we want to ignore
Copy link
Preview

Copilot AI Aug 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment contains a typo: 'ignoreß' should be 'ignore'. The character 'ß' appears to be a copy-paste error.

Copilot uses AI. Check for mistakes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants