Skip to content

Conversation

Anarogk
Copy link
Collaborator

@Anarogk Anarogk commented Jul 29, 2025

  • Added support for Azure devops to load Orgs, Repos, projects and members.
  • Added clean up scripts for the same

@Anarogk Anarogk requested a review from mpurusottamc July 29, 2025 16:21
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds comprehensive Azure DevOps support to cartography, enabling data ingestion from Azure DevOps organizations, projects, repositories, and members. The implementation follows cartography's established patterns for third-party service integrations with proper authentication, data transformation, and cleanup mechanisms.

Key changes include:

  • Added Azure DevOps SDK dependency and authentication flow using OAuth 2.0 refresh tokens
  • Implemented sync modules for organizations, projects, repositories, and members with Neo4j graph storage
  • Added CLI configuration options and cleanup job definitions

Reviewed Changes

Copilot reviewed 15 out of 15 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
setup.py Added azure-devops dependency with version constraints
cartography/sync.py Added Azure DevOps import and sync function builder
cartography/intel/azuredevops/ Core Azure DevOps integration modules for API calls, data sync, and graph operations
cartography/data/jobs/cleanup/ Cleanup job definitions for Azure DevOps entities
cartography/config.py Added Azure DevOps configuration parameter
cartography/cli.py Added CLI argument parsing and Azure DevOps runner function
Comments suppressed due to low confidence (1)

setup.py:78

  • The azure-devops package version constraint specifies >=7.0.0,<8.0.0, but the azure-devops package may not have version 7.0.0 available. The package follows different versioning patterns and this constraint should be verified against actual available versions.
        "azure-devops>=7.0.0,<8.0.0",

account['client_secret'],
account['refresh_token'],
)
if not token_data or 'access_token' not in token_data:
Copy link
Preview

Copilot AI Jul 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The get_access_token function returns a string (access token), but this code expects a dictionary with an 'access_token' key. This logic error will cause authentication to always fail.

Copilot uses AI. Check for mistakes.

logger.error(f"Failed to retrieve Azure DevOps access token for tenant {account.get('tenant_id')}")
continue

access_token = token_data['access_token']
Copy link
Preview

Copilot AI Jul 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line attempts to access 'access_token' key from token_data, but get_access_token returns a string directly, not a dictionary. This will cause a TypeError.

Copilot uses AI. Check for mistakes.

Comment on lines 51 to 54
access_token,
org_name,
url,
common_job_parameters,
Copy link
Preview

Copilot AI Jul 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The organization.sync function is called with incorrect parameter order. Based on the function definition, it expects (neo4j_session, common_job_parameters, access_token, url, org_name), but it's being called with (neo4j_session, access_token, org_name, url, common_job_parameters).

Suggested change
access_token,
org_name,
url,
common_job_parameters,
common_job_parameters,
access_token,
url,
org_name,

Copilot uses AI. Check for mistakes.

Comment on lines 80 to 84
run_cleanup_job(
"azure_devops_projects_cleanup.json",
neo4j_session,
common_job_parameters,
) No newline at end of file
Copy link
Preview

Copilot AI Jul 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The cleanup function is called twice in this sync function - once at line 62 through cleanup() and again at lines 80-84. This is redundant and could cause unnecessary database operations.

Suggested change
run_cleanup_job(
"azure_devops_projects_cleanup.json",
neo4j_session,
common_job_parameters,
)
# Cleanup is already handled by the call to cleanup() at line 62.

Copilot uses AI. Check for mistakes.

Comment on lines 104 to 108
run_cleanup_job(
"azure_devops_members_cleanup.json",
neo4j_session,
common_job_parameters,
) No newline at end of file
Copy link
Preview

Copilot AI Jul 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to projects.py, the cleanup function is called twice in this sync function - once at line 84 through cleanup() and again at lines 104-108. This redundancy should be removed.

Suggested change
run_cleanup_job(
"azure_devops_members_cleanup.json",
neo4j_session,
common_job_parameters,
)
cleanup(neo4j_session, common_job_parameters)

Copilot uses AI. Check for mistakes.

TIMEOUT = (60, 60)


def get_access_token(tenant_id, client_id, client_secret, refresh_token):
Copy link
Preview

Copilot AI Jul 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function lacks type hints for parameters and return type. The function returns a string but this is not documented in the signature.

Suggested change
def get_access_token(tenant_id, client_id, client_secret, refresh_token):
def get_access_token(tenant_id: str, client_id: str, client_secret: str, refresh_token: str) -> str:

Copilot uses AI. Check for mistakes.

@Anarogk Anarogk requested a review from sash2721 July 30, 2025 04:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant