Skip to content

feat: add GlobalCaption to db/api for text annotation#625

Merged
surygeng merged 6 commits intomainfrom
jgeng/global-caption-db-ingestion
Mar 19, 2026
Merged

feat: add GlobalCaption to db/api for text annotation#625
surygeng merged 6 commits intomainfrom
jgeng/global-caption-db-ingestion

Conversation

@surygeng
Copy link
Contributor

Description

Adds GlobalCaption as a new annotation shape type in the DB/API layer. S3 ingestion support was added in #603. This PR is a follow up to enable DB import and GraphQL API access.

Changes

  • apiv2/schema/schema.yaml: Added GlobalCaption to annotation_file_shape_type_enum and updated shape_type pattern regex
  • Codegen (make update-schema): Regenerated enums, DB models, GraphQL types, and Alembic migration
  • test_infra/test_files/.../foo-1.0.json: Added GlobalCaption file entry to shared test fixture
  • apiv2/db_import/tests/: Updated test expectations for the new annotation file

"s3_path": f"s3://test-public-bucket/{path}100-foo-1.0_globalcaption.json",
"https_path": f"{http_prefix}/{path}100-foo-1.0_globalcaption.json",
"source": "community",
"format": "json",
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@uermel Just to confirm, we should use "json" not "saber" here for the format, right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@surygeng It should be the other way around. We decided to use "saber" instead of "json".

Jinghui Geng added 2 commits March 16, 2026 16:26
@uermel
Copy link
Contributor

uermel commented Mar 16, 2026

@surygeng I'm a bit surprised this passes tests without any changes to cryoet-data-portalbackend/apiv2/db_import/importers/annotation.py

Since we're using a glob for any json file here this presents the risk of matching both the annotation file and annotation metadata file (like I encountered when trying to test). How is that avoided here?

@@ -0,0 +1,23 @@
"""autogenerated
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this file change is not needed. It is an empty change. :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed

Copy link
Contributor

@PALLAVIKHEDLE PALLAVIKHEDLE left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@surygeng
Copy link
Contributor Author

@surygeng I'm a bit surprised this passes tests without any changes to cryoet-data-portalbackend/apiv2/db_import/importers/annotation.py

Since we're using a glob for any json file here this presents the risk of matching both the annotation file and annotation metadata file (like I encountered when trying to test). How is that avoided here?

I updated the glob pattern for the get_finder_args () function to exclude the _globalcaption.json files. The same logic for the s3 ingestion. Let's see if this works.

@@ -0,0 +1,16 @@
{
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@manasaV3 @uermel I added this _globalcaption.json test file here but not sure if it's necessary.

uermel
uermel approved these changes Mar 19, 2026
@surygeng surygeng merged commit 756b680 into main Mar 19, 2026
10 checks passed
@surygeng surygeng deleted the jgeng/global-caption-db-ingestion branch March 19, 2026 00:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants