Skip to content

Conversation

SubhadityaMukherjee
Copy link
Contributor

Reference Issue

What does this PR implement/fix? Explain your changes.

This adds a function to the run class that allows adding an arbitrary file of a limited size to be uploaded along with the run. Something similar has been implemented in OpenML pytorch and I thought it would be nice to eventually have it as part of OpenML python as well.

How should this PR be tested?

Adding a random file and continuing the run as usual is enough. I will add tests and test it after the servers are back online.

Any other comments?

This is my first PR here so please be gentle 🙈

Copy link
Contributor

@LennartPurucker LennartPurucker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Heyho, coming back to this after a long time.

What are the use cases here exactly? What kind of files do we want users to upload?
I guess the website already supports this, so we could support this as well but I would try to avoid adding new code that is not used.

Also, the current way to support it seems very hacky. I recommend instead to add an official list-like parameter in the init (None/empty by default), and which one can then fill through such a function. And then _get_file_elements has code to check if this parameter is not empty and adds its content to the file elements.

@codecov-commenter
Copy link

Codecov Report

Attention: Patch coverage is 8.33333% with 22 lines in your changes missing coverage. Please review.

Project coverage is 83.71%. Comparing base (c9bfc0f) to head (254a471).
Report is 1 commits behind head on develop.

Files with missing lines Patch % Lines
openml/runs/run.py 8.33% 22 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #1387      +/-   ##
===========================================
- Coverage    84.09%   83.71%   -0.39%     
===========================================
  Files           38       38              
  Lines         5231     5255      +24     
===========================================
  Hits          4399     4399              
- Misses         832      856      +24     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Collaborator

@PGijsbers PGijsbers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some notes.

@LennartPurucker the use case would be sharing models with a run.

@@ -707,3 +707,46 @@ def _to_dict(self) -> dict[str, dict]: # noqa: PLR0912, C901
current,
)
return description

def _check_file_size(self, file: Any, max_file_size_mb: int = 100) -> bool:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return False
return True

def add_file_to_run(self, file: Any, name: str, max_file_size_mb: int = 100) -> None:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with Lennart we should just have proper support if we add this, e.g., simply keep a files: list[Path] property and update the _get_file_elements function accordingly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants