-
Notifications
You must be signed in to change notification settings - Fork 26
fix: simplify get_packages()
#378
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
get_packages()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this
R/connect.R
Outdated
url <- v1_url("packages") | ||
query_params <- list( | ||
name = name, | ||
page_number = page_number, | ||
page_size = page_size | ||
page_size = "1000000" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this really be a string?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right, this should be handled at a different layer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Holding myself back from trying to do this for all query params in all requests for now
R/connect.R
Outdated
@@ -830,12 +830,12 @@ Connect <- R6::R6Class( | |||
#' @param name The package name to filter by. | |||
#' @param page_number The page number. | |||
#' @param page_size Page size, max 500. | |||
packages = function(name = NULL, page_number = 1, page_size = 500) { | |||
packages = function(name = NULL, page_number = 1) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not opposed to having this as an argument if it defaults to something sensible. In case someone really does know what they're doing and wants to fiddle with it that's ok. My complain in prior PRs was that we force folks to deal with this in cases where they don't want to.
But if you would rather hide it entirely, that's fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really got the impression you wanted to hide it entirely from our previous discussion.
Either way — this is the back-end method which is called by the user-facing function get_packages()
which has been drastically simplified. I think given that, it's reasonable to leave limit
here, so that if people really want to mess around with it, they can. The main entrypoint for users will be much simpler.
get_packages(client, page_size = 2, limit = 2), | ||
get_packages(client), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What was limit = 2
here before? I don't see that being removed, but maybe it was just eaten in ...
somewhere?
Regardless, this is one place where having the ability to specify makes testing easier (in a way that encourages good factoring of our own code...). This particular parameter is just passed through so probably NBD to have a test that confirms that, but if there was any sort of processing or whatever it certainly would make something like that easier.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is where limit
was removed from.
Side note, slightly related: connectapi
used to paginate until the API returned empty pages, which required explicitly setting the limit at 2 in the tests. The pagination quirk has now been fixed: paginating until the response is empty is not required because the responses provide the total
number of objects to be returned.
|
||
agg_response <- res | ||
agg_length <- length(agg_response) | ||
while (length(res) > 0 && agg_length < limit) { | ||
while (length(res) > 0 && agg_length < limit && agg_length < total_items) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is limit every greater than total_items? Just curious what sparked this change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
limit
is often greater than total_items
— it's set to Inf
by default.
The result of this was that the gating factor that causes connectapi
to stop requesting new pages from the API was the API returning zero items — when total_items
is provided in each response.
This wasn't actually that bad in effect, but it made mocks more complex (both in code and on disk) because each API request had to have an extra empty page. We could previously work around this by providing limit
explicitly in functions in tests, but removing that from user-facing functions is a good prompt to actually address the underlying itch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm good with this. For future reference, it's worth updating the pull request description to reflect changes made during the review process.
@tdstein 🫡 Ah yeah, thanks for pointing that out |
Intent
Simplify
get_packages()
, removing pagination-related parameters.Fixes #
Approach
page_size
andlimit
from the user-facing functionget_packages()
.page_offset()
function to cease iteration when the total number of aggregated items reaches the total number of items that the server will return. (Previously, it would continue to request a new page until the number of new items was 0, but the responses always contain the total number of items to be returned!)Checklist
NEWS.md
(referencing the connected issue if necessary)?devtools::document()
?