Skip to content

Update metadata calc in text after removing sha256 #74

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 5 additions & 5 deletions pep-0458.txt
Original file line number Diff line number Diff line change
Expand Up @@ -554,9 +554,9 @@ split all targets in the *bins* role by delegating them to 16,384
*bin-n* roles (see C10 in Table 2). Each *bin-n* role would sign
for the PyPI targets whose SHA2-512 hashes fall into that bin
(see and Figure 2 and `Consistent Snapshots`_). It was found
that this number of bins would result in a 6-10% metadata overhead
that this number of bins would result in a 5-9% metadata overhead
(relative to the average size of downloaded distribution files; see V13 and
V15 in Table 3) for returning users, and a 70% overhead for new
V15 in Table 3) for returning users, and a 69% overhead for new
users who are installing pip for the first time (see V17 in Table 3).


Expand Down Expand Up @@ -590,7 +590,7 @@ A few assumptions used in calculating these metadata overhead percentages:
| C10 | # of bins | 16,384 |
+------+--------------------------------------------------+-----------+

C8 by computed querying the number of release files.
C8 was computed by querying the number of release files.
C9 was derived by taking the average between a rough estimate of the average
size of release files *downloaded* over the past 31 days (1,628,321 bytes),
and the average size of releases files on disk (2,740,465 bytes).
Expand Down Expand Up @@ -645,8 +645,8 @@ __ https://docs.google.com/spreadsheets/d/11_XkeHrf4GdhMYVqpYWsug6JNz5ZK6HvvmDZX

This number of bins SHOULD increase when the metadata overhead for returning
users exceeds 50%. Presently, this SHOULD happen when the number of targets
increase at least 8x from over 2M to over 18M, at which point the metadata
overhead for returning and new users would be around 46-51% and 111%
increase at least 10x from over 2M to over 22M, at which point the metadata
overhead for returning and new users would be around 50-54% and 114%
respectively, assuming that the number of bins stay fixed. If the number of
bins is increased, then the cost for all users would effectively be the cost
for new users, because their cost would be dominated by the (once-in-a-while)
Expand Down