Skip to content

Fix for sampling error in NumPy v1.9 [SPARK-3995][PYSPARK] #2889

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

freeman-lab
Copy link
Contributor

Change maximum value for default seed during RDD sampling so that it is strictly less than 2 ** 32. This prevents a bug in the most recent version of NumPy, which cannot accept random seeds above this bound.

Adds an extra test that uses the default seed (instead of setting it manually, as in the docstrings).

@mengxr

- Fixes bug in NumPy v1.9 which truncates random seeds larger than or
equal to 2 ** 32
- Add an extra test for sampling with default seed
@SparkQA
Copy link

SparkQA commented Oct 22, 2014

QA tests have started for PR 2889 at commit dc385ef.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Oct 22, 2014

QA tests have finished for PR 2889 at commit dc385ef.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/22025/
Test PASSed.

@mengxr
Copy link
Contributor

mengxr commented Oct 22, 2014

LGTM. Merged into master. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants