Skip to content

[SPARK-8707] RDD#toDebugString fails if any cached RDD has invalid partitions #7127

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

navis
Copy link
Contributor

@navis navis commented Jun 30, 2015

Added numPartitions(evaluate: Boolean) to RDD. With "evaluate=true" the method is same with "partitions.length". With "evaluate=false", it checks checked-out or already evaluated partitions in the RDD to get number of partition. If it's not those cases, returns -1. RDDInfo.partitionNum calls numPartition only when it's accessed.

@kmadhugit
Copy link

We may not need to introduce another version of toDebugString with a argument to get around this issue. The reported problem is - if there are any unreleated invalid RDDs in the same application it tries to evaluate all of them unnecessarily and fails. So we should be restricting toDebugString evaluate its own partitions.

@andrewor14
Copy link
Contributor

add to whitelist

@andrewor14
Copy link
Contributor

retest this please

@SparkQA
Copy link

SparkQA commented Jul 2, 2015

Test build #36350 has finished for PR 7127 at commit acf3661.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@navis
Copy link
Contributor Author

navis commented Jul 5, 2015

Adressed comments. Thanks, @kmadhugit

@SparkQA
Copy link

SparkQA commented Jul 5, 2015

Test build #36539 has finished for PR 7127 at commit 966bcd9.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • case class Hex(child: Expression) extends UnaryExpression with ExpectsInputTypes
    • case class UnHex(child: Expression) extends UnaryExpression with ExpectsInputTypes
    • case class ShiftLeft(left: Expression, right: Expression)
    • case class ShiftRight(left: Expression, right: Expression)
    • case class ShiftRightUnsigned(left: Expression, right: Expression)
    • case class Levenshtein(left: Expression, right: Expression) extends BinaryExpression

@SparkQA
Copy link

SparkQA commented Aug 21, 2015

Test build #41341 has finished for PR 7127 at commit 6698646.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Aug 26, 2015

Test build #41580 has finished for PR 7127 at commit c61b350.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

}

@DeveloperApi
def getRDDStorageInfo(filter: RDD[_] => Boolean): Array[RDDInfo] = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would just make this private[spark]. Let's try to limit the number of things we expose.

@andrewor14
Copy link
Contributor

LGTM, left a few minor comments. I'll merge this once you address them.

@andrewor14
Copy link
Contributor

retest this please

@SparkQA
Copy link

SparkQA commented Sep 1, 2015

Test build #41888 has finished for PR 7127 at commit c61b350.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@andrewor14
Copy link
Contributor

I've merged this into master after applying the change myself

@asfgit asfgit closed this in 0985d2c Sep 3, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants