Skip to content

[SPARK-26076][Build][Minor] Revise ambiguous error message from load-spark-env.sh #23049

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 5 commits into from

Conversation

gengliangwang
Copy link
Member

@gengliangwang gengliangwang commented Nov 15, 2018

What changes were proposed in this pull request?

When I try to run scripts (e.g. start-master.sh/start-history-server.sh in latest master, I got such error:

Presence of build for multiple Scala versions detected.
Either clean one of them or, export SPARK_SCALA_VERSION in spark-env.sh.

The error message is quite confusing. Without reading load-spark-env.sh, I didn't know which directory to remove, or where to find and edit the spark-evn.sh.

This PR is to make the error message more clear. Also change the script for less maintenance when we add or drop Scala versions in the future.
As now with #22967, we can revise the error message as following(in my local setup):

Presence of build for multiple Scala versions detected (/Users/gengliangwang/IdeaProjects/spark/assembly/target/scala-2.12 and /Users/gengliangwang/IdeaProjects/spark/assembly/target/scala-2.11).
Remove one of them or, export SPARK_SCALA_VERSION=2.12 in /Users/gengliangwang/IdeaProjects/spark/conf/spark-env.sh.
Visit https://spark.apache.org/docs/latest/configuration.html#environment-variables for more details about setting environment variables in spark-env.sh.

How was this patch tested?

Manual test

@gengliangwang
Copy link
Member Author

@srowen

@@ -47,8 +47,8 @@ if [ -z "$SPARK_SCALA_VERSION" ]; then
ASSEMBLY_DIR1="${SPARK_HOME}/assembly/target/scala-2.12"

if [[ -d "$ASSEMBLY_DIR2" && -d "$ASSEMBLY_DIR1" ]]; then
echo -e "Presence of build for multiple Scala versions detected." 1>&2
echo -e 'Either clean one of them or, export SPARK_SCALA_VERSION in spark-env.sh.' 1>&2
echo -e "Presence of build for both scala versions(SCALA 2.11 and SCALA 2.12) detected." 1>&2
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about something like "Multiple Scala versions detected ($ASSEMBLY_DIR1 and $ASSEMBLY_DIR2). Remove one, or export SPARK_SCALA_VERSION=(version) in load-spark-env.sh"

That adds more info and maybe is less maintenance as we add or drop Scala versions.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make sense. Let me have a quick update

@gengliangwang gengliangwang changed the title [SPARK-26076][Build] Revise ambiguous error message from load-spark-env.sh [SPARK-26076][Build][Minor] Revise ambiguous error message from load-spark-env.sh Nov 15, 2018
@SparkQA
Copy link

SparkQA commented Nov 15, 2018

Test build #98879 has finished for PR 23049 at commit 1289df1.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Nov 15, 2018

Test build #98876 has finished for PR 23049 at commit bf94264.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

ASSEMBLY_DIR1="${SPARK_HOME}/assembly/target/scala-${SCALA_VERSION1}"
ASSEMBLY_DIR2="${SPARK_HOME}/assembly/target/scala-${SCALA_VERSION2}"
if [[ -d "$ASSEMBLY_DIR1" && -d "$ASSEMBLY_DIR2" ]]; then
echo -e "Presence of build for multiple Scala versions detected($ASSEMBLY_DIR1 and $ASSEMBLY_DIR2)." 1>&2
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: space after 'detected'. I think you can put this in one string if you like.

@vanzin
Copy link
Contributor

vanzin commented Nov 15, 2018

The reason why it mentioned spark-env-sh is because that is a script that is loaded by load-spark-env.sh and is user-defined. From that script:

  if [ -f "${SPARK_CONF_DIR}/spark-env.sh" ]; then
    # Promote all variable declarations to environment (exported) variables
    set -a
    . "${SPARK_CONF_DIR}/spark-env.sh"
    set +a
  fi

Users shouldn't modify load-spark-env.sh, so your updated message is actually wrong.

@gengliangwang
Copy link
Member Author

Hi @vanzin ,
thanks for pointing it out! I have updated the script and PR description.

@SparkQA
Copy link

SparkQA commented Nov 16, 2018

Test build #98899 has finished for PR 23049 at commit daf5e33.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Nov 16, 2018

Test build #98900 has finished for PR 23049 at commit 3269862.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@gengliangwang
Copy link
Member Author

retest this please.

@SparkQA
Copy link

SparkQA commented Nov 16, 2018

Test build #98908 has finished for PR 23049 at commit 3269862.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@vanzin
Copy link
Contributor

vanzin commented Nov 16, 2018

I'm not sure this is actually helping much or making things less confusing.

If spark-env.sh doesn't exist, your message will tell people to modify a file that doesn't exist.

If it's triggered in a subsequent invocation of this script, where the file has already been loaded, it will just say spark-env.sh without its location, like before. So nothing is gained.

spark-env.sh is actually documented in Spark's configuration doc. So if you want to point people at the doc instead, I could see that as an improvement. But you current change doesn't really change anything from before to me.

@gengliangwang
Copy link
Member Author

gengliangwang commented Nov 17, 2018

@vanzin I see your point. I will add a link to https://spark.apache.org/docs/latest/configuration.html. Thanks for the suggestion.

In my case, I didn't know where to find or edit spark-env.sh at that time. I tried running find . -name spark-env.sh and got nothing.
The script will print spark-env.sh without its location only if SPARK_ENV_LOADED is not set.

In addition, it shows what the value of SPARK_SCALA_VERSION should be.

@SparkQA
Copy link

SparkQA commented Nov 17, 2018

Test build #98954 has finished for PR 23049 at commit 07cd7f5.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@gengliangwang
Copy link
Member Author

retest this please.

@SparkQA
Copy link

SparkQA commented Nov 17, 2018

Test build #98957 has finished for PR 23049 at commit 07cd7f5.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@srowen
Copy link
Member

srowen commented Nov 20, 2018

Merged to master

@asfgit asfgit closed this in c34c422 Nov 20, 2018
jackylee-ch pushed a commit to jackylee-ch/spark that referenced this pull request Feb 18, 2019
…spark-env.sh

## What changes were proposed in this pull request?

When I try to run scripts (e.g. `start-master.sh`/`start-history-server.sh ` in latest master, I got such error:
```
Presence of build for multiple Scala versions detected.
Either clean one of them or, export SPARK_SCALA_VERSION in spark-env.sh.
```

The error message is quite confusing. Without reading `load-spark-env.sh`,  I didn't know which directory to remove, or where to find and edit the `spark-evn.sh`.

This PR is to make the error message more clear. Also change the script for less maintenance when we add or drop Scala versions in the future.
As now with apache#22967, we can revise the error message as following(in my local setup):

```
Presence of build for multiple Scala versions detected (/Users/gengliangwang/IdeaProjects/spark/assembly/target/scala-2.12 and /Users/gengliangwang/IdeaProjects/spark/assembly/target/scala-2.11).
Remove one of them or, export SPARK_SCALA_VERSION=2.12 in /Users/gengliangwang/IdeaProjects/spark/conf/spark-env.sh.
Visit https://spark.apache.org/docs/latest/configuration.html#environment-variables for more details about setting environment variables in spark-env.sh.
```

## How was this patch tested?

Manual test

Closes apache#23049 from gengliangwang/reviseEnvScript.

Authored-by: Gengliang Wang <[email protected]>
Signed-off-by: Sean Owen <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants