Skip to content

[SPARK-6980] [CORE] [WIP] Akka timeout exceptions indicate which conf controls them #5741

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 2 commits into from

Conversation

BryanCutler
Copy link
Member

First shot at adding a description for an akka timeout that indicates which configuration property it came from

@BryanCutler
Copy link
Member Author

I am still pretty new with Scala, so any comments/suggestions are much appreciated! @squito

@squito
Copy link
Contributor

squito commented Apr 29, 2015

jenkins, this is ok to test

@SparkQA
Copy link

SparkQA commented Apr 29, 2015

Test build #31255 has finished for PR 5741 at commit 9368e48.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • class ConfiguredTimeout(timeout_duration: FiniteDuration, timeout_description: String = null)
  • This patch does not change any dependencies.

* @param timeout_duration timeout duration in milliseconds
* @param timeout_description description to be displayed in a timeout exception
*/
class ConfiguredTimeout(timeout_duration: FiniteDuration, timeout_description: String = null) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

scala convention is camelCase, so timeoutDuration and timeoutDescription. But in this case, I think timeout is a little redundant anyway, so maybe just duration and description. I don't think you should have a null default for description

@squito
Copy link
Contributor

squito commented Apr 29, 2015

Thanks for working on this @BryanCutler! my major concern at this point is the use of the implicit conversion. Also we need to add a test to AkkaUtilsSuite for askWithReply, that the timeout still works, and we get the right error message when there is no response in time.

@BryanCutler
Copy link
Member Author

Thanks for the quick feedback @squito! I probably should have explained a little, but I only added the implicit conversion to maintain api compatibility by also allowing a FiniteDuration input parameter. If that's not needed then I can definitely remove it.

I'll fix up the code with your suggestions and see what I can do about adding a test case for this. Hopefully, be able to get it done in the next couple days.

@BryanCutler
Copy link
Member Author

Made another commit with received feedback from @squito. Changes are:

  • removed implicit conversion, ConfiguredTimeout must be constructed with a description
  • more descriptive message printed when timeout
  • added method to create ConfiguredTimeout with properties used in RpcUtils.askTimeout
  • added test in AkkaUtilsSuite and new suite ActorSystemSuite to test with an ActorSystem

@SparkQA
Copy link

SparkQA commented May 3, 2015

Test build #31686 has finished for PR 5741 at commit d4bb0e9.

  • This patch passes all tests.
  • This patch does not merge cleanly.
  • This patch adds the following public classes (experimental):
    • class ConfiguredTimeout(duration: FiniteDuration, description: String)
  • This patch adds the following new dependencies:
    • activation-1.1.jar
    • jaxb-api-2.2.2.jar
    • jaxb-impl-2.2.3-1.jar
    • mesos-0.21.0-shaded-protobuf.jar
  • This patch removes the following dependencies:
    • jaxb-api-2.2.7.jar
    • jaxb-core-2.2.7.jar
    • jaxb-impl-2.2.7.jar
    • mesos-0.21.1-shaded-protobuf.jar
    • pmml-agent-1.1.15.jar
    • pmml-model-1.1.15.jar
    • pmml-schema-1.1.15.jar
    • spark-unsafe_2.10-1.4.0-SNAPSHOT.jar

@squito
Copy link
Contributor

squito commented May 5, 2015

Hi @BryanCutler, thanks for the updates.

I realized that I between the time I dreamed of this in my head, and when I actually created the jira, there was a layer of abstraction put in the RPC layer. So in addition to AkkaUtils.askWithReply, we also want to fix AkkaRpcEnv.sendWithReply, as that is where most of the calls are really being made. As you pointed out, these timeouts are all created by RpcUtils.askTimeout -- so how about we just change that method to always return a ConfiguredTimeout?

Sorry I spec'ed this poorly in the beginning and to require more work on this, but I feel pretty strongly that I want to make sure we get all of the timeouts covered, and the best way to do that is make the compiler do the checks for us. Also it makes me to want to fix RpcUtils.lookupTimeout as well. Just changing AkkaUtils.askWithReply is still useful, so you dont' have to take on the whole thing, but there's no huge time pressure so feel free to keep going on it.


conf.set(shortProp, "1s")

ssc = new StreamingContext(master, appName, batchDuration)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you don't need to make a streamingContext for this test at all, it just involves akka actors. In fact, I think this test should get moved to AkkaUtilsSuite as well.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No problem, in my initial tests I created the Actors from the ActorSystem, so I thought that was required. I'll rework this into AkkaUtilsSuite and shorten the timeouts as you mentioned below.

@BryanCutler
Copy link
Member Author

Hey @squito, I agree that it would be best to improve on all of these timeouts. I'll extend the scope to include AkkaRpcEnv.sendWithReply and modify askTimeout and lookupTimeout. Should we try something to handle the timeouts from using these operators:
sender ! message
sender ? message
or since this usage is out of the Spark realm, we don't need to worry about it?

@zsxwing
Copy link
Member

zsxwing commented May 7, 2015

@BryanCutler Could you take a look at org.apache.spark.rpc.RpcEndpointRef and org.apache.spark.rpc.RpcEnv and apply ConfiguredTimeout there? AkkaUtils.askWithReply won't be used after we finish refactoring of the RPC layer.

@BryanCutler
Copy link
Member Author

Sure, I can do that. I'll follow up with another PR soon, thanks!

@squito
Copy link
Contributor

squito commented May 8, 2015

Thanks for the feedback @zsxwing ! and thanks for sticking with this Bryan

@squito
Copy link
Contributor

squito commented May 8, 2015

@BryanCutler to answer your earlier question about sender ! message and sender ? message:

! doesn't have a timeout, so nothing to change there. And for other occurrences of ? -- my guess is that the other refactoring going on will eliminate them. Wouldn't hurt for you to change them as well so they don't overlooked, but might be more work and / or redundant w/ the refactoring. Sorry we're going after a bit of a moving target here :)

@hardmettle
Copy link

@BryanCutler @squito Also one more confusion I had was how would you differentiate between the constructors (in the apply method) since the description and timeoutProp are both strings.

@BryanCutler
Copy link
Member Author

Closing, will continue in PR #6205

@BryanCutler BryanCutler deleted the akka-timeout-6980 branch October 29, 2015 21:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants