misc(native-pos): Extract AbstractNativeProcess from NativeExecutionProcess (#27696)#27696
misc(native-pos): Extract AbstractNativeProcess from NativeExecutionProcess (#27696)#27696kevintang2022 wants to merge 1 commit intoprestodb:masterfrom
Conversation
Reviewer's GuideRefactors native worker process management by extracting the generic subprocess lifecycle and HTTP health-check logic from NativeExecutionProcess into a new AbstractNativeProcess base class, leaving NativeExecutionProcess as a thin subclass that supplies session-specific env, Spark-based path resolution, and worker configuration file population while preserving external behavior. Sequence diagram for starting a native execution process via AbstractNativeProcesssequenceDiagram
actor SparkExecutor
participant NativeExecutionProcess
participant AbstractNativeProcess
participant ProcessBuilder
participant NativeProcessBinary
participant PrestoSparkHttpServerClient
SparkExecutor->>NativeExecutionProcess: start()
NativeExecutionProcess->>AbstractNativeProcess: start()
AbstractNativeProcess->>AbstractNativeProcess: getLaunchCommand()
AbstractNativeProcess->>AbstractNativeProcess: resolveProcessWorkingPath("./")
AbstractNativeProcess->>NativeExecutionProcess: populateConfigurationFiles(configBasePath)
NativeExecutionProcess-->>AbstractNativeProcess: (writes worker config, node, catalog)
AbstractNativeProcess->>ProcessBuilder: new ProcessBuilder(launchCommand)
AbstractNativeProcess->>NativeExecutionProcess: customizeProcessBuilder(processBuilder)
NativeExecutionProcess-->>AbstractNativeProcess: (set INIT_PRESTO_QUERY_ID env)
AbstractNativeProcess->>ProcessBuilder: start()
ProcessBuilder-->>AbstractNativeProcess: Process
AbstractNativeProcess->>AbstractNativeProcess: ProcessOutputPipe.start()
AbstractNativeProcess->>AbstractNativeProcess: getServerInfoWithRetry()
loop retry until success or error budget exhausted
AbstractNativeProcess->>PrestoSparkHttpServerClient: getServerInfo()
PrestoSparkHttpServerClient-->>AbstractNativeProcess: BaseResponse_ServerInfo or error
end
alt server info obtained
AbstractNativeProcess-->>NativeExecutionProcess: start() returns
NativeExecutionProcess-->>SparkExecutor: start() returns
else failure after retries
AbstractNativeProcess->>AbstractNativeProcess: close()
AbstractNativeProcess->>AbstractNativeProcess: propagateStartFailure(Throwable)
AbstractNativeProcess-->>SparkExecutor: PrestoSparkFatalException (executor fails)
end
Class diagram for AbstractNativeProcess refactorclassDiagram
class AbstractNativeProcess {
-String executablePath
-String programArguments
-PrestoSparkHttpServerClient serverClient
-URI location
-int port
-Executor executor
-RequestErrorTracker errorTracker
-Process process
-ProcessOutputPipe processOutputPipe
+AbstractNativeProcess(executablePath, programArguments, nodeInternalAddress, httpClient, executor, scheduledExecutorService, serverInfoCodec, maxErrorDuration)
+start()
+terminateWithCore(timeout)
+close()
+isAlive() boolean
+getCrashReport() String
+getPort() int
+getLocation() URI
+getServerInfoWithRetry() SettableFuture_ServerInfo
+customizeProcessBuilder(processBuilder)
+populateConfigurationFiles(configBasePath) *abstract*
+resolveProcessWorkingPath(path) String
+propagateStartFailure(t) RuntimeException
+getAvailableTcpPort(nodeInternalAddress) int *static*
+workerConfigFile(configBasePath) Path *static*
+workerNodeConfigFile(configBasePath) Path *static*
+workerCatalogDir(configBasePath) Path *static*
}
class NativeExecutionProcess {
-Session session
-WorkerProperty workerProperty
-Optional_nativeTempStorageHandle nativeTempStorageHandle
+NativeExecutionProcess(executablePath, programArguments, session, httpClient, executor, scheduledExecutorService, serverInfoCodec, maxErrorDuration, workerProperty, nativeTempStorageHandle)
+customizeProcessBuilder(processBuilder)
+populateConfigurationFiles(configBasePath)
+resolveProcessWorkingPath(path) String
+updateWorkerProperties()
+updateWorkerMemoryProperties()
}
class AbstractNativeProcess_ProcessOutputPipe {
-long pid
-InputStream inputStream
-OutputStream outputStream
-StringBuilder abortMessage
-AtomicBoolean started
+AbstractNativeProcess_ProcessOutputPipe(pid, inputStream, outputStream)
+start()
+run()
+getAbortMessage() String
}
AbstractNativeProcess <|-- NativeExecutionProcess
AbstractNativeProcess *-- AbstractNativeProcess_ProcessOutputPipe
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
85bd9bc to
3c8d166
Compare
There was a problem hiding this comment.
Hey - I've found 2 issues, and left some high level feedback:
- The config file name constants lost their leading '/' and are now resolved via Path.resolve, which changes paths like Paths.get(configBasePath, "/config.properties") to configBasePath.resolve("config.properties"); please double-check this doesn’t alter the effective etc-dir layout compared to the previous behavior.
- propagateStartFailure currently wraps the original Throwable as t.getCause(), which can lose the top-level exception and stack; consider passing t directly (or preserving the full cause chain) to avoid losing debugging context when a native process fails to start.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- The config file name constants lost their leading '/' and are now resolved via Path.resolve, which changes paths like Paths.get(configBasePath, "/config.properties") to configBasePath.resolve("config.properties"); please double-check this doesn’t alter the effective etc-dir layout compared to the previous behavior.
- propagateStartFailure currently wraps the original Throwable as t.getCause(), which can lose the top-level exception and stack; consider passing t directly (or preserving the full cause chain) to avoid losing debugging context when a native process fails to start.
## Individual Comments
### Comment 1
<location path="presto-spark-base/src/main/java/com/facebook/presto/spark/execution/nativeprocess/AbstractNativeProcess.java" line_range="186-188" />
<code_context>
+ * <p>The return type lets callers write {@code throw propagateStartFailure(t);} for
+ * flow analysis, even though control never reaches the {@code throw}.
+ */
+ protected RuntimeException propagateStartFailure(Throwable t)
+ {
+ throw new PrestoSparkFatalException(t.getMessage(), t.getCause());
+ }
+
</code_context>
<issue_to_address>
**suggestion (bug_risk):** Preserve the original throwable when propagating start failures to avoid losing stack information.
Since `t` may be a `PrestoException` (or other non-wrapped exception) with a null `getCause()`, this construction can drop the original stack trace. Prefer using `t` as the cause (e.g., `new PrestoSparkFatalException("Native process start failed", t)`) so the full stack is preserved. Also, the method’s declared return type (`RuntimeException`) is misleading because it always throws an `Error` and never returns.
</issue_to_address>
### Comment 2
<location path="presto-spark-base/src/main/java/com/facebook/presto/spark/execution/nativeprocess/AbstractNativeProcess.java" line_range="416-420" />
<code_context>
+ ImmutableList.Builder<String> command = ImmutableList.builder();
+ List<String> argsList = Arrays.asList(programArguments.split("\\s+"));
+ boolean etcDirSet = false;
+ for (int i = 0; i < argsList.size(); i++) {
+ String arg = argsList.get(i);
+ if (arg.equals("--etc_dir")) {
+ etcDirSet = true;
+ configPath = argsList.get(i + 1);
+ break;
+ }
</code_context>
<issue_to_address>
**suggestion (bug_risk):** Harden parsing of the `--etc_dir` argument to avoid potential IndexOutOfBoundsException.
In `getLaunchCommand`, when `arg.equals("--etc_dir")` you access `argsList.get(i + 1)` without checking bounds. If `--etc_dir` is last or malformed, this will throw `IndexOutOfBoundsException`. Since this runs during startup error handling, add an `i + 1 < argsList.size()` check and, on failure, throw a `PrestoException` with a clear message instead of relying on the raw index error.
Suggested implementation:
```java
private List<String> getLaunchCommand()
throws IOException
{
String configPath = Paths.get(resolveProcessWorkingPath("./"), String.valueOf(port)).toAbsolutePath().toString();
ImmutableList.Builder<String> command = ImmutableList.builder();
List<String> argsList = Arrays.asList(programArguments.split("\\s+"));
boolean etcDirSet = false;
for (int i = 0; i < argsList.size(); i++) {
String arg = argsList.get(i);
if ("--etc_dir".equals(arg)) {
etcDirSet = true;
if (i + 1 >= argsList.size()) {
throw new PrestoException(
GENERIC_INTERNAL_ERROR,
"Missing value for --etc_dir argument in programArguments: " + programArguments);
}
configPath = argsList.get(i + 1);
break;
}
}
```
To compile successfully, ensure the following imports are present at the top of `AbstractNativeProcess.java` (if they are not already there):
1. `import com.facebook.presto.spi.PrestoException;`
2. `import static com.facebook.presto.spi.StandardErrorCode.GENERIC_INTERNAL_ERROR;`
If a different error code is preferred for configuration issues in this module (e.g., `CONFIGURATION_INVALID` or similar), replace `GENERIC_INTERNAL_ERROR` with that project-standard error code.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
| protected RuntimeException propagateStartFailure(Throwable t) | ||
| { | ||
| throw new PrestoSparkFatalException(t.getMessage(), t.getCause()); |
There was a problem hiding this comment.
suggestion (bug_risk): Preserve the original throwable when propagating start failures to avoid losing stack information.
Since t may be a PrestoException (or other non-wrapped exception) with a null getCause(), this construction can drop the original stack trace. Prefer using t as the cause (e.g., new PrestoSparkFatalException("Native process start failed", t)) so the full stack is preserved. Also, the method’s declared return type (RuntimeException) is misleading because it always throws an Error and never returns.
| for (int i = 0; i < argsList.size(); i++) { | ||
| String arg = argsList.get(i); | ||
| if (arg.equals("--etc_dir")) { | ||
| etcDirSet = true; | ||
| configPath = argsList.get(i + 1); |
There was a problem hiding this comment.
suggestion (bug_risk): Harden parsing of the --etc_dir argument to avoid potential IndexOutOfBoundsException.
In getLaunchCommand, when arg.equals("--etc_dir") you access argsList.get(i + 1) without checking bounds. If --etc_dir is last or malformed, this will throw IndexOutOfBoundsException. Since this runs during startup error handling, add an i + 1 < argsList.size() check and, on failure, throw a PrestoException with a clear message instead of relying on the raw index error.
Suggested implementation:
private List<String> getLaunchCommand()
throws IOException
{
String configPath = Paths.get(resolveProcessWorkingPath("./"), String.valueOf(port)).toAbsolutePath().toString();
ImmutableList.Builder<String> command = ImmutableList.builder();
List<String> argsList = Arrays.asList(programArguments.split("\\s+"));
boolean etcDirSet = false;
for (int i = 0; i < argsList.size(); i++) {
String arg = argsList.get(i);
if ("--etc_dir".equals(arg)) {
etcDirSet = true;
if (i + 1 >= argsList.size()) {
throw new PrestoException(
GENERIC_INTERNAL_ERROR,
"Missing value for --etc_dir argument in programArguments: " + programArguments);
}
configPath = argsList.get(i + 1);
break;
}
}To compile successfully, ensure the following imports are present at the top of AbstractNativeProcess.java (if they are not already there):
import com.facebook.presto.spi.PrestoException;import static com.facebook.presto.spi.StandardErrorCode.GENERIC_INTERNAL_ERROR;
If a different error code is preferred for configuration issues in this module (e.g., CONFIGURATION_INVALID or similar), replace GENERIC_INTERNAL_ERROR with that project-standard error code.
…rocess (prestodb#27696) Summary: ## Description Extracts the common subprocess lifecycle code from `NativeExecutionProcess` into a new `AbstractNativeProcess` base class. The original `NativeExecutionProcess` becomes a thin subclass that supplies execution-specific configuration. Behavior is unchanged. ## Motivation and Context The presto-on-spark driver needs to launch a second native subprocess (a metadata-only sidecar — see follow-up diffs in this stack) that shares ~95% of its lifecycle with the existing per-task native worker process: spawn the binary, write per-process etc/ files, install a shutdown hook, monitor for unexpected exit. Without this refactor each new native subprocess type would have to re-implement the lifecycle boilerplate. ## Impact No user-facing impact. `NativeExecutionProcess` is internal to presto-on-spark and its public behavior (constructor signature, start/stop semantics, configuration files written) is preserved. Subclasses can override hooks like `populateConfigurationFiles` and `resolveProcessWorkingPath`. ## Test Plan Existing `NativeExecutionProcess` unit tests continue to pass. The new abstract class is exercised by tests in subsequent diffs in this stack that introduce a second concrete subclass. ## Contributor checklist - [x] My PR adheres to the code style of this project. - [x] My code builds clean without any errors or warnings. - [x] I am willing to help maintain this change if there are any issues in the future. ## Release Notes \`\`\` == NO RELEASE NOTE == \`\`\` Differential Revision: D103025709
3c8d166 to
308bcdc
Compare
…rocess (prestodb#27696) Summary: ## Description Extracts the common subprocess lifecycle code from `NativeExecutionProcess` into a new `AbstractNativeProcess` base class. The original `NativeExecutionProcess` becomes a thin subclass that supplies execution-specific configuration. Behavior is unchanged. ## Motivation and Context The presto-on-spark driver needs to launch a second native subprocess (a metadata-only sidecar — see follow-up diffs in this stack) that shares ~95% of its lifecycle with the existing per-task native worker process: spawn the binary, write per-process etc/ files, install a shutdown hook, monitor for unexpected exit. Without this refactor each new native subprocess type would have to re-implement the lifecycle boilerplate. ## Impact No user-facing impact. `NativeExecutionProcess` is internal to presto-on-spark and its public behavior (constructor signature, start/stop semantics, configuration files written) is preserved. Subclasses can override hooks like `populateConfigurationFiles` and `resolveProcessWorkingPath`. ## Test Plan Existing `NativeExecutionProcess` unit tests continue to pass. The new abstract class is exercised by tests in subsequent diffs in this stack that introduce a second concrete subclass. ## Contributor checklist - [x] My PR adheres to the code style of this project. - [x] My code builds clean without any errors or warnings. - [x] I am willing to help maintain this change if there are any issues in the future. ## Release Notes \`\`\` == NO RELEASE NOTE == \`\`\` Differential Revision: D103025709
…rocess (prestodb#27696) Summary: ## Description Extracts the common subprocess lifecycle code from `NativeExecutionProcess` into a new `AbstractNativeProcess` base class. The original `NativeExecutionProcess` becomes a thin subclass that supplies execution-specific configuration. Behavior is unchanged. ## Motivation and Context The presto-on-spark driver needs to launch a second native subprocess (a metadata-only sidecar — see follow-up diffs in this stack) that shares ~95% of its lifecycle with the existing per-task native worker process: spawn the binary, write per-process etc/ files, install a shutdown hook, monitor for unexpected exit. Without this refactor each new native subprocess type would have to re-implement the lifecycle boilerplate. ## Impact No user-facing impact. `NativeExecutionProcess` is internal to presto-on-spark and its public behavior (constructor signature, start/stop semantics, configuration files written) is preserved. Subclasses can override hooks like `populateConfigurationFiles` and `resolveProcessWorkingPath`. ## Test Plan Existing `NativeExecutionProcess` unit tests continue to pass. The new abstract class is exercised by tests in subsequent diffs in this stack that introduce a second concrete subclass. ## Contributor checklist - [x] My PR adheres to the code style of this project. - [x] My code builds clean without any errors or warnings. - [x] I am willing to help maintain this change if there are any issues in the future. ## Release Notes \`\`\` == NO RELEASE NOTE == \`\`\` Differential Revision: D103025709
jja725
left a comment
There was a problem hiding this comment.
Thanks for the refactoring
Summary:
Description
Extracts the common subprocess lifecycle code from
NativeExecutionProcessinto a newAbstractNativeProcessbase class. The originalNativeExecutionProcessbecomes a thin subclass that supplies execution-specific configuration. Behavior is unchanged.Motivation and Context
The presto-on-spark driver needs to launch a second native subprocess (a metadata-only sidecar — see follow-up diffs in this stack) that shares ~95% of its lifecycle with the existing per-task native worker process: spawn the binary, write per-process etc/ files, install a shutdown hook, monitor for unexpected exit. Without this refactor each new native subprocess type would have to re-implement the lifecycle boilerplate.
Impact
No user-facing impact.
NativeExecutionProcessis internal to presto-on-spark and its public behavior (constructor signature, start/stop semantics, configuration files written) is preserved. Subclasses can override hooks likepopulateConfigurationFilesandresolveProcessWorkingPath.Test Plan
Existing
NativeExecutionProcessunit tests continue to pass. The new abstract class is exercised by tests in subsequent diffs in this stack that introduce a second concrete subclass.Contributor checklist
Release Notes
```
== NO RELEASE NOTE ==
```
Differential Revision: D103025709