Closed
Description
Description
Failed to create training datasets.
To Reproduce
Reproducible in many of the examples, but easiest to reproduce in Poker.
- Define only the app, environment and raw columns in YAML
cx deploy
to process the raw columns- Define a model that uses the raw columns
cx deploy
Stack Trace
cx logs dnn/training_dataset
Failed to start:
time="2019-04-18T20:12:09Z" level=info msg="Creating a docker executor"
time="2019-04-18T20:12:09Z" level=info msg="Executor (version: v2.2.1, build_date: 2018-10-11T16:27:29Z) initialized with template:\narchiveLocation: {}\ninputs: {}\nmetadata:\n labels:\n appName: recommendations\n argo: \"true\"\n workloadID: bord6fnh1lma5hyn8my3\n workloadType: data-job\nname: bord6fnh1lma5hyn8my3\noutputs: {}\nresource:\n action: create\n failureCondition: status.applicationState.state in (FAILED,SUBMISSION_FAILED,UNKNOWN)\n manifest: |-\n {\n \"kind\": \"SparkApplication\",\n \"apiVersion\": \"sparkoperator.k8s.io/v1alpha1\",\n \"metadata\": {\n \"name\": \"bord6fnh1lma5hyn8my3\",\n \"namespace\": \"cortex\",\n \"creationTimestamp\": null,\n \"labels\": {\n \"appName\": \"recommendations\",\n \"workloadID\": \"bord6fnh1lma5hyn8my3\",\n \"workloadType\": \"data-job\"\n },\n \"ownerReferences\": [\n {\n \"apiVersion\": \"argoproj.io/v1alpha1\",\n \"kind\": \"Workflow\",\n \"name\": \"argo-recommendations-rplw6\",\n \"uid\": \"3dca2989-6216-11e9-aaf1-02cc01957708\",\n \"blockOwnerDeletion\": false\n }\n ]\n },\n \"spec\": {\n \"type\": \"Python\",\n \"mode\": \"cluster\",\n \"image\": \"969758392368.dkr.ecr.us-west-2.amazonaws.com/cortexlabs/spark:latest\",\n \"imagePullPolicy\": \"Always\",\n \"mainApplicationFile\": \"local:///src/spark_job/spark_job.py\",\n \"arguments\": [\n \"--workload-id=bord6fnh1lma5hyn8my3 --context=s3://cortex-cluster-vishal/apps/recommendations/contexts/9063143c7366987a974e14c07bf21c40bed64e03a3aa0fed55a670c7756e317.msgpack --cache-dir=/mnt/context --raw-columns= --aggregates= --transformed-columns= --training-datasets=d6c73248656984e3d08a6165cd3b34de27253021cb94c232526f96776999d73\"\n ],\n \"driver\": {\n \"cores\": 0,\n \"memory\": \"0k\",\n \"envVars\": {\n \"CORTEX_CACHE_DIR\": \"/mnt/context\",\n \"CORTEX_CONTEXT_S3_PATH\": \"s3://cortex-cluster-vishal/apps/recommendations/contexts/9063143c7366987a974e14c07bf21c40bed64e03a3aa0fed55a670c7756e317.msgpack\",\n \"CORTEX_SPARK_VERBOSITY\": \"WARN\",\n \"CORTEX_WORKLOAD_ID\": \"bord6fnh1lma5hyn8my3\"\n },\n \"envSecretKeyRefs\": {\n \"AWS_ACCESS_KEY_ID\": {\n \"name\": \"aws-credentials\",\n \"key\": \"AWS_ACCESS_KEY_ID\"\n },\n \"AWS_SECRET_ACCESS_KEY\": {\n \"name\": \"aws-credentials\",\n \"key\": \"AWS_SECRET_ACCESS_KEY\"\n }\n },\n \"labels\": {\n \"appName\": \"recommendations\",\n \"userFacing\": \"true\",\n \"workloadID\": \"bord6fnh1lma5hyn8my3\",\n \"workloadType\": \"data-job\"\n },\n \"podName\": \"bord6fnh1lma5hyn8my3\",\n \"serviceAccount\": \"spark\"\n },\n \"executor\": {\n \"cores\": 0,\n \"memory\": \"0k\",\n \"envVars\": {\n \"CORTEX_CACHE_DIR\": \"/mnt/context\",\n \"CORTEX_CONTEXT_S3_PATH\": \"s3://cortex-cluster-vishal/apps/recommendations/contexts/9063143c7366987a974e14c07bf21c40bed64e03a3aa0fed55a670c7756e317.msgpack\",\n \"CORTEX_SPARK_VERBOSITY\": \"WARN\",\n \"CORTEX_WORKLOAD_ID\": \"bord6fnh1lma5hyn8my3\"\n },\n \"envSecretKeyRefs\": {\n \"AWS_ACCESS_KEY_ID\": {\n \"name\": \"aws-credentials\",\n \"key\": \"AWS_ACCESS_KEY_ID\"\n },\n \"AWS_SECRET_ACCESS_KEY\": {\n \"name\": \"aws-credentials\",\n \"key\": \"AWS_SECRET_ACCESS_KEY\"\n }\n },\n \"labels\": {\n \"appName\": \"recommendations\",\n \"workloadID\": \"bord6fnh1lma5hyn8my3\",\n \"workloadType\": \"data-job\"\n },\n \"instances\": 0\n },\n \"deps\": {\n \"pyFiles\": [\n \"local:///src/spark_job/spark_util.py\",\n \"local:///src/lib/*.py\"\n ]\n },\n \"restartPolicy\": {\n \"type\": \"Never\"\n },\n \"pythonVersion\": \"3\"\n },\n \"status\": {\n \"lastSubmissionAttemptTime\": null,\n \"completionTime\": null,\n \"driverInfo\": {},\n \"applicationState\": {\n \"state\": \"\",\n \"errorMessage\": \"\"\n }\n }\n }\n successCondition: status.applicationState.state in (COMPLETED)\n"
time="2019-04-18T20:12:09Z" level=info msg="Loading manifest to /tmp/manifest.yaml"
time="2019-04-18T20:12:09Z" level=info msg="kubectl create -f /tmp/manifest.yaml -o name"
time="2019-04-18T20:12:10Z" level=fatal msg="The SparkApplication \"bord6fnh1lma5hyn8my3\" is invalid: []: Invalid value: map[string]interface {}{\"apiVersion\":\"sparkoperator.k8s.io/v1alpha1\", \"kind\":\"SparkApplication\", \"metadata\":map[string]interface {}{\"name\":\"bord6fnh1lma5hyn8my3\", \"namespace\":\"cortex\", \"creationTimestamp\":\"2019-04-18T20:12:09Z\", \"labels\":map[string]interface {}{\"workloadID\":\"bord6fnh1lma5hyn8my3\", \"workloadType\":\"data-job\", \"appName\":\"recommendations\"}, \"ownerReferences\":[]interface {}{map[string]interface {}{\"apiVersion\":\"argoproj.io/v1alpha1\", \"kind\":\"Workflow\", \"name\":\"argo-recommendations-rplw6\", \"uid\":\"3dca2989-6216-11e9-aaf1-02cc01957708\", \"blockOwnerDeletion\":false}}, \"generation\":1, \"uid\":\"3f1c787f-6216-11e9-aaf1-02cc01957708\", \"selfLink\":\"\"}, \"spec\":map[string]interface {}{\"image\":\"969758392368.dkr.ecr.us-west-2.amazonaws.com/cortexlabs/spark:latest\", \"mainApplicationFile\":\"local:///src/spark_job/spark_job.py\", \"mode\":\"cluster\", \"restartPolicy\":map[string]interface {}{\"type\":\"Never\"}, \"type\":\"Python\", \"driver\":map[string]interface {}{\"serviceAccount\":\"spark\", \"cores\":0, \"envSecretKeyRefs\":map[string]interface {}{\"AWS_ACCESS_KEY_ID\":map[string]interface {}{\"key\":\"AWS_ACCESS_KEY_ID\", \"name\":\"aws-credentials\"}, \"AWS_SECRET_ACCESS_KEY\":map[string]interface {}{\"key\":\"AWS_SECRET_ACCESS_KEY\", \"name\":\"aws-credentials\"}}, \"envVars\":map[string]interface {}{\"CORTEX_CACHE_DIR\":\"/mnt/context\", \"CORTEX_CONTEXT_S3_PATH\":\"s3://cortex-cluster-vishal/apps/recommendations/contexts/9063143c7366987a974e14c07bf21c40bed64e03a3aa0fed55a670c7756e317.msgpack\", \"CORTEX_SPARK_VERBOSITY\":\"WARN\", \"CORTEX_WORKLOAD_ID\":\"bord6fnh1lma5hyn8my3\"}, \"labels\":map[string]interface {}{\"appName\":\"recommendations\", \"userFacing\":\"true\", \"workloadID\":\"bord6fnh1lma5hyn8my3\", \"workloadType\":\"data-job\"}, \"memory\":\"0k\", \"podName\":\"bord6fnh1lma5hyn8my3\"}, \"deps\":map[string]interface {}{\"pyFiles\":[]interface {}{\"local:///src/spark_job/spark_util.py\", \"local:///src/lib/*.py\"}}, \"executor\":map[string]interface {}{\"envVars\":map[string]interface {}{\"CORTEX_CACHE_DIR\":\"/mnt/context\", \"CORTEX_CONTEXT_S3_PATH\":\"s3://cortex-cluster-vishal/apps/recommendations/contexts/9063143c7366987a974e14c07bf21c40bed64e03a3aa0fed55a670c7756e317.msgpack\", \"CORTEX_SPARK_VERBOSITY\":\"WARN\", \"CORTEX_WORKLOAD_ID\":\"bord6fnh1lma5hyn8my3\"}, \"instances\":0, \"labels\":map[string]interface {}{\"appName\":\"recommendations\", \"workloadID\":\"bord6fnh1lma5hyn8my3\", \"workloadType\":\"data-job\"}, \"memory\":\"0k\", \"cores\":0, \"envSecretKeyRefs\":map[string]interface {}{\"AWS_ACCESS_KEY_ID\":map[string]interface {}{\"key\":\"AWS_ACCESS_KEY_ID\", \"name\":\"aws-credentials\"}, \"AWS_SECRET_ACCESS_KEY\":map[string]interface {}{\"name\":\"aws-credentials\", \"key\":\"AWS_SECRET_ACCESS_KEY\"}}}, \"imagePullPolicy\":\"Always\", \"pythonVersion\":\"3\", \"arguments\":[]interface {}{\"--workload-id=bord6fnh1lma5hyn8my3 --context=s3://cortex-cluster-vishal/apps/recommendations/contexts/9063143c7366987a974e14c07bf21c40bed64e03a3aa0fed55a670c7756e317.msgpack --cache-dir=/mnt/context --raw-columns= --aggregates= --transformed-columns= --training-datasets=d6c73248656984e3d08a6165cd3b34de27253021cb94c232526f96776999d73\"}}, \"status\":map[string]interface {}{\"lastSubmissionAttemptTime\":interface {}(nil), \"applicationState\":map[string]interface {}{\"errorMessage\":\"\", \"state\":\"\"}, \"completionTime\":interface {}(nil), \"driverInfo\":map[string]interface {}{}}}: validation failure list:\nspec.driver.cores in body should be greater than 0\nspec.executor.instances in body should be greater than or equal to 1\nspec.executor.cores in body should be greater than 0\ngithub.com/argoproj/argo/errors.New\n\t/root/go/src/github.com/argoproj/argo/errors/errors.go:48\ngithub.com/argoproj/argo/workflow/executor.(*WorkflowExecutor).ExecResource\n\t/root/go/src/github.com/argoproj/argo/workflow/executor/resource.go:36\ngithub.com/argoproj/argo/cmd/argoexec/commands.execResource\n\t/root/go/src/github.com/argoproj/argo/cmd/argoexec/commands/resource.go:38\ngithub.com/argoproj/argo/cmd/argoexec/commands.glob..func2\n\t/root/go/src/github.com/argoproj/argo/cmd/argoexec/commands/resource.go:23\ngithub.com/argoproj/argo/vendor/github.com/spf13/cobra.(*Command).execute\n\t/root/go/src/github.com/argoproj/argo/vendor/github.com/spf13/cobra/command.go:766\ngithub.com/argoproj/argo/vendor/github.com/spf13/cobra.(*Command).ExecuteC\n\t/root/go/src/github.com/argoproj/argo/vendor/github.com/spf13/cobra/command.go:852\ngithub.com/argoproj/argo/vendor/github.com/spf13/cobra.(*Command).Execute\n\t/root/go/src/github.com/argoproj/argo/vendor/github.com/spf13/cobra/command.go:800\nmain.main\n\t/root/go/src/github.com/argoproj/argo/cmd/argoexec/main.go:15\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:198\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:2361"
Version
master