Skip to content

Commit 30421dd

Browse files
authored
[HWORKS-2190][APPEND] Updating job configuration to include file, pyfiles, archives and jars (#478)
* updating docs for jobs configs to include files, pyFiles, jars and archives * updating based on review comments * updating documentation for notebooks and python Jobs
1 parent 4acf8b6 commit 30421dd

File tree

4 files changed

+13
-2
lines changed

4 files changed

+13
-2
lines changed

docs/user_guides/projects/jobs/notebook_job.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -179,6 +179,7 @@ The following table describes the JSON payload returned by `jobs_api.get_configu
179179
| `resourceConfig.gpus` | number (int) | Number of GPUs to be allocated | `0` |
180180
| `logRedirection` | boolean | Whether logs are redirected | `true` |
181181
| `jobType` | string | Type of job | `"PYTHON"` |
182+
| `files` | string | HDFS path(s) to files to be provided to the Notebook Job. Multiple files can be included in a single string, separated by commas. <br>Example: `"hdfs:///Project/<project_name>/Resources/file1.py,hdfs:///Project/<project_name>/Resources/file2.txt"` | `null` |
182183

183184

184185
## Accessing project data

docs/user_guides/projects/jobs/pyspark_job.md

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -217,7 +217,7 @@ The following table describes the JSON payload returned by `jobs_api.get_configu
217217
| Field | Type | Description | Default |
218218
| ------------------------------------------ | -------------- |-----------------------------------------------------| -------------------------- |
219219
| `type` | string | Type of the job configuration | `"sparkJobConfiguration"` |
220-
| `appPath` | string | Project path to script (e.g `Resources/foo.py`) | `null` |
220+
| `appPath` | string | Project path to script (e.g `Resources/foo.py`) | `null` |
221221
| `environmentName` | string | Name of the project spark environment | `"spark-feature-pipeline"` |
222222
| `spark.driver.cores` | number (float) | Number of CPU cores allocated for the driver | `1.0` |
223223
| `spark.driver.memory` | number (int) | Memory allocated for the driver (in MB) | `2048` |
@@ -229,6 +229,10 @@ The following table describes the JSON payload returned by `jobs_api.get_configu
229229
| `spark.dynamicAllocation.maxExecutors` | number (int) | Maximum number of executors with dynamic allocation | `2` |
230230
| `spark.dynamicAllocation.initialExecutors` | number (int) | Initial number of executors with dynamic allocation | `1` |
231231
| `spark.blacklist.enabled` | boolean | Whether executor/node blacklisting is enabled | `false` |
232+
| `files` | string | HDFS path(s) to files to be provided to the Spark application. Multiple files can be included in a single string, separated by commas. <br>Example: `"hdfs:///Project/<project_name>/Resources/file1.py,hdfs:///Project/<project_name>/Resources/file2.txt"` | `null` |
233+
| `pyFiles` | string | HDFS path(s) to Python files to be provided to the Spark application. These will be added to the `PYTHONPATH` so they can be imported as modules. Multiple files can be included in a single string, separated by commas. <br>Example: `"hdfs:///Project/<project_name>/Resources/module1.py,hdfs:///Project/<project_name>/Resources/module2.py"` | `null` |
234+
| `jars` | string | HDFS path(s) to JAR files to be provided to the Spark application. These will be added to the classpath. Multiple files can be included in a single string, separated by commas. <br>Example: `"hdfs:///Project/<project_name>/Resources/lib1.jar,hdfs:///Project/<project_name>/Resources/lib2.jar"` | `null` |
235+
| `archives` | string | HDFS path(s) to archive files to be provided to the Spark application. Multiple files can be included in a single string, separated by commas. <br>Example: `"hdfs:///Project/<project_name>/Resources/archive1.zip,hdfs:///Project/<project_name>/Resources/archive2.tar.gz"` | `null` |
232236

233237

234238
## Accessing project data

docs/user_guides/projects/jobs/python_job.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -177,6 +177,7 @@ The following table describes the JSON payload returned by `jobs_api.get_configu
177177
| `resourceConfig.gpus` | number (int) | Number of GPUs to be allocated | `0` |
178178
| `logRedirection` | boolean | Whether logs are redirected | `true` |
179179
| `jobType` | string | Type of job | `"PYTHON"` |
180+
| `files` | string | HDFS path(s) to files to be provided to the Python Job. Multiple files can be included in a single string, separated by commas. <br>Example: `"hdfs:///Project/<project_name>/Resources/file1.py,hdfs:///Project/<project_name>/Resources/file2.txt"` | `null` |
180181

181182

182183
## Accessing project data

docs/user_guides/projects/jobs/spark_job.md

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -230,7 +230,12 @@ The following table describes the JSON payload returned by `jobs_api.get_configu
230230
| `spark.dynamicAllocation.minExecutors` | number (int) | Minimum number of executors with dynamic allocation | `1` |
231231
| `spark.dynamicAllocation.maxExecutors` | number (int) | Maximum number of executors with dynamic allocation | `2` |
232232
| `spark.dynamicAllocation.initialExecutors` | number (int) | Initial number of executors with dynamic allocation | `1` |
233-
| `spark.blacklist.enabled` | boolean | Whether executor/node blacklisting is enabled | `false` |
233+
| `spark.blacklist.enabled` | boolean | Whether executor/node blacklisting is enabled | `false`
234+
| `files` | string | HDFS path(s) to files to be provided to the Spark application. Multiple files can be included in a single string, separated by commas. <br>Example: `"hdfs:///Project/<project_name>/Resources/file1.py,hdfs:///Project/<project_name>/Resources/file2.txt"` | `null` |
235+
| `pyFiles` | string | HDFS path(s) to Python files to be provided to the Spark application. These will be added to the `PYTHONPATH` so they can be imported as modules. Multiple files can be included in a single string, separated by commas. <br>Example: `"hdfs:///Project/<project_name>/Resources/module1.py,hdfs:///Project/<project_name>/Resources/module2.py"` | `null` |
236+
| `jars` | string | HDFS path(s) to JAR files to be provided to the Spark application. These will be added to the classpath. Multiple files can be included in a single string, separated by commas. <br>Example: `"hdfs:///Project/<project_name>/Resources/lib1.jar,hdfs:///Project/<project_name>/Resources/lib2.jar"` | `null` |
237+
| `archives` | string | HDFS path(s) to archive files to be provided to the Spark application. Multiple files can be included in a single string, separated by commas. <br>Example: `"hdfs:///Project/<project_name>/Resources/archive1.zip,hdfs:///Project/<project_name>/Resources/archive2.tar.gz"` | `null` |
238+
234239

235240
## Accessing project data
236241

0 commit comments

Comments
 (0)