Creates a new job for an existing AWS Glue DataBrew recipe in the current AWS account. You can create a standalone job using either a project, or a combination of a recipe and a dataset.
See also: AWS API Documentation
See ‘aws help’ for descriptions of global parameters.
create-recipe-job
[--dataset-name <value>]
[--encryption-key-arn <value>]
[--encryption-mode <value>]
--name <value>
[--log-subscription <value>]
[--max-capacity <value>]
[--max-retries <value>]
--outputs <value>
[--project-name <value>]
[--recipe-reference <value>]
--role-arn <value>
[--tags <value>]
[--timeout <value>]
[--cli-input-json | --cli-input-yaml]
[--generate-cli-skeleton <value>]
--dataset-name
(string)
The name of the dataset that this job processes.
--encryption-key-arn
(string)
The Amazon Resource Name (ARN) of an encryption key that is used to protect the job.
--encryption-mode
(string)
The encryption mode for the job, which can be one of the following:
SSE-KMS
- Server-side encryption with AWS KMS-managed keys.
SSE-S3
- Server-side encryption with keys managed by Amazon S3.Possible values:
SSE-KMS
SSE-S3
--name
(string)
A unique name for the job.
--log-subscription
(string)
A value that enables or disables Amazon CloudWatch logging for the current AWS account. If logging is enabled, CloudWatch writes one log stream for each job run.
Possible values:
ENABLE
DISABLE
--max-capacity
(integer)
The maximum number of nodes that DataBrew can consume when the job processes data.
--max-retries
(integer)
The maximum number of times to retry the job after a job run fails.
--outputs
(list)
One or more artifacts that represent the output from running the job.
(structure)
Represents individual output from a particular job run.
CompressionFormat -> (string)
The compression algorithm used to compress the output text of the job.
Format -> (string)
The data format of the output of the job.
PartitionColumns -> (list)
The names of one or more partition columns for the output of the job.
(string)
Location -> (structure)
The location in Amazon S3 where the job writes its output.
Bucket -> (string)
The S3 bucket name.
Key -> (string)
The unique name of the object in the bucket.
Overwrite -> (boolean)
A value that, if true, means that any data in the location specified for output is overwritten with new output.
Shorthand Syntax:
CompressionFormat=string,Format=string,PartitionColumns=string,string,Location={Bucket=string,Key=string},Overwrite=boolean ...
JSON Syntax:
[
{
"CompressionFormat": "GZIP"|"LZ4"|"SNAPPY"|"BZIP2"|"DEFLATE"|"LZO"|"BROTLI"|"ZSTD"|"ZLIB",
"Format": "CSV"|"JSON"|"PARQUET"|"GLUEPARQUET"|"AVRO"|"ORC"|"XML",
"PartitionColumns": ["string", ...],
"Location": {
"Bucket": "string",
"Key": "string"
},
"Overwrite": true|false
}
...
]
--project-name
(string)
Either the name of an existing project, or a combination of a recipe and a dataset to associate with the recipe.
--recipe-reference
(structure)
Represents all of the attributes of an AWS Glue DataBrew recipe.
Name -> (string)
The name of the recipe.
RecipeVersion -> (string)
The identifier for the version for the recipe.
Shorthand Syntax:
Name=string,RecipeVersion=string
JSON Syntax:
{
"Name": "string",
"RecipeVersion": "string"
}
--role-arn
(string)
The Amazon Resource Name (ARN) of the AWS Identity and Access Management (IAM) role to be assumed for this request.
--tags
(map)
Metadata tags to apply to this job dataset.
key -> (string)
value -> (string)
Shorthand Syntax:
KeyName1=string,KeyName2=string
JSON Syntax:
{"string": "string"
...}
--timeout
(integer)
The job’s timeout in minutes. A job that attempts to run longer than this timeout period ends with a status of
TIMEOUT
.
--cli-input-json
| --cli-input-yaml
(string)
Reads arguments from the JSON string provided. The JSON string follows the format provided by --generate-cli-skeleton
. If other arguments are provided on the command line, those values will override the JSON-provided values. It is not possible to pass arbitrary binary values using a JSON-provided value as the string will be taken literally. This may not be specified along with --cli-input-yaml
.
--generate-cli-skeleton
(string)
Prints a JSON skeleton to standard output without sending an API request. If provided with no value or the value input
, prints a sample input JSON that can be used as an argument for --cli-input-json
. Similarly, if provided yaml-input
it will print a sample input YAML that can be used with --cli-input-yaml
. If provided with the value output
, it validates the command inputs and returns a sample output JSON for that command.
See ‘aws help’ for descriptions of global parameters.