[ aws . sagemaker ]

create-auto-ml-job

Description

Creates an AutoPilot job.

After you run an AutoPilot job, you can find the best performing model by calling , and then deploy that model by following the steps described in Step 6.1: Deploy the Model to Amazon SageMaker Hosting Services .

For information about how to use AutoPilot, see Use AutoPilot to Automate Model Development .

See also: AWS API Documentation

See ‘aws help’ for descriptions of global parameters.

Synopsis

  create-auto-ml-job
--auto-ml-job-name <value>
--input-data-config <value>
--output-data-config <value>
[--problem-type <value>]
[--auto-ml-job-objective <value>]
[--auto-ml-job-config <value>]
--role-arn <value>
[--generate-candidate-definitions-only | --no-generate-candidate-definitions-only]
[--tags <value>]
[--cli-input-json | --cli-input-yaml]
[--generate-cli-skeleton <value>]
[--cli-auto-prompt <value>]

Options

--auto-ml-job-name (string)

Identifies an AutoPilot job. Must be unique to your account and is case-insensitive.

--input-data-config (list)

Similar to InputDataConfig supported by Tuning. Format(s) supported: CSV. Minimum of 1000 rows.

(structure)

Similar to Channel. A channel is a named input source that training algorithms can consume. Refer to Channel for detailed descriptions.

DataSource -> (structure)

The data source.

S3DataSource -> (structure)

The Amazon S3 location of the input data.

Note

The input data must be in CSV format and contain at least 1000 rows.

S3DataType -> (string)

The data type.

S3Uri -> (string)

The URL to the Amazon S3 data source.

CompressionType -> (string)

You can use Gzip or None. The default value is None.

TargetAttributeName -> (string)

The name of the target variable in supervised learning, a.k.a. ‘y’.

Shorthand Syntax:

DataSource={S3DataSource={S3DataType=string,S3Uri=string}},CompressionType=string,TargetAttributeName=string ...

JSON Syntax:

[
  {
    "DataSource": {
      "S3DataSource": {
        "S3DataType": "ManifestFile"|"S3Prefix",
        "S3Uri": "string"
      }
    },
    "CompressionType": "None"|"Gzip",
    "TargetAttributeName": "string"
  }
  ...
]

--output-data-config (structure)

Similar to OutputDataConfig supported by Tuning. Format(s) supported: CSV.

KmsKeyId -> (string)

The AWS KMS encryption key ID.

S3OutputPath -> (string)

The Amazon S3 output path. Must be 128 characters or less.

Shorthand Syntax:

KmsKeyId=string,S3OutputPath=string

JSON Syntax:

{
  "KmsKeyId": "string",
  "S3OutputPath": "string"
}

--problem-type (string)

Defines the kind of preprocessing and algorithms intended for the candidates. Options include: BinaryClassification, MulticlassClassification, and Regression.

Possible values:

  • BinaryClassification

  • MulticlassClassification

  • Regression

--auto-ml-job-objective (structure)

Defines the job’s objective. You provide a MetricName and AutoML will infer minimize or maximize. If this is not provided, the most commonly used ObjectiveMetric for problem type will be selected.

MetricName -> (string)

The name of the metric.

Shorthand Syntax:

MetricName=string

JSON Syntax:

{
  "MetricName": "Accuracy"|"MSE"|"F1"|"F1macro"
}

--auto-ml-job-config (structure)

Contains CompletionCriteria and SecurityConfig.

CompletionCriteria -> (structure)

How long a job is allowed to run, or how many candidates a job is allowed to generate.

MaxCandidates -> (integer)

The maximum number of times a training job is allowed to run.

MaxRuntimePerTrainingJobInSeconds -> (integer)

The maximum time, in seconds, a job is allowed to run.

MaxAutoMLJobRuntimeInSeconds -> (integer)

The maximum time, in seconds, an AutoML job is allowed to wait for a trial to complete. It must be equal to or greater than MaxRuntimePerTrainingJobInSeconds.

SecurityConfig -> (structure)

Security configuration for traffic encryption or Amazon VPC settings.

VolumeKmsKeyId -> (string)

The key used to encrypt stored data.

EnableInterContainerTrafficEncryption -> (boolean)

Whether to use traffic encryption between the container layers.

VpcConfig -> (structure)

VPC configuration.

SecurityGroupIds -> (list)

The VPC security group IDs, in the form sg-xxxxxxxx. Specify the security groups for the VPC that is specified in the Subnets field.

(string)

Subnets -> (list)

The ID of the subnets in the VPC to which you want to connect your training job or model. For information about the availability of specific instance types, see Supported Instance Types and Availability Zones .

(string)

JSON Syntax:

{
  "CompletionCriteria": {
    "MaxCandidates": integer,
    "MaxRuntimePerTrainingJobInSeconds": integer,
    "MaxAutoMLJobRuntimeInSeconds": integer
  },
  "SecurityConfig": {
    "VolumeKmsKeyId": "string",
    "EnableInterContainerTrafficEncryption": true|false,
    "VpcConfig": {
      "SecurityGroupIds": ["string", ...],
      "Subnets": ["string", ...]
    }
  }
}

--role-arn (string)

The ARN of the role that will be used to access the data.

--generate-candidate-definitions-only | --no-generate-candidate-definitions-only (boolean)

This will generate possible candidates without training a model. A candidate is a combination of data preprocessors, algorithms, and algorithm parameter settings.

--tags (list)

Each tag consists of a key and an optional value. Tag keys must be unique per resource.

(structure)

Describes a tag.

Key -> (string)

The tag key.

Value -> (string)

The tag value.

Shorthand Syntax:

Key=string,Value=string ...

JSON Syntax:

[
  {
    "Key": "string",
    "Value": "string"
  }
  ...
]

--cli-input-json | --cli-input-yaml (string) Reads arguments from the JSON string provided. The JSON string follows the format provided by --generate-cli-skeleton. If other arguments are provided on the command line, those values will override the JSON-provided values. It is not possible to pass arbitrary binary values using a JSON-provided value as the string will be taken literally. This may not be specified along with --cli-input-yaml.

--generate-cli-skeleton (string) Prints a JSON skeleton to standard output without sending an API request. If provided with no value or the value input, prints a sample input JSON that can be used as an argument for --cli-input-json. Similarly, if provided yaml-input it will print a sample input YAML that can be used with --cli-input-yaml. If provided with the value output, it validates the command inputs and returns a sample output JSON for that command.

--cli-auto-prompt (boolean) Automatically prompt for CLI input parameters.

See ‘aws help’ for descriptions of global parameters.

Output

AutoMLJobArn -> (string)

When a job is created, it is assigned a unique ARN.