Returns the description of an endpoint.
See also: AWS API Documentation
See ‘aws help’ for descriptions of global parameters.
describe-endpoint
--endpoint-name <value>
[--cli-input-json | --cli-input-yaml]
[--generate-cli-skeleton <value>]
--endpoint-name
(string)
The name of the endpoint.
--cli-input-json
| --cli-input-yaml
(string)
Reads arguments from the JSON string provided. The JSON string follows the format provided by --generate-cli-skeleton
. If other arguments are provided on the command line, those values will override the JSON-provided values. It is not possible to pass arbitrary binary values using a JSON-provided value as the string will be taken literally. This may not be specified along with --cli-input-yaml
.
--generate-cli-skeleton
(string)
Prints a JSON skeleton to standard output without sending an API request. If provided with no value or the value input
, prints a sample input JSON that can be used as an argument for --cli-input-json
. Similarly, if provided yaml-input
it will print a sample input YAML that can be used with --cli-input-yaml
. If provided with the value output
, it validates the command inputs and returns a sample output JSON for that command.
See ‘aws help’ for descriptions of global parameters.
EndpointName -> (string)
Name of the endpoint.
EndpointArn -> (string)
The Amazon Resource Name (ARN) of the endpoint.
EndpointConfigName -> (string)
The name of the endpoint configuration associated with this endpoint.
ProductionVariants -> (list)
An array of ProductionVariantSummary objects, one for each model hosted behind this endpoint.
(structure)
Describes weight and capacities for a production variant associated with an endpoint. If you sent a request to the
UpdateEndpointWeightsAndCapacities
API and the endpoint status isUpdating
, you get different desired and current values.VariantName -> (string)
The name of the variant.
DeployedImages -> (list)
An array of
DeployedImage
objects that specify the Amazon EC2 Container Registry paths of the inference images deployed on instances of thisProductionVariant
.(structure)
Gets the Amazon EC2 Container Registry path of the docker image of the model that is hosted in this ProductionVariant .
If you used the
registry/repository[:tag]
form to specify the image path of the primary container when you created the model hosted in thisProductionVariant
, the path resolves to a path of the formregistry/repository[@digest]
. A digest is a hash value that identifies a specific version of an image. For information about Amazon ECR paths, see Pulling an Image in the Amazon ECR User Guide .SpecifiedImage -> (string)
The image path you specified when you created the model.
ResolvedImage -> (string)
The specific digest path of the image hosted in this
ProductionVariant
.ResolutionTime -> (timestamp)
The date and time when the image path for the model resolved to the
ResolvedImage
CurrentWeight -> (float)
The weight associated with the variant.
DesiredWeight -> (float)
The requested weight, as specified in the
UpdateEndpointWeightsAndCapacities
request.CurrentInstanceCount -> (integer)
The number of instances associated with the variant.
DesiredInstanceCount -> (integer)
The number of instances requested in the
UpdateEndpointWeightsAndCapacities
request.VariantStatus -> (list)
The endpoint variant status which describes the current deployment stage status or operational status.
(structure)
Describes the status of the production variant.
Status -> (string)
The endpoint variant status which describes the current deployment stage status or operational status.
Creating
: Creating inference resources for the production variant.
Deleting
: Terminating inference resources for the production variant.
Updating
: Updating capacity for the production variant.
ActivatingTraffic
: Turning on traffic for the production variant.
Baking
: Waiting period to monitor the CloudWatch alarms in the automatic rollback configuration.StatusMessage -> (string)
A message that describes the status of the production variant.
StartTime -> (timestamp)
The start time of the current status change.
CurrentServerlessConfig -> (structure)
The serverless configuration for the endpoint.
MemorySizeInMB -> (integer)
The memory size of your serverless endpoint. Valid values are in 1 GB increments: 1024 MB, 2048 MB, 3072 MB, 4096 MB, 5120 MB, or 6144 MB.
MaxConcurrency -> (integer)
The maximum number of concurrent invocations your serverless endpoint can process.
DesiredServerlessConfig -> (structure)
The serverless configuration requested for the endpoint update.
MemorySizeInMB -> (integer)
The memory size of your serverless endpoint. Valid values are in 1 GB increments: 1024 MB, 2048 MB, 3072 MB, 4096 MB, 5120 MB, or 6144 MB.
MaxConcurrency -> (integer)
The maximum number of concurrent invocations your serverless endpoint can process.
DataCaptureConfig -> (structure)
The currently active data capture configuration used by your Endpoint.
EnableCapture -> (boolean)
Whether data capture is enabled or disabled.
CaptureStatus -> (string)
Whether data capture is currently functional.
CurrentSamplingPercentage -> (integer)
The percentage of requests being captured by your Endpoint.
DestinationS3Uri -> (string)
The Amazon S3 location being used to capture the data.
KmsKeyId -> (string)
The KMS key being used to encrypt the data in Amazon S3.
EndpointStatus -> (string)
The status of the endpoint.
OutOfService
: Endpoint is not available to take incoming requests.
Creating
: CreateEndpoint is executing.
Updating
: UpdateEndpoint or UpdateEndpointWeightsAndCapacities is executing.
SystemUpdating
: Endpoint is undergoing maintenance and cannot be updated or deleted or re-scaled until it has completed. This maintenance operation does not change any customer-specified values such as VPC config, KMS encryption, model, instance type, or instance count.
RollingBack
: Endpoint fails to scale up or down or change its variant weight and is in the process of rolling back to its previous configuration. Once the rollback completes, endpoint returns to anInService
status. This transitional status only applies to an endpoint that has autoscaling enabled and is undergoing variant weight or capacity changes as part of an UpdateEndpointWeightsAndCapacities call or when the UpdateEndpointWeightsAndCapacities operation is called explicitly.
InService
: Endpoint is available to process incoming requests.
Deleting
: DeleteEndpoint is executing.
Failed
: Endpoint could not be created, updated, or re-scaled. Use DescribeEndpointOutput$FailureReason for information about the failure. DeleteEndpoint is the only operation that can be performed on a failed endpoint.
FailureReason -> (string)
If the status of the endpoint is
Failed
, the reason why it failed.
CreationTime -> (timestamp)
A timestamp that shows when the endpoint was created.
LastModifiedTime -> (timestamp)
A timestamp that shows when the endpoint was last modified.
LastDeploymentConfig -> (structure)
The most recent deployment configuration for the endpoint.
BlueGreenUpdatePolicy -> (structure)
Update policy for a blue/green deployment. If this update policy is specified, SageMaker creates a new fleet during the deployment while maintaining the old fleet. SageMaker flips traffic to the new fleet according to the specified traffic routing configuration. Only one update policy should be used in the deployment configuration. If no update policy is specified, SageMaker uses a blue/green deployment strategy with all at once traffic shifting by default.
TrafficRoutingConfiguration -> (structure)
Defines the traffic routing strategy to shift traffic from the old fleet to the new fleet during an endpoint deployment.
Type -> (string)
Traffic routing strategy type.
ALL_AT_ONCE
: Endpoint traffic shifts to the new fleet in a single step.
CANARY
: Endpoint traffic shifts to the new fleet in two steps. The first step is the canary, which is a small portion of the traffic. The second step is the remainder of the traffic.
LINEAR
: Endpoint traffic shifts to the new fleet in n steps of a configurable size.WaitIntervalInSeconds -> (integer)
The waiting time (in seconds) between incremental steps to turn on traffic on the new endpoint fleet.
CanarySize -> (structure)
Batch size for the first step to turn on traffic on the new endpoint fleet.
Value
must be less than or equal to 50% of the variant’s total instance count.Type -> (string)
Specifies the endpoint capacity type.
INSTANCE_COUNT
: The endpoint activates based on the number of instances.
CAPACITY_PERCENT
: The endpoint activates based on the specified percentage of capacity.Value -> (integer)
Defines the capacity size, either as a number of instances or a capacity percentage.
LinearStepSize -> (structure)
Batch size for each step to turn on traffic on the new endpoint fleet.
Value
must be 10-50% of the variant’s total instance count.Type -> (string)
Specifies the endpoint capacity type.
INSTANCE_COUNT
: The endpoint activates based on the number of instances.
CAPACITY_PERCENT
: The endpoint activates based on the specified percentage of capacity.Value -> (integer)
Defines the capacity size, either as a number of instances or a capacity percentage.
TerminationWaitInSeconds -> (integer)
Additional waiting time in seconds after the completion of an endpoint deployment before terminating the old endpoint fleet. Default is 0.
MaximumExecutionTimeoutInSeconds -> (integer)
Maximum execution timeout for the deployment. Note that the timeout value should be larger than the total waiting time specified in
TerminationWaitInSeconds
andWaitIntervalInSeconds
.AutoRollbackConfiguration -> (structure)
Automatic rollback configuration for handling endpoint deployment failures and recovery.
Alarms -> (list)
List of CloudWatch alarms in your account that are configured to monitor metrics on an endpoint. If any alarms are tripped during a deployment, SageMaker rolls back the deployment.
(structure)
An Amazon CloudWatch alarm configured to monitor metrics on an endpoint.
AlarmName -> (string)
The name of a CloudWatch alarm in your account.
AsyncInferenceConfig -> (structure)
Returns the description of an endpoint configuration created using the `
CreateEndpointConfig
https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateEndpointConfig.html`__ API.ClientConfig -> (structure)
Configures the behavior of the client used by SageMaker to interact with the model container during asynchronous inference.
MaxConcurrentInvocationsPerInstance -> (integer)
The maximum number of concurrent requests sent by the SageMaker client to the model container. If no value is provided, SageMaker chooses an optimal value.
OutputConfig -> (structure)
Specifies the configuration for asynchronous inference invocation outputs.
KmsKeyId -> (string)
The Amazon Web Services Key Management Service (Amazon Web Services KMS) key that SageMaker uses to encrypt the asynchronous inference output in Amazon S3.
S3OutputPath -> (string)
The Amazon S3 location to upload inference responses to.
NotificationConfig -> (structure)
Specifies the configuration for notifications of inference results for asynchronous inference.
SuccessTopic -> (string)
Amazon SNS topic to post a notification to when inference completes successfully. If no topic is provided, no notification is sent on success.
ErrorTopic -> (string)
Amazon SNS topic to post a notification to when inference fails. If no topic is provided, no notification is sent on failure.
PendingDeploymentSummary -> (structure)
Returns the summary of an in-progress deployment. This field is only returned when the endpoint is creating or updating with a new endpoint configuration.
EndpointConfigName -> (string)
The name of the endpoint configuration used in the deployment.
ProductionVariants -> (list)
List of
PendingProductionVariantSummary
objects.(structure)
The production variant summary for a deployment when an endpoint is creating or updating with the `` CreateEndpoint `` or `` UpdateEndpoint `` operations. Describes the
VariantStatus
, weight and capacity for a production variant associated with an endpoint.VariantName -> (string)
The name of the variant.
DeployedImages -> (list)
An array of
DeployedImage
objects that specify the Amazon EC2 Container Registry paths of the inference images deployed on instances of thisProductionVariant
.(structure)
Gets the Amazon EC2 Container Registry path of the docker image of the model that is hosted in this ProductionVariant .
If you used the
registry/repository[:tag]
form to specify the image path of the primary container when you created the model hosted in thisProductionVariant
, the path resolves to a path of the formregistry/repository[@digest]
. A digest is a hash value that identifies a specific version of an image. For information about Amazon ECR paths, see Pulling an Image in the Amazon ECR User Guide .SpecifiedImage -> (string)
The image path you specified when you created the model.
ResolvedImage -> (string)
The specific digest path of the image hosted in this
ProductionVariant
.ResolutionTime -> (timestamp)
The date and time when the image path for the model resolved to the
ResolvedImage
CurrentWeight -> (float)
The weight associated with the variant.
DesiredWeight -> (float)
The requested weight for the variant in this deployment, as specified in the endpoint configuration for the endpoint. The value is taken from the request to the `` CreateEndpointConfig `` operation.
CurrentInstanceCount -> (integer)
The number of instances associated with the variant.
DesiredInstanceCount -> (integer)
The number of instances requested in this deployment, as specified in the endpoint configuration for the endpoint. The value is taken from the request to the `` CreateEndpointConfig `` operation.
InstanceType -> (string)
The type of instances associated with the variant.
AcceleratorType -> (string)
The size of the Elastic Inference (EI) instance to use for the production variant. EI instances provide on-demand GPU computing for inference. For more information, see Using Elastic Inference in Amazon SageMaker .
VariantStatus -> (list)
The endpoint variant status which describes the current deployment stage status or operational status.
(structure)
Describes the status of the production variant.
Status -> (string)
The endpoint variant status which describes the current deployment stage status or operational status.
Creating
: Creating inference resources for the production variant.
Deleting
: Terminating inference resources for the production variant.
Updating
: Updating capacity for the production variant.
ActivatingTraffic
: Turning on traffic for the production variant.
Baking
: Waiting period to monitor the CloudWatch alarms in the automatic rollback configuration.StatusMessage -> (string)
A message that describes the status of the production variant.
StartTime -> (timestamp)
The start time of the current status change.
CurrentServerlessConfig -> (structure)
The serverless configuration for the endpoint.
MemorySizeInMB -> (integer)
The memory size of your serverless endpoint. Valid values are in 1 GB increments: 1024 MB, 2048 MB, 3072 MB, 4096 MB, 5120 MB, or 6144 MB.
MaxConcurrency -> (integer)
The maximum number of concurrent invocations your serverless endpoint can process.
DesiredServerlessConfig -> (structure)
The serverless configuration requested for this deployment, as specified in the endpoint configuration for the endpoint.
MemorySizeInMB -> (integer)
The memory size of your serverless endpoint. Valid values are in 1 GB increments: 1024 MB, 2048 MB, 3072 MB, 4096 MB, 5120 MB, or 6144 MB.
MaxConcurrency -> (integer)
The maximum number of concurrent invocations your serverless endpoint can process.
StartTime -> (timestamp)
The start time of the deployment.