[ aws . entityresolution ]
Creates a MatchingWorkflow
object which stores the configuration of the data processing job to be run. It is important to note that there should not be a pre-existing MatchingWorkflow
with the same name. To modify an existing workflow, utilize the UpdateMatchingWorkflow
API.
See also: AWS API Documentation
create-matching-workflow
uses document type values. Document types follow the JSON data model where valid values are: strings, numbers, booleans, null, arrays, and objects. For command input, options and nested parameters that are labeled with the type document
must be provided as JSON. Shorthand syntax does not support document types.
create-matching-workflow
[--description <value>]
[--incremental-run-config <value>]
--input-source-config <value>
--output-source-config <value>
--resolution-techniques <value>
--role-arn <value>
[--tags <value>]
--workflow-name <value>
[--cli-input-json | --cli-input-yaml]
[--generate-cli-skeleton <value>]
[--debug]
[--endpoint-url <value>]
[--no-verify-ssl]
[--no-paginate]
[--output <value>]
[--query <value>]
[--profile <value>]
[--region <value>]
[--version <value>]
[--color <value>]
[--no-sign-request]
[--ca-bundle <value>]
[--cli-read-timeout <value>]
[--cli-connect-timeout <value>]
[--cli-binary-format <value>]
[--no-cli-pager]
[--cli-auto-prompt]
[--no-cli-auto-prompt]
--description
(string)
A description of the workflow.
--incremental-run-config
(structure)
An object which defines an incremental run type and has only
incrementalRunType
as a field.incrementalRunType -> (string)
The type of incremental run. It takes only one value:IMMEDIATE
.
Shorthand Syntax:
incrementalRunType=string
JSON Syntax:
{
"incrementalRunType": "IMMEDIATE"
}
--input-source-config
(list)
A list of
InputSource
objects, which have the fieldsInputSourceARN
andSchemaName
.(structure)
An object containing
InputSourceARN
,SchemaName
, andApplyNormalization
.applyNormalization -> (boolean)
Normalizes the attributes defined in the schema in the input data. For example, if an attribute has anAttributeType
ofPHONE_NUMBER
, and the data in the input table is in a format of 1234567890, Entity Resolution will normalize this field in the output to (123)-456-7890.inputSourceARN -> (string)
An Glue table ARN for the input source table.schemaName -> (string)
The name of the schema to be retrieved.
Shorthand Syntax:
applyNormalization=boolean,inputSourceARN=string,schemaName=string ...
JSON Syntax:
[
{
"applyNormalization": true|false,
"inputSourceARN": "string",
"schemaName": "string"
}
...
]
--output-source-config
(list)
A list of
OutputSource
objects, each of which contains fieldsOutputS3Path
,ApplyNormalization
, andOutput
.(structure)
A list of
OutputAttribute
objects, each of which have the fieldsName
andHashed
. Each of these objects selects a column to be included in the output table, and whether the values of the column should be hashed.KMSArn -> (string)
Customer KMS ARN for encryption at rest. If not provided, system will use an Entity Resolution managed KMS key.applyNormalization -> (boolean)
Normalizes the attributes defined in the schema in the input data. For example, if an attribute has anAttributeType
ofPHONE_NUMBER
, and the data in the input table is in a format of 1234567890, Entity Resolution will normalize this field in the output to (123)-456-7890.output -> (list)
A list of
OutputAttribute
objects, each of which have the fieldsName
andHashed
. Each of these objects selects a column to be included in the output table, and whether the values of the column should be hashed.(structure)
A list of
OutputAttribute
objects, each of which have the fieldsName
andHashed
. Each of these objects selects a column to be included in the output table, and whether the values of the column should be hashed.hashed -> (boolean)
Enables the ability to hash the column values in the output.name -> (string)
A name of a column to be written to the output. This must be anInputField
name in the schema mapping.outputS3Path -> (string)
The S3 path to which Entity Resolution will write the output table.
Shorthand Syntax:
KMSArn=string,applyNormalization=boolean,output=[{hashed=boolean,name=string},{hashed=boolean,name=string}],outputS3Path=string ...
JSON Syntax:
[
{
"KMSArn": "string",
"applyNormalization": true|false,
"output": [
{
"hashed": true|false,
"name": "string"
}
...
],
"outputS3Path": "string"
}
...
]
--resolution-techniques
(structure)
An object which defines the
resolutionType
and theruleBasedProperties
.providerProperties -> (structure)
The properties of the provider service.
intermediateSourceConfiguration -> (structure)
The Amazon S3 location that temporarily stores your data while it processes. Your information won’t be saved permanently.
intermediateS3Path -> (string)
The Amazon S3 location (bucket and prefix). For example:s3://provider_bucket/DOC-EXAMPLE-BUCKET
providerConfiguration -> (document)
The required configuration fields to use with the provider service.providerServiceArn -> (string)
The ARN of the provider service.resolutionType -> (string)
The type of matching. There are two types of matching:RULE_MATCHING
andML_MATCHING
.ruleBasedProperties -> (structure)
An object which defines the list of matching rules to run and has a field
Rules
, which is a list of rule objects.attributeMatchingModel -> (string)
The comparison type. You can either chooseONE_TO_ONE
orMANY_TO_MANY
as the AttributeMatchingModel. When choosingMANY_TO_MANY
, the system can match attributes across the sub-types of an attribute type. For example, if the value of theBusinessEmail
field of Profile B matches, the two profiles are matched on theONE_TO_ONE
,the system can only match if the sub-types are exact matches. For example, only when the value of therules -> (list)
A list of
Rule
objects, each of which have fieldsRuleName
andMatchingKeys
.(structure)
An object containing
RuleName
, andMatchingKeys
.matchingKeys -> (list)
A list of
MatchingKeys
. TheMatchingKeys
must have been defined in theSchemaMapping
. Two records are considered to match according to this rule if all of theMatchingKeys
match.(string)
ruleName -> (string)
A name for the matching rule.
JSON Syntax:
{
"providerProperties": {
"intermediateSourceConfiguration": {
"intermediateS3Path": "string"
},
"providerConfiguration": {...},
"providerServiceArn": "string"
},
"resolutionType": "RULE_MATCHING"|"ML_MATCHING"|"PROVIDER",
"ruleBasedProperties": {
"attributeMatchingModel": "ONE_TO_ONE"|"MANY_TO_MANY",
"rules": [
{
"matchingKeys": ["string", ...],
"ruleName": "string"
}
...
]
}
}
--role-arn
(string)
The Amazon Resource Name (ARN) of the IAM role. Entity Resolution assumes this role to create resources on your behalf as part of workflow execution.
--tags
(map)
The tags used to organize, track, or control access for this resource.
key -> (string)
value -> (string)
Shorthand Syntax:
KeyName1=string,KeyName2=string
JSON Syntax:
{"string": "string"
...}
--workflow-name
(string)
The name of the workflow. There can’t be multipleMatchingWorkflows
with the same name.
--cli-input-json
| --cli-input-yaml
(string)
Reads arguments from the JSON string provided. The JSON string follows the format provided by --generate-cli-skeleton
. If other arguments are provided on the command line, those values will override the JSON-provided values. It is not possible to pass arbitrary binary values using a JSON-provided value as the string will be taken literally. This may not be specified along with --cli-input-yaml
.
--generate-cli-skeleton
(string)
Prints a JSON skeleton to standard output without sending an API request. If provided with no value or the value input
, prints a sample input JSON that can be used as an argument for --cli-input-json
. Similarly, if provided yaml-input
it will print a sample input YAML that can be used with --cli-input-yaml
. If provided with the value output
, it validates the command inputs and returns a sample output JSON for that command. The generated JSON skeleton is not stable between versions of the AWS CLI and there are no backwards compatibility guarantees in the JSON skeleton generated.
--debug
(boolean)
Turn on debug logging.
--endpoint-url
(string)
Override command’s default URL with the given URL.
--no-verify-ssl
(boolean)
By default, the AWS CLI uses SSL when communicating with AWS services. For each SSL connection, the AWS CLI will verify SSL certificates. This option overrides the default behavior of verifying SSL certificates.
--no-paginate
(boolean)
Disable automatic pagination.
--output
(string)
The formatting style for command output.
--query
(string)
A JMESPath query to use in filtering the response data.
--profile
(string)
Use a specific profile from your credential file.
--region
(string)
The region to use. Overrides config/env settings.
--version
(string)
Display the version of this tool.
--color
(string)
Turn on/off color output.
--no-sign-request
(boolean)
Do not sign requests. Credentials will not be loaded if this argument is provided.
--ca-bundle
(string)
The CA certificate bundle to use when verifying SSL certificates. Overrides config/env settings.
--cli-read-timeout
(int)
The maximum socket read time in seconds. If the value is set to 0, the socket read will be blocking and not timeout. The default value is 60 seconds.
--cli-connect-timeout
(int)
The maximum socket connect time in seconds. If the value is set to 0, the socket connect will be blocking and not timeout. The default value is 60 seconds.
--cli-binary-format
(string)
The formatting style to be used for binary blobs. The default format is base64. The base64 format expects binary blobs to be provided as a base64 encoded string. The raw-in-base64-out format preserves compatibility with AWS CLI V1 behavior and binary values must be passed literally. When providing contents from a file that map to a binary blob fileb://
will always be treated as binary and use the file contents directly regardless of the cli-binary-format
setting. When using file://
the file contents will need to properly formatted for the configured cli-binary-format
.
--no-cli-pager
(boolean)
Disable cli pager for output.
--cli-auto-prompt
(boolean)
Automatically prompt for CLI input parameters.
--no-cli-auto-prompt
(boolean)
Disable automatically prompt for CLI input parameters.
description -> (string)
A description of the workflow.
incrementalRunConfig -> (structure)
An object which defines an incremental run type and has only
incrementalRunType
as a field.incrementalRunType -> (string)
The type of incremental run. It takes only one value:IMMEDIATE
.
inputSourceConfig -> (list)
A list of
InputSource
objects, which have the fieldsInputSourceARN
andSchemaName
.(structure)
An object containing
InputSourceARN
,SchemaName
, andApplyNormalization
.applyNormalization -> (boolean)
Normalizes the attributes defined in the schema in the input data. For example, if an attribute has anAttributeType
ofPHONE_NUMBER
, and the data in the input table is in a format of 1234567890, Entity Resolution will normalize this field in the output to (123)-456-7890.inputSourceARN -> (string)
An Glue table ARN for the input source table.schemaName -> (string)
The name of the schema to be retrieved.
outputSourceConfig -> (list)
A list of
OutputSource
objects, each of which contains fieldsOutputS3Path
,ApplyNormalization
, andOutput
.(structure)
A list of
OutputAttribute
objects, each of which have the fieldsName
andHashed
. Each of these objects selects a column to be included in the output table, and whether the values of the column should be hashed.KMSArn -> (string)
Customer KMS ARN for encryption at rest. If not provided, system will use an Entity Resolution managed KMS key.applyNormalization -> (boolean)
Normalizes the attributes defined in the schema in the input data. For example, if an attribute has anAttributeType
ofPHONE_NUMBER
, and the data in the input table is in a format of 1234567890, Entity Resolution will normalize this field in the output to (123)-456-7890.output -> (list)
A list of
OutputAttribute
objects, each of which have the fieldsName
andHashed
. Each of these objects selects a column to be included in the output table, and whether the values of the column should be hashed.(structure)
A list of
OutputAttribute
objects, each of which have the fieldsName
andHashed
. Each of these objects selects a column to be included in the output table, and whether the values of the column should be hashed.hashed -> (boolean)
Enables the ability to hash the column values in the output.name -> (string)
A name of a column to be written to the output. This must be anInputField
name in the schema mapping.outputS3Path -> (string)
The S3 path to which Entity Resolution will write the output table.
resolutionTechniques -> (structure)
An object which defines the
resolutionType
and theruleBasedProperties
.providerProperties -> (structure)
The properties of the provider service.
intermediateSourceConfiguration -> (structure)
The Amazon S3 location that temporarily stores your data while it processes. Your information won’t be saved permanently.
intermediateS3Path -> (string)
The Amazon S3 location (bucket and prefix). For example:s3://provider_bucket/DOC-EXAMPLE-BUCKET
providerConfiguration -> (document)
The required configuration fields to use with the provider service.providerServiceArn -> (string)
The ARN of the provider service.resolutionType -> (string)
The type of matching. There are two types of matching:RULE_MATCHING
andML_MATCHING
.ruleBasedProperties -> (structure)
An object which defines the list of matching rules to run and has a field
Rules
, which is a list of rule objects.attributeMatchingModel -> (string)
The comparison type. You can either chooseONE_TO_ONE
orMANY_TO_MANY
as the AttributeMatchingModel. When choosingMANY_TO_MANY
, the system can match attributes across the sub-types of an attribute type. For example, if the value of theBusinessEmail
field of Profile B matches, the two profiles are matched on theONE_TO_ONE
,the system can only match if the sub-types are exact matches. For example, only when the value of therules -> (list)
A list of
Rule
objects, each of which have fieldsRuleName
andMatchingKeys
.(structure)
An object containing
RuleName
, andMatchingKeys
.matchingKeys -> (list)
A list of
MatchingKeys
. TheMatchingKeys
must have been defined in theSchemaMapping
. Two records are considered to match according to this rule if all of theMatchingKeys
match.(string)
ruleName -> (string)
A name for the matching rule.
roleArn -> (string)
The Amazon Resource Name (ARN) of the IAM role. Entity Resolution assumes this role to create resources on your behalf as part of workflow execution.
workflowArn -> (string)
The ARN (Amazon Resource Name) that Entity Resolution generated for theMatchingWorkflow
.
workflowName -> (string)
The name of the workflow.