Retrieves metadata for a specified crawler.
See also: AWS API Documentation
See ‘aws help’ for descriptions of global parameters.
get-crawler
--name <value>
[--cli-input-json | --cli-input-yaml]
[--generate-cli-skeleton <value>]
[--cli-auto-prompt <value>]
--name
(string)
The name of the crawler to retrieve metadata for.
--cli-input-json
| --cli-input-yaml
(string)
Reads arguments from the JSON string provided. The JSON string follows the format provided by --generate-cli-skeleton
. If other arguments are provided on the command line, those values will override the JSON-provided values. It is not possible to pass arbitrary binary values using a JSON-provided value as the string will be taken literally. This may not be specified along with --cli-input-yaml
.
--generate-cli-skeleton
(string)
Prints a JSON skeleton to standard output without sending an API request. If provided with no value or the value input
, prints a sample input JSON that can be used as an argument for --cli-input-json
. Similarly, if provided yaml-input
it will print a sample input YAML that can be used with --cli-input-yaml
. If provided with the value output
, it validates the command inputs and returns a sample output JSON for that command.
--cli-auto-prompt
(boolean)
Automatically prompt for CLI input parameters.
See ‘aws help’ for descriptions of global parameters.
Crawler -> (structure)
The metadata for the specified crawler.
Name -> (string)
The name of the crawler.
Role -> (string)
The Amazon Resource Name (ARN) of an IAM role that’s used to access customer resources, such as Amazon Simple Storage Service (Amazon S3) data.
Targets -> (structure)
A collection of targets to crawl.
S3Targets -> (list)
Specifies Amazon Simple Storage Service (Amazon S3) targets.
(structure)
Specifies a data store in Amazon Simple Storage Service (Amazon S3).
Path -> (string)
The path to the Amazon S3 target.
Exclusions -> (list)
A list of glob patterns used to exclude from the crawl. For more information, see Catalog Tables with a Crawler .
(string)
JdbcTargets -> (list)
Specifies JDBC targets.
(structure)
Specifies a JDBC data store to crawl.
ConnectionName -> (string)
The name of the connection to use to connect to the JDBC target.
Path -> (string)
The path of the JDBC target.
Exclusions -> (list)
A list of glob patterns used to exclude from the crawl. For more information, see Catalog Tables with a Crawler .
(string)
DynamoDBTargets -> (list)
Specifies Amazon DynamoDB targets.
(structure)
Specifies an Amazon DynamoDB table to crawl.
Path -> (string)
The name of the DynamoDB table to crawl.
scanAll -> (boolean)
Indicates whether to scan all the records, or to sample rows from the table. Scanning all the records can take a long time when the table is not a high throughput table.
A value of
true
means to scan all records, while a value offalse
means to sample the records. If no value is specified, the value defaults totrue
.scanRate -> (double)
The percentage of the configured read capacity units to use by the AWS Glue crawler. Read capacity units is a term defined by DynamoDB, and is a numeric value that acts as rate limiter for the number of reads that can be performed on that table per second.
The valid values are null or a value between 0.1 to 1.5. A null value is used when user does not provide a value, and defaults to 0.5 of the configured Read Capacity Unit (for provisioned tables), or 0.25 of the max configured Read Capacity Unit (for tables using on-demand mode).
CatalogTargets -> (list)
Specifies AWS Glue Data Catalog targets.
(structure)
Specifies an AWS Glue Data Catalog target.
DatabaseName -> (string)
The name of the database to be synchronized.
Tables -> (list)
A list of the tables to be synchronized.
(string)
DatabaseName -> (string)
The name of the database in which the crawler’s output is stored.
Description -> (string)
A description of the crawler.
Classifiers -> (list)
A list of UTF-8 strings that specify the custom classifiers that are associated with the crawler.
(string)
SchemaChangePolicy -> (structure)
The policy that specifies update and delete behaviors for the crawler.
UpdateBehavior -> (string)
The update behavior when the crawler finds a changed schema.
DeleteBehavior -> (string)
The deletion behavior when the crawler finds a deleted object.
State -> (string)
Indicates whether the crawler is running, or whether a run is pending.
TablePrefix -> (string)
The prefix added to the names of tables that are created.
Schedule -> (structure)
For scheduled crawlers, the schedule when the crawler runs.
ScheduleExpression -> (string)
A
cron
expression used to specify the schedule (see Time-Based Schedules for Jobs and Crawlers . For example, to run something every day at 12:15 UTC, you would specify:cron(15 12 * * ? *)
.State -> (string)
The state of the schedule.
CrawlElapsedTime -> (long)
If the crawler is running, contains the total time elapsed since the last crawl began.
CreationTime -> (timestamp)
The time that the crawler was created.
LastUpdated -> (timestamp)
The time that the crawler was last updated.
LastCrawl -> (structure)
The status of the last crawl, and potentially error information if an error occurred.
Status -> (string)
Status of the last crawl.
ErrorMessage -> (string)
If an error occurred, the error information about the last crawl.
LogGroup -> (string)
The log group for the last crawl.
LogStream -> (string)
The log stream for the last crawl.
MessagePrefix -> (string)
The prefix for a message about this crawl.
StartTime -> (timestamp)
The time at which the crawl started.
Version -> (long)
The version of the crawler.
Configuration -> (string)
Crawler configuration information. This versioned JSON string allows users to specify aspects of a crawler’s behavior. For more information, see Configuring a Crawler .
CrawlerSecurityConfiguration -> (string)
The name of the
SecurityConfiguration
structure to be used by this crawler.