[ aws . transcribe ]
Starts a batch job to transcribe medical speech to text.
See also: AWS API Documentation
See ‘aws help’ for descriptions of global parameters.
start-medical-transcription-job
--medical-transcription-job-name <value>
--language-code <value>
[--media-sample-rate-hertz <value>]
[--media-format <value>]
--media <value>
--output-bucket-name <value>
[--output-key <value>]
[--output-encryption-kms-key-id <value>]
[--settings <value>]
--specialty <value>
--type <value>
[--cli-input-json | --cli-input-yaml]
[--generate-cli-skeleton <value>]
--medical-transcription-job-name
(string)
The name of the medical transcription job. You can’t use the strings “
.
” or “..
” by themselves as the job name. The name must also be unique within an AWS account. If you try to create a medical transcription job with the same name as a previous medical transcription job, you get aConflictException
error.
--language-code
(string)
The language code for the language spoken in the input media file. US English (en-US) is the valid value for medical transcription jobs. Any other value you enter for language code results in a
BadRequestException
error.Possible values:
af-ZA
ar-AE
ar-SA
cy-GB
da-DK
de-CH
de-DE
en-AB
en-AU
en-GB
en-IE
en-IN
en-US
en-WL
es-ES
es-US
fa-IR
fr-CA
fr-FR
ga-IE
gd-GB
he-IL
hi-IN
id-ID
it-IT
ja-JP
ko-KR
ms-MY
nl-NL
pt-BR
pt-PT
ru-RU
ta-IN
te-IN
tr-TR
zh-CN
--media-sample-rate-hertz
(integer)
The sample rate, in Hertz, of the audio track in the input media file.
If you do not specify the media sample rate, Amazon Transcribe Medical determines the sample rate. If you specify the sample rate, it must match the rate detected by Amazon Transcribe Medical. In most cases, you should leave the
MediaSampleRateHertz
field blank and let Amazon Transcribe Medical determine the sample rate.
--media-format
(string)
The audio format of the input media file.
Possible values:
mp3
mp4
wav
flac
ogg
amr
webm
--media
(structure)
Describes the input media file in a transcription request.
MediaFileUri -> (string)
The S3 object location of the input media file. The URI must be in the same region as the API endpoint that you are calling. The general form is:
For example:
For more information about S3 object names, see Object Keys in the Amazon S3 Developer Guide .
Shorthand Syntax:
MediaFileUri=string
JSON Syntax:
{
"MediaFileUri": "string"
}
--output-bucket-name
(string)
The Amazon S3 location where the transcription is stored.
You must set
OutputBucketName
for Amazon Transcribe Medical to store the transcription results. Your transcript appears in the S3 location you specify. When you call the GetMedicalTranscriptionJob , the operation returns this location in theTranscriptFileUri
field. The S3 bucket must have permissions that allow Amazon Transcribe Medical to put files in the bucket. For more information, see Permissions Required for IAM User Roles .You can specify an AWS Key Management Service (KMS) key to encrypt the output of your transcription using the
OutputEncryptionKMSKeyId
parameter. If you don’t specify a KMS key, Amazon Transcribe Medical uses the default Amazon S3 key for server-side encryption of transcripts that are placed in your S3 bucket.
--output-key
(string)
You can specify a location in an Amazon S3 bucket to store the output of your medical transcription job.
If you don’t specify an output key, Amazon Transcribe Medical stores the output of your transcription job in the Amazon S3 bucket you specified. By default, the object key is “your-transcription-job-name.json”.
You can use output keys to specify the Amazon S3 prefix and file name of the transcription output. For example, specifying the Amazon S3 prefix, “folder1/folder2/”, as an output key would lead to the output being stored as “folder1/folder2/your-transcription-job-name.json”. If you specify “my-other-job-name.json” as the output key, the object key is changed to “my-other-job-name.json”. You can use an output key to change both the prefix and the file name, for example “folder/my-other-job-name.json”.
If you specify an output key, you must also specify an S3 bucket in the
OutputBucketName
parameter.
--output-encryption-kms-key-id
(string)
The Amazon Resource Name (ARN) of the AWS Key Management Service (KMS) key used to encrypt the output of the transcription job. The user calling the StartMedicalTranscriptionJob operation must have permission to use the specified KMS key.
You use either of the following to identify a KMS key in the current account:
KMS Key ID: “1234abcd-12ab-34cd-56ef-1234567890ab”
KMS Key Alias: “alias/ExampleAlias”
You can use either of the following to identify a KMS key in the current account or another account:
Amazon Resource Name (ARN) of a KMS key in the current account or another account: “arn:aws:kms:region:account ID:key/1234abcd-12ab-34cd-56ef-1234567890ab”
ARN of a KMS Key Alias: “arn:aws:kms:region:account ID:alias/ExampleAlias”
If you don’t specify an encryption key, the output of the medical transcription job is encrypted with the default Amazon S3 key (SSE-S3).
If you specify a KMS key to encrypt your output, you must also specify an output location in the
OutputBucketName
parameter.
--settings
(structure)
Optional settings for the medical transcription job.
ShowSpeakerLabels -> (boolean)
Determines whether the transcription job uses speaker recognition to identify different speakers in the input audio. Speaker recognition labels individual speakers in the audio file. If you set the
ShowSpeakerLabels
field to true, you must also set the maximum number of speaker labels in theMaxSpeakerLabels
field.You can’t set both
ShowSpeakerLabels
andChannelIdentification
in the same request. If you set both, your request returns aBadRequestException
.MaxSpeakerLabels -> (integer)
The maximum number of speakers to identify in the input audio. If there are more speakers in the audio than this number, multiple speakers are identified as a single speaker. If you specify the
MaxSpeakerLabels
field, you must set theShowSpeakerLabels
field to true.ChannelIdentification -> (boolean)
Instructs Amazon Transcribe Medical to process each audio channel separately and then merge the transcription output of each channel into a single transcription.
Amazon Transcribe Medical also produces a transcription of each item detected on an audio channel, including the start time and end time of the item and alternative transcriptions of item. The alternative transcriptions also come with confidence scores provided by Amazon Transcribe Medical.
You can’t set both
ShowSpeakerLabels
andChannelIdentification
in the same request. If you set both, your request returns aBadRequestException
ShowAlternatives -> (boolean)
Determines whether alternative transcripts are generated along with the transcript that has the highest confidence. If you set
ShowAlternatives
field to true, you must also set the maximum number of alternatives to return in theMaxAlternatives
field.MaxAlternatives -> (integer)
The maximum number of alternatives that you tell the service to return. If you specify the
MaxAlternatives
field, you must set theShowAlternatives
field to true.VocabularyName -> (string)
The name of the vocabulary to use when processing a medical transcription job.
Shorthand Syntax:
ShowSpeakerLabels=boolean,MaxSpeakerLabels=integer,ChannelIdentification=boolean,ShowAlternatives=boolean,MaxAlternatives=integer,VocabularyName=string
JSON Syntax:
{
"ShowSpeakerLabels": true|false,
"MaxSpeakerLabels": integer,
"ChannelIdentification": true|false,
"ShowAlternatives": true|false,
"MaxAlternatives": integer,
"VocabularyName": "string"
}
--specialty
(string)
The medical specialty of any clinician speaking in the input media.
Possible values:
PRIMARYCARE
--type
(string)
The type of speech in the input audio.
CONVERSATION
refers to conversations between two or more speakers, e.g., a conversations between doctors and patients.DICTATION
refers to single-speaker dictated speech, e.g., for clinical notes.Possible values:
CONVERSATION
DICTATION
--cli-input-json
| --cli-input-yaml
(string)
Reads arguments from the JSON string provided. The JSON string follows the format provided by --generate-cli-skeleton
. If other arguments are provided on the command line, those values will override the JSON-provided values. It is not possible to pass arbitrary binary values using a JSON-provided value as the string will be taken literally. This may not be specified along with --cli-input-yaml
.
--generate-cli-skeleton
(string)
Prints a JSON skeleton to standard output without sending an API request. If provided with no value or the value input
, prints a sample input JSON that can be used as an argument for --cli-input-json
. Similarly, if provided yaml-input
it will print a sample input YAML that can be used with --cli-input-yaml
. If provided with the value output
, it validates the command inputs and returns a sample output JSON for that command.
See ‘aws help’ for descriptions of global parameters.
Example 1: To transcribe a medical dictation stored as an audio file
The following start-medical-transcription-job
example transcribes an audio file. You specify the location of the transcription output in the OutputBucketName
parameter.
aws transcribe start-medical-transcription-job \
--cli-input-json file://myfile.json
Contents of myfile.json
:
{
"MedicalTranscriptionJobName": "simple-dictation-medical-transcription-job",
"LanguageCode": "language-code",
"Specialty": "PRIMARYCARE",
"Type": "DICTATION",
"OutputBucketName":"DOC-EXAMPLE-BUCKET",
"Media": {
"MediaFileUri": "s3://DOC-EXAMPLE-BUCKET/your-audio-file.extension"
}
}
Output:
{
"MedicalTranscriptionJob": {
"MedicalTranscriptionJobName": "simple-dictation-medical-transcription-job",
"TranscriptionJobStatus": "IN_PROGRESS",
"LanguageCode": "language-code",
"Media": {
"MediaFileUri": "s3://DOC-EXAMPLE-BUCKET/your-audio-file.extension"
},
"StartTime": "2020-09-20T00:35:22.256000+00:00",
"CreationTime": "2020-09-20T00:35:22.218000+00:00",
"Specialty": "PRIMARYCARE",
"Type": "DICTATION"
}
}
For more information, see Batch Transcription Overview in the Amazon Transcribe Developer Guide.
Example 2: To transcribe a clinician-patient dialogue stored as an audio file
The following start-medical-transcription-job
example transcribes an audio file containing a clinician-patient dialogue. You specify the location of the transcription output in the OutputBucketName parameter.
aws transcribe start-medical-transcription-job \
--cli-input-json file://mysecondfile.json
Contents of mysecondfile.json
:
{
"MedicalTranscriptionJobName": "simple-dictation-medical-transcription-job",
"LanguageCode": "language-code",
"Specialty": "PRIMARYCARE",
"Type": "CONVERSATION",
"OutputBucketName":"DOC-EXAMPLE-BUCKET",
"Media": {
"MediaFileUri": "s3://DOC-EXAMPLE-BUCKET/your-audio-file.extension"
}
}
Output:
{
"MedicalTranscriptionJob": {
"MedicalTranscriptionJobName": "simple-conversation-medical-transcription-job",
"TranscriptionJobStatus": "IN_PROGRESS",
"LanguageCode": "language-code",
"Media": {
"MediaFileUri": "s3://DOC-EXAMPLE-BUCKET/your-audio-file.extension"
},
"StartTime": "2020-09-20T23:19:49.965000+00:00",
"CreationTime": "2020-09-20T23:19:49.941000+00:00",
"Specialty": "PRIMARYCARE",
"Type": "CONVERSATION"
}
}
For more information, see Batch Transcription Overview in the Amazon Transcribe Developer Guide.
Example 3: To transcribe a multichannel audio file of a clinician-patient dialogue
The following start-medical-transcription-job
example transcribes the audio from each channel in the audio file and merges the separate transcriptions from each channel into a single transcription output. You specify the location of the transcription output in the OutputBucketName
parameter.
aws transcribe start-medical-transcription-job \
--cli-input-json file://mythirdfile.json
Contents of mythirdfile.json
:
{
"MedicalTranscriptionJobName": "multichannel-conversation-medical-transcription-job",
"LanguageCode": "language-code",
"Specialty": "PRIMARYCARE",
"Type": "CONVERSATION",
"OutputBucketName":"DOC-EXAMPLE-BUCKET",
"Media": {
"MediaFileUri": "s3://DOC-EXAMPLE-BUCKET/your-audio-file.extension"
},
"Settings":{
"ChannelIdentification": true
}
}
Output:
{
"MedicalTranscriptionJob": {
"MedicalTranscriptionJobName": "multichannel-conversation-medical-transcription-job",
"TranscriptionJobStatus": "IN_PROGRESS",
"LanguageCode": "language-code",
"Media": {
"MediaFileUri": "s3://DOC-EXAMPLE-BUCKET/your-audio-file.extension"
},
"StartTime": "2020-09-20T23:46:44.081000+00:00",
"CreationTime": "2020-09-20T23:46:44.053000+00:00",
"Settings": {
"ChannelIdentification": true
},
"Specialty": "PRIMARYCARE",
"Type": "CONVERSATION"
}
}
For more information, see Channel Identification in the Amazon Transcribe Developer Guide.
Example 4: To transcribe an audio file of a clinician-patient dialogue and identify the speakers in the transcription output
The following start-medical-transcription-job
example transcribes an audio file and labels the speech of each speaker in the transcription output. You specify the location of the transcription output in the OutputBucketName
parameter.
aws transcribe start-medical-transcription-job \
--cli-input-json file://myfourthfile.json
Contents of myfourthfile.json
:
{
"MedicalTranscriptionJobName": "speaker-id-conversation-medical-transcription-job",
"LanguageCode": "language-code",
"Specialty": "PRIMARYCARE",
"Type": "CONVERSATION",
"OutputBucketName":"DOC-EXAMPLE-BUCKET",
"Media": {
"MediaFileUri": "s3://DOC-EXAMPLE-BUCKET/your-audio-file.extension"
},
"Settings":{
"ShowSpeakerLabels": true,
"MaxSpeakerLabels": 2
}
}
Output:
{
"MedicalTranscriptionJob": {
"MedicalTranscriptionJobName": "speaker-id-conversation-medical-transcription-job",
"TranscriptionJobStatus": "IN_PROGRESS",
"LanguageCode": "language-code",
"Media": {
"MediaFileUri": "s3://DOC-EXAMPLE-BUCKET/your-audio-file.extension"
},
"StartTime": "2020-09-21T18:43:37.265000+00:00",
"CreationTime": "2020-09-21T18:43:37.157000+00:00",
"Settings": {
"ShowSpeakerLabels": true,
"MaxSpeakerLabels": 2
},
"Specialty": "PRIMARYCARE",
"Type": "CONVERSATION"
}
}
For more information, see Identifying Speakers in the Amazon Transcribe Developer Guide.
Example 5: To transcribe a medical conversation stored as an audio file with up to two transcription alternatives
The following start-medical-transcription-job
example creates up to two alternative transcriptions from a single audio file. Every transcriptions has a level of confidence associated with it. By default, Amazon Transcribe returns the transcription with the highest confidence level. You can specify that Amazon Transcribe return additional transcriptions with lower confidence levels. You specify the location of the transcription output in the OutputBucketName
parameter.
aws transcribe start-medical-transcription-job \
--cli-input-json file://myfifthfile.json
Contents of myfifthfile.json
:
{
"MedicalTranscriptionJobName": "alternatives-conversation-medical-transcription-job",
"LanguageCode": "language-code",
"Specialty": "PRIMARYCARE",
"Type": "CONVERSATION",
"OutputBucketName":"DOC-EXAMPLE-BUCKET",
"Media": {
"MediaFileUri": "s3://DOC-EXAMPLE-BUCKET/your-audio-file.extension"
},
"Settings":{
"ShowAlternatives": true,
"MaxAlternatives": 2
}
}
Output:
{
"MedicalTranscriptionJob": {
"MedicalTranscriptionJobName": "alternatives-conversation-medical-transcription-job",
"TranscriptionJobStatus": "IN_PROGRESS",
"LanguageCode": "language-code",
"Media": {
"MediaFileUri": "s3://DOC-EXAMPLE-BUCKET/your-audio-file.extension"
},
"StartTime": "2020-09-21T19:09:18.199000+00:00",
"CreationTime": "2020-09-21T19:09:18.171000+00:00",
"Settings": {
"ShowAlternatives": true,
"MaxAlternatives": 2
},
"Specialty": "PRIMARYCARE",
"Type": "CONVERSATION"
}
}
For more information, see Alternative Transcriptions in the Amazon Transcribe Developer Guide.
Example 6: To transcribe an audio file of a medical dictation with up to two alternative transcriptions
The following start-medical-transcription-job
example transcribes an audio file and uses a vocabulary filter to mask any unwanted words. You specify the location of the transcription output in the OutputBucketName parameter.
aws transcribe start-medical-transcription-job \
--cli-input-json file://mysixthfile.json
Contents of mysixthfile.json
:
{
"MedicalTranscriptionJobName": "alternatives-conversation-medical-transcription-job",
"LanguageCode": "language-code",
"Specialty": "PRIMARYCARE",
"Type": "DICTATION",
"OutputBucketName":"DOC-EXAMPLE-BUCKET",
"Media": {
"MediaFileUri": "s3://DOC-EXAMPLE-BUCKET/your-audio-file.extension"
},
"Settings":{
"ShowAlternatives": true,
"MaxAlternatives": 2
}
}
Output:
{
"MedicalTranscriptionJob": {
"MedicalTranscriptionJobName": "alternatives-dictation-medical-transcription-job",
"TranscriptionJobStatus": "IN_PROGRESS",
"LanguageCode": "language-code",
"Media": {
"MediaFileUri": "s3://DOC-EXAMPLE-BUCKET/your-audio-file.extension"
},
"StartTime": "2020-09-21T21:01:14.592000+00:00",
"CreationTime": "2020-09-21T21:01:14.569000+00:00",
"Settings": {
"ShowAlternatives": true,
"MaxAlternatives": 2
},
"Specialty": "PRIMARYCARE",
"Type": "DICTATION"
}
}
For more information, see Alternative Transcriptions in the Amazon Transcribe Developer Guide.
Example 7: To transcribe an audio file of a medical dictation with increased accuracy by using a custom vocabulary
The following start-medical-transcription-job
example transcribes an audio file and uses a medical custom vocabulary you’ve previously created to increase the transcription accuracy. You specify the location of the transcription output in the OutputBucketName
parameter.
aws transcribe start-transcription-job \
--cli-input-json file://myseventhfile.json
Contents of mysixthfile.json
:
{
"MedicalTranscriptionJobName": "vocabulary-dictation-medical-transcription-job",
"LanguageCode": "language-code",
"Specialty": "PRIMARYCARE",
"Type": "DICTATION",
"OutputBucketName":"DOC-EXAMPLE-BUCKET",
"Media": {
"MediaFileUri": "s3://DOC-EXAMPLE-BUCKET/your-audio-file.extension"
},
"Settings":{
"VocabularyName": "cli-medical-vocab-1"
}
}
Output:
{
"MedicalTranscriptionJob": {
"MedicalTranscriptionJobName": "vocabulary-dictation-medical-transcription-job",
"TranscriptionJobStatus": "IN_PROGRESS",
"LanguageCode": "language-code",
"Media": {
"MediaFileUri": "s3://DOC-EXAMPLE-BUCKET/your-audio-file.extension"
},
"StartTime": "2020-09-21T21:17:27.045000+00:00",
"CreationTime": "2020-09-21T21:17:27.016000+00:00",
"Settings": {
"VocabularyName": "cli-medical-vocab-1"
},
"Specialty": "PRIMARYCARE",
"Type": "DICTATION"
}
}
For more information, see Medical Custom Vocabularies in the Amazon Transcribe Developer Guide.
MedicalTranscriptionJob -> (structure)
A batch job submitted to transcribe medical speech to text.
MedicalTranscriptionJobName -> (string)
The name for a given medical transcription job.
TranscriptionJobStatus -> (string)
The completion status of a medical transcription job.
LanguageCode -> (string)
The language code for the language spoken in the source audio file. US English (en-US) is the only supported language for medical transcriptions. Any other value you enter for language code results in a
BadRequestException
error.MediaSampleRateHertz -> (integer)
The sample rate, in Hertz, of the source audio containing medical information.
If you don’t specify the sample rate, Amazon Transcribe Medical determines it for you. If you choose to specify the sample rate, it must match the rate detected by Amazon Transcribe Medical. In most cases, you should leave the
MediaSampleHertz
blank and let Amazon Transcribe Medical determine the sample rate.MediaFormat -> (string)
The format of the input media file.
Media -> (structure)
Describes the input media file in a transcription request.
MediaFileUri -> (string)
The S3 object location of the input media file. The URI must be in the same region as the API endpoint that you are calling. The general form is:
For example:
For more information about S3 object names, see Object Keys in the Amazon S3 Developer Guide .
Transcript -> (structure)
An object that contains the
MedicalTranscript
. TheMedicalTranscript
contains theTranscriptFileUri
.TranscriptFileUri -> (string)
The S3 object location of the medical transcript.
Use this URI to access the medical transcript. This URI points to the S3 bucket you created to store the medical transcript.
StartTime -> (timestamp)
A timestamp that shows when the job started processing.
CreationTime -> (timestamp)
A timestamp that shows when the job was created.
CompletionTime -> (timestamp)
A timestamp that shows when the job was completed.
FailureReason -> (string)
If the
TranscriptionJobStatus
field isFAILED
, this field contains information about why the job failed.The
FailureReason
field contains one of the following values:
Unsupported media format
- The media format specified in theMediaFormat
field of the request isn’t valid. See the description of theMediaFormat
field for a list of valid values.
The media format provided does not match the detected media format
- The media format of the audio file doesn’t match the format specified in theMediaFormat
field in the request. Check the media format of your media file and make sure the two values match.
Invalid sample rate for audio file
- The sample rate specified in theMediaSampleRateHertz
of the request isn’t valid. The sample rate must be between 8000 and 48000 Hertz.
The sample rate provided does not match the detected sample rate
- The sample rate in the audio file doesn’t match the sample rate specified in theMediaSampleRateHertz
field in the request. Check the sample rate of your media file and make sure that the two values match.
Invalid file size: file size too large
- The size of your audio file is larger than what Amazon Transcribe Medical can process. For more information, see Guidelines and Quotas in the Amazon Transcribe Medical Guide
Invalid number of channels: number of channels too large
- Your audio contains more channels than Amazon Transcribe Medical is configured to process. To request additional channels, see Amazon Transcribe Medical Endpoints and Quotas in the Amazon Web Services General ReferenceSettings -> (structure)
Object that contains object.
ShowSpeakerLabels -> (boolean)
Determines whether the transcription job uses speaker recognition to identify different speakers in the input audio. Speaker recognition labels individual speakers in the audio file. If you set the
ShowSpeakerLabels
field to true, you must also set the maximum number of speaker labels in theMaxSpeakerLabels
field.You can’t set both
ShowSpeakerLabels
andChannelIdentification
in the same request. If you set both, your request returns aBadRequestException
.MaxSpeakerLabels -> (integer)
The maximum number of speakers to identify in the input audio. If there are more speakers in the audio than this number, multiple speakers are identified as a single speaker. If you specify the
MaxSpeakerLabels
field, you must set theShowSpeakerLabels
field to true.ChannelIdentification -> (boolean)
Instructs Amazon Transcribe Medical to process each audio channel separately and then merge the transcription output of each channel into a single transcription.
Amazon Transcribe Medical also produces a transcription of each item detected on an audio channel, including the start time and end time of the item and alternative transcriptions of item. The alternative transcriptions also come with confidence scores provided by Amazon Transcribe Medical.
You can’t set both
ShowSpeakerLabels
andChannelIdentification
in the same request. If you set both, your request returns aBadRequestException
ShowAlternatives -> (boolean)
Determines whether alternative transcripts are generated along with the transcript that has the highest confidence. If you set
ShowAlternatives
field to true, you must also set the maximum number of alternatives to return in theMaxAlternatives
field.MaxAlternatives -> (integer)
The maximum number of alternatives that you tell the service to return. If you specify the
MaxAlternatives
field, you must set theShowAlternatives
field to true.VocabularyName -> (string)
The name of the vocabulary to use when processing a medical transcription job.
Specialty -> (string)
The medical specialty of any clinicians providing a dictation or having a conversation.
PRIMARYCARE
is the only available setting for this object. This specialty enables you to generate transcriptions for the following medical fields:
Family Medicine
Type -> (string)
The type of speech in the transcription job.
CONVERSATION
is generally used for patient-physician dialogues.DICTATION
is the setting for physicians speaking their notes after seeing a patient. For more information, see how-it-works-med