A.I. Transcriber: Documentation
Transcribe your audio or video files. Utilizes OpenAI’s context-correcting Whisper model and Azure Cognitive Services processing for transcriptions with the ability to identify and tag multiple speakers.
How to Use the A.I. Audio Transcriber:
STEP 1: Upload your audio or video file.
STEP 2: Select your transcription settings (see the “Transcription Settings and Usage Notes” section below).
STEP 3: Begin transcription task (click “Start Transcription” button, visible once your file has been uploaded).
You will receive a pop-up message when your transcription task has started successfully, and the transcription task will be added to the table on the page with a Status of “In Progress”. The Status will change to a green checkmark when your transcription is ready.
Depending on the length and quality of your audio recording and on other factors such as server load, it can take a few minutes for your transcription to complete. You do not need to keep the page open for the progress to continue. You can select the “Send Email When Ready” option when you begin your transcription to be notified when the transcription task has completed.
Accessing and Using Your Transcriptions:
Once your transcription is ready, you will see three Action buttons for each transcription in the table on the page:
Chat: Creates a new, specialized “Transcript Chat” session in A.I. Chat for Professionals, loaded with the full transcription text as context. Powerful for creating summaries and analyses from your transcript (such as meeting minutes, interview summaries, key theme extraction).
Read: Loads your transcription into the viewing pane at the bottom of the page. From there, you can also select Copy (copies the full transcription text to your clipboard) or Download (downloads the full transcription text as a .txt file).
Delete: Permanently deletes the transcription task and results.
Transcription Settings and Usage Notes:
You must set the following settings before you begin a transcription:
Identify Speakers: The A.I. will identify unique speakers within multi-speaker recordings and attribute words to the people who spoke them (tagged as “Speaker 1”, “Speaker 2” and so on). When selected, you will be prompted to select a maximum number of speakers that the A.I. should attempt to identify, up to a maximum of 35. Providing an accurate number of speakers who are actually heard in the file can help to improve its speaker identification performance. Timestamps are always included when Identify Speakers is selected. Speaker Identification accuracy is quite good and sufficient for most purposes, but not perfect. The clearer the recording, the better the performance.
Include Timestamps. Adds timestamps to the transcriptions. Allows you to select timestamp duration in seconds (i.e., the time interval between timestamps). Timestamps are always included when Identify Speakers is selected, and are automatically tagged when the current speaker switches.
Send Email When Ready. Depending on the length and quality of your audio recording and on other factors such as server load, it can take a few minutes for your transcription to complete. Select the Send Email When Ready button to be notified when it is complete — no need to keep the Transcriber tool open and sit around waiting for it.
Language. Select the language that the speakers in your audio recording are speaking. Current options: English and French.
Other notes:
- Upload almost any format or size of audio or video file for audio transcription.
- You can have multiple transcriptions running simultaneously.
Qoken Consumption (Usage):
Qoken usage: 1.20 Qokens per hour of audio transcribed.