Looking for free software that can convert speech to text? Have you considered Whisper AI?
Whisper is software released by Open AI that allows you to do just that. Convert speech to text for free.
There are other freemium software that I will mention towards the end of this article. However Whisper is totally free for life.
You can totally run in on you computer if you have a GPU, graphics processing unit. If you don’t have the necessary computer, don’t fret. I’ll show you how to install it in Google Drive using an addon call Google Colaboratory.
It is a great way to learn a bit of artificial intelligence. There are some pros and cons of using Whisper which is towards the later part of this article.
Anyway, here are the steps to install Whisper.
These are the steps to install Whisper AI to Google Colaboratory
- Install Google colaboratory
- Install Whisper AI
- Configure Google Colabotatory
- Run Whisper AI
- Download the text, vtt or srt files
The above steps may seems daunting but it is not. It is just a matter of mostly copy and pasting. You won’t be installing any software on your computer so you are safe. The files will be installed into your Google Drive.
To read about the Whisper AI model at Github 1
Here are the necessary code or links
- Copy the line below into Google colaboratory and press Run
pip install git+https://github.com/openai/whisper.git
- Copy the line below into Google colaboratory and press Run
!sudo apt update && sudo apt install ffmpeg
Next click on the Runtime tab > click on Change Runtime Type > click on T4 GPU. Click to save it. (see the image below)
- Copy the line below. Change FullFileName to the name of the file you upload include the file format without the quotes eg. “SampleAudio.mp3”
For language, replace “LanguageNoQuotes” with example English, or Spanish etc.
!whisper "FullFileName" --model medium --language "LanguageNoQuotes"
An example is
!whisper “MyAudio.mp3” –model medium –language English
Note - Sometimes you need to type item no 3 above instead of just copy paste. The reason is Whisper didn’t run unless I type in the code. This is just for item no. 3.
After Whisper AI runs, you will see on the left hand side within the folder several files, that end with .srt, .vtt or other format captions file.
It will take some time depending on the length of your audio file. For a start, I suggest trying it with something about a minute long.
Step by Step Video Guide
Disadvantage of Whisper AI + Google Colabotary
The main disadvantage of using Whisper with Google Colaboratory is that you may need to go through all the steps above when you want to use Whisper later. The speed of transcribing is also not the best since we are using free Google GPUs.
You can install Whisper AI onto your computer but you need to have a GPU, graphics processing unit.
Whisper AI Alternative
Since I managed to get Whisper AI to work, I also wanted to try an alternative speech to text service that is web based. I came across TurboSribe which has a freemium model.
Turboscribe is an alternative to using Whisper AI and it uses the same technology.
Since it is a commercial software, the hardware behind it is better, therefore you can get faster transcribing speeds. The files that you upload are also stored there unlike with Google Colaboratory.
Based on a several short tests, TurboScribe is much faster and more convenient.
I found transcribing to another language happens almost immediately since it translate from the transcribe text. Whisper AI on the other hand, runs through the audio again, then transcribe it to another language. Therefore it takes much longer.
There are slight differences in the captions created by Whisper AI vs TurboScribe in English and in Malay. You can download a zip file containing an audio sample (90 seconds long) and the transcribe text in English and Malay to compare them.
Here a sample audio and transcription click to download. No registration required.
You will notice that Whisper AI tend to create longer sentences, while TurboScribe tend to have short sentences. The captions created by TurboScribe is at an advantage to keep your viewers attention.
If you want to be kept updated in tech and AI, join my newsletter here.
-
Whisper AI description at Github https://github.com/openai/whisper ↩︎