Internet is full of posts requesting assistance in transcribing audio or video files. RealSpeaker can transform audio and video files in letters and words at RUR 7 ($0.11) per minute. Invest Foresight learnt the details of the project from its founder, Victor Osetrov, and also tested the program performance.
How it works
One is to go to the RealSpeaker website, download an audio or video file and select the language. The system will measure the file run time and submit a bill to be paid. Once it is settled online, the transcribing process starts and sounds are put into letters. The program will transform a one hour long interview into a text in about 20 minutes. The obtained text will contain no punctuation marks and hence will require editing and making some corrections since the text may turn to contain words which were not in the original file. But even such an unfinished text can save a lot of time. Sorting out the problem of punctuation marks is the next stage in the RealSpeaker efforts.
RealSpeaker is now used by call centers, telecom companies, media reporters, bloggers, students, authors, lecturers, and anyone who needs to have a voice recording transformed in a typed text. RealSpeaker gets requests to transcribe courtroom hearings, large meetings of municipal authorities (of Togliatti and some cities in Siberia, for example) which have to publish texts of their sessions. There are companies willing to use RealSpeaker’s API and integrate it into their own commercial products. A one minute service costs RUR 7 for Russian nationals and $0.08 to $0.12 for foreign audience. Since last February, the service is available on demand, but within the next few weeks it will also be available on subscription.
RealSpeaker is able to recognize 16 languages (Russian, Ukrainian, Spanish, UK English and US English, French, German, Portuguese, Chinese, Korean, Egyptian Arabic, etc). RealSpeaker has accumulated its vocabulary database of the Russian language all by itself, in the process of transcribing customers’ voice recordings. For other languages, vocabulary databases are purchased from companies specializing in voice recognition.
From Yoshkar-Ola to Galway
Victor Osetrov graduated from university in Yoshkar-Ola (Republic of Mari El) and in 2012 registered RealSpeaker in both Russia and the US and became an IT park resident in Kazan (Tatarstan). The startup’s first investor was Svetlana Nikiforova, business angel and wife of Russia’s communications minister. Nikiforova’s Startobaza acquired a 10% stake in RealSpeaker USA Inc. Further 10% of that company were purchased by US citizens, Pavel Pogodin and Natalia Bugorskaya. The names of other investors have not been disclosed. In 2014 Osetrov went to Chile to launch the version of the RealSpeaker program for Latin America. Since then, he operates from abroad having lived in Spain, US, South Korea, and now sharing his time between Ireland (Galway) and France (Brest in Brittany).
Until 2013 RealSpeaker lived on sponsors’ and external investors’ money. The startup’s main sponsors included Russian Academy of Sciences, Skolkovo Foundation, Tatarstan’s Innovative Development Agency, Microsoft, Starpobaza managing company, Bortnik Fund for Innovations Support, etc. At that early stage, new audience was accumulated and a stable version of the program was launched. Since 2013, the company no longer depends on external funding and supports its activities by its sales. By 2016 RealSpeaker has attracted over 3,500 regular customers whose licenses were priced between $5 and $69.
From software to web
In 2016 RealSpeaker had to stop selling software and launch its web version. RealSpeaker used to sell software programs which analyzed lips movements and voice frequencies to ensure real-time speech transcription for producing a typed text. In 2016 though, the company which supplied a library of video recognition of face features, was acquired by Intel and subsequently changed its activities. Besides, Windows 10 was released with Сortana function which was incompatible with RealSpeaker software. Amazon released its Alexa voice service and Apple released its Siri voice assistant. The above voice assistants operate in real time and can be used free of charge, so RealSpeaker found it hard to sell its software in such an environment. The company therefore decided to transcribe media files into typed texts and stop real time speech distinguishing activities.
“80% of all audio files are unavailable for search engines as they are not converted into pieces of text. When that is required, that is usually urgent and our clients are willing to pay”, Victor Osetrov explains. “In Russia, the conversion is usually done manually since labor is cheap”.
RealSpeaker started selling its new product in Russia, because its previous product was known in Russia as well. Of the company customers, 60% are in Russia and further 10% in Ukraine. RealSpeaker claims it has no direct competitors in Russia at the moment. To distribute the new product, a new company was set in Ireland. It has the same name, RealSpeaker, and its shareholders are an Irish government company, some private Spanish and Irish investors, Victor Osetrov and his team members. The names and shares of the investors are not disclosed. Svetlana Nikiforova has by now sold her stake in RealSpeaker.
RealSpeaker has not run any advertizing campaign, but still, some 50 files to be transcribed are loaded at its website on a weekday and some 30 on a day off. Each file may be few minutes to three hours long. RealSpeaker is getting ready to launch the official version of its product to be able to offer it to the business audience. So far, all the orders were submitted by individuals. By the year end the company plans to have its trade turnover at 1 million euro. Still, the profit will be rather small since the costs of renting servers to store files are very high.
By Natalia Kuznetsova