Speech to text

5.23'22

vosk

Vosk Server

Server: requires about 8G memory or more.

docker run -d -p 2700:2700 alphacep/kaldi-en:latest

Client

git clone https://github.com/alphacep/vosk-server
cd vosk-server/websocket
./test.py test.wav

Client with microphone

pip3 install sounddevice
./test_microphone.py -u ws://localhost:2700

Julius

https://github.com/julius-speech/julius

Build

git clone https://github.com/julius-speech/julius.git
cd julius

Fix for macOS: Add belowto libsent/src/adin_mic_darwin_coreaudio.c

#include <sent/stddefs.h>
./configure --enable-words-int
make -j4

Run

Download JuliusModel named ENVR-v5.4.Dnn.Bin.zip

unzip ENVR-v5.4.Dnn.Bin.zip
cd ENVR-v5.4.Dnn.Bin

Edit config file dnn.jconf

-feature_options -htkconf wav_config -cvn -cmnload ENVR-v5.3.norm -cmnstatic
+feature_options -htkconf wav_config -cvn -cmnload ENVR-v5.3.norm -cvnstatic

+state_prior_log10nize false

Recognize audio file

../julius/julius/julius -C julius.jconf -dnnconf dnn.jconf

Edit test.dbl to test with other audio files.

📖