Coqui STT in Unity

It is possible to enable Speech Recognition (STT) in Unity with the use of Coqui STT. This is a tutorial on how to do that. Many thanks to @kbabilinski for doing most of the hard work porting DeepSpeech to Unity.

Create a new Unity project.

In Unity go to Edit->Project Settings->Player->Other Settings
Set Configuration->API Compatibility Level to .Net 4.x
Tick the box “Allow unsafe code”.

Create a folder structure as shown below:

Move the SampleScene in the root Scenes folder to the CoquiSTT/Scenes folder, then delete the Scenes folder in the root.

Get the Coqui STT source. Go to the git repository below and click on Code->Download Zip

Unzip the source and go to this directory:

Delete the file STTClient.csproj

Select all contents of this folder and drag into the STTClient folder inside Unity.

Currently there is a bug in the source which causes a number of these compile errors:
Stream’ is an ambiguous reference between ‘STTClient.Models.Stream’ and ‘System.IO.Stream’

To fix these compile errors, double click on each complile error to open the source file at the error location.
Change each Stream reference to STTClient.Models.Stream

For example, change this:

unsafe void FreeStream(Stream stream);

To this:

unsafe void FreeStream(STTClient.Models.Stream stream);

Save the file and let Unity compile it. Click on another compile error and repeat the process until all compile errors are resolved.

Next is to download the models. Go to git releases:

Download these files:


Because Unity does not show the extension in the Assets folder explorer, it is a bit hard to tell which file is which, so rename these files like this:

coqui-stt-0.9.3-models.tflite rename to coqui-stt-tflite.tflite
coqui-stt-0.9.3-models.scorer rename to coqui-stt-scorer.scorer

Note that pbmm files are not supported anymore in Coqui STT.

Place the tflite and scorer file in the Unity folder CoquiSTT/models/
If you have custom language and acoustic models you can place them here too.

Unzip the native_client.tflite.Windows.tar.xz file. Then look for the file and rename it to libstt.dll

Place the libstt.dll file in the Unity folder CoquiSTT/Plugins/win64/

Repeat this process for any other (available) platforms you want to support. Download the appropriate native_client package for each platform. Then change the .so extension to the extension appropriate for that platform. For example, the extension for IOS must be .bundle and for Android it must remain as .so and then place the library file in each appropriate platform folder inside Unity.

In Unity, select the file libstt, then go to the Inspector and untick the box Windows x86.
Click Apply. Do the same for all appropriate platforms.

Download the Unity scripts here:

These files originally come from a DeepSpeech to Unity port made by @kbabilinski. They have been slightly modified to make it work with CoqiSTT and contain a few other small imporvements.

Place the files ContinuousVoiceRecorder.cs and SpeechTextToText.cs in the Unity folder CoquiSTT/Scripts

The ContinuousVoiceRecorder script feeds the audio into Coqui STT in realtime and processes the intermediate result. The SpeechTextToText detects the users voice and processes the audio after the user stops talking.
Both examples can auto detect if the user is speaking using a volume threshold.

Create an empty game object in the Hierarchy and call it VoiceRecorder.

Drag and drop the ContinuousVoiceRecorder and SpeechTextToText script onto the VoiceRecorder game object.

Select the SpeechTextToText game object, then go to the Hierarchy and deselect one of the two scripts. Only one should run at the same time.

Go to Hierarchy->VoiceRecorder->Inspector->Speech Text To Text (script)->Tflite File Name and enter the file name: coqui-stt-tflite.tflite

Go to Hierarchy->VoiceRecorder->Inspector->Speech Text To Text (script)->Scorer File Name and enter the file name: coqui-stt-scorer.scorer

Do the same with the “Continuous Voice Recorder (Script)” in the Inspector.

Check that your microphone works.

Press Play and say something. The transcription should appear in Console as a Debug.Log

Here is a unitypackage minus the tflite and scorer files (to save space). You will still need to modify the project settings after importing:


Speech recognition with Coqui STT on Windows

Speech recognition and Flight Simulators

Recently, a few airplane addons for consumer PC based flight simulators have reached a level of detail which makes them suitable for home based practice for certain scenarios. There is one big shortcoming of this use case though: multi crew operations. In the real world, most commercial aircraft are flown by two pilots. The interaction between the two pilots is vast and strictly determined using Standard Operating Procedures (SOPs). For example, the pilot who is flying (PF) will instruct the other pilot (Pilot Monitoring, PM) “Gear Down”. PM will then put the gear lever down. Another example of heavy interaction between pilots is how checklists are read. PF will ask for “Landing Checklist”. PM will then respond with “Cabin Crew”. PF will then say “Advised”, followed by PM saying “Auto Thrust”, and PF responding “Speed”, etc.

Situations like this require speech recognition (Speech To Text, STT). Speech recognition has existed for a fairly long time but recognition quality for a relatively small amount of custom short phrases full of jargon has historically been very poor.

Since a few years, commercial AI based STT solutions which accept custom phrases have been available (Such as Microsoft Azure) but they are not suitable for 3rd party app deployment. This is because these services are cloud based (slow, needs internet), and charge on a data amount or time basis, making it difficult to create a suitable end user pricing system. In general these cloud based systems are also quite expensive.

Luckily there are now open source solutions available which are free, fast, and can run locally. Their accuracy and customization abilities rival paid counterparts. One such software package is called DeepSpeech from Mozilla but this project was recently scaled down. Luckily the core developers from DeepSpeech forked the project and created Coqui STT (and Coqui TSS) which is now in active development.

Out of the box, the supplied model for Coqui STT is not suitable for a flight simulator environment. This is because of the jargon used and phrases being out of context. For example, “gear up” might be recognized as “get up” and “flap one” might be recognized as “let one” (which is more common in every day use). It is not needed to train a new acoustic model (large database of sound files with transcription, requiring huge amount of processing time). Instead, a small custom language model (scorer) will be created to improve accuracy without requiring much processing time. A language model is basically a text file with custom sentences such as “flap one”, “gear down”, etc. To further improve accuracy, data from custom sound files can be incorporated into the model, if needed.

The other required part is converting Text To Speech (TTS). This is also very useful to simulate interaction with a synthetic ATC (Air Traffic Control) system. TTS has been possible for a long time. Most solutions sound quite robotic but Coqui has a TTS solution which sounds surprisingly natural. This will be discussed in another blog post.

The documentation for Coqui STT is somewhat outdated at this moment, so I created a tutorial on how to get this up and running.

Coqui is mainly Linux based but both training and deployment can run on Windows using WSL (Windows Subsystem for Linux). WSL is a virtual Linux environment inside of Windows. It also allows you to use windows based code editors like VS Code or Visual Studio. Additionally, WSL requires less resources than a virtual machine and is free. WSL only works on x64 processors (or ARM).

Note that some users have reported issues with specific firewall applications blocking internet access in WSL.

Coqui STT tutorial

If you have an nVidia Pascal GPU or later, you must install a special driver before installing WSL2:
Do not install any nvidia driver from a Linux terminal.

To install WSL2 (with Ubuntu 20.04 LTS), follow this guide:
Or this guide:

Don’t forget to reboot after you install WSL2.

-If you start Ubuntu (via Start menu) and you get the error message “wsl 2 requires an update to its kernel component” then run windows update and try again.
-The powershell must run with administrator permissions.
-There is no need to install the windows terminal.
-If you get an error relating to virtual disk, compression, and encryption, you have to disable disk compression for a certain folder. To fix this error, right click on this folder in Explorer: C:\Users\User\AppData\Local\Packages\ConicalGroupLimited…
Then un-tick the box at: Properties->General->Advanced->Compress contents to save disk space
Also un-tick Encrypt contents. Restart the PC and try starting Ubuntu again (via the start menu)
-To open a new Linux terminal, go to Windows search and type “ubuntu” to open the Ubuntu app.
-You can paste text into the Linux terminal by right clicking in the terminal window. If that doesn’t work, click on the Ubuntu Icon on the terminal window then tick the box in Properties->Options->Use Ctrl+Shift+C…
-In Windows, the Linux folder is located here: < \\wsl$ >
To open the Linux folder in Windows Explorer, type the above directory in the Path edit box. The Linux folder is only visible if a Linux terminal is running.
-When files are added or removed in the Linux environment, the change does not always show up in Windows Explorer. To refresh, navigate out of the directory, then navigate back.
-Curently, coqui STT only works with Python 3.6 and Tensorflow 1.15.4

If the command "wsl --install" doesn't work, first install all windows updates. Then try "wsl --install -d Ubuntu".

All commands are typed into the Linux terminal.

To make formatting clear, each command is separated by a new line. Long commands appear without a new line.

Install python:

sudo apt-get update

sudo apt-get upgrade

sudo apt-get -y purge python3.8

sudo apt-get -y autoremove

sudo apt install software-properties-common

sudo add-apt-repository ppa:deadsnakes/ppa

sudo apt-get update

sudo apt-get install python3.6-venv

python3.6 -m venv venv-stt

Next is to activate the virtual environment. All STT commands have to be executed in the virual environment, After installing, you only have to execute this line to start the virtual environment in a new terminal:

source $HOME/venv-stt/bin/activate

-Once the virtual environment is active, the terminal will change to something like this:
(venv-stt) username@USER-PC

Install pip and STT:

python3.6 -m pip install -U pip

python3.6 -m pip install stt

Or for GPU support you can run this:

python3.6 -m pip install stt-gpu

Download the tflite acoustic models:

curl -LO
curl -LO

-If the links above don’t work then you can find the files here:
-The pbmm model is not supported anymore, only the tflite model is supported.
-The scorer file is a generic language model and will not work well in a flight simulator environment. A custom language model has to be generated. More on that later.
-There is also a quantized tflite file available. This model is both smaller and faster than the unquantized model but it is also less accurate.

Custom language model

To create a custom language model, follow the steps below.

Clone the Coqui STT Git Repo:

git clone

Install dependencies:

pip install progressbar

pip install progressbar2

sudo apt-get update

sudo apt-get install build-essential libboost-all-dev cmake zlib1g-dev libbz2-dev liblzma-dev

sudo apt-get install libboost-all-dev libeigen3-dev

Get KenLM repository and build:

git clone

cd kenlm

mkdir build

cd build

sudo apt-get -y install cmake

sudo apt-get install build-essential

cmake ..

make -j 4

sudo make install

cd ~

Create a language model text file:

Use a text file with once sentence per line. Use utf8 encoding. This text should not contain any markup language. Remove any punctuation, but you can keep the apostroph as a character. Numbers should be written in full (ie as a cardinal) – that is, as eight rather than 8.

A sample file with WAV, CSV, and a language model file can be found here:

Place the language model in a file called language_model.txt and place it in the home/username directory (replace username with your username). In Windows Explorer, type < \\wsl$ > in the path bar to see the Linux drive. The full directory where you should place the file is this:

Example language model file format:

flap one
flap two
flap three

Create the required binary and vocab files from the language model:

python3.6 STT/data/lm/ --input_txt language_model.txt --output_dir . --top_k 500000 --kenlm_bins kenlm/build/bin/ --arpa_order 5 --max_arpa_memory "85%" --arpa_prune "0|0|1" --binary_a_bits 255 --binary_q_bits 8 --binary_type trie --discount_fallback

-The generate_lm command will save the new language model as two files on disk: lm.binary and vocab-500000.txt

Get the generate_scorer_package binary and extract:

curl -LO

tar -xvf native_client.tflite.Linux.tar.xz --directory STT/native_client/

Install dependencies:

pip install --upgrade pip setuptools

pip install optuna

python3.6 -m pip uninstall tensorflow

python3.6 -m pip install tensorflow==1.15.4

pip install coqui-stt-ctcdecoder

pip install coqui-stt-training

To fix “command not found”:

sudo chmod +x STT/native_client/generate_scorer_package

To fix “ not found”:

Copy this file:


Then move the file here:


Generate scorer file:

sudo STT/native_client/generate_scorer_package --alphabet STT/data/alphabet.txt --lm lm.binary --vocab vocab-500000.txt --package kenlm.scorer --default_alpha 0.5891777425167632 --default_beta 0.6619145283338659

-If you get this error: “Invalid label 0”, it probably means that the path to the alphabet file is incorrect.
-The message: “Doesn’t look like a character based (Bytes Are All You Need) model”, is not an error.
-The message: “–force_bytes_output_mode was not specified, using value infered from vocabulary contents: false”, is not an error.
-The output is kenlm.scorer

Make some WAV files with phrases which exist in the language model.

Audio file format:
Audio files should be WAV, 16 bit, 16Khz, mono.
Place the audio and csv files (can use example files from in home/username/STT/data/

Check the accuracy of the custom language model:

stt --model model.tflite --scorer kenlm.scorer --audio STT/data/flap_1.wav


stt --model model_quantized.tflite --scorer kenlm.scorer --audio STT/data/flap_1.wav

The output will be the last line in the console.

If the accuracy is not so good, a new scorer file (language model) must be generated with updated alpha and beta values. To do this, first create audio and csv files of the language model. I made a special tool for this which you can find here (source included, C#, WPF):

Download (WPF C# source included):

Name the text file sample.csv and place in STT/data/
Do not name the sample file train.csv, dev.csv, or test.csv, otherwise the scripts won’t be able to automatically create the required database split.

CSV file format for manual editing:

The first line of the csv file needs to be exactly this: wav_filename,wav_filesize,transcript
The wav_filesize is the file size in bytes. You can get this in Windows by right clicking on the file->Properties->Size: (not size on disk). For wav_filename you can use either the path to the WAV file or just the WAV file if it is placed in the same directory as the csv file.

The transcript should be exactly the same as the audio file. It should not contain any markup language. Remove any punctuation, but you can keep the apostrophe as a character. Numbers should be written in full (ie as a cardinal) – that is, as eight rather than 8.

Example csv file format:

flap_1.wav,50464,flap one
flap_2.wav,50712,flap two
flap_3.wav,54208,flap three

Get a checkpoint file from the generic model and extract it:

curl -LO

tar -xvf coqui-stt-1.0.0-checkpoint.tar.gz

-Do not download and extract the checkpoints file on Windows and then transfer to the Linux folder because then lm_optimizer will fail.

Generate new alpha and beta values:

python3.6 STT/ --alphabet_config_path STT/data/alphabet.txt --scorer_path kenlm.scorer --auto_input_dataset STT/data/sample.csv --checkpoint_dir coqui-stt-1.0.0-checkpoint --n_trials 6 --n_hidden 2048 --lm_alpha_max 5 --lm_beta_max 5

-The script will try random alpha and beta values to see which gives the best result so you will get a different output every time.
-For n_trials use 2400 if you have time as that gives a more accurate output.
-The output is for example: Best params: lm_alpha=1.58401780601227 and lm_beta=1.796448020609769
-This will also output the following files (see the dataset distribution chapter for more details): train.csv, dev.csv, test.csv into STT/data/
-The WAV files corresponding to the csv files should be present also.

Copy the updated alpha and beta values, then run generate_scorer_package again using the updated values:

sudo STT/native_client/generate_scorer_package --alphabet STT/data/alphabet.txt --lm lm.binary --vocab vocab-500000.txt --package kenlm.scorer --default_alpha 1.58401780601227 --default_beta 1.796448020609769

Transcribe the audio files again to see if the model now preforms better:

stt --model model.tflite --scorer kenlm.scorer --audio STT/data/flap_1.wav


You can further improve the recognition accuracy of a custom language model using a process called “fine-tuning”. This is done using a pre-trained generic acoustic model and custom WAV files with transcripts specific to the custom language model. This works especially well if you use WAV files from your own voice.

The following command performs fine-tuning using the default checkpoint dataset and the WAV files supplied by the CSV files:

python3.6 -m coqui_stt_training.train --checkpoint_dir coqui-stt-1.0.0-checkpoint --train_files STT/data/train.csv --dev_files STT/data/dev.csv --test_files STT/data/test.csv --n_hidden 2048 --load_cudnn true --epochs 3

-The output will be placed in –checkpoint_dir
-The flag “–epochs 3” should be removed for actual training but it will take a long time so for a quick test to see if everything is working.

The three CSV files were created by the –auto_input_dataset flag in If you only have one dataset with 100% coverage (sample.csv), you can run the command below instead. Keep in mind though that auto_input_dataset creates a random distribution but the csv files used should be the same throughout the training process:

python3.6 -m coqui_stt_training.train --checkpoint_dir coqui-stt-1.0.0-checkpoint --auto_input_dataset STT/data/sample.csv --n_hidden 2048 --load_cudnn true --epochs 3

The training output is not suitable for deployment and has to be converted to a tflite model first:

python3 -m coqui_stt_training.export --checkpoint_dir coqui-stt-1.0.0-checkpoint --export_dir STT/data/


To finish up, generate new alpha and beta values using using the train and dev csv files:

python3.6 STT/ --alphabet_config_path STT/data/alphabet.txt --scorer_path kenlm.scorer --test_files STT/data/train.csv STT/data/dev.csv --checkpoint_dir coqui-stt-1.0.0-checkpoint --n_trials 6 --n_hidden 2048 --lm_alpha_max 5 --lm_beta_max 5

-Use –n_trials 2400 if you have time as that gives more accurate values.
-The script will try random alpha and beta values to see which gives the best result so you will get a different output every time.

Now create a new scorer package using the alpha and beta values from

sudo STT/native_client/generate_scorer_package --alphabet STT/data/alphabet.txt --lm lm.binary --vocab vocab-500000.txt --package kenlm.scorer --default_alpha 0.931289039105002 --default_beta 1.1834137581510284

The fine-tuned model is now finished, so we can test its performance and transcribe some audio using the new tflite file:

stt --model STT/data/output_graph.tflite --scorer kenlm.scorer --audio STT/data/flap_1.wav

Dataset distribution

For machine learning, three datasets are required. For an improved workflow, all WAV files and the three different csv files should be placed in the same folder. You can either use one csv file with 100% of the WAV files (sample.csv in the example in this blog) and automatically generate the three separate csv files from that (using the –auto_input_dataset flag), or you can distribute the data using your own script. If you distribute the data using your own script, the ratio of content should be more or less like this (WAV samples must be randomly taken from all the data):

70% of all WAV and corresponding csv files for training (train).
20% of all WAV and corresponding csv files for validating (dev).
10% of all WAV and corresponding csv files for testing (test).

Meaning of train, dev, and test:

Training (train): for training the model.
Validation (dev): to keep checking the performance of your model in order to know when to stop training.
Testing (test): used to check the performance once training is finished.

Do not crate the dataset manually as that is too much work and is not reproducible.

The data sets should not overlap too much. More information about overlap, overfitting, and the train, dev, and test data sets is available here:


GPU inference is not supported. The GPU can only be used for training.

-Use “–train_cudnn true” instead of “–load_cudnn true” if you have a CUDA GPU. The graphics card drivers need to be installed correctly and GPU must be supported by Coqui STT.

-CUDA required Software requirements:

-When using WSL, you need a special nVidia driver:

-For Tensorflow 1.15.4 use CUDA 10.0 and cuDNN 7.4.2

-More information:


Further improvements to the recognition accuracy can be made by loading a different language model (scorer) depending on the phase of flight. For example, during taxiing different phrases are expected compared to the cruise phase. Also use a numbers-only language model when only numbers are expected to be heard.

Another way to improve command recognition is to force the language model to recognize words in the correct order. For example, let’s consider the language model below:

flap one
gear down

A misheard output could be “flap down” or “gear one”. Currently there is no way to force words to be recognized in the correct order as specified in the language model. However, there is a hack to acomplish the same thing. Words in sentences can be forced to be recognized in the correct order by removing the spaces. So the the language model in the example above would be like this:


Removing the space between words will only work for a system required to recognize commands, but for our use case it will do.

With a custom language model, only words within that language model will be recognized so the solution below is only applicable if a generic language model is used.

As a last line of defense, include common transcription mistakes in the command detection logic. For example, if your code is looking for “gear down”, also execute the command for transcriptions like “git down”, “dear down”, “gear done”, etc, provided the altitude and speed of the aircraft is reasonable for lowering the gear. This is also how humans operate on instructions in a familiar environment, through context aware assumption. For example, when you are PM and the airplane is at final approach and you hear PF say “gear don”, you know what he means and you will lower the gear. But if you are at cruising altitude and you hear “gear don”, or even the correct phrase “gear down”, then you might say “say again?”. This logic can be included in the deployment application and it should work quite well.

Other important points are (applicable to both recording for fine-tuning and for inference (inference/deployment is speech recognition applied to an audio sample):

-Check the microphone output waveform (looking for volume). If the microphone output is too loud, it can lead to clipping. When the microphone volume is too low, STT will not work well, even if the Signal to Noise Ratio is high.
-Some microphones use automatic gain. This will lead to a lot of background noise in the silent segment before speech, resulting in poor STT quality. Disable this feature if possible. Automatic gain can be easily seen in sound recording applications such as Audacity by inspecting the resulting waveform before, during, and after speech when a lot of static noise is present.
-There are many types of microphones. There are differences in direction sensitivity, general sensitivity, frequency response, and recording quality. Use the same type of microphone for both training and inference. Keep in mind though that a too sensitive microphone used in a noisy environment will not lead to good results.
-Microphones can pickup physical vibrations just as easily as sound so make sure that the microphone is unaffected by environmental vibrations caused by fans, typing, etc.
-Have as little background noise as possible. A silent background is always better unless the speech corpus has been mostly trained on noisy backgrounds. In a flight simulator environment (or any simulator or game environment), a headset has to be used because the speaker output will interfere with the STT process.
-Not breathing on the microphone.
-Speech training data used matching the deployment environment (type of microphone, speaker demographics, etc).
-Speak clearly, not too fast and articulate well.
-The audio buffer fed to the recognizer must not be clipped at the beginning or end of the recording. Clipping can happen due to inaccurate Voice Activity Detection (VAD) results or when using Push To Talk (PTT), due timing issues with the start recording and end recording of the Audio API used. If audio buffering is applied incorrectly, it can happen that audio data from a previous spoken sentence is included in the next one. To check for this behavior, make fast successive utterances using PTT and inspect the individual wave forms.
-Some audio players like Windows Groove Music do not play short sound files correctly and make it appear the audio is cut off at the end. Keep this in mind when debugging language model files or inference code applications. It is best to play sound files in Audacity so that you can see the waveform at the same time.
-When using PTT, it is important not to press the talk button too late and not to release the talk button too early. For the button release, a small delay can be build in to solve the problem of users releasing the button a fraction too early, but for the start of the audio recording, no workaround exists.
-The end user program needs to use either threads or tasks when calling STT and audio buffering related functions in order for the UI to remain responsive. This adds code complexity and should be thoroughly tested as it can easily lead to crashes.
-When using continuous inference, some additional logic has to be added which resets the output text if it is to be used for commands, otherwise STT will just keep adding words to the output text.
-Continuous inference in a noisy environment will lead to a lot of false positives, especially if the background noise contains speech.

When developing an end user application (for deployment/inference), it is very important to write exactly the same audio stream which is sent to the recognizer, to a WAV file as well. This way the audio can be inspected with a wave editor like Audacity. In doing so, buffer clipping issues (both in volume and start/end) can be easily detected.

Below is an example of three separate recordings which show that the buffer is cut short at the end of the recording and instead is added to the next recording, caused by buggy audio buffering code on my part. In addition, the microphone volume is set too high:

Without an output like this, inference will simply fail and you have no idea why.

Below is an example of microphone automatic gain. When the recording is started the signal is low but then progressively increases to very loud as the driver automatically increases the gain causing the background noise (mostly the fan from the laptop) to be much more pronounced. After some louder audio input is detected (typing on keyboard), the gain is automatically adjusted and a new background noise base line can be seen. The automatic gain applied in the driver cannot be modified on this laptop (Dell XPS 13 9343) and is therefore not suitable for STT as it is hard for the first word to be recognized due to the excessive gain on the microphone.

Practical Usage

In order to use Coqui STT in a flight simulator and do something useful with it, we need to use real time streaming audio for continuous recognition using a custom program on Windows. How to do this with Visual Studio and C# will be discussed in another blog post.

Deployment on Windows

Transcribing voice to text using Coqui STT can be done with a standalone Windows executable.

C# WPF sample

This is a tutorial how to get the C# WPF source sample to compile using Microsoft Visual Studio.

Get the Coqui STT source. Go to the git repository below and click on Code->Download Zip

Unzip the source and copy this folder to another location:

Double click on STT.WPF.sln in the \dotnet\STTWPF\ folder.

If you get the error “Project Target Framework not installed”, click on download.
Install all missing developer packs. Some more related errors might appear in the Error list.

After installing all missing framework developer packs, restart Visual Studio.

Select the STT.WPF build target (Debug and x64). Click start to build.

Currently there is a bug in the source which causes a number of these compile errors:
‘Stream’ is an ambiguous reference between ‘STTClient.Models.Stream’ and ‘System.IO.Stream’

To fix these compile errors, double click on each compile error to open the source file at the error location.
Change each Stream reference to STTClient.Models.Stream

For example, change this:
unsafe void FreeStream(Stream stream);
To this:
unsafe void FreeStream(STTClient.Models.Stream stream);

Next is to download the models. Go to git releases:

Download these files:


Copy the kenlm.scorer file here: home/username/kenlm.scorer

Place the tflite and scorer file in the executable directory (dotnet\STTWPF\bin\x64\Debug)
unzip native_client.tflite.Windows.tar.xz then place all *.so files in the executable directory too.

Go to the Solution Explorer, right click STTClient, then select Build.

Select STT.WPF as the build target on the top bar, then click Start to build. If any additional ambiguous Stream errors pop up, fix those first as stated above.

An error will popup: Cannot find the model file”. Go to the line which triggers the error and change the file name to “model_quantized.tflite“.

Save and run again.

On the program, click on “Enable external” (to enable the coqqui stt scorer).
Select a microphone device, then click Record, say something, then click stop. The transcript should show up.

Create a project from scratch

To create a project from scratch, follow the instructions below.

In Visual Studio, create a new C# WPF Application project. Name it TestSTT.
Do not select “place solution and project in the same directory”.

Get the Coqui STT source. Go to the git repository below and click on Code->Download Zip

Unzip the source and copy this folder:

Place the STTClient folder into the project folder.

Go to Solution Explorer->Right click on your Project (not the solution)->Properties->Build tab->
Select Platform target: x64
Select “Allow unsafe code”
Change the target in the task bar to x64 (Debug, x64, Start).

Go to Solution Explorer->Right click on the solution->Add->Existing project. Open the STTClient folder which you placed in the project folder before, then select the STTClient.csproj file and add it to the project.

Go to Solution Explorer->Right click on your Project (not the solution, and not STTClient)->Add->Project Reference. Then on the left side bar select Projects->Solution. Tick the box next to STTClient, then click Ok.

Go to Solution Explorer->Right click on “STTClient”->Build

Currently there is a bug in the source which causes a number of these compile errors:
‘Stream’ is an ambiguous reference between ‘STTClient.Models.Stream’ and ‘System.IO.Stream’

To fix these compile errors, double click on each compile error to open the source file at the error location.
Change each Stream reference to STTClient.Models.Stream

For example, change this:
unsafe void FreeStream(Stream stream);
To this:
unsafe void FreeStream(STTClient.Models.Stream stream);

Scroll past this error (it will be fixed after changing the stream reference):
does not implement interface member”

Make sure all “Stream is an ambiguous reference” related errors are fixed first.

Go to Solution Explorer->Right click on “STTClient”->Build. This time it should succeed. If it complains about a missing metafile for STTClient.dll, close Visual Studio and open it again, then build again.

Next is to download the models. Go to git releases:

Download these files:


Place the tflite and kenlm.scorer file in the executable directory (TestSTT\bin\Debug\netcoreapp3.1)
unzip native_client.tflite.Windows.tar.xz then place all *.so files in the executable directory.

Open App.xaml.cs or MainWindow.xaml.cs and add this line:
using STTClient.Interfaces;
If this gives an error, add the STTClient reference to the project as described before.

Demo App

A WPF C# demo application which supports PTT and continuous inference is available here:

The tflite and scorer file have to be added to the executable directory.

Check the required gain in a wave editor like Audacity first. Too much gain will lead to clipping.

VAD is not supported yet.

Press F7 for PTT. The window doesn’t have to have the focus in order to receive the key press because it uses a global key hook.

The audio heard by the STT engine is sent to the /WavOutput directory if “Output debug WAV” is ticked.


Once GPU inference is supported, it will lead to better performance. But whether CPU or GPU inference is used, it is still good to keep in mind performance issues.

On a 2.2 Ghz i5 processor, CPU usage is about 30% continuously if FeedAudioContent and IntermediateDecode is used in a loop. This is not acceptable in a real time 3D rendering use case. It would be better to either use a PTT solution (only send an audio sample for recognition when needed), or use streaming VAD to detect when audio is to be decoded into speech. The disadvantage of the latter is that it has to use a silence period to detect end of speech which leads to both latency and recognition errors if pauses in speech are too long.

The best solution depends on the use case. For ATC conversations, a PTT button is definitely the best option as that is both realistic (PTT is done in real life also), has the lowest latency, and is the most performant. For cockpit crew interaction, it is best to offer a user configurable option of either PTT, continuous recognition, or VAD, depending on the user preference.

When using VAD, it is important to set the microphone volume at an appropriate level. If STT requires a different volume, the wave data can be scaled in code. Note that volume detection alone is not a good way to implement VAD. More information here:

Note that using continuous speech recognition with a small language model will lead to many false positives if the background noise level is too high and contains speech fragments.

Further reading

Sociopaths in Aviation

It is normal that you can’t get along with everyone, there is nothing wrong with that. The highly complex but logical environment we operate in is mostly dictated by SOPs and regulations, you can have a good day at work with someone who you normally wouldn’t hang out with. That is a good thing. But every once in a while you come across someone who very few people like to work with. It’s not just you. We have all seen it and it doesn’t just apply to aviation.

Luckily the danger of personality issues is a widely recognized problem in Aviation. There have been many incidents and accidents in which personality clashes were a factor. CRM classes alone cannot solve this issue because it is not possible to change someone’s personality unless that person puts a significant amount of effort into that over a long period of time. People can change their behavior but only with their own free will. But, when under pressure, the worst part of someone’s personality usually surfaces.

A person should be able to set aside his personality issues, focus on the job, and do what’s right not who’s right. But this is a theoretical notion which sometimes doesn’t match with reality. If you disagree, you haven’t been flying long enough.

We are all people with feelings and empathy, so why do some people are horrible to work with? Because some people have very little empathy. They are sociopaths, and are a danger to Aviation. Let me explain. Sociopathy is a personality disorder which has many facets but the things you see the most in pilots regarding this are a combination of:

-Lack of empathy.

-Being arrogant, the feeling of being “better” than everyone else.

-Talking about other people as if they are stupid and clueless.

-Unable to keep positive professional relationships.

-Not being fair.

-Verbally blunt, lack of tact.

-Drive to acquire ever higher positions.

-Abuse of power.


-Not taking input from others seriously. Wanting to have it their way.

In a theoretical professional environment these things shouldn’t be a problem as it should be possible to operate the aircraft safely, even if you can’t get along with someone. Unfortunately this is not the reality. When someone pisses you off enough so that it occupies your thoughts for a long time, it can become a safety issue.

You are less likely to help a person who has made a mistake but has previously mistreated you. That is human psychology and very hard to resist. Another big issue is that your thoughts are likely to be somewhere else if you just had an “event” on the social level with your colleague but are now in a high workload situation.

Social friction is especially a problem when there is a high cockpit gradient, a senior pilot flying with a far less senior pilot. The less senior pilot is even less likely to help out if his/her colleague makes a mistake. This is both due to the unwillingness to speak up, and the inner doubt this creates. I have been there when I was a cadet. It is a serious issue.

The issues don’t have to start in the airplane. If someone misbehaves sufficiently in the office, the simulator, or even during days off, the negative effects of it can be carried over into the aircraft.

I think the problem of personality disorders like sociopathy, or worse, psychopathy in Aviation are not being taken seriously enough. Airlines and flight schools seem to favor the cleverest pilots who score the highest in the aptitude tests. Sure there are some personality tests like questionnaires and group assignments but these are easily faked. Because sociopaths are generally also highly intelligent, they know they have to modify their behavior temporarily in order to get hired.

So how to solve this problem? There is only one way in my opinion. Don’t engage in verbal conflict and operate the aircraft to the best of your abilities within the framework of CRM. When the situation is bad enough, ask the rostering department not to fly with that person again and call sick if you have to. Someone else is not going to change so if the situation is a safety issue, it is best to avoid it all together.

Hopefully this shines some light on a publically little discussed topic. If you have anything to add, please let me know in the comments.

A320 Descent Energy Management

energy management 0

When I first started flying jets, I struggled with descent energy management. I ended up too high, too low, and didn’t know when to use the Speed brakes. It wasn’t until I had about 2000 hours on the jet before I finally understood the principle. But why did it take so long? I used to think I was the only one who took so long to understand this subject but in hind sight, that didn’t turn out to be the case.

I started flying on the B737 and later transitioned to the A320. This was easy as far as energy management concerns because these aircraft behave practically the same way. Ten years after I started flying, I started instructing on the A320. It is interesting to see flying from an instructor point of view, and I learned something very important: just because someone else understands something, doesn’t mean it is easy. What I observed is that all new pilots struggle with descent energy management. Every single one of them. What is even more interesting is that I sometimes see experienced captains who still don’t get it.

So I was right after all. Descent Management isn’t easy. It is very difficult. But it is not rocket science either, so why did it take so long for me to understand it, and why do some pilots never seem to understand it? I believe there are a few reasons for this. There are some instructors who themselves don’t get it. Some instructors use over complicated and non-intuitive methods and math, confusing the student. Most instructors have forgotten that it is a difficult subject and subsequently don’t spend enough time explaining it. Often technical details are taken for granted but what is obvious to you might not be obvious to someone else. And finally, one of the problems is that there is no standard way of dealing with Descent Energy Management. There are many ways to do it but each instructor insists on their own way, confusing the student pilot even more.

energy management 1

One other problem is that the student pilot is often overloaded with new information. The learning curve is steep and Descent Energy Management is just one of many subjects to master. The first few days a fresh new pilot who just graduated from flight school (Cadet) flies on the real aircraft, the crew is given an additional pilot (safety pilot) to make sure the aircraft can land safely in case the Captain becomes incapacitated. This has mostly to do with landing skills. However, if you can’t put the aircraft on the approach with an appropriate energy state (speed and altitude), you are not going to land it in the first place. So it is important to make descent energy management a priority right from the start.

Currently very little documentation exists about Descent Energy Management on the A320. There are some documents about this subject available, some even from Airbus, but they are all either too theoretical, not practical, not written for pilots (for ATC instead), or overly simplified. So to address this issue, I decided to write a a book about the subject myself. That being said, Descent Energy Management is not something you can learn from a book alone as it takes a lot of experience to fully master this subject. A a well written and easy to understand guide on the subject will make the learning process much easier though. And hopefully it will become the standard one day, doing away with the jungle of different methods out there.

The book focuses on what most inexperienced pilots struggle with but it contains everything there is to say about the subject. It also contains a lot of real world examples and lots of advice, especially on how to make things easier.

Below are some excerpts of the book.

energy management 5

energy management 4energy management 3energy management 2

There is also a quiz at the end to test your understanding of the subject.

energy management 7

I am currently looking for a publisher for the book. Stay tuned.

A320 ECAM Rendering

I finished a few more EFIS vector graphics displays. They are created in a CAD program, converted to SVG, modified in Inkscape, converted to XAML with ViewerSVG, and rendered in Unity using NoesisGUI. The complete process is described here.

Here are some screenshots:


In Unity, it looks like this:

EWD and SD

The Status Display (SD) consists out of two separate parts. The top graphics part and the bottom table with the TAT, SAT, ISA, etc. Using two separate parts is easier to maintain if a design error is detected. They can be blended in code using NoesisGUI. All symbols can be animated in NoesisGUI too.

The source vector graphics files contain all symbols. Here is an example:


Path Tracing in Unity using Octane

Unity recently added support for Path Tracing using Octane so I decided to give it a try. I was in the beta program for a few months and it turned out the A320 CAD model in Unity caused quite a few problems due to the large amount of materials used. But eventually it did work.

Here is a sample render. Right click->View Image to expand. (WordPress really has to make this easier)


The images below are tone mapped in Photoshop. This looks a bit better.

render Octane 5render Octane 4render Octane 3render Octane 2

Although it works pretty much out of the box, I discovered a few issues:

-All materials appear slightly more rough when rendered with Octane. Unfortunately there is no global slider available to fix this issue.

-Although the renderer is full HDR (32 bit float RGBA which is 128 bit per pixel), it requires careful tweaking of the sun intensity, exposure, gamma, sky turbidity, and tone mapping in order to avoid white highlights or the entire scene looking too dark. This is a common issue with renderers and is described in detail here. You can also work around this problem by saving the render as an 16 bit EXR and then modifying it in Photoshop but that solution is less than ideal. An out of the box solution which is more physically inspired would be more ideal.

-Currently there are a few bugs which require some workarounds. This includes GameObjects with disabled MeshRenderers still being rendered and spot lights casting shadows.

-A model designed for realtime rendering does not necessarily look good with Path Tracing. This is not the fault of the Path Tracer but due to the fact that flat geometry with detail in AO maps is not rendered. Have a look at the screws on the FCU and you can see that it lacks AO. I did not try enabling the AO maps (not sure if that is possible) but that would make other geometry look worse due to quality issues. My AO mapse are just not designed for close up renders.

Even with the current issues, it is still an easy and quick way to get a nice looking render. And it is free 🙂

The A320 CAD model is available for purchase. Contact for more information.

Fast light source rendering

Rendering light sources is typically done using individual sprites but this can become computationally expensive pretty quickly if you use thousands of lights. A better approach is to use a single mesh and make the individual triangles face the camera in the shader. This way you can render a huge amount of lights (21844 with Unity 2017.2, or 1.431.655.765 lights with Unity 2017.3) in one draw call.

The lights don’t actually light other objects and it needs a good bloom shader but the aim is to make the lights itself look realistic.

Note that Unity’s post processing stack V1 bloom shader does not work well with SpriteLights. However, the current V2 beta (available on github) works exceptionally well, even better than Sonic Ether’s bloom shader as it has almost no flicker.

The funny thing is that there are thousands of references available on how a light affects an object. But the amount of references available on how the light itself looks you can count on one hand. I once found a scientific paper, but that’s about it. Perhaps that is why very few people get it right. Often you see an emissive sphere with a flare sprite slapped on top of it. But that is a far cry from a physically based approach, which I will describe here.

Most lights have a lens, which makes them either highly directional like a flashlight, or horizontally directional, the result of a cylindrical Fresnel lens. This directional behavior is simulated with a phase function which shows nicely on a polar graph. Here you can see two common light radiation patterns:


The blue graph has the function 1 + cos(theta*2) where theta is the angle between the light normal and the vector from the light to the camera. The output of the function is the irradiance. Adding this to the shader gives the lights a nice angular effect.


Next is the attenuation. Contrary to popular belief, focused lights (in the extreme case, lasers) still attenuate with the inverse square law, as described here:…distance-grows-similar-to-other-light-sources

But contrary to even popular scientific belief, lights themselves don’t behave in quite the same way, or at least not perceptually. The inverse square law states that the intensity is inversely proportional to the square of the distance. Because of this:


You see this reference all over, for example here:


Yet the light itself is brighter than bar number 4, which is about at the same distance as the light to the camera. The light itself doesn’t seem to attenuate with the inverse square law. So why is this? Turns out that in order to model high gain light sources (such as directional lights), you need to place the source location far behind the actual source location. Then you can apply the inverse square law like this:


Note that highly directional lights have a very flat attenuation curve, which can be approximated with a linear function if needed in order to save GPU cycles.

Some more reading about the subject here (chapter Validity of the Inverse Square Law):

One other problem is that the light will disappear if it gets too far from the camera. This is the result of the light being smaller than one pixel. That is fine for normal objects but not for lights because even extremely distant or small lights are easily visible in real life, for example a star. It would be nice if we would have a programmable rasterizer, but so far no luck. Instead, I scale the lights up when they are smaller than one pixel, so they remain the same screen size. Together with the attenuation, this gives a very realistic effect. And all of this is done in the shader so it is very fast, about 0.4 ms for 10.000 lights on a 780ti.

Since I made this system for a flight simulator, I included some specific lights you find in aviation, like walking strobe lights (also done entirely in the shader):


And PAPI lights, which are a bit of a corner case. They radiate light in a split pattern like this (used by pilots to see if they are high or low on the approach):


Simulated here, also entirely in the shader.


Normally there are only 4 of these lights in a row, but here are 10.000, just for the fun of it. They have a small transition where the colors are blended (just like in reality), which you won’t find in any simulator product, even multi million dollar professional simulators. That’s a simple lerp() by the way.

I should also note that the shaders don’t use any conditional if-else statements but use lerp, clamp, and scaling trickery instead. So it plays nice even on low-end hardware.

Available here (supports Build-in render pipeline, URP, and HDRP) :

Driving non linear gauges


Setting the needle of a gauge in code is easy when the scale is linear but it gets surprisingly complicated when the scale is not linear.

There are a few ways to deal with this problem. The easiest is to simply map different linear ranges to different segments of the gauge. However, this creates a change in needle speed when crossing the boundary. A better way is to create a logarithmic function which best fits the scale. But this can be difficult to maintain and it can be hard to make the needle follow the scale exactly, especially when the scale is not logarithmic to begin with.

The best way to deal with this problem is to make the needle follow a spline. The needle angle vs scale value will be stored in an array which are treated as control points for a Catmull-Rom spline.

The function of a Catmull-Rom spline is defined as:
0.5 * (2*P1 + (-P0 + P2) * t + (2*P0 – 5*P1 + 4*P2 – P3) * t^2 + (-P0 + 3*P1 – 3*P2 + P3) * t^3)
Variables P0 to P3 are the control points. Variable t is the position on the spline, with a range of 0 to 1. This only creates a spline with one section and 4 control points. To create a spline with more control points, the spline segments have to be stitched together.

The points P0 to P3 are vectors where in the case of the gauge, x is the needle angle, and y is the scale value at that angle.

A Catmull-Rom spline with multiple control points placed in zig-zag shape Note that the first and last control point is not shown here:catmull-rom

A Catmull-Rom spline with 6 control points placed in curved shape. Note that the spline does not exist at the first and last segment:
spline curve Unity

It is also possible to make the spline into a closed loop. For that, the first and last two control points have to be overlapping

Using a spline like this will make the needle follow the sampled points (scale values) exactly using smooth interpolation in between. To get an intermediate position on the spline, a value between 0 and 1 (t) has to be supplied to the spline function. The problem is that t is not known because the needle angle (x) has to be found for a certain scale number (y).

There are two ways to find t. One way is by using a brute force method of calculating many points on the spline and then finding the closest one to the number we are looking for. This works but is not exactly elegant, not to mention the performance and memory overhead involved. A better way is to find t mathematically. This is quite complicated but luckily it has been done before:

The blog post explains how to substitute the variables from a standard linear equation with parts of the spline formula. This allows you to solve a Cubic equation which gives you the intersection points of a straight line and a spline. Solving a Cubic equation is not exactly easy either, but luckily it has been implemented in code here:

Due to a bug in the wordpress code formatting system, I am unable to post the code here without breaking the rest of the blog text. However, the code is available in the Unity source in a link below. The source code includes the Catmull-Rom spline, create a Cubic function from a line spline intersection and solve it. It supports multiple spline segments.

In case of the gauge, we need to make a horizontal line (y) at the location of the scale value we want to find the needle angle for. This will give us the intersection (t). This is not a coordinate yet, but if you simply plug this value (t) in the spline function, it will give a point with values x (needle angle, yay!), and y (scale value). The scale value was already known but it can be used to check the result.

Here is an implementation in Unity which calculates the intersection between a line and a spline:
spline bend Unity

The Unity project can be found here:

Note that the Unity project contains both implementations of solving a Cubic function which was used to verify the result.

The red cubes are the control points of the spline. The yellow cubes create a straight line. The magenta cubes are the intersection points between the line and the spline. The green cube can be moved along the spline by moving the slider. To use, press Run, then move the cubes in the Scene window.

Another closely related application is to make a gauge follow a non-linear animation, for example the EGT of a jet engine during startup. A video of a the event would be recorded and used to capture sample points consisting of EGT vs time. The time (x) and EGT (y) values would then be used to create a spline, allowing smooth interpolation between the original sample points. The line-spline intersection function can then be used to get the EGT for any point in time.

So there you have it. A real world use case of finding the intersection points between a line and a spline by solving a Cubic equation. Learning mathematics was not a waste of time after all 😉

I added the spline and spline solve functions in the Math3D Unity wiki too:
The functions are called GetPointOnSpline() and GetLineSplineIntersections()

Real time EFIS vector graphics

after start small

This is a tutorial on how to create a real time rendering system for a PFD, ND, ECAM, MCDU, LCD, or any other electronic aircraft display.  This can be done two different ways:

-Create all graphics as separate meshes.
-Place the meshes at different heights relative to each other, simulating layers.
-Render it with a separate orthographic camera into a render texture.
-Assign the render texture to the display material.

Vector graphics
-Create an SVG vector graphics file containing the graphics.
-Render the vector graphics directly into a render texture.
-Assign the render texture to the display material.

The latter is much easier to maintain, easier to animate, and much faster to render. In order to render vector graphics, a 3rd party tool called NoesisGUI is used. Unlike the name suggests, it can be used to render anything xaml based, not just a GUI. It can be found here:

The vector graphics can be created in a vector graphics drawing program like Inkscape, but this is not designed for precision which makes the workflow very cumbersome. I tried simply eyeballing the design using a perspective corrected photo as a background, but even with perspective and barrel distortion removed, a photograph is not accurate enough.

Instead, I decided to create an initial sketch with a CAD program. The constraint based parametric workflow is a joy to work with, and much faster and accurate than using a freehand vector based program. It is best to use QCAD as this can export a good quality SVG file. However, I already know Autodesk Inventor, so I used that to create the sketch instead.

Here are some screenshots of the CAD drawings. They only contain sketches and no solid geometry. Everything was physically measured in the aircraft so all dimensions are correct.  Note that I used two different sketches because the large amount of constraints in a single sketch made the sketch unstable and slow. In addition, the ISO drawing information symbols (info box on bottom right and edge outline) are removed. This tutorial assumes the drawing is made in mm.

The CAD drawing contains no fill data, line width, colors, and layers. This will be added later using Inkscape. The purpose of the CAD drawing is just to place lines and text at the correct location.

Make sure to create a square outline in the CAD sketch because this will be used to center the drawing when imported into Inkscape.

-In Inventor, create a drawing (*.dwg) file. Part or Assembly files won’t work. On the drawing file, you can crate sketches the same way as in a part file.
-On the model tree, delete the outline (Default Border) and the info box on the bottom right (ANSI – Large).
-Create a display sketch.

Next, a few conversion steps have to be performed in order to get the CAD drawing into Inkscape:
-Go to Inventor->File->Save As->Save Copy As->DXF. Then Click Options on the save dialog, select file version AutoCAD 2013 DXF (important, otherwise the output will be corrupt if an embedded image is present). Do not select “Model Geometry Only”, otherwise the QCAD import won’t work. Delete the border and text box in the model tree instead. Now click Next->Finish and then click Save. Exporting a DXF file can take a long time if an embedded image is present.
-QCAD->File->Open (do not use Import). Select the exported DXF file.
-QCAD->File->Advanced SVG Export: select “Preserve Geometry” (to prevent text being converted to a path).
-Select Export.

Note that Inkscape can import DXF files, but this is buggy. As a workaround, QCAD is used to convert the DXF into an SVG file.

A few settings in Inkscape have to be changed to make sure the SVG coordinates are the same as in the CAD drawing. This makes it easier to make modifications.

-Inkscape->Edit -> Preferences -> Behavior -> Transforms-> Store transformation = Optimized.
-Inkscape->Edit -> Preferences -> Input / Output -> SVG Output -> Path Data -> Path string format = absolute.
-Inkscape->Edit->Preferences->Behavior->Snapping->Delay = 0.
-Inkscape->File->Document Properties->Page->General->Display Units-> mm.
-Inkscape->File->Document Properties->Page->Page size->Custom Size-> mm.
-Set the Custom Size to the size of the display and make sure it is square. For example 158, 158.
-Set the scale to 1.
-Save the file and keep a copy as a template for future designs.
-Close Inkscape, open the svg file in a text editor, and remove the translate transform of all layers (transform=”translate), caused by page resize.

Import the converted CAD drawing into Inkscape:
-Start Inkscape and open the template file.
-Open the converted CAD drawing: Inkscape->File->Import->SVG
-If part of the drawing looks incorrect, it will be fixed with ungrouping later.
-Position the drawing so it fits nicely in the middle of the viewbox. Make two guide lines and edit the location (double click on guide) so they are exactly at a corner. Then use snapping to align the outline square with the guide lines.
-Select the imported object, then go to Object->Ungroup.
-Select all, then ungroup again. Do this a few times until there are no more groups. This will also fix any incorrectly placed geometry.

Now the SVG file is ready to be modified so it looks exactly like the real display. There are a few operations which must be performed.

All lines are imported into separate path segments. If a shape needs to have a fill or if the segments need to be dynamically changed together in code, the line segments need to be stitched together. Below Is what a shape looks like when it consists of separate path segments. Note that it looks like a single segment.lines before
Below is what the shape looks like if all  individual path elements are selected. Now it is clear that it is not one single shape.separate
Select all individual path segments as show above. Then go to Path->Combine (or Ctrl-K). Now all segments are fused into a single object which looks like this:
Even though the path segments are fused into a single object, is not possible to add a Fill yet. This is because the nodes of the line segments are not joined together. To do this, select the object, then select the “Edit paths by nodes” tool (icon just below arrow select tool). With the object selected, drag to select all nodes at the same time. After this operation, it is not evident that all nodes are selected, but they are. Now click on the icon called “Join selected nodes” (or Shift-J). After the nodes are joined together, the shape looks like this:
The diamonds on the corners are an indication that the join operation was successful. Now the Fill or any cutting operations work correctly.

All shapes which need to be animated in code need to have a proper ID set in Inkscape. To modify the shape ID, go to Inkscape->Select shape->Object properties->ID. Change the ID and click on “Set”.

Repeat the process for all applicable shapes. Even if a shape does not need a fill, it is still recommended to fuse path segments together where it makes sense. For example, all pitch lines for the attitude scale are fused together into one single path. This makes it easier to manage (set layers, change colors, change stroke settings, etc.) The end result should look something like the screenshot below. Note the two diagonal lines at the right. They are guide lines, used to align shapes.

PFD inkscape full

Note that the layout looks very messy. This is because all available symbols of the A320 PFD are present. The state of the symbols (color, text, number, position, etc) will be set in code (C#) at a later stage. Alternatively, you can create a copy of the SVG and delete/hide certain elements if you only want to make screenshots of certain display states.

Even if all elements are shown, it only uses two draw calls (set pass calls) in Unity, so NoesisGUI renders it very fast.

Here is a screenshot of a more realistic display state:

after start

Because vector graphics are used, it is possible to zoom infinitely while maintaining quality:

PFD full zoom

Note the small black outline on some of the symbols. This is used for added contrast, a feature which the real PFD has too. It is not possible to add an outline to a shape which does not consists out of closed line segments. So to add the black outline, a duplicate is created, the color changed to black, the stroke width set a bit bigger, and moved to a z-order just below the original. The two paths are then grouped together.

Here is a closeup photo of the real display where you can see the black contrast outlines too. Fun fact: by counting pixels and measuring the size of the display, you can figure out the resolution of the screen. It is about 768×768. Not exactly a Retina display but the size is only 158 mm square so the pixel density is quite high, especially for its time when it was designed. Right click->View Image to enlarge.

photo closeup

Once the display is rendered, it is not possible to zoom in with the camera and maintain visual quality, but the same vector graphics can be added to a higher resolution texture to achieve the same effect.

Once the SVG file is done, it has to be exported to an xaml file because this is the format used by NoesisGUI. Unfortunately the xaml exporter from Inkscape is very buggy and is unusable. Luckily there is a standalone converter available which creates high quality xaml files. It is called ViewerSVG and is available here:

To convert the SVG to xaml with ViewerSVG do the following:
-Drag and drop the SVG file onto ViewerSVG.
-Select the Export icon (bottom left corner).
-On the top right corner change Target Platform to Silverlight XAML.
-On the bottom right corner change New Width to 1024 (assuming the texture you want to create for NoesisGUI is this size).
-Click on the Transform button.
-Click Save.


Now the xaml file is ready to be used by NoesisGUI. We will use Unity to render the result but NoesisGUI also has a native C++ SDK so you can use it in a different game engine.

In order to use the xaml file in Unity, do the following:
-Create new Unity project.
-Import the NoesisGUI unitypackage.
-Before adding the XAML to Unity, open it and modify all FontFamily lines so that a # character is in front of it. For example: FontFamily=”#Arial”.
-Drag and drop all fonts used in the xaml to the same directory as where the xaml file will be placed in Unity.
-Drag and drop the xaml file in the same directory as the fonts. When the xaml file is imported into Unity, it will automatically generate an asset file. This asset file is the one used by NoesisGUI, not the xaml file. Updating the xaml file will not re-import the asset file so it is best to delete the xaml file in the Unity folder.

To render the xaml to a mesh plane, do the following:
-Add a NoesisView component to the display Game Object (should be a square mesh, UV mapped correctly).
-Add the XAML asset file to the NoesisView component (not the xaml file but the .asset file which was automatically generated).
-Disable keyboard, mouse, and touch checkboxes.
-Set anti aliasing to PPAA (GPU).
-Create a render texture (no anti aliasing, and Depth Buffer set to 24 bit with stencil).
-Add the render texture to the appropriate texture slot on the material from the display Game Object.
-Press Play and check if the xaml file is rendered correctly.

Here are some screenshots from Unity. I use the standard specular shader with a slight red specular tint to simulate the anti reflective coating. The render texture is added to the Emission slot only. The Emission color is set to gray, otherwise the display is too bright. The Emission color can be changed in code to simulate display brightness change.

PFD dayPFD night

The following code can be used to animate the vector graphics. The c# file has to be placed on the display Game Object.

Note that WordPress does not allow <> signs in the code, so the GetComponent() function is displayed incorrect. It should be GetComponent<NoesisView>();

Add to top of C# script:

using Noesis;

Get a handle to a path:

NoesisView panel = GetComponent();
Path obj = (Path)panel.Content.FindName("line6866");

Get a handle to a group:

NoesisView panel = GetComponent();
Noesis.Canvas obj = (Noesis.Canvas)panel.Content.FindName("g865");

Enable transformations:

RotateTransform rotateTransform = new RotateTransform();
TranslateTransform translateTransform = new TranslateTransform();
TransformGroup transformGroup = new TransformGroup();
obj.RenderTransform = transformGroup;

Move a path:

translateTransform.X = 5f;

It can be hard to figure out how to precisely position items due to the scale and unit transformations caused by converting the file from SVG to XAML. The easiest way to figure out what positioning factor to use is to create a line (path) with a stroke width of 1. Then find the XAML code for that line in a text editor and copy the stroke width from there. This will be the factor to use in order to precisely position items on the canvas.

In Inkscape, the origin is at the bottom left. In XAML, the origin is at the top left. This has to be taken into account when positioning items.

If you want to get the absolute position of an element, use this code (it only works if the xaml layout has finished building, hence Content.Loaded):

NoesisView panel = GetComponent();
TextBlock text = (TextBlock)panel.Content.FindName("text7246");
panel.Content.Loaded += (s, e) =&amp;amp;amp;gt;
    Point pos = text.PointToScreen(new Point(0, 0));

Set the draw order of an element:

Panel.SetZIndex(text, 5);

Insert a new element in a specific location in the tree (affecting the draw order):

int index = canvas.Children.IndexOf(existingElement);
canvas.Children.Insert(index + 1, newElement);

In order to rotate a shape around the pivot point, a RenderTransformOrigin property has to be present in the xaml. The RenderTransformOrigin uses the range 0 to 1 and is based around 4 properties: Width, Height, Canvas.Left, and Canvas.Right. These properties must be present in the xaml shape and set to the shape bounding box. Additionally, this property has to be added: Stretch=”Uniform”. For example:

RenderTransformOrigin="0.008,0.989" Width="307.24" Height="214.098" Canvas.Left="508.8" Canvas.Top="628.8" Stretch="Fill"

When the required xaml code is present, the shape can be rotated around the pivot point using this code:

rotateTransform.Angle = 30f;

If an object has a MatrixTransform in the xaml code, you can’t use rotateTransform, otherwise you will get scaling issues. In that case use the code below. Bear in mind though that the RotateAt pivot point coordinates are absolute canvas coordinates, not in the relative 0-1 range as with RenderTransformOrigin used by rotateTransform.Angle. If you don’t want to use MatrixTransform, you need to wrap the shape or group around another group (without a matrix transform) and use rotateTransform.Angle instead.

NoesisView panel = GetComponent();
Noesis.Canvas obj = (Noesis.Canvas)panel.Content.FindName("g840");
MatrixTransform matrixTransform = (MatrixTransform)obj.RenderTransform;
Transform2 matrix = matrixTransform.Matrix;
matrix.RotateAt(0.5f, 232, 490);
matrixTransform.Matrix = matrix;

Hide a path:

obj.Visibility = Visibility.Hidden;

Set the color of a path:

obj.Stroke = new SolidColorBrush(Noesis.Color.FromLinearRGB(255, 255, 0));

Set the stroke thickness of a path:

obj.StrokeThickness = 3f;

Modify an existing path:

string dataString = obj.Data.ToString();
dataString = "M157.626,80.6264L149.626,88.6264";
//Modify the path string here.
StreamGeometry streamGeometry = new StreamGeometry();
obj.Data = streamGeometry;

Arc segments are drawn as a path with an arc data command (letter “A”). For example:
Data=”F1 M320, 484 A22, 22, 0, 0, 1, 318, 494″
Animating this is slightly more complex. According the markup documentation here, it means the following:

F1 = non-zero fill rule.
M320 = start point x.
484 = start point y.
A22 = the x radius of the arc.
22 = the y radius of the arc.
0 = the rotation of the ellipse in degrees.
0 = set to 1 if the angle of the arc should be 180 degrees or greater, otherwise set to 0.
1 = set to 1 if the arc is drawn in a positive-angle direction, otherwise set to 0.
318 = end point x.
494 = end point y.

In order to draw a round arc segment (circle instead of ellipse), the x and y radius values must be the same. When animating the arc, you need to calculate the start or end point. This can be done using trigonometry. Let’s say we are animating the APU EGT gauge arc segment. Using Microsoft Blend for Visual Studio we can easily experiment with an xaml file. Looking at the code is appears that the start point is the bottom right of the arc, so we need to change the end point in order to animate the arc (note that Cos and Sin functions require angles in radians, not degrees.

First calculate the arc center (only needs to be done once). The easiest is to just get it from your CAD source. If you need to calculate it, it can get a bit complex, as is described here (note: if the blog fails to load, copy paste the link and download as pdf here). When the arc center is known, the circle end point can be calculated using the known radius (22 in the example) and the given angle.

endPointX = radius * Cos(angleRadians) + arcCenterX
endPointY = radius * Sin(angleRadians) + arcCenterY

Cloning an existing path is described below. This requires all relevant properties to be copied. Only Fill, Stroke, and StrokeThickness are shown here:

Path obj2 = new Path();
obj2.Data = obj.Data;
obj2.Fill = obj.Fill;
obj2.Stroke = obj.Stroke;
obj2.StrokeThickness = obj.StrokeThickness;
Noesis.Canvas canvas = (Noesis.Canvas)panel.Content.FindName("layer1");

Create a path which can be drawn using commands instead of a string (does not draw anything yet):

Noesis.Canvas canvas = (Noesis.Canvas)panel.Content.FindName("layer1");
Path shapePath = new Path();
shapePath.Stroke = new SolidColorBrush(Colors.Green);
shapePath.StrokeThickness = 1;
StreamGeometry streamGeometry = new StreamGeometry();
streamGeometry.FillRule = FillRule.EvenOdd;
shapePath.Data = streamGeometry;

Draw a path using commands instead of a string:

using(StreamGeometryContext ctx = streamGeometry.Open())
ctx.BeginFigure(new Point(10, 90), true);
ctx.LineTo(new Point(20, 90));
ctx.ArcTo(new Point(60, 60), new Size(new Point(10, 10)), 0, false,

Change text:

TextBlock fdText = (TextBlock)panel.Content.FindName("text7246");
fdText.Text = "ABC";

Show/hide text:

TextBlock fdText = (TextBlock)panel.Content.FindName("text7246");
fdText.Visibility = Visibility.Hidden;
fdText.Visibility = Visibility.Visible;

Change text color:

TextBlock fdText = (TextBlock)panel.Content.FindName("text7246");
fdText.Foreground = new SolidColorBrush(Noesis.Color.FromLinearRGB(255, 255, 0));

Add xaml code from another xaml file:

NoesisView panel = GetComponent();

Noesis.Canvas root = (Noesis.Canvas)panel.Content.FindName("layer1");
Noesis.Canvas xaml = (Noesis.Canvas)Noesis.GUI.LoadXaml("Assets/file1.xaml");

//Place new xaml content in current xaml.

Replace the entire xaml file with another xaml file by using Resources.Load():

The xaml file which has been converted to an .asset file has to be placed in a folder called Assets/Resources. The file is then referenced in the Resources.Load() function without the extension. For example “Assets/Resources/file1.asset” becomes “file1”.

NoesisView panel = GetComponent();
NoesisXaml xaml = (NoesisXaml)UnityEngine.Resources.Load("file1", typeof(NoesisXaml));
panel.Xaml = xaml;

Replace the entire xaml file with another xaml file by using a public variable:

//Drag and drop the xaml file on the Inspector from the script.
public NoesisXaml xaml;

//At Start() function:
NoesisView panel = GetComponent();
panel.Xaml = xaml;

Note that the Back Up Speed Scale (BUSS) is also present but hidden to prevent clutter. Just do a text search for BUSS and you can enable the code manually.

All shapes, text, and groups, have an appropriate ID so you can find them easily.

The green altimeter numbers are hidden by a mask so you can animate them without having to worry about overdraw. The big numbers are called altLeftA, altLeftB, altMidA, altMidB, altRightA, altRightB. Not all numbers can be seen in the original SVG file because of the clipping mask but they are still there. Here is a screenshot with the clipping mask removed, revealing some hidden numbers.

alt clip

In order to animate the speed, altitude, and heading bar, simply move the index notches and numbers, change a number when it is out of view,  and re-position it accordingly.

To animate the Vertical Speed needle, change the start and end point of the line and underlying black contrast shape. Do not use rotate and scale as that can lead to unexpected results. The VS needle only goes to 6000 fpm and then stops. Any higher value is only visible in the VS number box. Near the VS needle is a line called “VSreference”. The right point of the line is the virtual pivot point of the VS needle. So any VS needle deflection must be drawn between that point and the current VS value. The line should only start drawing at the edge of the screen though.

The black outline used on some objects to increase contrast can cause aliasing at low resolution. In that case it is best to disable them.

Note that the attitude pitch angle scale is not linear and this has to be taken into account when setting a pitch angle.

If you want to explore different layers and groups, it is best to do this via the build in Inkscape XML editor because clicking on objects and groups in the viewport can be troublesome.

The rising runway symbol to be moved is a group called RISING_RWY_MOVE.

Note that flowed text is not supported by ViewerSVG. QCAD creates regular text from imported CAD drawings but if you create text inside of Inkscape, do not drag to make a text box. Instead just select the text tool, click, and type. This creates regular text which does not create any problems.

Another text type that can cause problems (incorrect placement) is a “tspan” element. These elements are not created when text created in CAD is imported but it is created when text is created or duplicated inside of Inkscape. To prevent any placement errors, delete all tspan elements. To do this, close Inkscape and edit the svg file in a text editor. Use a regular text element (without tspan) as an example.

Post me a mail or post a comment if you have any questions.