Understand spoken language

Difficult spelling breakdowns

Submitted by admin on 10 November 2018

Auto Breakdown should be able to add the spelling breakdown to words. In some languages, such as Thai, this is not quite so simple.

Take, for example the Thai word:

สนาม

https://lingopolo.org/thai/word/field-yard

When we first run the Auto Breakdown, we get an error that the 3rd character in this word is not found:

character not found

The problem is clear when we go to the page of this letter https://lingopolo.org/thai/word/vowel-a-long-version :

vowel a

Notice how the vowel is normally always displayed with a dash symbol (-) before the letter. This means that the automatic Auto Breakdown cannot make the match.

In order to help the software to know what the letter is to use, we have added a special field "Spelling symbol" which can be used to tell the software exactly what the particular part of the letter should be used for matching the spelling breakdown:

spelling symbol field

The tricky part sometimes is to find how to actually type just the part which is wanted. Sometimes, the normal typing just doesn't allow you to type only the accent on its own. One way is to use the keyboard on Google Translate. For example https://translate.google.be/?um=1&ie=UTF-8&hl=en&client=tw-ob#th/en/%E0… types just the letter in the above scenario:

Google Translate keyboard

This letter (or accent), separated from any other symbols, can now be added as the "Spelling symbol" field, which will then be used in the Auto Breakdown.

Spelling symbol filled in

Now, when we try the Auto Breakdown again, it will work:

Entry found

Of course, we only have to do this once for each letter which has a problem. Once the Spelling symbol field has been filled in once, then it should work for every spelling breakdown which uses that letter.

Note that sometimes the character which you type in the Spelling symbol field will be invisible. To check whether it is correctly entered, you may need to copy-paste the contents of the field into a UTF-8 encoder/decoder to see more clearly what is there, e.g.

https://mothereff.in/utf-8 

UTF-8 encoder/decoder