Understand spoken language

Rank score

Submitted by admin on 19 March 2017

Introduction

The rank score is a value given to a word or phrase to help know where in the list of all words and phrases something should be learnt; the higher the rank score, the sooner it should be learnt by the student.

For example, a student should learn the words for common animals like dog, cat, bird (which appear in Animals 1), before obscure words like guinea pig, razorbill, and snail (which appear in Animals 10).

Usage in assigning words to Sublessons

Sublessons are lessons like Animals 1, Animals 2, ...Animals 10. We want to know how to put each word automatically into the correct sublesson, so that Animals 1 has all the easiest most-common words, and Animals 10 has all the hardest most obscure words.

To do this we calculate a ranking score.

The ranking score is a bit like the Google rank score; the higher the score for a word, the earlier a student should learn it.

Let's compare the rank score of some words in these sublessons. Below is the word, followed by the rank score in brackets. Notice how the higher rank score words are learnt before the words with lower score.

Some Animals 1 words:

Some Animals 10 words:

Where to see the Rank score

The rank score for every word can be seen on the "Rank values" tab:

 

How the Rank score is calculated

Number of Examples

The main contributing factor to the rank score is how many example sentences a word has; the more example sentences, the higher the score. There are some more subtle measures, like the length of the word, to help separate words with equal numbers of example sentences, but 90% of the score comes from a simple count of how many example sentences a word has. This corresponds quite accurately to how useful a word is to someone. If a word has many example sentences, a student probably should be learning the word sooner than a word with few example sentences.

For example, dog has 21 example sentences (see the "Examples count" of 21 above, or simply look at the dog page). 100 points is given for every example, so 21 examples means 2100 score is added. The actual score is 2695, so there is another 595 coming from other reasons, but that starts to be more subtle.

Another example, cat only has 9 example sentences, so that contributes 9 x 100 = 900 to the cat score. The actual cat rank score is 1096, so there is another 196 points from other more subtle reasons.

At the extreme, the (feminine) has 430 example sentences and has a rank score of 59,698.

Length

One of the other criteria is length. Generally a shorter word is going to be more common and useful, and should therefore be learnt earlier than a longer word. So, a student should probably learn the word ten before they learn the word registration. Therefore part of the rank score comes from how long the word or phrase is; the shorter the word or phrase, the more bonus is given to the score. Basically the score has added to it one hundred minus the length of the word/phrase. So "dix", the word for "ten", has length 3, and so 100 - 3 = 97 is added to the score. The word for registration, has length 15, so 100 - 15 = 85 is added to the score.

Rank of breakdown items

Phrases also have a rank score. The problem for phrases is that, in general, they have no example sentences which use them. A phrase like a dog has no example sentences. How then to calculate a rank score? A rank score for a phrase is made just a bit lower than the rank score of the lowest ranked word in the phrase.

For example, the rank score of a dog is 2688.

To calculate this, we look at the rank score of the words which make it up:

  • a (masculine), is one of the most common words in the language, and so has a huge rank score of 68,598 (it has 624 example sentences)
  • dog is ranked a lot lower, and has a rank score of only 2,695.

The rank score of "a dog" is therefore made to be just a little bit lower than the rank score for dog. It is basically made lower by using the the lowest rank word score and then deducting the length of the phrase.

Reflexively applied

The rank score has a part of the score from the whole breakdown tree to which it belongs. For example, the word I (before a vowel sound) only has 17 examples. However, it is used in the phrase "I have" 50 times. In that case the score is increased not just for the number of direct uses, but also for the number of indirect uses. That is what the "Examples tree count" is showing; look at the Rank values tab for I (before a vowel sound), and you will see that there are 17 examples, but 129 indirect examples.

Rank Detail tab

Every lesson has a "Rank Detail" tab which shows the words in order of their rank score.

For example, take a look at the Birds Rank Detail tab.

The global list of rank scores

Since every lesson has a Rank Detail tab, the place to look if we want to see the rank scores of all the words and phrases is the Rank Detail tab of the All lesson, which is the lesson containing all words and phrases.

This is the list which shows us the ideal order which students should learn each word or phrase, ranked from highest to lowest.

Restricting rank of Example lessons

When we generate sublesson example phrase lessons, we restrict the examples only to the rank values of words in that lesson. Why? This is so that the examples are at the level of the rank values of the words. For example, for Animals 1, the examples are restricted to be at the rank values like the range of rank values of the Animals 1 words. The examples will therefore all be easier than the words in Animals 2. Imagine that Animals 1 has words with rank value from 3000 to 1000. The Animals 1 Examples would also all be in this range. The range of words in Animals 2 might have rank value from 1000 to 800, and all the examples would also be kept in that range.

The reason is that very simple words with very high rank values, e.g. "dog", "the", can have extremely advanced and complicated sentences with extremely low rank values, e.g. "I went with my dog to an equestrian class where we studied together metaphysics and reflected on whether the universere obeys Ptolemeian geomoetrical norms". We don't want very complex sentences at the level 1 of sublessons!