The automatic process sometimes ends up with the same word twice (or more!). That is, the identical English and Target language translation.
Look at the following example. We see a page for "a" in Spanish, which translates to the Spanish word "una". So far, so good. We notice there are "4 examples". OK, not every word has a lot of examples of use.
However, there will usually (though not always), be a clue when you see a "This is a duplicate" warning, which points to a word with exactly the same English, and exactly the same Target language translation.
In this example, the page we are on has the English "a" and the translation "una", and the duplicate page has the identical values; the English is "a" and the translation is "una".
If we drill through to this other page, it does look like the page is just the same word duplicated, although the duplicate in this case has "(feminine)" as the Clarifier field:
When we look further on the page, we see that this other version of "a" has 578 examples:
When we look carefully at the different pages it is clear (in this example at least), that these are really the same word; they have the same English, the same Target language translation, and most importantly, they have exactly the same meaning.
They therefore belong on only one page. You can use ChatGPT to explain the breakdown of some of the sentences on each page in order to be sure whether or not the meanings of the words really are the same.
So, once it is clear that the two words need to be merged here is the procedure:
- decide which version of the word will soon be deleted. Usually choose the one with the least number of examples. Here, it is easy; there are only 4 examples of one, and 578 examples of the other, so we choose to merge the word with only 4 examples.
- change the word which is going to soon be deleted to make it clearer in the literal breakdowns that this is the one soon to be deleted:
- make it Unpublished
- add to the Clarifier text like "deprecated" or "to delete soon"
- change the literal breakdown in each of the examples which use the "to delete soon" word, and change that word in the literal breakdown to use the other version of the word
- when there are no examples left which use the "to delete soon" word, then that word can be safely deleted
For example, we start with:
- "a", 4 examples
- "a (feminine)", 578 examples
When we have changed the 4 examples to use the other version of the word, we will have:
- "a" 0 examples
- "a (feminine)", 582 examples
At this stage, it is safe to delete the "a" page.