[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

VMs: AW: list of words from candidate languages...



Title: AW: list of words from candidate languages...

You will be surprised, if you consider conjugated language (not to be found in dictionary).Imagine "AM" as ed/ng... (English) or ng/en/nd... (German) or masu/shita/desu... (Japanese).These all are common endings of conjugations .And don't assume, that the VMS is a single substitution cipher.

Cheers
Claus

-----Ursprüngliche Nachricht-----
Von: Guy Thibault [mailto:gthibaul@xxxxxxxxxxxxxxxxxx]
Gesendet: Mittwoch, 23. Juli 2003 14:00
An: Vms-List@Voynich. Net
Betreff: VMs: list of words from candidate languages...


Hello

For some time I have been playing with the file Intrln17.txt... I loaded it into a Sql database and I do queries on the occurence of words (like so for instance: select mot, count(*) from voynich where codlangage = 'F' and charindex('%',mot) = 0 and charindex('*',mot) = 0 and charindex('!',mot) = 0 and charindex('[',mot) = 0 and charindex(']',mot) = 0 and charindex('|',mot) = 0 and patindex('%AM%', mot) <> 0 group by mot having count(*) > 10 order by len(mot), mot desc, produce this list:

AM      233
TAM     41
SAM     16
RAM     75
OAM     24
HAM     42
EAM     12
DAM     90
8AM     827
2AM     163
TDAM    18
T8AM    14
ORAM    45
OHAM    139
OEAM    41
ODAM    181
O8AM    54
HZAM    13
GHAM    36
GDAM    38
G8AM    19
EDAM    39
ARAM    11
4OAM    18
TODAM   12
TO8AM   39
TC8AM   30
SO8AM   18
SC8AM   12
OEDAM   44
4OHAM   94
4ODAM   336
4O8AM   39

This helps me identify a key (in this case the "word" AM) and its occurance in the "text". I think this particular key (AM) has interesting enaugh behaviour to make it a good candidate to identify the language... Lots of 4 letter words ending in AM, like baLL, taLL, waLL in the case of english... Which brings me to a suggestion to the group...

FWIW It might be of interest to all to have a place where one could download a list of words in any particular languages that one might think of... I suppose most of you scan the web for dictionaries and the you extract the words to have a list... Like I did! If we put all of our files in one place (say the voynich site) we<d have a very good sample of languages from old french, english, italian, latin, even khowar :) Of course the idea is to "train" a program to match the "key's behaviour" ("AM") in the target language's word list...

If anyone is interested in having the vb code that loads the interlinear file into sql I can email it. The end result is a 7mb sql database which I could also upload to a ftp site...

That's it for now, just thought it worth mentionning that stuff about a common place where one would get "words" from various languages...


Cheers.

______________________________________________________________________
To unsubscribe, send mail to majordomo@xxxxxxxxxxx with a body saying: unsubscribe vms-list