[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: WG: average word length in VMS

Bruce Grant wrote:
> 1.    tabulating frequencies for all final strings of length 1, 2, 3, ...
> 2.    walking through each word from back to front, trying to find a spot where
> the final string becomes
>         much less frequent.
> E.g. if the word is "SOLUTION" there would be lots ending in "-N", "-ON",
> "-ION", "-TION"
> but not too many ending in "-UTION".

So how do we distinguish between 'tion' and a regular verb
inflection?  Finding lots of 'tion's in Voynich would most
likely not be much more than a red herring.  Consider languages
like English with those kind of endings, vs a language like
German with lots of compound words vs Chinese with all words
being compound...The facts needed to solve an unknown cipher
with an unknown language have not been developed yet.  Perhaps
we should apply some techniques to known languages.  There has
to be a more decisive way to fingerprint a language than entropy
vs avg. word length.  Again, in line with my idea to translate
the text into a different format, what if a relative value of
'phonosyntactic oddity' were assigned to each VMS token,
wouldn't we be able to see a pattern that reflected the
language's outside influences that would help ID it?