[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
VMs: Numbercrunching "word" tuples
Very interesting for me, being a newbie. What tools and input transcription
did you use? And how long is the input transcription? (If you give me a
pointer I'll look up the rest myself.) Your list is very interesting. I
can't but wonder about coincidences like the following:
23 chedy qokaiin *
20 qokaiin chedy
24 daiin daiin **
22 chol chol
15 qokeedy qokeedy
19 shedy qokeedy ***
19 shedy qokedy
I don't have the tools yet to check it myself, but where would patterns like
this appear frequently in a natural language? I tried to find these
patterns in a random English book
but found few at first sight:
* I think that pairs like "if that" / "that if" might be the most frequent.
** At the moment I can only think of an example in Dutch that always
confuses my spelling checker. It goes something like this: "Alle
aandachtspunten die onderzocht zijn, zijn in orde bevonden." And then
there's a comma in between the two "zijn".
*** This one is easier: "of this", "of the" and "of a".
Not that this yields many insights, but it's an amusing exercise. And this
kind of analysis could even be applied to chinese characters :-)
I am still inclined to write a "long pattern seeker" that would find long
repeating patterns ignoring the spaces. Maybe longer patterns than 4 words
would emerge?
Greetings, Petr