[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

VMs: Numbercrunching "word" tuples



Very interesting for me, being a newbie. What tools and input transcription
did you use? And how long is the input transcription? (If you give me a
pointer I'll look up the rest myself.) Your list is very interesting. I
can't but wonder about coincidences like the following:

23 chedy qokaiin *
20 qokaiin chedy

24 daiin daiin **
22 chol chol
15 qokeedy qokeedy

19 shedy qokeedy ***
19 shedy qokedy

I don't have the tools yet to check it myself, but where would patterns like
this appear frequently  in a natural language? I tried to find these
patterns in a random English book
but found few at first sight:

* I think that pairs like "if that" / "that if" might be the most frequent.
** At the moment I can only think of an example in Dutch that always
confuses my spelling checker. It goes something like this: "Alle
aandachtspunten die onderzocht zijn, zijn in orde bevonden." And then
there's a comma in between the two "zijn".
*** This one is easier: "of this", "of the" and "of a".

Not that this yields many insights, but it's an amusing exercise. And this
kind of analysis could even be applied to chinese characters :-)

I am still inclined to write a "long pattern seeker" that would find long
repeating patterns ignoring the spaces. Maybe longer patterns than 4 words
would emerge?

Greetings, Petr