[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: VMs: VBScript for finding repeating strings

Hi Petr,

Just some quick comments on your VBScript string autocorrelation test.

(1) To my eyes, what appears to be happening is that your 12-unit cut-off string match is giving dense ("black triangle") matches in areas where there are a good number of longer-than-12-unit matches.

(2) These long matches seem to be heavy in the kind of [qo-ch/sh/k/t-e/ee/eee/eo-dy] "structured words" we've all come to know and love so much.

(3) The length of the match is calculated in EVA units, not in glyph units.

(4) You might consider replacing both probable glyphs (like "cfh", "ch", "sh", "ee", etc) and probable verbose pairs (like "qo", "dy") with non-EVA letters (like "A", "B", "C", ...), and reducing your cut-off length, to see what other patterns you notice. This might reduce your reliance on "qoteedy" matches, and hence help show up other kinds of matches within the text.

Alternatively, you might try adapting your code to compensate for matches longer than your cut-off (which I predict are giving you your black triangles etc).

Cheers, .....Nick Pelling.....

______________________________________________________________________ To unsubscribe, send mail to majordomo@xxxxxxxxxxx with a body saying: unsubscribe vms-list