[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: VMs: VBScript for finding repeating strings



> Could you provide a really simple slow description of what we are
> looking at in the triangular plot?
> How do you get from VMS to that image - step by step?

Gladly, but you'll have to set your mail reader to "courier"
otherwise it will come out as gibberish :-)

Suppose the input file is:

abc.cde.ab.fg

Then the algorithm will shift the file against itself in successive steps:

abc.cde.ab.fg
-abc.cde.ab.fg

distance = 1, no matches

abc.cde.ab.fg
--abc.cde.ab.fg

distance = 2, one match
the character "c" at positions 2 and 4, length = 1,

abc.cde.ab.fg
---abc.cde.ab.fg

distance = 3, one match
the space at positions 7 and 10

etcetera ...

abc.cde.ab.fg
--------abc.cde.ab.fg

distance = 8, one match
string "ab" at positions 0 and 8

If these are the only matches, this will give the following set of dots:

x=2, y=4
x=7, y=10
x=0, y=8

Actually, I ran the script and it produced more matches than I could see by
eye only:

File :  abc.txt
Lines:  1
Chars:  15
String1 String2 Distance Length String
2       4       2        1      |c|
7       10      3        1      |.|
3       7       4        1      |.|
3       10      7        1      |.|
0       8       8        2      |ab|

And if we make an x-y plot of it it will look like this:

,..X.,.X..,....,
,....,....,....,
X....,....,....,
,..X.,....,....,
,....,....,....,
,....,....,....,
,.X..,....,....,
,....,....,....,
,....,....,....,
,....,....,....,

Now if you take a bigger input file, you get a bigger triangle.

In the above example I set the cutoff at 0, so every match is accepted.
In my VMS calculation I set the cutoff at 12, so only strings > 12 are
printed.


______________________________________________________________________
To unsubscribe, send mail to majordomo@xxxxxxxxxxx with a body saying:
unsubscribe vms-list