Every state in the PFSM assigns a probability to every input token. (If a token is not allowed in a state, its probability is zero.) If the probability of a token in a certain state is P, then when that token appears in that state it generates log2(1/P) bits of information. The input text can be run through the PFSM, and the information from every input token totaled up (or averaged per token). The best-fit PFSM is the PFSM with the combination of transitions and probability assignments that provides the smallest total (or average) information when the sample text is run through it. <technical> A PFSM of course is a pair of functions p() and x() on a set S of states and set T of tokens. If s is a state and t is a token, then p(s,t) is the probability that t will be the input token when the PFSM is in state s, and x(s,t) is the next state that the PFSM transitions to when t is the input token in state s. Suppose t_0, t_1, t_2, ... is the sequence of input tokens. Generate the sequence s_0, s_1, s_2, ... of states, where s_0 is the initial state, and s_n+1 = x(s_n,t_n). Then the total information generated by the input text is sum log2(1/p(s_n,t_n)) </technical> -Ben -----Original Message----- From: owner-vms-list@xxxxxxxxxxx on behalf of Jim Gillogly Sent: Tue 3/8/2005 4:35 PM To: vms-list@xxxxxxxxxxx Cc: Subject: Re: VMs: RE: RE: Best-fit 2-state PFSMs They all looked legible and interesting to me. How did you do the best fit? -- Jim Gillogly ______________________________________________________________________ To unsubscribe, send mail to majordomo@xxxxxxxxxxx with a body saying: unsubscribe vms-list
<<winmail.dat>>