[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Collaboration on VMS

Hi John,

At 19:25 10/08/01 -0400, John Grove wrote:
If anyone is going to build a system that everyone can access - great! I
would suggest looking into the benefits of XML for cross-platform support.
It would be a major undertaking to convert all the old data into marked-up
content, create specific XSL Translators so the data can be shared and
displayed regardless of operating systems or software.

I've also spent a large part of this year working with XML: while it's an extremely useful *data interchange* mechanism, the primary issue here is about *data modelling*.

Bodies of knowledge (basically, databases) are usually framed as sets (tables) of atomic data each with their own multi-dimensional space - the columns of each table are dimensions.

But this is a very static knowledge model, and here we have the difficulty that a lot of the data may be flat wrong. The weakness of the normal approach here is that, ultimately, users of it need to absorb an encyclopaedia-full of data before they can infer where the edges of it lie.

I'm trying here to build a newer "edge-centric" model of knowledge, defined in terms of what we don't know - outside-in, rather than inside-out. Most sources of information are "declarative" - I suppose the model I'm constructing is more "interrogatory". We'll see how it goes. :-)

I've thought that we could actually go as far as marking up the character
set, tokens, titles, labels, etc... but considered the time that would take
and said to myself - I have to learn a lot more about content markup before
I could even suggest such a thing. However, along comes someone else that
wishes to improve the 'knowledge sharing' capabilities of this list - so
I'll suggest that a web-based product using W3C standards in markup should
allow for an incredibly versatile product we could use to share, search,
cross-reference, and crunch.

Remapping the various VMS transcriptions into XML would be extremely easy: but, again, I have to say that the existing transcriptions already serve the function of data interchange admirably.

One idea I had was to write a web page in JavaScript that contained one (or all) of the VMS transcriptions, and let you define and combine rules persistently, and examine the output (and its entropy).

Surprisingly, on similar tasks I've found JavaScript to be amply fast enough: the transparency and ease of developing the user interface gives it a huge advantage.

Also: it's not widely known that JavaScript 1.2 has a regular expression handler that's the same as in Perl 4, only running on the client side: this also hugely increases its suitability for use in this context.

So, would building a simple on-line tool with a set of predefined (and named) regexp rules/filters ("Language A only", "Herbals only", etc) that can be joined and executed interactively be the kind of thing you have in mind? This could open out the statistical analysis of the VMS into more of an openly collaborational venture.

Cheers, .....Nick Pelling.....