Monday, January 9, 2012

(empo-tymshft) The solution to "Losing information at CERN" - how about Mesh?

Modern-day users employ the World Wide Web for a variety of tasks.

But how many organizations require the World Wide Web to keep track of information?

Yet that was the problem facing CERN in 1989 - or at least that's what Tim Berners-Lee claimed. In order to get funding for his hypertext proposal, he had to present a business case to CERN - and, in 1989, he did. Here's an excerpt from that business case:

Losing Information at CERN

CERN is a wonderful organisation. It involves several thousand people, many of them very creative, all working toward common goals. Although they are nominally organised into a hierarchical management structure,this does not constrain the way people will communicate, and share information, equipment and software across groups.

The actual observed working structure of the organisation is a multiply connected "web" whose interconnections evolve with time. In this environment, a new person arriving, or someone taking on a new task, is normally given a few hints as to who would be useful people to talk to. Information about what facilities exist and how to find out about them travels in the corridor gossip and occasional newsletters, and the details about what is required to be done spread in a similar way. All things considered, the result is remarkably successful, despite occasional misunderstandings and duplicated effort.

A problem, however, is the high turnover of people. When two years is a typical length of stay, information is constantly being lost. The introduction of the new people demands a fair amount of their time and that of others before they have any idea of what goes on. The technical details of past projects are sometimes lost forever, or only recovered after a detective investigation in an emergency. Often, the information has been recorded, it just cannot be found.

If a CERN experiment were a static once-only development, all the information could be written in a big book. As it is, CERN is constantly changing as new ideas are produced, as new technology becomes available, and in order to get around unforeseen technical problems. When a change is necessary, it normally affects only a small part of the organisation. A local reason arises for changing a part of the experiment or detector. At this point, one has to dig around to find out what other parts and people will be affected. Keeping a book up to date becomes impractical, and the structure of the book needs to be constantly revised.

The sort of information we are discussing answers, for example, questions like

* Where is this module used?
* Who wrote this code? Where does he work?
* What documents exist about that concept?
* Which laboratories are included in that project?
* Which systems depend on this device?
* What documents refer to this one?

The problems of information loss may be particularly acute at CERN, but in this case (as in certain others), CERN is a model in miniature of the rest of world in a few years time. CERN meets now some problems which the rest of the world will have to face soon. In 10 years, there may be many commercial solutions to the problems above, while today we need something to allow us to continue.


Berners-Lee then examined existing organizational methods, all of which were too dependent upon a particular person or a particular organizational structure to be of long-term practical use.

The solution? Hypertext.

In 1980, I wrote a program for keeping track of software with which I was involved in the PS control system. Called Enquire, it allowed one to store snippets of information, and to link related pieces together in any way. To find information, one progressed via the links from one sheet to another, rather like in the old computer game "adventure". I used this for my personal record of people and modules. It was similar to the application Hypercard produced more recently by Apple for the Macintosh. A difference was that Enquire, although lacking the fancy graphics, ran on a multiuser system, and allowed many people to access the same data.

Now I was not reading Berners-Lee's paper in 1989, but a multi-user hypertext application would sound impressive to me - even if I couldn't dream about what it could become.

And even Berners-Lee's proposal didn't envision the Internet as we know it today. Here is how Berners-Lee envisioned hypertext at CERN:

uucp News
This is a Unix electronic conferencing system. A server for uucp news could makes links between notes on the same subject, as well as showing the structure of the conferences.

VAX/Notes
This is Digital's electronic conferencing system. It has a fairly wide following in FermiLab, but much less in CERN. The topology of a conference is quite restricting.

CERNDOC
This is a document registration and distribution system running on CERN's VM machine. As well as documents, categories and projects, keywords and authors lend themselves to representation as hypertext nodes.

File systems
This would allow any file to be linked to from other hypertext documents.

The Telephone Book
Even this could even be viewed as hypertext, with links between people and sections, sections and groups, people and floors of buildings, etc.

The unix manual
This is a large body of computer-readable text, currently organised in a flat way, but which also contains link information in a standard format ("See also..").

Databases
A generic tool could perhaps be made to allow any database which uses a commercial DBMS to be displayed as a hypertext view.


Of course, the paper was designed to address specific needs at CERN, and didn't advance the thought to see what would happen if the system were available at CERN...and at NIST...and at Reed College...and at Microsoft...and at a Pink Floyd website.
blog comments powered by Disqus