Search:

Energy efficient museum buildings

Copenhagen, Denmark, September 2011


Announcement

cd of the course of 2010

Program 2010

Download brochure

Lectures

Exercises

Exotic climates for exercises

Calculation tools and basic concepts

References and bibliography


Private pages


text for data storage lecture

[title page]

I am concerned that the digital data which has been so diligently inserted in databases of climate around, or condition of, museum objects is likely to be much less durable than even our most crumbly physical relics.

I have been a scientific programmer since the mid 19 sixties. When I started, the storage medium for both programs and data was paper. It came in two forms - One was a thin hard card made by IBM which held a line of code at a time. It is very durable as a material, but presents one major risk to the durability of the embedded data.

[tripping maid]

Note the 1960s style for free range office assistants. My wife worked for The British National Bibliography at that time. Cards that found their way to the foot of the stairs, or the lift shaft in one incident, served us well for writing shopping lists.

[paper tape]

The alternative record medium was punched paper tape. It had the advantage that the data stream was serialised so one had to not only drop the roll but tear it in pieces to disturb the data order. However, the paper was impregnated with oil to lubricate the reader mechanism. This oil rotted the paper within five years. Failure was naturally concentrated at the symbol with all holes across the tape. I forget what that symbol was but its use was another major risk.

One merit of these storage media, even when dispersed or fragmented, is that the code is human readable, like morse code. One just compares the pattern of holes and blanks with the code described on a single sheet of paper with 127 symbols, comprising the american alphabet and some long obsolete printer instructions. This was the ASCII code, standardised in 1963 and still going strong as the only symbol set allowed in programming languages.

[frog]

This single page standard has proved very versatile. Extending far from its original nerd use into the realm of the fine arts. Here for example is an American frog painted in ASCII symbols in a fixed width font.

I'll leave this dollar display up as a suitable background for the story of how information exchange standards have developed in recent times. The standard for digital data exchange is currently ODF: open document format. The specification has expanded from the single ascii page to about 600 pages. It has become standard ISO/IEC 26300. The underlying structure is XML which is a markup system similar to the HTML tags, meaning that every data point is enclosed by information about what it means, with both data and explanation in the ASCII code, or more exactly in UTF8, which is the extension of ascii to all human languages. In contrast to HTML, which can generate a perfectly good screen image with a mere dozen tags, ODF has many more tags, so it is scarcely writable by humans.

The Microsoft corporation, long devoted to secret formats for both programs and data storage, realised that the open ODF format was a threat to its world monopoly on desktop computers. It has tried to spoil the progress of ODF by presenting its own information exchange format for acceptance as a standard: ooxml - Object Oriented XML. In January this year Microsoft sent to the ISO standards committee a 6000 page specification, giving the members 30 days to study it and make criticisms. While they were doing this, and finding many parts of the specification where only microsoft programs could be used to generate or alter the data, Microsoft was busy persuading countries with no computers to join the standards organisation to vote for the microsoft proposal. They also bribed a few officials in IT-advanced countries. This blatant corruption of the standard confirmation process nearly succeeded. An after effect of this loading of the committee has been that the new members cannot be bothered to vote for anything that does not pay them, and it needs a minimum fraction of members to vote to pass any IT related standard, so the standards committee is now paralysed. Hey - ho: this is the free enterprise global economy in action.

[dateformats] Here is an extract from a detailed comparison of the odf and the ooxml formats which gives the flavour of the XML code. It represents a single cell in a spreadsheet, containing a date. The date is stored by microsoft as a number, in days, which is based on the wrong assumption that the year 1900 was a leap year. ODF stores the date in the format specified by another ISO standard, for date and time.

That's the story at the big world level. How are conservators fitting into this development?

[condition reports]

Since the 1970's there has always been someone working in the conservation institution which I belonged to, on a condition report of the institutions collection, carefully prepared by the diligent conservator and having no discernible effect on the management.

Every institution has its own database format, with a weird trail of obsolete identifiers. The conservation department of our hosts here, to take a local example, registers the normal place of residence of the objects it treats by the hundred. I don't mean hundreds of records, I mean the administrative unit used all over northern europe in olden times. It's called the hundred, or wapentake, in English: I live in the Coleridge hundred in the county of Devon. In Danish it is called the herred. Obsolete since 1970. But it lives on in the natmus database.

[cidoc-condition report]

All this confusion is about to change because we have a rejuvenated CIDOC. That is the International Committee for Documentation of the International Council of Museums (ICOM-CIDOC).

The standard for museum documentation, called the Conceptual Reference Model, CRM, is now canonised as ISO 21127:2006. 123 euro for 108 pages. The brevity is encouraging but asking money for a standard is hardly going to encourage its universal use. It is an object oriented scheme, surely an unfamiliar concept to most of you, and not explicable in the time available to me. It is unfamiliar also to the designers of the access database which powers most of the papers at this conference which invoke a database to record an objects climate or condition.

I think it is well thought out, but may be ahead of its time. I leave it for the climate recording, risk analysis and condition report experts in the audience to comment on their relationship to the CIDOC standard. But be aware that every modern jargon phrase, such as object oriented, is likely to be pre-empted for alternative use, before mainstream people have understood it.

[stonehenge]


Edit - History - Print - Recent Changes - Search
Page last modified on November 09, 2008, at 09:56 AM