Understanding EAD3 and XML

This isn’t a full overview of EAD or XML, but rather some basics that you should find helpful when learning EAD and using this site.

What is XML?

XML is eXtensible Markup Language. It is a language used to encode data. If you’re familiar with HTML, it’s like that (especially if you’re familiar with principles of closing tags and nesting tags as were introduced in XHTML), but it’s more flexible. If you’re not using a controlling document (DTD, Schema, etc.) you can make whatever elements you want. If you want, you can write your own DTD (Document Type Declaration) or Schema which codifies a set of XML tags and describes how they interact with each other. People write big schemas to describe how a camera should encode information about a picture it took: was the flash on? how was light balance set? what’s the model of the camera? Or they write little schemas to encode a recipe so they can search by ingredient.

The only real rules of XML are that you nest your tags within each other and that you follow any rules of a schema/DTD you’re using. This site will teach you how the EAD3 Schema works.

Elements

Elements are generally straightforward. They normally have an opening tag that looks like <this> and a closing tag that looks like </this> with any additional data in between. They may contain attributes inside the opening tag and child elements before the closing tag. They may be included within one or multiple parent elements.

<parent>
	<child class="value">Some text</child>
</parent>

The parent and child elements in that are fairly obvious. @class is an attribute (which will be distinguished with the “@” on this site) and “value” is the attribute’s value.

Some elements may contain text. Some may only contain other elements. Some may only contain attributes (these elements normally don’t have a closing element tag, you just end the element with “/>”. Sometimes elements must be used in a particular order within their parent elements. For example:

<ead>
	<control></control>
	<archdesc></archdesc>
</ead>

is the required order of the major elements in EAD3. You can’t have <archdesc> before control without breaking validation (you want your XML documents to be valid so that they play nicely with software).

If the element order is required, I’ll be sure to say so in the element’s page.

Attributes

An attribute tells you something about an element. You may be familiar with this common one: <a href="https://eadiva.com/">EADiva</a> in which the “href” is the attribute. Its content tells you the webpage of the text inside the a element (or really the a element is kind of the opposite of your average EAD element in that the href is probably more important than the text).

Some elements in EAD3, mostly linking ones, will contain their primary data in an attribute. Sometimes attributes have a controlled value list to standardize the vocabulary and allow for better machine processing.

For example:

	<publicationstatus value="inprocess"/>

This is an element which contains no text but has a designated attribute “value” which may only contain the following three values: “inprocess,” “approved,” “published.”

Attributes may be used in any order within an EAD element, unlike child elements which may be controlled. Each element page will contain information about the possible attributes it may contain and any restrictions over their content. Sometimes an attribute has a list of controlled values and then something like “othertype.” The instructions may then instruct you to add the @othertype attribute and designate a value there. This allows for local values while still standardizing for general archival software.

Nesting

Nesting is pretty straightforward. It means that this is NOT ok:

<tag1>
	<tag2>
	</tag1>
</tag2>

iInstead, it should look like:

<tag1>
	<tag2>
	</tag2>
</tag1>

Another way to look at that would be: <strong>The entire city was <em>in flames</strong></em> is WRONG because the <em> should be closed before the <strong> and <strong>The entire city was <em>in flames</em></strong> is correct.

You’ll occasionally see things called “wrapper” elements. These elements don’t contain any text, only other EAD elements. They’re designed for grouping elements which are related, rather than for specifically encoding text. Other elements can handle both plain text and subelements.

The Mozilla Developer Network has a good introduction to XML and links to other places where you can learn more about it.

What is EAD?

EAD stands for Encoded Archival Description. This website describes EAD3, the third release of EAD. It is a public domain XML Schema created by and for archivists to be used in the creation of XML-based finding aids. These finding aids can be read by browsers, search tools, etc. Various programs scripts can convert them into HTML pages, PDFs, or save them in databases where they can be edited and re-exported.

EAD finding aids can be created manually in text or XML editors, through specially-designed Access (and other) database exports, and through special EAD-creating software. ArchivesSpace is a next-generation descriptive information management system which will be EAD3-compliant. For EAD 2002, one may use older tools: Archivists’ Toolkit and Archon.

This site will assist you in learning the elements specific to EAD3 and how they should be used to describe a collection for a finding aid/index/guide/inventory. It includes links to many other sites which can help you learn even more.