OpenRefine, Access Points, Archivists’ Toolkit, and Chaos -> Order guest post

The contents of this post relate to EAD 2002.

After having read posts on the Chaos –> Order blog and articles concerning the use of OpenRefine to handle metadata stored in Excel and Access files, I found myself asking how this could be done with an Archivists’ Toolkit (MySQL) database. Since the literature was not forthcoming, I did my own experiment, which Maureen graciously offered me the space on the site describe.

EAD3 <control> preview!

I’m working on a sample file for EAD3 examples. I’m intending to use all possible tags, which will take some work. This is the first part, a sample of <control>, the replacement for <eadheader>. It’s still a draft and will probably be updated before I finish, but it’s fairly complete and I thought I’d share as a taste:

<control countryencoding="iso3166-1" dateencoding="iso8601" langencoding="iso639-2b" relatedencoding="iso15511" scriptencoding="iso15924">
	<recordid instanceurl="">S.0001</recordid>
	<otherrecordid localtype="manuscript">Slytherin-S-1</otherrecordid>
	<representation href="" linktitle="HTML transformation of the finding aid" show="new"/>
			<titleproper>Manuscripts of Salazar Slytherin: Finding Aid</titleproper>
				<subtitle>A guide to the letters, papers, and spell books held at Hogwarts School of Witchcraft and Wizardry Archives</subtitle>
			<author>Felicia Flitwick</author>
			<sponsor>Funding for the initial creation of this electronic finding aid was made possible in 1992 by a generous gift of Slytherin's class of 1972. Funding for encoding was made possible in 2008 by a gift from Slytherin's class of 1998.</sponsor>
			<edition>2nd edition</edition>
			<p>Edition was updated during EAD3 encoding to conform to DACS standards.</p>
			<publisher>Hogwarts School of Witchcraft and Wizardry</publisher>
			<date normal="2014-03-01">1 March 2014</date>
				<addressline>Hogwarts School of Witchcraft and Wizardry, Archives</addressline>
				<addressline>Hogwarts Castle</addressline>
				<addressline>Inverness IV27 H64</addressline>
				<addressline>Scotland, UK</addressline>
				<addressline>+44 01549 511100</addressline>
			<titleproper>Manuscripts of the Founders of Hogwarts School of Witchcraft and Wizardry</titleproper> <num>4</num>
				<p>The archivist is aware that certain families who came up through Slytherin still possess original documents and realia from the house's founder. Please consider donating or bequeathing these to our archives so that we may present Salazar Slytherin's life and views more completely.</p>
	<maintenancestatus value="revised"/>
	<publicationstatus value="published"/>
		<otheragencycode localtype="wizlib">H</otheragencycode>
		<agencyname>Hogwarts School of Witchcraft and Wizardry Archives</agencyname>
		<language langcode="eng">English</language>
		<script scriptcode="latn">Latin</script>
		<citation href="" lastdatetimeverified="2014-03-01T16:30:21Z" linktitle="DACS in HTML on SAA website" actuate="onload" show="new">Describing Archives: a Content Standard</citation>
			<p>DACS was used as the primary description standard.</p>
		<citation href="" linktitle="WizLib Standards" actuate="onload" show="new">Standards for Wizarding Libraries and Archives: 2012</citation>
			<p>The 2012 edition of the Standards for Wizarding Libraries and Archives was used to handle descriptions specific to the wizarding world, such as the Wizarding Agency Code.</p>
	<localcontrol localtype="house">
	<localcontrol localtype="section">
			<eventtype value="derived"/>
			<eventdatetime standarddatetime="2014-03-01T08:05:33Z">1 March 2014</eventdatetime>
			<agenttype value="machine"/>
			<eventdescription>Conversion from EAD 2002 finding aid using XSL transformation.</eventdescription>
			<eventtype value="revised"/>
			<eventdatetime standarddatetime="2014-03-01T10:05:23Z">1 March 2014</eventdatetime>
			<agenttype value="human"/>
			<agent>Felicia Flitwick</agent>
			<eventdescription>Conversion from EAD 2002 revised.</eventdescription>
			<eventtype value="revised"/>
			<eventdatetime standarddatetime="2014-03-09T14:23:42Z">9 March 2014</eventdatetime>
			<agenttype value="human"/>
			<agent>Felicia Flitwick</agent>
			<eventdescription>Minor revisions.</eventdescription>
			<sourceentry>Hogwarts: a History</sourceentry>
					<dc:title>Hogwarts: a History</dc:title>
					<dc:creator>Bathilda Bagshot</dc:creator>
					<dc:identifier>DA880.B34 1968</dc:identifier>
				<p>Basic biographical and historical information was taken from <title><part>Hogwarts: a History</part></title>.</p>

Site Functionality Improvement: Linked DACS

The contents of this post relate to EAD 2002.

Now that DACS is available in HTML online, I’ve gone through the relevant element pages and included links to the specific DACS page which applies to them. A small update, but one which I think will make DACS-compliant encoding much easier.

Interpretation, Unraveling, and Examination of EAD3

The contents of this post relate to EAD 2002.

A few weeks ago, I finally downloaded the most recent releases of EAD3 from GitHub in preparation for designing the EAD3 version of EADiva (more coming on that, but not until the official one goes live). I finished going through it yesterday and made copious notes, essentially untangling things as I went.

I started farther down, in the attribute & model section and then did the rest. What I have isn’t perfect, but I was still so pleased with it as a work-in-progress that I decided to share in case it was helpful or interesting to others. You can view it here and leave comments if you see something you have a question about/think I got wrong.

This is the RNG file because it’s just so much easier to read RNG than XSD, at least in my opinion.

View my notes on the EAD3 RNG schema.

How I Introduced Myself to EAD and Archivists’ Toolkit

The contents of this post relate to EAD 2002.

This was actually the topic of a paper I co-wrote for a student session at Fall 2012’s MARAC conference. Ever since then, I’ve been wanting to rewrite it as a simple, linked how-to blog post. Because other people have asked me where to get started and I’d like to show how it’s not actually too hard.

Step 1) Install Archivists’ Toolkit on your computer

I’ve already written up instructions for one way to set up Archivists’ Toolkit with a local database on your computer. You can also install the software and simply connect to the Archivists’ Toolkit sandbox database which they host and run.

Step 2) Find an EAD finding aid and import it into Archivists’ Toolkit

This is where you have it easier than I did. I was part of a group project that directly approached UMD’s manager of digital collections and asked her for an EAD finding aid to work with. You can just use one of the sample EAD finding aids I’m hosting on the site. And if you want some variety, the archivist from Syracuse also shared the location of all their EAD files.

For easiest import, I would suggest using the Miriam Butterworth and Natalie Babbit papers, which don’t make references to any files local to the repository.

If you use another file and get an error regarding not being able to find the EAD 2002 DTD (because the repository had linked to a local version), open the file in an editor and change the link from whatever their URL ending in ead.dtd was to:

Then import. If you then get a import log message that any mandatory field is missing and the record is invalid, just know that it’ll require you to fill those in before you can use the record in Archivists’ Toolkit. In the Berliner family papers, you’ll need to fill in an extent type and number. Just fill in something like 2 linear feet and voila you can view the rest.

I’ll recommend against using the Syracuse ones for this stage, because some local practices which are great for handling a whole system are a problem if you don’t have those files available. However, they’re quite useful on the whole for viewing later on. Just be aware that phrases like &su_name; are short-codes which pull in all the repository’s data so that people don’t have to type it every time.

Step 3) Open the EAD file in an XML-friendly reader

This doesn’t have to be an XML reader, per-se, just something like Notepad ++ (Windows), which highlights the syntax and has a word wrap function (under View in Notepad++) for easier reading. You can also view all but the Helen Lyman papers (which are in a very large file) on EADiva in a syntax-highlighting XML layout.

Step 4) Compare, compare, compare

Look how the record displays in Archivists’ Toolkit. Find the same fields in the EAD file. Look at the structure, where things show up. Look up any questions about elements, what they mean, what their attributes mean, etc, on EADiva. Remember, the syntax here is quite easy: Bam, you’re at the page.

Bonus step

View a finding aid’s display on a repository’s website. Any finding aid. Open Archivists’ Toolkit. Copy data from the finding aid, like title, date, etc. into the fields you think are right for it in Archivists’ Toolkit. See what happens when you export the file.

Post-Post Script

You can also import files into ArchivesSpace’s current beta and see how they look on there too. Or you can try the bonus step using the ArchivesSpace back-end vs. Archivists’ Toolkit, just to stretch your wings. I think it’s a good idea to work with both, since it’ll be a few years before many repositories transfer from AT to AS.

This was my method, except that EADiva didn’t yet exist and I spent my hours instead pouring over questions on the Library of Congress’s EAD tag library. If you taught yourself EAD, how did you do it? Was your method similar?

Sample EAD Finding Aids Now Available on EADiva!

The contents of this post relate to EAD 2002.

Thanks to generous and quick responses by Syracuse University Libraries Special Collections, University of Connecticut’s Thomas J. Dodd Research Center, and the Thomas and Katherine Detre Library and Archives, EADiva now has a selection of “live” EAD samples, actual files being used by actual repositories.

Some students reading EADiva have asked for these and I think they’re a great resource for anyone who wants to see how the elements function in the real world. I’m hoping that even more repositories decide to participate, but these five finding aids are a great start.

What can you do with these? Well, you can view them on the site, download the XML file, and look at how the repository’s catalog displays the finding aid to the public. If you download them, please make private use only, of course.

If you have Archivists’ Toolkit installed locally you can import the EAD files and examine the resources created by each.

You may try your hand at creating XSL Transformations for the files. Besides using the W3 Schools, try this XSLT tutorial in the basics. Taking some time to learn about XPATH will make XSL much easier.

How to Install Archivists’ Toolkit With a Local XAMPP Database on Your Computer

The contents of this post relate to EAD 2002.

With pictures!

(Disclaimers–this is not for an institutional setting. Also, this is geared toward Windows users. Specifically, I’m installing XAMPP on a Windows 7 machine. I’ve installed it on XP as well and still have an XP installation running at work. There are other ways to install & run AT locally. This is what I do.)

Installing Archivists’ Toolkit on your own computer makes sense for a lot of budding archivists. If we’re not using it at work, our free time may be the best way to build relevant experience. But why not just use the sandbox database which AT has graciously provided?

A few reasons to have the database installed locally:

  1. You want to keep your experiments private or are working with non-public data.
  2. You want to be able to work offline.
  3. You’re trying to create something more permanent and don’t want to risk its deletion by another user.
  4. You’re worried that the sandbox might someday go away.
  5. You can use phpMyAdmin to easily backup the database and move it elsewhere.
  6. You want a little phpMyAdmin experience or are interested in running other things on a local database

Let’s get started.

Step 1: Install XAMPP on your computer

What is XAMPP? It’s a simplified Apache distribution with PHP, MySQL, and Perl. The Windows version includes Apache, MySQL, PHP + PEAR, Perl, mod_php, mod_perl, mod_ssl, OpenSSL, phpMyAdmin, Webalizer, Mercury Mail Transport System for Win32 and NetWare Systems v3.32, Ming, FileZilla FTP Server, mcrypt, eAccelerator, SQLite, and WEB-DAV + mod_auth_mysql, so there’s a lot more you can do with it. For example, I run locally-hosted WordPress sites where I edit themes and such (I hand-wrote the theme for on a local site).

If you don’t know what that means, don’t panic. It’s a small server you can run on your computer. It won’t be hooked up to the internet, so you don’t have to worry about people breaking in and you can use it whether or not you’re online.

Scroll down on the windows download page and select the download option. Since we’re doing this easy-style, select the Installer option below. This will be a .exe file.

The download link will likely take you to SourceForge, which is a legitimate download site. Once you’ve downloaded the installation file, double-click like any program you’ve installed before. You may get a warning that firewall or antivirus will slow down installation. That’s fine, I just chose to keep them on anyway.

During setup, you’ll have an option of what you want to install. I recommend installing everything it offers. At the least you’ll need MySQL and phpMyAdmin (Apache and PHP cannot be unchecked). I chose NOT to learn more about Bitnami, which can help one install WordPress and more on the computer.

XAMPP will then install on your computer. It may take a while.

A picture of XAMPP slowly installing

When XAMPP has finished installing, choose to start the control panel. You will always need the control panel to be running to use Archivists’ Toolkit.

A picture of the completion screen for XAMPP with the option ticked to start the control panel on finishing

In order to set up (or, once installed, turn on) your database, you’ll need to click Start on both Apache and MySQL (arrows below point to the relevant buttons, which currently read Stop, since I’ve already turned it on).

The Apache and MySQL functions have been turned on. They're already displaying Stop which has been turned on

When you do this the first time, you may get the following error message from the Windows firewall.

An image of the windows firewall error you may encounter

That’s fine, allow it to connect.

Step 2: Setting up a Local database for Archivists’ Toolkit

Now, go to http://localhost/phpmyadmin/ in your browser (this link will only work if you’ve already installed and activated XAMPP). You may see a notice that your root user doesn’t have a password setup. As long as you’re just running this locally, it doesn’t matter. You probably don’t need to upgrade phpMyAdmin, if it’s a slightly older version.

Click the Databases tab at the top and use the create database box to create your database as below. I’m calling mine “atdatabase.” I simply selected “Collation” as the type.

An image of the database creation screen. atdatabase is entered in the Create Database field and Collation is selected

Under the Users tab, create a new user. I chose to create one simply called atuser with the password eadiva. I clicked the Check All option to give atuser access to all the databases (this is a bit of a short-cut, but for the purposes of running AT it works just fine). Click “Go” at the bottom right corner to create the user.

An image of creating a new user and assigning it privileges

After the database and user have been created, we’ll switch over to installing Archivists’ Toolkit itself. Leave the XAMPP control panel open and on.

Step 3: Install Archivists’ Toolkit and Initialize the Database

Download the most recent update of release 2.0 and execute the installation file.

An image of the Archivists Toolkit installer

Walk through the basic installation steps.

Then open C:/Program Files (x86)/Archivists Toolkit 2.0 (or wherever your program files are stored). Execute Maintenance Program 2.0.exe.

An image of the Maintenance Program file within the Archivists Toolkit folder

Select “Initialize a blank database.”

Initialize a blank database is checked

Enter the location of your database, the username and password, and the type.

An image of the various login information written below entered into the appropriate field

The information entered in the image above is:


Your database location should begin with jdbc:mysql://localhost/ and be followed by the name of your database.

The next screen will prompt you to enter a repository name:

an image of the repository name fields, both for a long and formal repository title and a short form

Call it whatever you like, but be aware that it’ll require work to change later. The next screen prompts you to enter a username and password. Enter something you’ll remember or *gasp* write it down (I would never advocate for this for a real repository or an important password, but this is just a local installation of AT). I use my initials for both.

Then click Finish at let the software initialize your database. This will create the tables that AT needs to function. It may take a little while. During all of this, you’ll need to have had your XAMPP turned on or else the database can’t initialize.

Step 4: Set up Archivists’ Toolkit

When done initializing, open the newly-installed Archivists’ Toolkit.

It will require you to enter the database login information. As before, for mine, it’s:


and click Save.

Connection settings image for the database

It will then prompt you for the username and password you created during initialization (the ones for which I used my initials).

Once in, you can start using Archivists’ Toolkit. Your initial password will give you superuser (fullest) access to the software. You may Setup -> Users to create users with a variety of privileges and test out what you can and can’t do.

How to use AT once you’ve installed a local copy is a whole ‘nother tutorial series.

Step 6: How to start Archivists’ Toolkit next time

When you finish with AT the first time, simply exit out and close XAMPP. You will have to open XAMPP and press the Quit button in order to close it.

In order to restart AT, you’ll need to start XAMPP first. Simply open it from your Programs list or wherever you’ve put a shortcut. Turn on Apache and MySQL again just as you did before.

When restarting XAMPP, be sure to turn on both Apache and MySQL

Now you can open AT. This time it will automatically access the database and simply prompt you for your login.

Should you wish to create and initialize multiple databases or to access both the sandbox AND your local database from your AT installation, you’ll have the option to “Select Server” when logging in, which lets you change the server you’ll be logging into.

I’ve tried to make this tutorial thorough and heavily-illustrated, since once you have AT running it doesn’t take nearly as much technical savvy to use. Please comment below or contact me if you have any questions and I’ll do my best to answer and clarify the tutorial.

Code4Lib Article on EAD Tag Usage Analysis

The contents of this post relate to EAD 2002.

In the newest issue (22) of the Code4Lib included a fascinating article analyzing EAD tag usage across repositories. The team analyzed EAD-encoded finding aids, using the beta ArchivesGrid discovery system to document usage and more. The article includes both raw data about how often things were used, or repeated, in finding aids. The authors note that “not all elements are created equally,” so we needn’t expect, say, 50%+ usage of <frontmatter> (which is actually down at 37%) because it can be entirely generated using information from <eadheader>.

It’s definitely worth reading to see what current (perhaps not best) practices are across 120,000+ documents. The authors address not only % of usage of such heavily-used tags as <unitdate>—which was not used as often as I thought it would have been within <did> (73%)—but also the variance in descriptive practices which may make it less useful for sorting. For example, if <unittitle> is “Records” or “Reports,” searching that field along is not going to produce useful results unless, say, the <persname> or <corpname> or other relevant field is included as well.

Also, thanks M. Bron, M. Proffitt and B. Washburn for the EADiva recommendation.

Now with more DACS!

The contents of this post relate to EAD 2002.

Thanks to some good feedback I got this past week, I’ve integrated DACS references into relevant pages, using the DACS/EAD cross-reference from Appendix C of the DACS manual (PDF). Unfortunately, DACS is not available in a format like that of EADiva or the EAD tag library, which meant I couldn’t actually link. So I gave section and page numbers, which should help people quickly locate the right material in the PDF file.

My integration came entirely from the crosswalk, so not every element has a DACS reference.

Introducing EADiva

The contents of this post relate to EAD 2002.

Welcome to EADiva, a plain-talking EAD tag library. Despite its tongue-in-cheek name, I have taken this project very seriously and have attempted to create a friendlier version of the Library of Congress EAD tag library.

My goal in creating this site was to make a resource oriented toward people who are attempting to learn EAD but may not have much more experience with XML than one gets in basic library school classes. The Library of Congress’s tag library uses some terms which may confuse a person who is unable to read a DTD. Attributes are described separately from elements, causing a need for much back-and-forth to understand how they relate to the elements. Element pages also don’t link to the elements they mention, making the user spend much more time navigating.

I cannot overstate how much the LoC’s tag library has helped me over the past 7 months as I’ve written the element pages for this site. It is a tremendous resource and I outline some of its advantages on the about page for this site. But when I studied it for my library school projects, I saw room for improvement. I saw places where I knew that some of my fellow students who have less technical backgrounds might find themselves frustrated. I think many people who could use a better knowledge of EAD, students and professionals alike, need a different starting place and don’t need the level of technical knowledge to understand a DTD. Thus, I built EADiva.

The name comes from a joke with one of my classmates as we corresponded about a paper on the subject. But it stuck in my head, evolving into the idea for this site. I don’t claim to be an EADiva, but I would like to evolve into one. If you would too, this site is here to help.