Notes from an Islandora Installation (aka Islandora Digital Edition Brain Dump)

I recently spent some time installing Islandora (Drupal 7 plus a Fedora Commons repository = open-source, best-practices framework for managing institutional collections) as part of my digital dissertation work, with the goal of using the Editing Modernism in Canada (EMiC) digital edition modules (Islandora Critical Edition and Critical Edition Advanced) as a platform for my Infinite Ulysses participatory digital edition.

Why Islandora?

For various reasons having to do with the focus and scope of my particular dissertational project, I didn’t ultimately end up using Islandora or the EMiC modules for my digital edition1—but I spent enough time with them to know they’re fantastic tools for creating an online scholarly collection of critical text- or image-based editions. Here’s some highlights of what Islandora plus the EMiC modules can let you do:

  • Upload and OCR your texts!
  • Batch ingest of pages of a book or newspaper
  • TEI and RDF encoding GUI (incorporating the magic of CWRC-Writer within Drupal)
  • Highlight words/phrases of text—or circle/rectangle/draw a line around parts of a facsimile image—and add textual annotations
  • Internet Archive reader for your finished edition! (flip-pages animation, autoplay, zoom)
  • A Fedora Commons repository managing your digital objects

Plus, Islandora has a slew of active repositories for additional features, such as integrating Solr search and ingesting web archives, as well as solution packs for handling many types of digital object (newspapers, large images, audio, PDFs, video…)

How do I set it up?

You’d want to start with the Islandora wiki for both installation instructions and answers as to what Islandora is and what it can do; there’s also the main Islandora site. Other places to look for help include the main Islandora listerv for Islandora users, the Islandora Dev listserv (if you’re looking for help in developing on Islandora), and listservs for any particular pieces of the installation process that give you trouble (e.g. I found the Fedora Commons developer mailing list and the Adore Djatoka wiki helpful, as well as this blog post on Fedora set-up).

BUT FIRST! While the Islandora wiki instructions will help you install Islandora on your own server, I’d highly recommend trying two other tactics first—because regular installation requires a lot of sysadminy folderol (fine and great fun if you’re a sysadmin or interested+have the time in picking up some new knowledge in that area). Islandora’s a fantastic platform, but the main problems I ran into while installing and playing with it were server configuration ones: finding the right paths, selecting the right versions of software, setting permissions correctly.

I didn’t find that following the wiki straight through resulted in a working site; not to find fault with the official documentation, though!: documentation is very difficult with a thing like Islandora because many server configs are different, dependencies develop new versions, packages cease to be available2. The page on setting up Solr and gSearch was a particular stumbling point for me and others, and I’d recommend you’d start by reading the comments at the bottom and then doing a quick search on the Islandora listserv when you get there.

Start Here!

1. I’d strongly recommend playing around with Islandora using the virtual image they offer (bottom of the page) to make sure it’s right for your needs.

2. Then, consider forgoing the wiki instruction route for this “one-click automated Islandora install“. I haven’t tried it myself (it became available after I’d switch over to plain Drupal 7), but various users have chimed in on that thread that it works, or with questions when they ran into issues (that were quickly answered by the community). You’ll need to learn Vagrant and Chef to make this route work, but that should both a) be less work than learning how to so the server config stuff you’d need to do for the Islandora install otherwise, and b) be really great in the long run for your Islandora site (Vagrant lets you not mess with the system configuration on your host computer and quickly deploy a local dev site to the public). Also, it looks like that thread instructs you on the stuff you’d need to learn about Vagrant/Chef.

Setting up Islandora without Vagrant or the VM: Random Notes

If you want to set up Islandora using the wiki instructions (rather than using the Vagrant route mentioned above), I’ve compiled below random notes I took during my set-up process. These are meant as supplementary to the Islandora wiki and listserv and in no way can be followed on their own to set up your site—but it seemed worthwhile to share somewhere as one possible reference if you run into issues while setting up your Islandora site. Caveat emptor: as I mentioned, I’m no longer working on an Islandora site for my dissertation project, so I won’t be able to answer any Islandora-y questions. If you do have any questions specifically about using Islandora for digital edition sites, get in contact with the excellent people at Editing Modernism in Canada, who have been developing the modules to add those editing capabilities.

My Server Set-Up

I used Linode to set up Virtual Private Server (VPS) with Islandora and was very happy with their services. If you’re used to using regular web hosting, the difference with a VPS is that you have access to a “whole server”, so you can fiddle around with what version of PHP is installed, etc. Linode was not only reliable and fairly-priced, they have truly amazing documentation on how to set up your server—so don’t be hesitant over creating your own Islandora site just because you’ve never used a VPS or set up a server before—Linode can walk you through it. Additionally, there’s this blog post by Feross Aboukhadijeh on “How To Set Up Your Linode For Maximum Awesomeness” to further help you with your Linode set-up. I’d recommend starting by getting a basic Ubuntu 12.04 server set-up by following the Linode instructions (also check out their set of “stackscripts” that automate things like setting up LAMP on your server).

Random Notes

  • I found it extremely helpful to keep a written log of everything I was doing: things I tried, files I altered, paths to error logs I’d want to be checking, tutorials I’d tried (I also started auto-saving my command line input). I decided to cut a lot of that out for this post because it was very specific to my server config and needs (and also contained some trial and error and exasperated comments…), but I’d strongly encourage you to do the same. I’d never forced myself to keep such thorough notes on what I was doing before, and I’ll be doing it in all my tech work from now on. You’re going to have a million passwords you’ll need to record anyway (Tomcat user, Drupal, Drupal database, Fedora database…)
  • Always check out the ReadMe for any Islandora/EMiC module you’re installing (i.e. on the GitHub repository page for the module, scroll down past the file list to read the text, or look for the text file labeled README in the module folder you downloaded). There are often additional instructions beyond the usual way you install a module (e.g. dependencies to install, permissions to set).
  • I wanted to use the Drupal maintenance page module to hide my site from the public, but this prevented me from setting clean URLs on the site, which is required to make Islandora work.
  • I started seeing a problem where Fedora kept crashing/closing down on its own (which I could see by going to the yoursite.com:8080/fedora/admin page and not seeing the Fedora dashboard. This was solved by adding more memory to the server (i.e. purchasing a higher Linode plan, in my case).
  • Be mindful of what pieces of the platform need what version of Java. Running $FEDORA_HOME/tomcat/bin/startup.sh at one point, the output showed me that my JRE_HOME variable was pointing at the Java 6 OpenJDK, when it needed to be accessing the Sun JRE. I have a hazy memory that there’s one piece of the Islandora set-up that requires one version of Java, and another that needs a different version.
  • Some random “where is that again?” notes:
    • yoursite.com:8080/fedora/admin (your Fedora Repository admin dashboard)
    • yoursite.com:8080/fedoragsearch (your Fedora gSearch admin; credentials are set in $FEDORA_HOME/server/config/fedora-users.xml)
    • yoursite.com:8080 (your Tomcat manager; set an admin user in $CATALINA_HOME/conf/tomcat-users.xml and you can log in to view various processes running on your server)
    • Check out $FEDORA_HOME/data/fedora-xacml-policies/repository-policies to make sure your Fedora access policies are secure

/end brain dump

I’ll be posting details of my current digital edition tech stack/set-up soon (and eventually there should be thorough, user-friendly documentation on creating a participatory digital edition of your own with capabilities like mine will have).

  1. e.g. I don’t need the Fedora Commons extra database and collection management abilities to work with my single text, and I need installation to be lightweight and easy so others can replicate my site []
  2. Something I’ve learned from creating user documentation for BitCurator is that we have to write documentation for things not even in our suite of tools or they can be major stumbling blocks to people using our software—stuff like getting 7-zip to decompress a file, or VirtualBox to recognize a given USB stick []