Electronic Texts Then and Now: Susan Hockey’s Electronic Texts in the Digital Humanities

Hockey, Susan. Electronic Texts in the Humanities. Oxford: Oxford University Press, 2001.

Susan Hockey’s Electronic Texts in the Digital Humanities is a concise introduction to various digital text topics: corpus projects, OCR, TEI (and more outdated text encoding schemes), SGML, text analysis (before the many online text analysis tools freely available today), and stylometrics. Although the book’s ten-year-old publishing date leaves much of the technical material outdated (e.g. discussing CD-ROMs of electronic texts–though I believe E-Lit Volume 2 is available on CD-ROM, so that’s not so much dating in itself–and calling multimedia “bells and whistles” when it’s more like mortar and scaffolding to a useful digital text project), the book can be read in part as a history of electronic textual work.

Uh-oh, Binaries (Scholar vs. reader, searcher vs. browser)

Audience and the project mission dictate design decisions in digital text projects. As dangerous as binaries can be–especially when they artificially ensconce people in non-overlapping categories–I’ve found that the goals of most literary digital texts (that is, scholar-focused digital texts) are so different from the genre of reader-focused digital text that laying some of these dichotomies bare could help us arrive at more specific design recommendations for a reader audience. Because of my interest with projects that privilege the audience as critical participants, thinking about digital texts in terms of the museum is a good starting place. I’ll have more to say once I’ve finished reading Nina Simon’s The Participatory Museum, but I suspect there’s something to be said for the difference between traditional galleries and the more recent emphasis on interactivity and “living collections” at museums. The digital archive is often a catalogue of textual items with excellent metadata but no tools to assist non-specialists in juxtaposing the items or gaining a macro view of a collection–not so different from the static holdings of a museum; the collaborative, interactive, reader-privileging “reader-focused” genre of digital text I’ve been working to define is more like an immersive exhibit at one of our better museums.

The very massivity of scholarly digital projects means their deep database structure often forces through, or at least deeply influences, their site design. The more-or-less obvious catalogue look of the great digital archives is great for searching, but not so much for browsing. Susan Hockey recognizes this distinction between digital text design for (re)searching and for browsing (which I claim is a reading rather than a research activity*): “Several of these projects have taken to calling themselves an ‘archive’ rather than an edition, simply because they present a lot of source material to the user but much less in the way of editorial annotations or navigational tools… the onus is on the reader of the edition, the user, to determine what is useful and to choose what route to take through the material” (132-133).

Scholarly digital texts often make it hard for a reader to get a sense of the whole (133), something a digital edition should be able to do better than an analog edition. An analog edition has a table of contents you can skim over and pages you can visually measure with a quick glance at the width of the book’s spine (unless, like Douglas Hofstadter suggests, publishers start to add extra pages to books so that the number of pages ahead don’t give away the nearness of the ending), but its physical properties do not represent an overview of the textual content in any interpretive sense. A digital edition, on the other hand, can provide a number of macro views of the material; for example, a digital text of Ulysses could include an interactive map of the paths different characters take through Dublin or a clock representing the different points in the novel’s day.

Another useful binary I’ve been considering is the difference between “browser” and “searcher” digital text user personas–a difference of reading in a broadly defined area and seeking specific answers (I explored this divide in my 2010 digital text user study). Uncle Tom’s Cabin and American Culture, one of the few literary digital texts I’ve found to really bridge scholar and reader audiences, underlines this divide by offering separate “browse”, “search”, and “interpret” modes (see? binaries aren’t so bad).

To help me start to look at the design features that feed into assisting a particular digital text audience, I’ve come up with the following ten dichotomies illustrating the difference between scholar-focused digital texts (i.e. the majority of digital text projects) and those digital texts with more aesthetic, immersive, reader-focused aims:

Scholar-Focused Digital Texts Reader-Focused Digital Texts
computation aesthetics
analysis of the text with scholarly performance (e.g. monographs) as an end performance of the text
immersion in the item via mark-up, image analysis tools, etc. immersion in the text’s discourse field
bibliographic data cultural references/discourse field
specific research queries conjectural play
research reading
specific engagement, fixed goals broad engagement, open goals
searching browsing
segmented, individual items in catalog-based archives connections and juxtapositions
more texts per project fewer texts per project (if more than one text, often with a single text showcased among them)

By looking at where current digital text projects fall on the overlapping spectra of usefulness for scholars and readers, I’m hoping to pull out more specific design features that support the latter’s needs. I’ve begun work on a chart listing the scholar-focused and reader-focused features of some of the more well-known digital text projects–to be blogged here in the future.

If You Build It, Would They Come? (or, Would developing more reader-focused digital texts be worth it?)

Two of the defining characteristics of scholarly as opposed to reader-focused digital texts–that they target professional scholars, and that they tend to offer more of a warehouse than a boutique presentation of texts–are largely practical. The audience for digital texts will always include scholars; many projects are created by user-developers (e.g. The Walt Whitman Archive) who rely on the tools they create. Giant archives make sense under the growing pressure for a more public humanities, for tools serving more of the revelers in the Big Tent, for the availability of large corpora; they can serve more people and uses for a longer amount of time than a smaller project around a single text, and are easier to argue as meriting funding.

How, then, might we countenance smaller–let’s say, single-text–projects?

The Digital Ulysses project (a.k.a. Ulysses in Hypermedia), a single-text digital text, garnered attention and funding over the years of its development in part because Joyce’s novel is so different from most single works. As one of the more encyclopedic fictional works, Ulysses is a very large text in many ways. The hyper-referentiality of the work invited hypertexting, while its puzzles (both solvable and unsolved) and the challenge it presents to the first-time reader argued for an interpretive performance of the book, something like bringing Harry Blamires’ beloved Bloomsday Book online.

Almost any digital text project centered on a single text of less than Ulysses‘ status, however, is going to raise questions of audience size and audience frequency: is the funding or effort spent on the project going to be rewarded with regular use? Specifics of different single-text projects will vary, but I suspect that anything significantly smaller in scope than Ulysses might have an audience smaller than the typical large-scale DH project (almost by definition), but perhaps parallel to the audience size, shelf life, and use frequency of analog literature projects (e.g. monographs, journal articles). If we’re only addressing the average audience size for the bulk of a humanities field, is that a problem? And might we even be addressing a wider audience than most analog monographs–isn’t the amateur but critical reader of Blake a wider audience than the limited circle of Blake-devoted scholars?

Another argument for a wide audience: reader-focused digital texts, since they are not bound by an underlying catalog structure, can better present the discourse field around a text. With a project like my Digital Dos Passos work, which seeks to present the many media instances referenced in Dos Passos’ U.S.A. trilogy, the goal is to create an engagement with material that is not only pertinent to the Dos Passos scholar, but to readers interested in American history, media, subcultures, and literary and political Modernisms. By presenting the world around a text, reader-focused digital texts can engage a wider audience than more specific scholarly productions.

Digital text production might arguably involve higher costs (e.g. hosting, hiring programming assistance, extra time to set up the code framework for literary content), but viewing this as an impediment to producing single-text projects errs by applying the needs of large-scale projects to productions that can be accomplished by one or two people over the same span of time taken for a dissertation or part of a book (e.g. Tanya Clement’s excellent electronic edition).

Current poles for digital literary work seem divided between the commercial (“e-text”: stuff designed for popular consumption on the Kindle or iPad) and the scholarly (the big digital archives). Reader-focused work might seek to align itself with commercial sponsors; indeed, there are some great examples of performative, immersive literary work in the e-lit field from which digital versions of analog texts could take a page (e.g. Deena Larsen’s Marble Springs). Perhaps shifting the source of sponsorship for digital projects could help smaller endeavors; libraries, academic presses, and digital humanities centers are all sites that might assist in the production of smaller, reader-focused digital texts. Part scholarly academic production, part package for reader-focused consumption, Digital Ulysses pointed to an audience that wasn’t purely commercial publishing or scholarly research (sponsored at various times by the University of Pennsylvania Press, the Voyager Company, and the Mellon Foundation). If we can move single-text/reader-focused digital text projects away from their history of one-off, unreusable boutique design, demonstrating new features and standards in our digital text design work, individual digital text projects could advance the field as a whole while also serving readers.

* 2012 retraction: not sure what I meant by that, but I definitely think browsing can be a useful research action–cascading-style access, even more so than random access, can help develop research questions grounded-theory style.

“Aesthetic Provocations”: Reading Speclab toward Reader-Focused Digital Text Design

Drucker, Johanna. SpecLab: Digital Aesthetics and Projects in Speculative Computing. Chicago: University of Chicago Press, 2009.

Johanna Drucker’s Speclab isn’t explicitly about digital text design, but I found strong resonances to it in Speclab’s challenge to the digital humanities to include more aesthetic, performative digital projects, Drucker’s recounting of the design decisions that went into the creation of the Ivanhoe literary exploration game, and the contention that to create the e-books we dreamed of in the ’90s, we need to reevaluate what we’re porting (and not porting) from the analog book. I’ve been working toward a synthesis of design ideas about what, for lack of a less clunky term, I’m currently calling the “reader-focused” digital text: a genre of digital texts as sites of reader engagement and interpretive performance, rather than of (often computational) tools and scholar- (but not reader-)friendly structures such as catalogues. The term situates my vision of a genre of digital text project close to the literary aims of the great scholarly textual projects (e.g. The William Whitman Archive), while shifting the emphasis from scholar/research to reader/engagement. Using a somewhat DH-specific term like “digital text” also underlines the difference between such projects and “e-books”/other flat digitized text work, while keeping the scholarly attitude of interpretative discovery around a text. “Digital text” also emphasizes that what is presented is a text–the “discourse field” (as Drucker calls it) around a textual work; not a novel or poem, but the world of reading, interaction, and reception, the “rich, ongoing assembly of artifacts of which the text in question was but an instance” (34). I envision a type of digital text that is literary, cultural, historical: a museum of the text’s discourse field. I’ll be talking about my digital text work more on this blog in the future, so I’ll limit myself to this synopsis of my interests, which should be enough to situate my reading of Speclab.

Designing Ivanhoe
A series of sketches from Ivanhoe‘s creation illuminate some of the thinking behind its successive designs: the interface moved from a rigidly grid-based interface to a more stacked, realistic version before reaching its final layout (Bethany Nowviskie’s website has some pretty Ivanhoe design sketches, though not the same ones as in the book). Given Speclab’s emphasis on the subjective, the aesthetic, and the non-computational, an attempt to “break out of frames” seems natural–but how to eschew traditional structures of order without overwhelming the viewer with an unsiftable mass of interpretations? One of the early design ideas went uncomfortably near the Microsoft Bob school of interface design–the rectangles of text were broken out of traditional grid form, only to be spread around the browser window like index cards on a researcher’s desk (granted, I’m not sure how fluid this array was–it might have handled nicely if the user was allowed to shuffle the arrangement). The final interface, however–particularly the pie chart showing a breakdown of the game’s moves and characters–strikes a pleasant balance between the need to organize a lot of textual material and a move beyond design constraints based on the original capabilities of browsers.

This section of Speclab raised two design questions for me:
1. Ivanhoe works, in part, because segmentation into individual games cuts down the number of interpretations to be displayed, and because teams of players encourage and police one another into strong interpretive work. If one wanted to similarly allow for a collaborative collection of interpretations around a text–but you weren’t working in a game format–how would you design a digital text so that the accumulation of interpretations was readily accessible and easy to browse? (Perhaps collaborative tagging, as with the Library of Congress’s online photographs?)

2. If a digital text project centers on reading an actual long work, how can the interface keep the reader “in” or near the text amid the many options that make a digital presentation useful? Certainly choices as to the look of the site, from small choices like button appearance to larger ones like information visualizations, can keep a reader immersed in the world or mood of a text even when she’s not directly reading it. This solution reminds me, though, that I soon need to unpack two adjectives I apply to digital texts too cavalierly: immersive and interactive. An aesthetic digital text needs to be about more than just a pretty appearance; what does immersion mean, and what would interacting with a text in a digital environment (truly interacting, not just clicking some royal blue links) entail? More specifically: what are the design features that reify these characteristics?

Analog Book/E-Book
Drucker argues that too often, digital forms of text are conceived along a harmful binary: you have an analog book and its digitized version (with maybe a few pretty widgets thrown on top); if a digital text is a mirror of its analog counterpart, it isn’t only no better than the analog–problems inherent to computer display such as screen viewing and click fatigue might be a large part of why Project Gutenberg has never been most readers’ bookshelf of choice. So, what methods might we use to break away from design boundaries imposed on the digital text by the analog book? Drucker argues that we need to reassess the nature of the book, using digital presentations of text to simulate what a book does, not how it looks (166). To return to my earlier question–what does interactivity really mean in the context of digital texts?–Drucker suggests that an over-reliance on hyperlinks has been a barrier to encouraging the play and conjecture that would occur if a reader were truly interacting with a digital text (166).

Digital Archive, Digital Text…
The Rosetti, Cather, and Whitman digital archives are all scholarly tools lovely in appearance, cleanly structured, and obviously useful to many scholars. Such digital text sites are, foremost, scholarly tools; they provide transparency and traceability for each piece of content in their archives, and their well-structured data helps us make explicit structures inherent (but previously less discussed) in print data. Unlike Ivanhoe or other reader-focused digital text projects, however, these major projects are archives centered on multiple, not single texts. Multiple texts means a database or catalog, with a catalog look and catalog use. While archives don’t tend to spend a lot of time developing the discourse field around any one work, they do segment and separate these works by treating each as a separate item. It’s difficult to reconcile the desire to describe each work with proper metadata with the drive to performatively demonstrate the interconnected nature (the non-separateness) of the various textual works. Digital archives, as opposed to single-text digital texts, present so much material that can be interlinked and juxtaposed, yet this very quantity of content seems to somehow create a barrier to expressing the discourse field around individual pieces.

Design is too often used simply for the “transmission or delivery” of content, rather than an agent of mediation in its own right (6). How can we move beyond the rigidity imposed on design by the database background of so many digital textual projects to better express a text’s discourse field? Is it possible to better visually shift emphasis from the individual units of content to relationships between them, to design “forms as relations, rather than entities” (28)? (The Blake Archive‘s image comparison tool is one scholarly solution…)

…Digital Interpretation
Ivanhoe presents an interesting possibility for designing a reader-focused digital text that encourages active reading and interpretation: a focus on a text’s potential, as suggested by gaps, inconsistencies, and elisions of detail. Drucker describes the game’s origin as a conjectural reading of the otherwise extremely predictable novel Ivanhoe. As when Shreve jumps into Quentin Compson’ tale-telling in Absalom, Absalom!, saying, “Let me play a while now”, a passive reader becomes active when asked to fill in the gaps of a story or otherwise engage with a text’s discourse field (68). Rather than creating a host of overly biased interpretations or conjectures around a text by thus encouraging interpretation as play, Drucker argues that interpretation, however unconscious, is the only way we create value in a text (29). A work depends on interpretation and reading to exist; there is no objective text to find. Digital texts would thus need to be ongoing projects allowing readers to add interpretations over the years, perhaps with the project showcasing different interpretations as accumulation thrusts them forward.

Conjecture and play in digital texts make a lot of sense to me. With Joyce’s Ulysses, I can imagine a digital text that would follow Bloom’s path through Dublin and ask you to justify the novel, given what could have occurred had he taken a different route. The way you understand the course of the Ulysses comes, to some extent, from imagining what would happen if Bloom had, for example, returned home early (well, and with a little help from Circe). In my Digital Dos Passos digital text, I’m planning a segment that will juxtapose some of the starker, less detailed sections of Dos Passos’ prose with more detailed media and textual examples (e.g. contrasting his sections on American WWI ambulance drivers in Italy to his earlier war novels, actual soldiers’ diaries, photographs from the front, etc.). We’ll see how this experiment turns out…