Digging for Data

Amy C. Smith
Department of Classics
University of Reading


First, some evolutionary history. The Perseus Project, a digital library for the humanities, contains a plethora of original sources —and an increasing number of secondary sources— for the study of Greek and Roman civilization: texts, lexica, plans, maps, images, and archaeological catalogs. Archaeological materials are predominantly Greek.

The brainchild of Greg Crane, it originated in 1985 as a linguistic tool to facilitate the learning of the ancient Greek language. After a morphological tool was combined with texts (in ancient Greek) and translations of Greek literature, Crane quickly realized the multimedia promise of such digital educational tools; that this project could also combine literary sources with history, art, and archaeology, as a means of exposing the student/scholar to the full range of remnants of ancient Greece, including primary materials beyond his/her specific focus. Perseus has been published subsequently in two Mac-platform CD-Roms, a Platform-independent CD-Rom, and is available on the Internet.

Now Renaissance, New World, and other post-Classical materials have been added to the ever expanding boundaries of Perseus.

The archaeological materials added to the Perseus Project in its earliest phases (up to 1996) on CD-Roms included a selection of objects (coins and painted vases as well as sculpture), buildings, and sites from the world of Archaic and Classical Greece. Over 70 museums generously shared images of their art objects included in Perseus 2.0. The atomic objects, that is, those that constitute discrete movable entities —coins, vases, sculptures, and even entire buildings— were each given their own catalog entries, which followed traditional lines (noting specific dates and locations, fabric, subject, and form descriptions).

Some more flexible information was added for the archaeological sites, which in most cases spanned several occupation periods: take the example of the panhellenic site of Delphi. Relevant images were linked to each entry. These catalog entries were designed for and perfectly suited the HyperCard environment in which the Perseus Project was originally published. With the Perseus Project at her/his fingertips the student of ancient Greece could now shuffle through the relevant stacks, applying morphological tools to ancient texts, as needed, investigating discrete catalog entries, or sorting through lists of various archaeological materials — organized in whichever order s/he selected. Thus a student of Homer’s Iliad could search for ancient representations of the hero Achilles with which to envision a main character of this epic.

Using Perseus’ archaeological materials as a scholarly resource

The Perseus Project was envisioned from the beginning as a multifaceted library that could have a breadth of material suitable for students of all ages and a depth of material suitable for scholars at any level. The widespread use of the Perseus Project in high schools and colleges attests to the success of the first goal; the second —and particularly the degree to which the art and archaeology materials in the Perseus Project aid the scholar in his or her research— is a debatable matter, which I shall address in this paper.

In constructing this paper I resisted the temptation of stepping into my former role, as Art and Archaeology Editor of the Perseus Project — to show how far the Project has come and to tell what is to come next. These tasks I shall leave to my able colleagues on this panel. Rather, I stand firmly in my current role, as a University faculty member who uses —or tries to use— the Perseus Project as a tool for scholarship. Not just helping us present antiquity intelligibly and usefully to our students, but also helping us tackle research questions. In this role I shall examine my own use of Perseus on two projects, one in which I am now involved and another that I should like to pursue.

The collaborative project, Demos, is part of the consortium that has created stoa.org. You shall hear about that from Chris Blackwell later in this conference. It is an on-line reference tool, a collection of original sources on democracy in Classical Athens, with some annotation, scholarly commentary, and bibliography. In discussing my part in creating materials for Demos I shall focus on the issues encountered in data mining, efficiently linking materials, and sharing data across different web sites.

The second project is an expensive, ambitious, and less evolved project, namely a three-dimensional catalog of Greek Sculpture. In exploring the idea behind this three-dimensional catalog I shall discuss the evolution of the sculpture catalog on the Perseus Project, and focus on the fundamental differences between print and digital catalogs: how radically different a Web catalog might be, in order to take full advantage of the digital medium, and to help scholars answer age-old questions.

Democratic art and archaeology in Demos

To include an art and archaeology component in the Demos Project first I had to identify and categorize the original monuments and visual materials that were relevant to and/or enlightening with regard to the evolution of the Athenian democracy, from 508 to 322 BC. Demos aims to bring primary sources on Athenian democracy to the web in an intelligible manner and encourages the user to dig into the data —the original sources— in doing their own research.

What I’ve called democratic archaeology constitutes the places and things found in the archaeological record that inform us about how Athenian democracy worked.

By democratic art, however, I mean the art objects that illustrated or explicitly reflected aspects of the democracy.

The Democratic Art title page serves as one of many portals through which the interested user may enter Demos. From there one might skip straight to the "further reading" (traditional, as opposed to Web bibliography). Or one can delve into the guts of the catalog. I’ve discerned three categories of democratic art:

For portraits in Demos I have narrowed the range down to historical figures who strongly influenced the democracy — generals and statesmen, as well as writers whose works appear prominently elsewhere in Demos, notably Aristotle and Plato. The origins of portraiture in Greece go back to the Tyrannicides —Harmodios and Aristogeiton— the pair of lovers who killed Hipparchos, the brother of the Athenian tyrant, Hippias, in 514 (they are illustrated here on a composite copy in Naples; not after a copy of the original portrait created by Antenor, but after the copy of 477/6 by Kritios and Nesiotes). They didn’t quite end the tyranny: this was, after all, a lover’s spat. They failed to kill Hippias, were caught and killed themselves, but therein became civic heroes, credited posthumously with the overthrow of the tyranny in 511/10.

If you want to read more about it —dig into the source material— you could follow the text links to the Perseus server. After a brief explanation of the Tyrannicides, with links to the relevant ancient texts, there is a discussion of the evidence regarding the creation of their portraits. And we follow up with the extant portraits. Inevitably Roman copies are prominent in the extant portraits section, but I’ve only included representative or particularly illuminating Roman copies, rather than an exhaustive list of them

The eponymous heroes who constitute my second category, are the heroes who were assigned, in consultation with Apollo’s oracle at Delphi, as the named heroes of each of the ten tribes of Athens — the new political and military units into which Attica was divided after the Kleisthenic reforms of 508/7, thus Erechtheus for the Erechtheis tribe, and so on. Erechtheus is thought to be represented on the Choiseul Marble (shown here), a relief decorated Parthenon Treasury Account (of 410/09). This is political, but because of the diversity of scenes in which these heroes appear, and their common appearances together, in this section I have separated the heroes from the art. After an introductory discussion of the heroes as a group, there is the encyclopaedic text on the individual heroes: the mythology, worship, and tribal connections that are attested in the ancient sources. Then the images of the heroes, divided into sculptures and paintings, and different variations within those categories. Thus the data can be approached either from the perspective of the hero or of the objects — whatever suits the individual researcher’s needs.

And finally we come to images of political personifications, perhaps the most overlooked of these three categories. But among the Classical vases with labels painted on the vases, there is a high preponderance of personifications with political names. Here is the well known example of the statue of Eirene (Peace), holding Ploutos, the god of wealth. Sculpted by Praxiteles’ father, Kephisodotos, she was erected in the late 370s BC, perhaps after the Peace of Kallias, and placed near the Agora. This statue, known from Roman copies, such as the illustrated one in Munich, as well as Panathenaic vases (of 369 BC), was not the only image of Eirene known in Classical Athens. The copies of Kephisodotos’ statue are enumerated in the Perseus sculpture catalog; the other appearances of Eirene are enumerated on Demos, as for all other political personifications.

Democratic archaeology on Demos remains at an outline stage, but it is there for your perusal. The monuments and materials —many of which have been found through the efforts of the American School of Classical Studies at Athens, in their Agora excavations— give us insights into how the democratic process actually worked.

Creating the Resource

The first issue one encounters in creating a web resource is copyright. I have taken the lazy but safe option, so far, and used my own images, those that I shot, as photographer, with no special agreements. For other images not yet shown, I am seeking permissions from relevant authorities, especially museums.

But without those images, if and where the Perseus Project has those images or materials, why not link to them? And while I was art and archaeology editor of the Perseus Project I actually added democratic materials to their database, so that Demos and others could link into them! That catalog is a better depository for the arcane details, so we don’t have to overburden Stoa or —worse yet— create a competing catalog on Stoa.

To link appropriately into Perseus one has to have the right terms, spelled in a uniform manner, and the same catalog numbers designated. Not British Museum 459 but London 459, simply because that’s how it is entered in the Perseus Project. This touches on a big problem encountered early in development of the Perseus Project, that variations in spelling and nomenclature are standard in ancient studies. The Greek hero Achilles, for example, is correctly referred to be transliterations of his Greek name (Akhilleus or Akhilles), by Latinized versions of that name (Achilleus or Achilles), and by at least one other name, Ligyron. While a database of alternate names was built in anticipation of variation in spellings and names that the users would wish to employ, it was built into the CD-Roms and the earlier Web site as a didactic tool. That is, the user would be presented with the variety of names by which an object, site, or person might be known, but would then have to search individually each alternate name. The Perseus search tool now does the searching for us, but the tool is only as good as its underlying data, and the system still requires human intervention — a person to type the names into the "alternate names" database, as they crop up.

This has not yet been done for the Athenian heroes and lovers, Harmodios and Aristogeiton, otherwise known as the Tyrannicides. But one can find the connection through the image caption, which is now regularly searched by Perseus’ look-up tool. But how can a search engine make distinctions between the many ancient individuals and places that share names? The overworked Perseus programmers are currently reworking the ontological basis for the project’s data, so that the search tool should, in the future, be able to distinguish between people and places, places in different regions, and so on. The ontology should also serve as a source of keywords and thus a separate research tool.

Even synonyms and alternate spellings (e.g., pottery v. ceramic or terracotta v. terra cotta) were not sufficiently standardized in earlier versions of the Perseus Project to enable the program to infer all necessary cross-references. In the catalogs themselves a certain amount of consistency has been created through the use of pop-up menus.

Of course, for the CD products, a system of keywords was established to allow for searching across multiple catalogs and lists of catalog entries: a keyword search for "pomegranate" in coins would generate a list of two coins with images of this food item. Yet the keywords were limited to iconographic terms (scenes shown, individuals pictured, and attributes worn/held by those characters) AND one could only search one catalog at a time with these keywords; this obviously limited ones ability to find comparable images in different media. But now the programmers have enabled the Perseus search mechanism to search through keywords as well as names, summary descriptions, artists, materials, and captions. Although the developers went to great lengths to include the vast majority of iconographic types, the original database of keywords was limited to subject matter included in the original group of archaeological materials; additional keywords for novel subjects shown on newer materials had to be generated by hand, rather than inferentially in the program.

Automatically generated links between discrete texts are improved with a good ontology. When comparanda, ancient and modern texts, and Greek and Latin words are included in the archaeology databases, they must also be tagged so that relevant links will be generated. Further resources may be accessed only if the data is consistent or thorough, with, for example, standardized bibliographic citations. As a teacher and a scholar I anxiously await the new Perseus ontology.

In anticipation of the development of such an ontology for Perseus and/or for Stoa, the text I have written for Demos has been tagged with smart tags —in XML— that distinguish artists from authors from other historical individuals, from heroes, and so forth. For now these characters are merely highlighted, and hyperlinked to the internal text. So, just as links have been made to the Perseus catalog entries that don’t exist, in the form in which they should exist, eventually, when the ontology exists, these tags will enable easy linking to the ontology, which in turn will be a source of cross-references and a resource tool in itself.

Hypertext links are the basic advantage of a digital versus a conventional library. Some connections rely on thoughtful database design, as in the case of our sculpture database, so at this point I shall turn my attention to the potential scholarly value of digital catalogs in general, and particularly for sculpture.

The Digital Potential for Object Catalogs

Catalogs have come a long way in this century. In the 1920s museum catalogs claimed to be written for the interested lay audience, but were written in a manner that makes them barely intelligible to a first year graduate student. Now we have COMPASS, the British Museum’s on-line catalog. As in contemporary print catalogs, images are better, descriptions are less descriptive, less analytical, and less scholarly, but there is plenty of contextual information, such as findspot, collection history, and information on techniques of manufacture. An extended description and discussion could be provided, of course, for those who seek more. Links could be provided to materials off the British Museum web site, but the best you get in this respect is a link to the British Museum’s on-line bookshop where you can buy relevant sources by British Museum scholars writing in books published by the British Museum Press!

The Museum of Fine Arts, Boston, kindly granted the Perseus Project permission to include their most extensive catalog, Caskey and Beazley’s 3-volume catalog of Greek Vases in their collection, published in 1931, on the Perseus Project. This was a godsend in the early ‘90s. The black-and-white images were scanned and presented with the full text and, as of 1997 anyone, not just members of elite academic institutions who happened to own this Classic three-volume set, could access Sir John Beazley’s erudite texts on one of the best collections of Greek vases in the United States. Yet not everyone finds the texts as useful as they could be. References have not been made useful, the abbreviations remain arcane — a secret code that is both unintelligible to the novice student and off-putting to the accidental tourist surfing through the world of Perseus. Most of the images —the few— are still black-and-white. And more of the recent vases in the collection are simply not included. A few key pieces in the collection, such as the Agamemnon vase attributed to the Dokimasia Painter, have been acquired since 1931. This vase, which is, however, included in the Beazley Archive, is the great source for a variant on the golden-age question: was it Queen Clytaemnestra or her lover Aegisthus who killed King Agamemnon? Perseus’ visitors won’t know, for a while yet. The Beazley Archive and the Perseus Project have launched on a collaboration, however, merging their databases. That would give Perseus viewers access to 60,000 vases, which constitute an extremely thorough scholarly tool.

The addition of multiple images —as shown, for example, in the case of the Berlin Museum’s copy of Kresilas’ portrait of Perikles— is one of the great boons of Perseus. It far surpasses the illustrative material traditionally available in print catalogs. But there is still the size issue, in which we are limited by browsers and bandwidth, as well as museums. This limitation has been addressed by Perseus with its new high-resolution tiles. A catalog of circa 150 gems from the Lewes House Collection, now in the Museum of Fine Arts, Boston, is one of the first projects conceived with high-resolution images. For now the gems are presented through so-called "tombstone" labels —scant information about the what, where, and when of gems— while we await a Web version of Sir John Boardman’s forthcoming scholarly update of Beazley’s detailed 1920s catalog.

With or without that scholarly detail, however, Maria Daniels’ (yet to be released!) high resolution photographs —at 18 MB— of each of these gems, and their casts, bring them to the fore like no museum display ever could. They allow us to study the details better than we could with the naked eye. Finally we will all be able to read the inscription on this sard scarab (Boston 27.723) without making an appointment to visit the curator, to take the relevant gem carefully off display, to procure the magnifying glass, to provide the right amount of light, to try to read that minuscule inscription labelling Ercle—Hercules. Maria and her camera have done that for us, and all we need to do is navigate around the Perseus site, clicking on the part of the gem that we would like to see, at the resolution we prefer. This level of photographic documentation is certainly a scholarly tool.

But back to Perseus’ sculpture catalog. I would like to think it has come a long way since the publication of Perseus’ CD-Roms. Despite the multitude of images, the hypertext environment and multiple search options, Perseus’ early catalogs —quickly converted into Web catalogs in 1995— still fall short of the promise of the Internet.

Even in 1995 the HyperCard presentation seemed less flexible than it had seemed in 1992. In his 1997 review of Perseus 2.0. Nick Eiteljorg justifiably complained "the complete separation of text and graphics elements in Perseus 2.0 seems more than dated; it projects a bias towards words and against images." He also provided the most substantial and useful criticism of the sculpture catalog — that the catalog information was overly simplified, so that no allowance had been made for the existence of copies of famous original statues and their relation to each other. Indeed a complex challenge is posed by the fragmentary nature of extant Greek sculpture: some sculptures are comprised of fragments located in a variety of museums; many more (particularly architectural sculptures) once belonged to a group that may have been separated through time; and the most famous statues from Greek antiquity are lost, yet recognized in multiple copies of the originals that proliferated in Roman society.

In fact, there was no interconnection between the sculptural pieces, aside from those that adorned the same building. Ancient sculpture is fragmentary. Some sculptures exist only in pieces; others have been split up; and many of the ancient world’s most famous statues now survive only in multiple copies created later than the originals. The redesign of Perseus’ sculpture catalog in 1997-1999 was intended to bring the visitor’s attention to these complexities of ancient art. A data field in our database now categorizes artworks according to their degree of entirety and automatically generates links to related objects. More than one relation for each object is permitted, which encourages users to investigate the whole range of possible connections. This multirelational database enables our next phase of catalog development, which will include the construction of series of copies, the contextualization of groups of sculptures, and the reconstruction of lost originals.

A preeminent example of all of these relations is the Parthenon, or Temple of Athena Parthenos in Athens. This building was decorated with three different types of sculptural forms, metopes —square (relief) plaques— on the outer frieze, a continuous relief zone on the interior (Ionic frieze), and two sets of pedimental sculptures, one on the East façade and one on the West.

Each of these sculptural forms relates to one another and to the building as a whole. How do they relate? Of course every scholar has a different idea. But each of these forms also represents a separate group (or multiple groups) of sculptures, e.g., the discrete metopes originally numbered 92. They may also be divided into four thematically distinct groups depending on which of the four faces of the building they decorated.

The Perseus viewer may now find these interconnections as follows: From the Parthenon entry in the architecture catalog one is supposed to be able to link to the sculpture entry. (This isn’t quite working, as of the latest reworking of the user interface.) One could then chose one of the entries relating to a particular group. In this case I’ll choose West Pediment, which shows the deities Athena and Poseidon battling for the patronage of Athens. And we may view the different fragments known to have belonged to the East or the West pediment.

Through the ravages of time, many of the original sculptures have been separated from each other, and some reliefs and figures have been broken, yet these fragments should still be studied in relation to each other. How to fit all the fragments together? Well, that’s up to the individual scholar, but can’t be done until you know —or see— what fragments remain.

Many Greek sculptures are known only through almost identical copies produced in Roman times, rather than the original art work, and such is the case with the chryselephantine (gold and ivory) cult statue of Athena Parthenos that originally stood within that Temple, but had been lost by late antiquity. Of course I could go back to the main architecture entry for the Parthenon, from which I could link to the cult statue that was housed within it. But the entry on the lost statue links me back to known copies, including the stunning modern copy, in the Royal Ontario Museum in Toronto. This modern copy takes the cue from Roman copies, multiple versions of which were created from the first through the third centuries AC.

The Perseus Project’s sculpture catalog was redesigned, between 1997 and 1999, with almost no changes to the existing data and relatively few strategic changes to the database itself. The most important change was the creation of a new data field, named "category," that was used to specify the particular relation by which a sculpture might be connected to another. One of a discrete number of possible relations would be indicated for each piece: "single monument," "statuary group," "original/copies," or "separated fragments." The pertinent relation thus residing in the database would then indicate the way in which that monument should be linked to related monuments. In the case of single monument, no relation would be necessary. "Statuary group" would generate a link to a specified "group" to which the sculpture belonged. "Original/copies" would generate a link to specified "original" that this sculpture copied. And "separated fragments" would likewise generate a link to a specified "whole" to which the sculpture originally belonged. The database also permits the existence of more than one such relation for each sculpture, as in the case of the Parthenon.

The next stage in the evolution of the Perseus Project’s catalog of Greek sculpture has come through improvements in the user interface on Perseus’ Web site. As I have mentioned, the comprehensive text browser —the "Lookup" tool— facilitates access between the sculpture catalog and other Perseus materials. And a thumbnail browser also allows users to browse through objects visually. In the future it should also allow for the visual comparison of a selection of images of the same or different objects chosen by the user, not by the Lookup tool.

More work on contextualizing the individual sculptures, alone and in groups, is needed to establish the variety of relations between individual sculptures, and representing them intelligibly to the widest possible audience. To take full advantage of the multimedia digital environment we should be able to present users with 3-D visualization of the relationships between particular sculptures and of their relationship to sites and buildings. Rapid advances in the accuracy of 3-D digital representations are now allowing us to accomplish faithful digital reconstructions not only of architectural spaces, but also of actual objects — first to rebuild digitally those that exist today, and then to extrapolate in an attempt to visualize those that are lost to us.

A related database of sculpture types —the art works on which copies were based— is also necessary. Scholars have begun to realize that Hellenistic and Roman copies of Greek originals should not only be studied in their own right (with regard to issues of production, collections, taste, and architectural placement/decoration), but that there is one-to-many, not a one-to-one relationship of copies to originals. That is, copies seem to have copied intermediate types or copies of copies of originals. An appropriate treatment of these copies —as copies of copies, not necessarily copies of originals— may be easily tackled in a relational database, so that an infinite genealogy of original types, variations, copy types, and subsequent copies thereof, may be built into the database. A program might even generate a genealogical table that would provide the interested viewer with a quick overview of the relation of several related sculptures in the nexus of originals and copies.

An exciting multimedia means of visualization is 3-D reconstruction of the original placement of the sculptures, which would show each sculpture in relation to one another and to the architectural complex of which it played an integral part. Although several ambitious projects have attempted to provide accurate scholarly reconstructions of some important monuments from classical antiquity, none, to our knowledge, have attempted to provide accurate renderings of the sculptural decoration of these monuments, but have rather aimed to provide architecturally precise spaces. Rudimentary architectural reconstructions may at least provide contexts in which images of sculptures —both architectural and those in the round— may be visualized. Perseus hopes eventually to have reconstructions of the spatial and architectural complexes in which many of the sculptures in our catalog were originally found, starting with the Apollo sanctuary at Delphi, discussed in Tom Milbank’s contribution to this conference.

A more ambitious future project that involves the relation of copies to originals, and of separated fragments to each other is the implementation of 3-D scanning technology to create digital images of existing copies or fragments and subsequent reconstructions of the original sculptural works on which they were based or to which they belonged. A detailed measurement and analysis of the copies/fragments would enable one to usefully superimpose missing parts of some copies on others, to scale each copy to the same "actual size," to adjust for copyists additions or changes (e.g., to account for dimensions that may have been foreshortened in a relief representation), and finally to arrive at a clearer picture of the original statue(s) on which copies were based.

The technology to enable an accurate digital rendering of the originals is already available, and is now being employed by its inventor, Marc Levoy of the Stanford University Computer Graphics Lab, in the study of the sculptures of Michelangelo, as well as the Severan Marble Plan. While this laser scanning technology has proved successful in the digital recreation of known sculptures, the data acquired through such a process has not yet been used a basis for comparison of similar copies and extrapolation that might render the most faithful recreations of long lost ancient masterpieces. Such an adaptation of this scanning technology might also enable scholars at all levels to digitally create their own reconstructions. This is my earnest wish for the future.

This file was last updated on 25 July 2001.
Please send your comments to Michael DiMaio, jr.

Return to the Table of Contents