Morning Tutorials

Tutorial 1: Making DSpace Your Own

Dorothea Salo and Tim Donohue


Starting an institutional repository on top of DSpace? Using BioMedCentral’s new OpenRepository service? Get control over its look and feel! Learn to modify DSpace to reflect your institution’s branding, and improve usability for both submitters and users. Learn some of the basics to making your DSpace installation unique with customized code or functionality. While you’re at it, learn about the DSpace developer community and how you can give back.

This introductory tutorial assumes no knowledge of DSpace or Java. Familiarity with basic Unix commands, FTP, and HTML recommended.

Target Audience

Librarians and staff planning or running DSpace installations who want more control over the technology. Introductory to intermediate-level. Basic Unix, FTP, and HTML familiarity useful, though not required. No Java, JSP, or CSS knowledge assumed.


Dorothea Salo is the Digital Repository Services Librarian at George Mason University; she runs Mason Archival Repository Service, GMU’s DSpace installation. Her background includes work with text markup languages, electronic publishing and ebooks, and accessible web design. Despite knowing nothing about Java or DSpace previously, she completed a visual redesign of MARS within two months of her start at GMU, and is now actively working on DSpace patches to improve the platform’s accessibility and usability.

Tim Donohue is a Research Programmer at the University of Illinois at Urbana-Champaign, where he works on the Illinois Digital Environment for Access to Learning and Scholarship (IDEALS). IDEALS is the UIUC institutional repository, built on DSpace and scheduled for wide release in Fall 2006. Tim has a background in Java programming (among other languages) and, before receiving an MLS from UIUC in May 2005, was a technical architect for a Chicago-based consulting firm. He is an active member in the DSpace developer community: submitting patches, reporting bugs, and helping to answer questions on the various listservs.

Tutorial 2: The Fedora Service Framework: Introduction

Sandy Payette and Chris Wilper


This tutorial is an introduction to the Fedora Service Framework, which is an open-source service-oriented architecture for digital repositories. We will discuss how Fedora acts as an enabling technology for accommodating a diverse set of user needs across different communities, and how it is positioned as an evolutionary technology that can be adapted to fulfill new user requirements over time. Central to the framework is the Core Repository Service which provides essential capabilities for ingesting, storing, managing, preserving, and disseminating digital content in the form of digital objects. Loosely coupled within the Fedora framework are supporting services that make the repository environment more than just a storage system. We will examine the repository in relationship to ingest, workflow, preservation, search, and other important services. We will discuss the reasons one would choose Fedora as the basis for digital libraries, institutional repositories, digital preservation, scholarly communication, and related problems. Finally, we will review Fedora’s involvement in demonstrating new standard-based approaches to cross-repository interoperability via OpenURL and RDF.

Target Audience

Information science specialists, including technically-oriented librarians and archivists, information technology specialists, and digital library architects, who wish to understand the capabilities of Fedora and the benefits of service-oriented architectures for libraries and other information-oriented institutions. There is no pre-requisite for this tutorial.


Sandy Payette, Co-Director of Fedora Project, Cornell Information Science

Chris Wilper, Senior Software Engineer, Fedora Project, Cornell Information Science

Tutorial 3: Introduction to (Teaching / Learning about) Digital Libraries

Edward A. Fox and Marcos André Gonçalves


This tutorial will provide a thorough and deep introduction to the DL field, introducing and building upon a firm theoretical foundation (starting with “5S”: Streams, Structures, Spaces, Scenarios, Societies), giving careful definitions and explanations of all the key parts of a “minimal digital library”, and expanding from that basis to cover key DL issues, illustrated with a well-chosen set of case studies. Attendees will receive a partial draft copy of a new book under development by the co-presenters, with tentative title “Foundations for Information Systems: Digital Libraries and the 5S Framework”, based in part on ideas explored in Dr. Gonçalves’ dissertation.

Target Audience

  • Audience 1: Those attending JCDL for the first time, to become oriented.
  • Audience 2: Those interested in DL theory in general, or 5S in particular.
  • Audience 3: Those teaching DL courses, so as to be prepared to use the new book.

Level of experience required: introductory. Those at intermediate or advanced levels could benefit as well, since the 5S framework has broad applicability for planners, designers, implementers, and evaluators.


Edward Fox holds a Ph.D. and M.S. in Computer Science from Cornell, and a B.S. from M.I.T. Since 1983 he has been at Virginia Tech, where he serves as Professor. He directs VT’s Digital Library Research Laboratory and the Networked Digital Library of Theses and Dissertations. He is chair of the IEEE Technical Committee on Digital Libraries, and is on the steering committee for JCDL and ICADL. He has been (co) PI on over 90 research and development projects. In addition to his courses at Virginia Tech (including on digital libraries), Dr. Fox has taught over 65 tutorials in more than 23 countries. He has given roughly 50 keynote/banquet/international invited/distinguished speaker presentations, 110 refereed conference/workshop papers, and 280 additional papers/presentations. He has co-authored/edited 12 books, 76 journal/magazine articles, 36 book chapters, and many reports. Fox is Co-Editor-in-Chief for ACM JERIC, and is on the boards of TOIS, IJDL, IP&M, J. UCS, Multimedia Tools and Applications, etc.

Marcos André Gonçalves concluded his doctoral degree in Computer Science at Virginia Tech in 2004. He earned a Master degree from State University of Campinas (UNICAMP) in 1997 and a Bachelor degree from the Federal University of Ceará (UFC) in 1995, both in Computer Science. In his dissertation, he proposed one of the first comprehensive formal frameworks for digital libraries (DLs): the 5S framework of Streams, Structures, Spaces, Scenarios, and Societies. He has been working on the DL field since 1997. He has published 5 book chapters, 10 journal/ magazine papers, and more than 30 conference/workshop papers in the DL field. He received 5 awards including the Lewis Trustee Award from Laspau for promoting collaborative research between the U.S. and Latin America (Brazil) in the DL field and the ACM/IEEE 2004 Joint Conference on Digital Library’s Best Student Paper Award. His research interests include DL, IR, and Databases. He is collaborating with Fox on a book (tentative title: “Foundations for Information Systems: Digital Libraries and the 5S Framework”) based in part on his dissertation.

Tutorial 4: Mapping the Intellectual Landscape of Scientific Knowledge in Digital Libraries

Chaomei Chen, Xia Lin, Howard White and Kate McCain


The desire to identify, understand, and track the development of scientific knowledge is deeply rooted in philosophy of science, sociology of science, information science, and a wide variety of scientific disciplines. Increasingly comprehensive and widely accessible digital libraries of scientific publications are becoming essential resources of scientific communities. On the other hand, users of such digital libraries are also increasingly overwhelmed by the rapidly expanding volume of collections in their digital libraries and by the rapidly advancing scientific knowledge. Users are in need of tools that can help them augment their ability to conceive and abreast the big picture of how scientific knowledge evolves in their own fields or in a new field at a macroscopic level. The aim of this tutorial is to introduce some of the fundamental theories, methodologies, and technologies that can help users of digital libraries to better understand their collections of scientific literature. The tutorial will present the integral role of information visualization, citation analysis, and knowledge domain visualization in augmenting our understanding of scientific knowledge conveyed through digital libraries, knowledge repositories, and disciplinary archives. The tutorial will include hands-on sessions for participants to explore a number of fully operational prototypes.

Target Audience

Anyone who needs to develop a better understanding of the intellectual structure of scientific literature and/or how scientific knowledge evolves in a scientific domain. Researchers, educators, consultants, evaluators, and domain analysts.


Chaomei Chen, Ph.D. Associate Professor, College of Information Science and Technology, Drexel University
Dr. Chen’s research includes information visualization and how relevant techniques can be applied to digital libraries. He is particularly interested in mapping scientific frontiers and identifying structural and temporal patterns of the growth of scientific knowledge and how such patterns change over time. He is the Editor-in-Chief of Information Visualization (Palgrave-Macmillan) and the author of Information Visualization (Springer, 2004) and Mapping Scientific Frontiers (Springer, 2003). He is the designer and developer of the CiteSpace system, a freely available Java application for analyzing trends and changes in scientific literature. For further information, see He can be reached at

Xia Lin, Ph.D. Associate Professor, College of Information Science and Technology, Drexel University
Dr. Lin received his Ph.D. in Information Science from the University of Maryland at College Park in 1993. Since then, he has been an active member of SIGIR and digital libraries communities and has published widely in the areas of digital libraries, information visualization and information retrieval. He has developed several information visualization prototypes that can be applied to very large databases or digital libraries. For further information, see

Howard White, Ph.D. Professor Emeritus, College of Information Science and Technology, Drexel University
Dr. White has published on bibliometrics and co-citation analysis, evaluation of reference services, expert systems for reference work, innovative online searching, social science data archives, and literature retrieval for meta-analysis and interdisciplinary studies. He is the author of Brief Tests of Collection Strength (Greenwood, 1995) and a co-author of For Information Specialists: Interpretations of Reference and Bibliographic Work with Marcia Bates and Patrick Wilson (Ablex, 1992). He is the recipient of the Research Award of the American Society for Information Science and Technology (ASIST) in 1993, the best JASIS paper award with Katherine McCain for Visualizing a Discipline: An Author Co-Citation Analysis of Information Science, 1972-1995 in 1998, the ASIST’s Award of Merit – the highest honor for career achievement in 2004, and the biennial Derek de Solla Price Memorial Medal from the International Society for Scientometrics and Informetrics for contributions to the quantitative study of science in 2005. He also developed the AuthorMap system. For further information, see

Katherine W. McCain, Ph.D. Professor, College of Information Science and Technology, Drexel University
Dr. McCain’s primary research interests focus on aspects of formal and informal communication including bibliometric studies of scholarly literatures and information transfer in the biomedical sciences. Her secondary interests include evaluation of information retrieval systems and diffusion of innovation. Her major teaching areas comprise resources in science and technology, content representation, and scholarly communication. She also teaches the introductory Ph.D. course, Topics in Information Science. Her background is in the biological sciences, and her professional experience includes management of a biology library.

Tutorial 5: Practical Digital Library Interoperability Standards

Ian Witten


As the field of digital libraries matures and new systems and standards develop, the ability to interoperate between systems becomes paramount. This tutorial gives a practical introduction to many recent standards and de facto standards for interoperability, and illustrates them using open source digital library software—including online demonstrations of interoperation issues and solutions. Core standards that are discussed include Dublin Core, OAI-PMH, METS, and MODS. We use interoperation between Greenstone and DSpace as a motivating case study.

Target Audience

The tutorial is designed for those who want to learn about digital library standards and interoperability in the context of actual digital library software and digital library collections. Interoperability issues that seem abstract when discussed in isolation become immediate and concrete when set in the context of particular practical problems. The tutorial is intended for digital library students, researchers, and practitioners who are interested in practical issues of interoperability. It will also be useful for those seeking to further their knowledge of what existing open source digital library software can do and how to work with them.

Tutorial 6: Introduction to XQuery

David Durand


This tutorial presents the XQuery XML query language defined by the W3C. The goals of the tutorial are to present an introduction to the structure and features of XQuery, along with lots of examples and demonstrations. Students will gain an overview of this new XML processing and management tool, but will also learn enough practical information to be able to use XQuery to create XML retrieval and manipulation tools and interfaces. While we will look at a few different XQuery processors, we will use the popular open-source eXist XML database, because it is freely accessible and because its integrated HTTP server allows the creation of complete applications using only XQuery.

Target Audience

The level of this course is Intermediate. Students should have some prior familiarity with XML and the notion of text markup; XQuery contains a complete programming language for complex queries, but extensive programming experience is not required — simple queries with no advanaced features can accomplish a lot.


David Durand has been working with markup systems since the birth of SGML in the 1980’s. He served on the W3C XML and XLink working groups, and has contributed to the Text Encoding Initiative’s work on schema languages and Hypertext markup. He is CEO of Tizra Inc. and is an Adjunct Associate Professor at Brown University, where he teaches a course on Document Engineering. With Steven J. DeRose, he is co-author of Making Hypermedia Work: A User’s Guide to HyTime.