Collection development in the era of virtual libraries
Nancy K. Roderer
Director, Cushing/Whitney Medical Library
University of Yale, U.S.A.

I am of two minds when it comes to the virtual library. Most of the time, since I am an optimistic person, I think that the virtual medical library exists; we certainly have a good deal of one at Yale University and are moving rapidly toward a comprehensive virtual library for some categories of library users. But more of that in a moment.

On my pessimistic days I am reminded of how long the idea of the virtual library, that is, the library that comes to the user, has been in existence and how little has been accomplished. An early idea that had much in common with the virtual library is the radio doctor, the doctor who came to you via the radio. This idea, shown in Figure 1, was featured in Radio Digest in 1924 (1). More of you have probably heard about Vannevar Bush’s MEMEX, the "device in which an individual stores all of his books, records and communications, and which is mechanized so that it may be consulted with amazing speed and flexibility" (2, see Figure 2). Bush’s article on MEMEX was published in 1945. In the 1970's and 1980's, as computers became more commonplace and began to be used for creating articles and communicating with others, a series of research projects looked at the possibility of journals moving from paper to computer form and considered how that might happen. It was frequently said, in that era, that we were on poised on the brink of a technological revolution in scientific communication. We have been poised on that brink for some time!

But, as I alluded to earlier, we have made some progress. To make my case I want to describe the electronic library services provided to the users of the Yale Medical Library. I chose this because I am most familiar with it; we are not the most advanced medical library in electronic terms, but we have made considerable progress. Figure 3 shows the Medical Library’s home page, the doorway to the fast array of resources we provide to our users (3). I will focus my comments here primarily on journal articles, which amount for the great majority of library use.

I date the beginnings of the digital library to the first days of widespread use of MEDLINE outside of the physical library. At Yale, we began with a system called MiniMEDLINE, about 1986, and in 1993 moved over to OVID MEDLINE (4). I count this as the beginning for two reasons: first, users of computerized versions of MEDLINE were able to get not only citations but, from the abstracts, at least some specific information relevant to their needs. Secondly, and more importantly, the widespread use of MEDLINE from University offices, from the hospital and from home accustomed the user to the idea of getting more information from those locations, and they quickly began to ask when they could have journals electronically as well. MEDLINE use has grown by leaps and bounds since its introduction. At Yale last year 95% of the faculty and 100% of the students used MEDLINE. There were just under 200,000 searches done, and over 81% of those searches were done outside of the library. (NLM has data on this phenomena at a national level.) There is little or no use of the printed Index Medicus.

Yale’s first substantial introduction of electronic journals was last year, when we made OVID’s core biomedical collection available. This has been followed by a several additional OVID collections of journals and by the provision of over a hundred Academic Press titles; we also provide access to a large number of medical journals that are available individually over the world wide web. At this point about 300 of our total collection of about 1700 journal titles are available via the computer, and our projections for next year suggest that we will add at least 300 more. The indication from our users is that we have now reached a critical mass of available material; that is, that many will consider doing most of their library work from their home or office, with only the occasional visit to the physical library. Certainly at the rate we are going it will not be long before we have the vast bulk of the current journals available electronically.

Perhaps the clearest lesson from our experiences to date is the user response, which has been unequivocally in support of electronic journals. You can see a few of our happy users in Figure 4. The positive response to electronic journals is true of students, faculty, and staff; skilled and naive computer users; and young and old alike. With easier-to-use systems and encouragement from their colleagues, even the previously computer-phobics in the medical center have learned to use MEDLINE and other databases. And once they learn, it is no surprise that they find searching from their office, the hospital and home -- at the point that they need information -- more convenient than coming to the library.

Greater convenience leads to more use. We know that there is significantly more use of MEDLINE than there was previously of Index Medicus and earlier versions of MEDLINE, and we suspect that there is more use of articles as well. Users tell us that they are better informed and that they can now work more efficiently and effectively. The combination of increased use and use from outside of the library has changed the demand for reference service -- a smaller proportion of users ask for assistance, but because there is more use we have about the same number of reference questions as before. Generally users’ questions are more sophisticated and more difficult to answer, and require more in depth technical or subject expertise. We also spend much more time than we did previously providing instruction to users, and that instruction is most likely to take place outside of the library rather than in it.

These experiences at the Yale Medical Library over the last five years have allowed us to get a much better sense of how a virtual library might work, and to try out various ways of dealing with the interesting and sometimes unexpected effects of the virtual library. But let me stop my description of what is happening at Yale and take a large leap forward to the virtual medical library of the future. What I want to describe for you is my personal scenario for the future, and then talk a bit about the challenges and opportunities that it brings.

For me, to talk in any depth about a virtual medical library requires reminding myself, and you as well, of the environment within which journals are created for distribution and use. There are many activities performed between generation of research results recorded in articles and the use of those articles by other researchers (5). Consider the four large and diverse groups of participants in this system - scientists as both authors and readers, publishers, abstracting and indexing services, and libraries. There are many interactions among the activities and the participants in this whole system, and so when we speak about change we must consider the implications for all parts of the system.

So now let's imagine ourselves in the system of the future, say 5-10 years out. To me it seems most likely that the scientific communication system will rely predominantly on electronic materials for not only identification of the works of authors but also for those works themselves. To give you a sense of what I think this system will be like, let me ask and answer a series of questions designed to get at what I feel are particularly important features of this future system. What I will tell you is my own individual point of view, presented less as a prediction than as a stimulant to each of you thinking about that future system. The point I am working up to here is that we -- librarians -- can have opportunities to influence how the system develops and how useful it is to our patrons.

The first question is how will information be packaged? In some ways I don't expect that information will come in radically different packages 5-10 years from now. I imagine that there will still be articles, that is, reports of research made at key points in research projects. I imagine that there will still be books, pulling together a larger body of work by one or more researcher into a review of a topic. At the same time, however, the qualities of electronic communication will permit, and are already leading to, several changes in packaging.

The first change stems from the fact that print on paper is static, and, once distributed, not readily changed. Paper distribution also limits the size of the articles distributed. Electronic versions of articles are easily changed, however, and authors are already beginning to make their work available electronically and then continue to update it as they proceed with their research. The extension of this idea might be a new and longer form of communication that reflects one person, or a research team's body of work on a particular, and fairly specific topic, created over a period of time. Such an article -- or perhaps we will call it a superarticle -- would be annotated so that you could trace the development of the work over time. It might also -- and this too is already being done in some areas -- allow for comments and annotation by others.

I don't believe that these new articles will be packaged into journals. While the journal is a useful means of grouping together related article, the same job can be done as or more readily by automated systems -- the abstracting and indexing services and search engines of today -- that allow us to search much more precisely for the sort of information we need. And, while journals have also served as a means of validating information -- that is, if it is published in a reputable journal with peer review, it is a good article -- that same peer review can take place as effectively, and probably more efficiently, in an electronic environment.

How will users get to the information that they need, that is, what will be the analog to the catalogs and indexes of today? When electronic publications were first discussed, it was often imagined that having that text available for analysis would reduce the need for indexing and other forms of description of articles. In fact, it seems to me that as we deal with larger and larger bodies of electronic materials our needs for descriptions - - metadata -- have increased significantly. The continuing growth in electronic materials also stretches our traditional methods of searching, and I believe that more sophisticated methods of retrieval -- such as those being worked on by NLM in their UMLS project -- will be incorporated into the systems of the future.

How will users judge quality? Currently, many readers look at the publisher and the review process as a general indicator of quality; at the individual article level they look at the author and his or her institution. All of these factors help the reader to chose one article over another for reading, with the final judgement of quality the readers own assessment of the contents of the article. In the future, I imagine, readers will continue to rely on publisher and author information. The review system will be modified to include both prescribed and ad hoc reviews -- initially perhaps a peer review process designed by the professional societies and incorporating some level of rating. That might be followed by ad hoc methods of providing for comments to be appended to articles. Another indicator of possible interest to readers would be levels of use of a particular article, easily generated in the electronic environment. I also imagine that readers will make more use of electronic versions of citation indexes to trace the development of research in a particular area.

How will uses keep up to date? I imagine that users will continue to use the traditional means of learning about new research results -- hearing from other users, browsing, and selective dissemination of information publications and services such as current contents. Added to these, however, will be greater use of personalized selective dissemination of information services -- something like the individualized newspapers available today, where you can indicate what types of information you are interested in, the frequency with which you receive it, and the format in which it is displayed.

But so far all we have talked about it the user perspective. What about the other participants in the communication system, who too will be likely to experience substantial change? A major player in the medical communication system is publishers. What role will they play? While this is a subject worthy of extended discussion, let me just say that I envision that publishers will continue to be involved in the identification of publishable material, in editing, in reviewing, and in delivering both advertising and articles to users -- a major difference will be that the delivery will be electronic rather than on paper. What seems likely, however, is that over time there will be less of this work done by commercial publishers, and more by scientific societies and universities.

What role will vendors play? I have in mind two groups here. I imagine that today’s producers of abstracting and indexing services and databases will still have a role to play in describing publications -- today’s retrieval problem is bigger than ever, and I believe that controlled vocabularies and other means of systematically describing publications will become increasingly important.

There is a second type of vendor, the subscription agent that assists libraries in purchasing subscriptions. In my scenario this group continues to be involved in managing financial transactions, however those are made. A number of vendors today are also involved in providing search interfaces to publications, as are individual publishers and some database vendors. It is difficult to see which of these will provide the interfaces of the future, but I suspect it will not be the individual publishers.

I’ve suggested earlier some of the changes that are taking place in my library -- a continued role in supporting the user and an increased instructional role. In my scenario libraries will also continue to acquire materials, in the sense of selecting materials of particular interest and arranging to pay for their users use of those articles. I also anticipate that library will be heavily involved in the development of interfaces tailored to their individual institutions -- something that is designed to meet the needs of their user population and to point them to the particular publications that the library has "selected". Many libraries do this today with their web pages, and I imagine that this trend will continue.

One last question in this brief overview of the communication system of the future as I see it -and that addresses the important issue of how will we pay for publications. Today the dominant means of paying for publications is annual subscriptions to journals. I believe this will change to is per use costs, that is a charge for each unit of information accessed. A variation on this might be prepaying for a certain number of uses. This is a big change, for publishers and libraries both, but I believe that it is both inevitable as we get away from the packaging of articles into journals, and also a more equitable method of payment. A major obstacle in the way of implementation of per-use costs is insufficient knowledge of the level of use of individual articles, but today’s systems are improving our ability to make these counts.

So the overall scenario I see for medical and all scientific communication is one of more flexible units of information, produced increasingly by associations and universities, enhanced by standardized descriptions, and retrievable via increasingly sophisticated interfaces. I’d like to turn now to describing some of the major barriers that I see to this scenario and finally to the opportunities that it provides for libraries.

It is not difficult to see a large number of obstacles in the path of even the relatively conservative scenario I have outline. The topics listed here are the ones that I think are most significant -- you could certainly add many others. The good news is that there is much work being done on all of them.

Interface variety means that there are currently many steps, involving a number of different systems, that now stand between a user desiring some information and the ultimate retrieval of that information. The world wide web has provided an enormous leap forward, making it possible to use many systems in sequence with some consistency of interface. At the same time, we have definitely not reached the point where finding an electronic article is as easy as going to the right place in the stacks and retrieving the appropriate journal. Particularly difficult at this point is

the step of going from the identification of a particular article and journal to its retrieval, because of the many interfaces we use to provide access to articles. Systems like OVID and PubMED suggest a good way to go -- direct links from the citation and abstract to the full text of the article. Getting to that point for all articles will not be easy, however, and both technical and organizational issues stand in the way.

Today’s systems have a number of approaches to determining whether or not the user of the system is authorized to use it, that is a number of different authentication methods. Some systems work on the basis of individual authorization, requiring that the system know who all of the permissible users are. Other work on the basis of the computer from which the request comes, creating difficulties when an authorized user wants to use an unauthorized computer. It seems that the goal would be that any authorized user could readily access publications from any computer.

The need for an economical storage system is a large obstacle. While the cost of computer storage continues to decrease, the size of the literature increases at the same time, and certainly our ultimate goal is to have a large body of literature developed over a long period of time available to users via their computers. Particularly difficult here is the issue of older material, generally used to a lesser extent than current literature but still valuable. A number of ongoing projects address the storage and retrieval of older material, including the University of Michigan’s JSTOR.

All of the participants in the current scientific communication system have investments in the equipment and processes used to create it, and moving to a new system requires a good deal of development and new levels of investment. Publishers and vendors also have a considerable interest in a continuing income stream. It is not surprising that the early experimenters in electronic communication were societies and associations, and that commercial publishers took more of a wait and see attitude. We are now at the stage, however, where electronic publishing seems inevitable, and publishers and vendors have a need to participate in determining the future. At the same time, the paper system continues in much the same way as it has been, and that too must be maintained. All of this requires considerable investment and creates considerable anxiety among all participants.

I mentioned earlier that I believe we will ultimately pay for each use of publications. This requires that we know what the volume of use is. Participants in the scholarly communication system also need to know about the volume of use of materials in order to make decisions about the equipment used at each stage to store and access material, and about the telecommunications methods that will be used. Traditionally we have know relatively little about the amount of use of journals, but we are beginning to know more.

It’s impossible to talk about the scholarly communication system without at least mentioning the role of publication in the tenure system. Systems of tenure, that is, the way in which faculty are promoted, are heavily dependent on their research and writing as reflected in their publications. Changing the unit of publication and the system of journals in which they are published requires that the tenure system adapt to the new system of electronic communications -- certainly not impossible, but a major, and difficult, change both operationally and sociologically.

Finally, the current system of copyright, which is more or less silent in the area of electronic publication, is a significant barrier to the scenario I have outlined. Here, as in other areas, the good news is that there has been considerable discussion of how copyright might and should work in the electronic environment. New legislation is being created, but it is easy to imagine that there will be a need for continuing changes in copyright legislation and guidelines as we evolve a new communication system.

As I review these barriers, I note that there are many standards issues underlying them -- needs for greater standardization of interfaces, descriptions of publications, and vocabularies come quickly to mind. Also required is a new level of interaction among the various participants in the communications system, so that what is created serves the needs of all. Many individuals and organizations are working in these areas, notably in the United States the Association of Research Libraries, the Coalition for Networked Initiatives, and the National Digital Library Federation who address these issues primarily from the library perspective. Ultimately, though, the degree of change that is anticipated will involve all organizations and each of us and our users individually. It is appropriate for us to be well informed about new developments and trends and to focus in on the areas in which we best influence the development of systems that will better serve our library users.

Which leads me to my final topic area, the description of a few areas in which I think we have the greatest opportunity to contribute to the development of the electronic library at this point in time. I have identified four of these, with each requiring significant work.

The first opportunity I want to call your attention to is experimentation by individuals and groups of libraries. We can all certainly participate here, trying out new products and services in our libraries. This is the best way to learn what our users want, and to learn what changes will be required in our internal library operations to accommodate the digital collection. Over and over again in the Yale Medical Library we find both anticipated and unanticipated effects as we experiment with new systems, and it is clear that there is a considerable evolutionary path between the systems of yesterday and of tomorrow.

Secondly, I want to call your attention to the need for us to participate in identifying organizational schema and standards. How information is organized is at the very core of librarianship, and we are the group that best understands what information our users want and how to organize that information so that they -- and future generations of users -- can find it. We must be well informed on vocabulary issues, and participate in their translation into the electronic environment.

Next, I recommend that you consider the role of library consortia as agents of change, particularly in dealing with the complex changes that are anticipated in moving towards digital libraries. Libraries have a long tradition of coming together into consortia to share materials, to develop systems, and to have a greater influence on changes that are taking place. Library consortia and other forms of working together with our fellow libraries -- such as in the National Network of Libraries of Medicine and buying cooperatives such as the North East Research Libraries consortium in which Yale participates -- give us many opportunities to magnify our individual voices and steer our course.

Finally, I believe that now is the time for those of us in academic institutions to take on the task of working with our faculty member and administration on publication and tenure issues. There are now sufficient examples of how digital collections will function that it is time that universities seriously consider what role they will play in publishing faculty work and how tenure might change in this future environment.

Both of these are large topics, requiring considerable discussion and ultimately widespread action. What we can do at this point is raise the awareness of faculty and administration members of these issues, provide electronic system that allow them to understand some of the implications, and encourage experimentation with publication and tenure issues within and among universities.

To conclude this whirlwind tour of the coming digital communication system, let me ask you to recall my earlier description of the effects to date of the virtual library, as seen from the Yale School of Medicine -- happy users getting more of the information they need to do their work. I've seen enough that I am optimistic about the future, and excited about conquering the barriers and taking advantage of the opportunities that I've outlined. I look forward to working with all of my colleagues here, and in medical libraries throughout the world, toward increasingly virtual libraries.

References

1. "The Radio Doctor -- Maybe." Radio News, April 1924, p. 1406.

2. http://www.pronet.net.au/~kerry/memex_pic.htm -- picture of Vannevar Bush’s theoretical Memex machine

3. http://ww.med.yale.edu/library/ -- Yale Medical Library’s home page

http://info.med.yale.edu/computing/research/#monthly -- statistics on electronic library use at the Yale New Haven Medical Center

4. Susan E. Grajek, Majlen Helenius, R. Kenny Marone and Nancy K. Roderer. "The MEDLINE Experience at Yale: 1986-1996." Medical Reference Services Quarterly 16:4:1-18, Winter 1997.

5. Donald W. King, Dennis D. McDonald and Nancy K. Roderer. Scientific Journals in the United States: Their Production, Use, and Economics. Stroudsburg, Pennsylvania: Hutchinson Ross Publishing Company, 1981.