E-Journals Report

REPORT ON THE NISO/NFAIS Workshop:

Electronic Journals -- Best Practices

Held February 20, 2000

Philadelphia, PA

Details on the Electronic Journals Workshop

Report prepared by:

Priscilla Caplan (Florida Center for Library Automation and
Chair of the NISO Standards Development Committee)

CONTENTS:

Background and Introduction

Setting the Stage

Four Views on E-Publishing

Commercial primary publisher

Non-profit society publisher

Secondary publisher

Library community

Recommendations from the Workshop Breakout Groups

Editorial standards

Archiving issues

Reference linking

Technical standards

Conclusions and Wrap-up

BACKGROUND

The National Information Standards Organization (NISO) and the National Federation of Abstracting and Indexing Societies (NFAIS) cosponsored a one-day workshop to examine best practices for electronic journals. The Workshop was held as a preconference to the NFAIS Annual Conference 2000 in Philadelphia PA. The Workshop was attended by primary and secondary publishers, representatives of scholarly societies, librarians, and various vendors serving these communities.

The workshop began with an Introduction and an Overview to set the scene for the rest of the meeting. Four presentations focused on the challenges of electronic publishing from the viewpoint of the primary publisher, the non-profit/scholarly society publisher, the secondary publisher, and the library community. Workshop participants met in four discussion groups to discuss issues related to: editorial standards, archiving, linking, and technical standards. The participants reconvened to summarize their conclusions and recommendations.

INTRODUCTION

Pat Harris, Executive Director of NISO, welcomed meeting attendees on behalf of NISO and NFAIS and thanked ISI and RoweCom, the sponsors for the Workshop hospitality. Harris noted the diversity of interests represented and posed the main question: are we all dealing with common problems to which we can work out some common solutions? She noted that the very first NISO standard, Z39.1, addressed the format and arrangement of printed periodicals, and expressed optimism that best practices could also be established in the electronic arena to everyone's benefit.

SETTING THE STAGE

Emily Fayen, (RoweCom, Inc. and member of the NISO Standards Development Committee) defined the major stakeholders and the issues related to electronic journal publishing.

Interested parties, Fayen noted, include publishers, authors, scholarly societies, researchers and other users, abstracting and indexing (A&I) services, aggregators, libraries and library consortia. Major issues affecting them all include:

While there is overlap in interest among the stakeholder groups, some concerns are on the forefront:

Fayen concluded that many different groups must work together. While each of these groups has its own priorities, needs and goals, many common interests and interdependencies exist. Fayen urged the Workshop participants to consider how NISO can help, and what short term and longer-term actions and initiatives they might recommend as steps in addressing this set of issues.

Presentations: Four Views on E-Publishing

COMMERCIAL PRIMARY PUBLISHER

Presenter: Howard Ratner, Director, Electronic Publishing and Production, Springer-Verlag New York, Inc., New York, NY

Epublishing has allowed Springer-Verlag to increase page count and frequency. Last year Springer introduced "Online First", in which articles appear online before they are issued in print. This differs from a preprint service as preprints are traditionally issued before peer review. Online First articles are peer reviewed and include final edits. Ratner described the variety of questions and problems that have emerged in launching this new product. One question that emerged early-on was how authors should cite articles since volume, issue and page numbers are not assigned. Springer answer was to put a visible Digital Object Identifier (DOI), which is retained even after print publication, on every article.

The "published" content of journals published both electronically and in print can be problematic. PDF versions duplicate the print version, but HTML versions enhance it, containing links to author information, reference links, cited by links, and perhaps links to other supplementary materials. This raises the question of what the boundaries of an article are for archiving purposes.

Some Springer print journals have "online only" sections. Ratner described a Springer journal that features research articles and case reports, with the case reports published only in the online version. A&I services did not index the online case reports, so Springer now includes abstracts of case reports in the print version. Ratner noted that "Online only" sections of print journals may also include editorials, reviews, discussions and product listings. Must this non- research content be archived? How will users know which content is included in which version?

This leads to a related question: what part of print content should be duplicated online. Originally, only research content was included in Springer's online service, LINK. Now LINK has been expanded to encompass "cover-to-cover" coverage. These examples point out the dependencies between primary and secondary publishers.

Ratner emphasized that online content is more dynamic than print. Publications can be updated piecemeal "chunk-by-chunk", (not just chapter-by-chapter), like a loose-leaf publication. This new flexibility, however, complicates archiving and raises the problem of identifying a particular version for citation purposes.

Various options exist for archiving, including print, issuing an annual CD-ROM, creating a static online archive such as JSTOR, relying on services (such as OCLC or PubMed Central.) Ratner cautioned that publisher commitments vary and, in general, are not entirely trusted by librarians.

Linking to related data in all forms has become centrally important. This includes not only reference linking to cited articles, but links to databases of scientific data, author web pages and biographies, patent information, product information. Ratner noted that everybody wants to link to everybody else, but this cannot realistically be done with bilateral contracts. This has lead to the emergence of central facilities such as CrossRef (a multi-publisher reference linking service).

Ratner closed by urging the audience to continue to learn, to listen to both your customers and competitors, to not be afraid of change, and to look for ways to cooperate.

NON-PROFIT SOCIETY PUBLISHER

Presenter: John Ewing, Executive Director, American Mathematical Society, Providence, RI

The American Mathematical Society (AMS) has about 30,000 members (one-third international) and 230 employees. Ewing noted that the AMS focuses on research rather than education. The AMS publishes more than 100 book titles per year, including Math Reviews, the online version of which is known as MathSciNet. This database indexes and reviews the mathematical literature. It contains 1.5 million reviews from 1940 onward and adds about 65,000 new items every year.

AMS is a small publisher, with four journals published in both paper and electronic versions, three electronic-only journals, and two member journals. Subscriptions to AMS journals are almost exclusively institutional.,

Ewing noted that while the AMS has in many ways been in the forefront of electronic publishing the mathematics research community values print and the historical record more than immediacy.

In the 1980s AMS invested heavily in linking electronic services such as gophers and listservs, which are superfluous by the rise of the web. AMS also invested a huge amount in development of TeX (a math-specific markup language) and mathematical fonts, which have been placed in the public domain. Ewing cautioned that although we like to think we can predict the future more accurately today, this remains to be seen.

In 1994 AMS leadership articulated a vision that included putting existing print journals and Math Reviews online immediately, and developing a line of approximately forty electronic-only journals. In reality, online versions of the print journals were developed over the next couple of years and MathSciNet was launched, which was an instant success. Only three electronic-only titles were published. These titles attracted so few subscriptions they are now given away for free as a benefit of membership. Ewing noted that today less than one-half of one percent of the mathematics literature appears in online-only journals.

Most articles are contributed in LaTeX or some variation such as plain TeX or AMSTeX; everything is converted to a LaTeX structure. In the 1990s, the electronic versions were made available in a variety of formats, including TeX, dvi, PDF, postscript and HTML. HTML was particularly difficult, as it could not represent mathematical fonts and all math had to be converted to embedded gif files for display. In analyzing use statistics the AMS found that very few people were reading papers online, and the main requirement was a format that printed well. As a result, as of January 2000, HTML has been discontinued and linked PDF is being emphasized. All papers have associated HTML files containing selected information such as the abstract and references.

AMS experimented with "online first" and encountered the problem of how to cite articles before issues and page numbers are assigned. Currently articles are posted with Publisher Item Identifier (PII) identifiers for use in pre-pagination citations. AMS is moving toward use of the DOI. Timeliness, however, is not as important in mathematics as in other fields, and the confusion and problems associated with online first publication is not worth it.

Linked references in published articles do not link directly to the full text of the cited paper, but to the record for the cited article in Math Review. The entry in Math Review links to the full text of the article. This is helpful when the cited paper is not yet available online: readers can get some information by following the link; when the online version appears the link will be there and no retrospective updating will be needed. Ewing noted that this only works for papers in Math Review, and is a lot of work for the AMS staff.

AMS believes ownership and access to print publications should extend to the electronic versions. The subscription model works well, Ewing noted, not because the literature is needed now, but because this guarantees it will be available when it is needed in the future. If subscriptions do not guarantee backfile access, they will decline. Backfile access and archiving is of particular importance to the math research community. AMS allocates one percent of the subscription price to an archiving fund.

In terms of archiving, Ewing noted that the main challenge is converting from one format to another. AMS faced this problem in the early 1980s in converting Math Reviews. AMS attempted a machine-conversion of 80,000 pages from STI to TeX, but after many false starts, the data was rekeyed.

Ewing noted that electronic publishing provides more efficient document delivery to users, may reduce costs eventually, and in the end provides a better product. However, Ewing concluded that the real value of electronic publishing is still unknown. Illustrating that it is difficult for experts to predict the future Ewing noted that in 1922 Thomas Edison boldly predicted that the motion picture would replace the textbook in the educational system. Scholarly publishing is a delicate ecosystem, Ewing cautioned, and delicate ecosystems need protection. Even good technology can have bad consequences unexpected by the experts. The goal of scholarly publishers is to protect the literature for future generations particularly true of math, where the literature is very much for the future, not just the present. Therefore participating in and guiding change is an essential responsibility of the scholarly societies.

SECONDARY PUBLISHER

Presenter: Helen Atkins, Director, Database Development, Institute for Scientific Information (ISI), Philadelphia

Secondary publishers index journal articles for the purpose of helping users identify what has been published and gain access to articles that they are interested in. In the days of all print, users "wrote and walked" from print indexes to print journals. In the future, indexes and articles will all be electronic with materials linked by actionable identifiers. At present we are in a transition period and need to take care to create a future that works for everyone.

Users do not want citations to consist of identifiers alone they need to know the author, when and where it was published, and what journal an article is a part of. Atkins pointed out that identifiers are good for systems but bad for people and that both identifiers and bibliographic description are required. Serials information that A&I services capture includes journal title, title abbreviation(s), publisher, ISSN, CODEN, volume and issue numbering, cover date, and page ranges for articles. Atkins asked: Which of these will be available for ejournals? Which are important for identifying an article uniquely?

When ISI's flagship product was the print Current Contents, ISI informed publishers of the format requirements for the table of contents page so it could be easily indexed. The ISI requirements did not address bibliographic content, rather the ISI "suggestions" addressed presentation values, such as using black print on white background. Today, ISI is less presumptuous about dictating to publishers, but it does have some requests based on experience and the shared interest of both primary and secondary publishers to get information to users quickly and accurately. In this spirit, Atkins suggested that publishing practices affecting Content, Structure and Description be:

Focusing on Content, Atkins suggested that publishers:

Focusing on Structure (publication pattern), Atkins recommended that publishers:

Focusing on Description, Atkins suggested:

LIBRARY COMMUNITY

Presenter: Regina Reynolds, Head, National Serials Data Program (NSDP), Library of Congress, Washington, DC

Alice in Wonderland is in many ways an appropriate theme for electronic publishing. As in Wonderland, ejournals have many familiar characteristics, but without warning they may behave in strange ways.

Reynolds noted that the challenges for libraries include:

Reynolds observed that publishers are experimenting with a new medium and need the freedom to try new approaches and models. However, librarians, she noted, need some predictability and stability. Each community must work with the other. Publishers must keep librarians informed, give the same information on print and electronic versions, and minimize change for the sake of change. Librarians must realize ejournals are in a state of transition, and expect experimentation and change. Neither side should forget the user.

The traditional bibliographic rules which governed the print world have not responded rapidly enough to the electronic environment. For example, a serial must be published in designated issues with a number or a date. Reynolds suggested that this definition must be reconsidered in light of the many electronic journals that are published article by article.

A related question is determining when different formats of an ejournal are the same serial publication; for example the PDF and print versions of a journal may be equivalent, while an HTML version is different.

Reynolds suggested the need to harmonize the library community's rules for bibliographic description. Reynolds noted that the International Standard Bibliographic Description-Serials (ISBD-S) proposes a new umbrella category called "continuing resource" which covers both serials published in volumes and issues, and "integrating resources" that change and add new material over time. The ISBD rules must be harmonized with the library cataloging rules. In addition, the rules for assigning ISSNs must be harmonized with library cataloging.

Libraries now must decide whether or not to catalog electronic journals. There are good arguments for a library's catalog to provide complete, one-stop shopping for all materials. Weighing against this is the expense, difficulty and volume of cataloging required. In the end it may be that user expectations do not justify this investment.

Aggregators are a relatively new factor which introduce both positive and negative effects. Aggregators vastly simplify licensing and access to content. They can provide their own aggravations, including overlapping content, frequent changes in content, and unpublicized changes in content. Managing aggregations, such as providing notification when the service is down, and providing access to titles from the catalog, is a challenge in its own right.

Some libraries list ejournals on their web pages (portal pages) as an alternative to cataloging them. However, this is cumbersome when dealing with large numbers of titles, and does not provide good subject access. Others cut corners by not following the rule requirement of creating a separate cataloging record for the electronic version, by linking to the ejournal from the record for the print journal. This is unsatisfactory when the electronic and print versions differ substantially, or when the title is available through one or more aggregator services.

Some solutions are being tried. OCLC's CORC (Cooperative Online Resource Cataloging, http://www.oclc.org/oclc/research/projects/corc/) project holds some potential for libraries to be able to create cataloging records for electronic resources more economically. Embedded metadata may also hold some promise. Some libraries have begun embedding metadata in HTML headers for their web pages so that software like CORC can use it to build catalog records automatically. Something similar might be possible if NSDP supplied descriptive metadata to publishers applying for ISSNs, and publishers included this on the home pages of their ejournals.

Reynolds emphasized that presentation is extremely important to libraries.

An informal, unscientific survey of serials catalogers, uncovered agreement on five top problems with ejournals:

Other aggravations mentioned by survey respondents included frequent changes to URLs, no notification of URL changes, and long and cumbersome URLs. Aggregators were criticized for not providing coverage information, being unclear whether full text or abstracts were provided, and giving URLs to a homepage for the aggregator service instead of directly to the journal title itself. Title presentation problems include difficulty finding a title at all, multiple versions of the title, and using a different title on the print and electronic versions. Reynolds stressed that the library community's cataloging rules depend on having certain pieces of bibliographic data and she urged publishers to make an effort to include this information for epublications.

Reynolds noted that an early, and still controversial decision of the NSDP was to assign different ISSNs to print and electronic versions of the same journal. The online ISSN database, which is available by subscription, shows all ISSNs for a title. Common ISSN problems include ejournals displaying no ISSN or the ejournal using the ISSN of the printed version. Publishers are urged to display the ISSN for both versions on both the print and online publications.

Reynolds summarized the librarians' "Wish List to Publishers":

The "Librarians Wish List for Aggregators":

Reynolds noted that the Yale Medical School has mounted a prototype service called "Search Jake" (Jointly Administered Knowledge Environment, http://jake.med.yale.edu) which shows for all journals in the database, all places where that journal is abstracted and indexed and where it is available in full text. The Jake database includes over 21,000 journals. Such a service covering all major journals would be invaluable to librarians. Reynolds suggested that building and maintaining such a database might be initiated nationally, along the lines of the CONSER project.

Reynolds concluded that this is the right time to consider a standard or set of guidelines addressing the presentation of ejournals. Defining best practices would guide new ejournal publishers on "how to do it better" and help established publishers provide reliable and predictable information to secondary publishers and librarians, their business partners. NSDP distributes a pamphlet titled "You Name It" to every publisher applying for a pre-publication ISSN. It explains good practices and points to applicable standards. It would be very helpful to have a set of best practices for presentation of ejournals to point to.

RECOMMENDATIONS FROM THE BREAKOUT GROUPS

EDITORIAL STANDARDS: CONTENT AND BIBLIOGRAPHIC ISSUES

Group leader: Helen Atkins

ARCHIVING ISSUES

Group leader: Kathy Klemperer (Harrassowitz)

REFERENCE LINKING

Group leader: Howard Ratner

TECHNICAL STANDARDS: PRESENTATION/FORMAT ISSUES

Group leader: Emily Fayen

CONCLUSIONS AND WRAP-UP

Emily Fayen summarized the main points of consensus:

It was noted that this Workshop did not cover many the issues relevant to electronic publishing, including access control, rights management, licensing and contracts, and privacy of users.

Possible follow-on steps were suggested, including a re-convening of the workshop group at some future time and the publication of a glossary of terms related to electronic journal publishing. Workshop attendees were thanked for their participation and creative work. The meeting planners will meet in March to discuss the agenda that emerged from the Workshop and to recommend action steps.


Copyright 2000 National Information Standards Organization