Home | News & Events | Events | 2016 Events | 2016 NISO Virtual Conference | August 31: Data Curation - Cultivating Past Research Data for Future Consumption

NISO Virtual Conference: Data Curation - Cultivating Past Research Data for Future Consumption

Wednesday, August 31, 2016
11:00 - 5:00 p.m. (Eastern Time)

System Requirements:

  • NISO has developed a quick tutorial, How to Participate in a NISO Web Event. Please view the recording, which is an overview of the web conferencing system and will help to answer the most commonly asked questions regarding participating in an online Webex event.
  • You will need a computer for the presentation and Q&A.
  • Audio is available through the computer (broadcast) and by telephone. We recommend you have a set-up for telephone audio as back-up even if you plan to use the broadcast audio as the voice over Internet isn't always 100% reliable.
  • Please check your system in advance to make sure it meets the Cisco WebEx requirements. It is your responsibility to ensure that your system is properly set up before each webinar begins. 

About the Virtual Conference

Research data is an increasingly important component of communicating science. Responsibility to provide curation support for research data is falling on libraries, repositories, and archives. Support for research data is no small task, requiring expertise in data management, field-specific metadata structures, integration and sharing issues, potentially access control, and even rights management and privacy concerns. Cultivation and curation of this new form of scientific information at scale is a service that many in the scholarly world are expecting the library to manage, and librarians are well positioned to provide.

This virtual conference will explore the many aspects of data curation, including trusted-repository certification, metadata creation and management specific to data, systems deployment issues, facilitation of data sharing services, and data control issues. Speakers will provide first-hand experience of the unique challenges presented by curating data. The session will close with a panel discussion of future trends in data management and how libraries can prepare now to address them.

NEW! All registrants to this virtual conference will receive a login to the associated Training Thursday on  Emerging Tools to Improve Management of Data to be held on September 8. (Separate registration to the training event only is also available.)  If you are unable to attend the Training Thursday in person, you can view the recording of the session.

Preliminary Agenda

11:00 a.m. – 11:05 a.m. – Introduction
Todd Carpenter, Executive Director, NISO

11:00am – 11:30am -- Research Data and Services in Academic Libraries – US and Europe

Confirmed Speaker: Suzie Allard, Associate Dean for Research, College of Communication & InformationUniversity of Tennessee - Knoxville

The recognition of the importance of research data has academic institutions seeking the best path for providing support to researchers that will help preserve this intellectual asset. Academic libraries are often the locus for providing research data services (RDS), which include data management planning, digital curation, metadata creation and conversion. Our empirical investigations in the US in 2011 and 2014 and in Europe in 2016, illustrate the various levels of RDS provided by academic libraries and identify the obstacles to expansion and growth of services.

Suzie Allard is Associate Dean for Research (College of Communication and Information) and Professor School of Information Sciences) at The University of Tennessee. Her research focuses on how scientists and engineers use and communicate information. Current projects center on science data curation, team science and science information cybersecurity.

11:30am – 12:00Noon -- The Data Lifecycle: Curating Partners to Curate Data

Confirmed Speaker: Jennifer L. Lee, Director of Discovery and Access, University of Texas Libraries

As academic libraries evolve their approaches to managing the lifecycle of data, they must forge and manage relationships with partners, old and new, in order to effectively curate digital resources. The University of Texas Libraries engages partners across campus to develop and sustain the curation of our digital assets, including productive collaborations with various computing resources on campus.

Jennifer Lee is Director of Discovery and Access for The University of Texas Libraries at The University of Texas at Austin. She has a background in preservation and digital curation, previously serving as the Preservation Librarian as well as the Head Librarian for Digital Curation Services for the UT Libraries.

12:00pm – 12:30pm --   How to curate research data: An 8 step guide with incentives to collaborate

Confirmed Speaker: Lisa Johnston, Research Data Management/Curation Lead and Co-Director of the University Digital Conservancy, University of Minnesota - Twin Cities.

As reproducibility and data sharing emerge as key issues for academic researchers, the data management services offered by the library must continue to scale. Building from a 2013 pilot (http://hdl.handle.net/11299/162338) the data curation workflows used at the University of Minnesota Libraries have grown into a robust service involving multiple data curation specialists that curate data deposited into our Data Repository for the University of Minnesota (DRUM) and appropriate subject data repositories. Data curation steps, including quality assurance, file integrity checks, documentation review, metadata creation for discoverability, and file transformations into archival formats, are value-add services that enhance digital data for long-term preservation and reuse.

This talk will explore the data curation workflows in place not only at my institution but from 20+ disciplinary- and institutional-based data repositories such as Dryad, ICPSR, Yale, and the U of New Mexico. These experiences, collected in a new ACRL book due out in September 2016 titled "Curating Research Data: Practical Strategies for Your Digital Repository," span the sequential actions that you might take to curate a dataset from receiving the data (Step 1) to eventual reuse (Step 8). And individual institutions putting these key data curation workflows into action is just the first step. We will also discuss a new Sloan-funded project called the Data Curation Network (https://sites.google.com/site/datacurationnetwork/) that aims to create a shared staffing model for providing data curation services across academic institutions thus allowing our services to scale beyond what a single institution might offer alone.

Lisa R. Johnston is an Associate Librarian at the University of Minnesota, Twin Cities. Johnston serves as the Libraries' Research Data Management/Curation Lead and as Co-Director the University Digital Conservancy, the University of Minnesota's institutional repository. In 2014 Johnston led the team that developed and launched the Data Repository for the University of Minnesota (DRUM), http://hdl.handle.net/11299/166578. She oversees a team of data curation specialists to curate data accepted into the repository. Johnston has presented internationally on topics of academic library services for research data management, authored research articles on data management topics, and co-edited the book, Data Information Literacy: Librarians, Data, and the Education of a New Generation of Researchers (Purdue University Press, eds. Carlson and Johnston, 2015) which details a variety of educational approaches used in data management training for STEM graduate students. Johnston holds a Masters of Library Science and Bachelors of Science in Astrophysics, both from Indiana University and was certified by the Society of American Archivists as a Digital Curation Specialist.

12:30pm – 1:00pm --   Ethics and Legal Requirements
Confirmed Speaker:  Melissa Levine, Lead Copyright Officer and Librarian, University of Michigan

Melissa Levine is the Lead Copyright Officer at The University of Michigan Library where she provides guidance on all aspects of copyright policy, supports the HathiTrust Digital Library, and provides guidance to the library’s Research Data Services’ Deep Blue Data Repository. Melissa has held a wide range of posts in museums and libraries.

1:00pm – 1:45pm -- Lunch

1:45pm - 4:00pm -- Five Case Studies: Tools and Resources:

Each of the following case studies in this segment of the day's program will provide the following: 

  • Rationale for specific service approach towards data curation
  • Outline current status of project/initiative/service
  • Unforeseen challenges in properly curating data in specific context
  • Accomplishments/Successes 

1:45pm - 2:15pm   Case Study I: Level Up!: Building data services at the J. Willard Marriott Library

Confirmed Speaker: Rebekah Cummings. Research Data Management Librarian, J. Willard Marriott Library, University of Utah

Research data services have become a common fixture in academic libraries, yet many libraries still struggle to develop an appropriate and in-demand mix of services to support their research community. While an elite few offer seemingly endless curatorial assistance, the majority of libraries are building basic to mid-level services such as DMP support, workshops, and consultations. This case study provides a detailed look at the University of Utah Marriott Library’s data services, the rationale behind our current service model, the results of our campus data needs assessment, and how we plan to grow our technical infrastructure into the future. In addition to an overview of our data service mix, we will look closely at one current initiative, the Entertainment, Arts, and Engineering (EAE) Thesis Preservation Project, which highlights curation challenges such as irregular and proprietary file formats, copyright restrictions, long-term preservation, and a lack of appropriate metadata standards. This presentation will highlight the Marriott Library’s data curation accomplishments to date alongside an honest assessment of ongoing challenges.

Rebekah Cummings is the Research Data Management Librarian at the University of Utah Marriott Library. Prior to working for the Marriott Library, Rebekah was the Assistant Director of the Mountain West Digital Library, one of the inaugural service hubs for DPLA. She received her MLIS from UCLA with a specialization in data curation and her B.A. in Philosophy from California State University, Long Beach.

2:15pm - 2:45pm  Case Study II: The Metadata is the Message: Assessing, Curating and Publishing Data for the Humanities

Confirmed Speaker: Ashley Clark, XML Applications Programmer, Digital Scholarship Group, Northeastern University

Humanists and digital humanists may not always be aware that they work with data, but they are particularly conscious of the need to publish the diverse products of their work. When those products are many and complex, publication is not just a matter of displaying them to users, but providing context and prioritizing findability. Just such a publication was required by the "Cultures of Reception" transcriptions created by the Northeastern University Women Writers Project. Facing inconsistencies in both metadata and data, unforeseen complexities in entity relationships, as well as a loss in organizational memory, the Women Writers Project staff chose to use data curation techniques, such metadata rehabilitation, to drive website development and publication. This presentation will sketch out the approach, its institutional prerequisites, its benefits, and its disadvantages.

Ashley M. Clark is the XML Applications Programmer in the Northeastern University Libraries' Digital Scholarship Group. She received her MSLIS from the University of Illinois at Urbana-Champaign, with a specialization in data curation, especially in relation to the humanities.

2:45pm -3:00pm  Break

3:00pm - 3:30pm  Case Study III: NYU Data Catalog

Confirmed Speakers:  Nicole Contaxis, Data Catalog Coordinator, and Ian Lamb, Project Systems Developer, New York University Health Sciences Library,

Sharing research data for reuse and reproducibility has become a growing concern for researchers, particularly as publishers, government bodies, funding organizations, and universities encourage or mandate researchers to share their data. When researchers at the NYU Langone Medical Center (NYULMC) reported that they were having difficulty discovering and navigating licenses for datasets relevant to their work, the NYU Health Sciences Library (NYUHSL) developed the NYU Data Catalog. The NYU Data Catalog increases the visibility and discoverability of datasets created by NYU researchers as well as external datasets relevant to biomedical and public health research. By listing local experts for each dataset, the Library aims to foster collaboration between NYU researchers as they discuss and share different datasets. This presentation will focus on the types of datasets being cataloged, the work associated with outreach and curation, and metadata as a part of data curation.

Ian Lamb (ian.lamb@med.nyu.edu) is a full-stack web developed at the New York University Health Sciences Library, part of the NYU School of Medicine. He is the principal developed of the NYUHSL Data Catalog and is currently working on a bibliometrics dashboard for use at the institution. He focuses on building friendly and usable systems to advance the institution’s clinical, educational, and research goals.

Nicole Contaxis (nicole.contaxis@med.nyu.edu) is the Data Catalog Coordinator at the New York University Health Sciences Library. She is responsible for the growth and maintenance of the Data Catalog and handles all outreach efforts. Prior to joining the team at NYUHSL, she was a National Digital Stewardship Resident at the National Library of Medicine, creating a pilot workflow for software preservation.

3:30pm - 4:00pm  Case Study IV: Data Curation for Quantitative Social Science Research: A Case Study

Confirmed Speaker:  Libbie Stephenson, Social Science Data Archive, UCLA

A successful data curation program is based on an infrastructure consisting of well-formed policies; adherence to established standards and best practices; and, comprehensive work flows. Curation of data in social science data archives has been carried out since the 1960’s. This case study on the Socials Science Data Archive at UCLA will cover the standards and practices that have evolved over time to ensure the long term usability of these materials, including appraisal and data quality review, ingest and metadata creation based on the OAIS model, application of the Data Documentation Initiative metadata schema, and preservation workflow. The goal of this process is to ensure long term usability of data and enable replication of analyses, regardless of changes in technology, operating systems, software or devices. Background for this presentation can be found in Peer, Green and Stephenson (2014) “Committing to Data Quality Review”, International Journal of Digital Curation doi:10.2218/ijdc.v9i1.317.

Libbie Stephenson has been data archivist and director of the Social Science Data Archive, UCLA since 1977. She is a past president of IASSIST, and served on the governing boards of ICPSR and the Roper Center. Her research and advisory role addresses social science data curation policies and work flows.

4:00pm – 4:30pm Case Study V: A Multi-Decade Case: The Evolution of Data Products and Designated Audiences

Confirmed Speaker:  Karen S. Baker, School of Information Sciences, University of Illinois

A three-decade ethnography of data products originating from a single data set illustrates the processes of ongoing description, continuing development, and tailored delivery. This case study demonstrates knowledge mobilization through curation of a set of polar sea ice data products for a multiplicity of audiences.

Karen S. Baker, after careers in oceanography and data management, is a doctoral candidate at University of Illinois at Urbana-Champaign in the Graduate School of Information Sciences. Karen’s interests are in study of the data ecosystem focusing on data practices and data concepts that inform the growth of data infrastructure and information environments.

* * * * * * * * *

4:30 p.m. - 5:00 p.m. Roundtable Discussion 
Moderated by: Todd Carpenter, Executive Director, NISO


SAVE! Register for multiple events.

If paying by credit card, register online.

If paying by check, please use this PDF form.

Registration closes on Tuesday, August 30, 2016 at 4:00 p.m. Eastern.

Registration Costs

  • NISO Member
    • $185.00 (US and Canada)
    • $225.00 (International)
  • Non-Member
    • $245.00 (US and Canada)
    • $285.00 (International)
  • Student
    • $80.00

Additional Information

  • Cancellations made by Wednesday, August 24, 2016 will receive a refund, less a $35 cancellation. After that date, there are no refunds.
  • Registrants will receive detailed instructions about accessing the virtual conference via e-mail the Friday prior to the event. (Anyone registering between Monday and the close of registration will receive the message shortly after the registration is received, within normal business hours.) Due to the widespread use of spam blockers, filters, out of office messages, etc., it is your responsibility to contact the NISO office if you do not receive login instructions before the start of the webinar.
  • If you have not received your Login Instruction e-mail by 10 a.m. (ET) on the Tuesday before the virtual conference, please contact the NISO office at nisohq@niso.org for immediate assistance.
  • Registration is per site (access for one computer) and includes access to the online recorded archive of the conference. You may have as many people as you like from the registrant's organization view the conference from that one connection. If you need additional connections, you will need to enter a separate registration for each connection needed.
  • If you are registering someone else from your organization, either use that person's e-mail address when registering or contact nisohq@niso.org to provide alternate contact information.
  • Conference presentation slides and Q&A will be posted to this event webpage following the live conference.
  • Registrants will receive an e-mail message containing access information to the archived conference recording within 48 hours after the event. This recording access is only to be used by the registrant's organization.