Skip to main content
SearchLoginLogin or Signup

Lost or Found? Discovering Data Needed for Research: Supplementary Materials

Published onApr 30, 2020
Lost or Found? Discovering Data Needed for Research: Supplementary Materials
·
key-enterThis Pub is a Supplement to

Appendix A

Survey questionnaire

Introduction

This study investigates how participants locate and evaluate data they do not create themselves.

The survey consists of three main sections:
• Part 1: Data Needs
• Part 2: Finding Data
• Part 3: Evaluating Data


Our funding comes from the Netherlands Organization for Scientific Research (NWO). The study is part of a collaborative research project between researchers at the Data Archiving and Networked Services (DANS), the University of Amsterdam, the Vrije Universiteit Amsterdam and Elsevier.

By clicking on the below button to start the survey, you indicate your consent to participate in this research. You can read more about the survey and what will be done with the data here (this will launch a new window).

Thank you for your participation.

Please click >> button to indicate consent to participate and to begin the survey.

Survey Questions

Part 1: Data Needs

Q1: Which of the following best describes you?

Please select one answer

  • Researcher

  • Student

  • Librarian, archivist or research/data support provider

  • Manager

  • Other. Please specify ____________

 

Q2: Please describe the secondary data that you (might) need. (We define secondary data as data that you do not create yourself).

Please write your answer in the box below:

 

 

Q3: Please select the options that describe the secondary data that you (might) need.

Please select all that apply 

  • Observational or empirical (e.g. sensor data, survey data, interview transcripts, sample data, neuroimages, ethnographic data, diaries)

  • Experimental (e.g. gene sequences, chromatograms, toroid magnetic field data)

  • Simulation (e.g. climate models, economic models)

  • Derived or compiled (e.g. text and data mining, compiled database, 3D models)

  • Other, Please specify ____________

Q4: Why do you use or need secondary data?

Please select all that apply 

  • As the basis for a new study 

  • To calibrate instruments or models

  • For benchmarking 

  • To verify my own data

  • As model, algorithm or system inputs

  • To generate new ideas

  • For teaching/training

  • To prepare for a new project or proposal

  • To experiment with new methods and techniques (e.g. to develop data science skills)

  • To identify trends or make predictions

  • To compare multiple datasets to find commonalities or differences

  • To create summaries, visualizations, or analysis tools

  • To integrate with other data to create a new dataset

  • Other. Please specify ____________

Q5: Have you ever used data outside of your area of expertise?

Please select one answer

  • Yes

  • No

Q5a: How did you find this data? 

Please write your answer in the box below:

 

Part 2: Finding Data

Q6: When you need data, who finds it for you? 

Please select all that apply

  • I find it myself

  • Graduate student

  • Research support professional (e.g. librarian, archivist, data or literature manager)

  • Someone else in my personal network (e.g. peers, collaborators, mentors)

  • Other. Please specify ____________

Q7: How frequently do you use the following to find data?

Please select one answer per row

 

Often

Occasionally

Never

Multidisciplinary data repositories

Discipline-specific data repositories

Governmental agencies and websites

Personal networks (e.g. colleagues, peers)

Academic literature (e.g. journal articles, conference proceedings

Code repository (e.g. GitHub)

General search engines (e.g. Google)

Professional associations

Data specific search engines

Commercial sources

Consultation with research support professionals (e.g. librarians, archivists or data managers)

 

Q7_open: Please specify any other resources that you use to find data:

Please write your answer in the box below:

 

Q7a: Which statement(s) describe how you discover data using the academic literature?

Please select all that apply

  • I search the academic literature with the goal of finding data.

  • I find data serendipitously while reading articles or performing literature searches.

  • I follow citations and references in the literature to datasets.

  • I extract and use data from the literature directly (e.g. from tables, graphs, or instrument specifications and parameters)

  • Other. Please specify ____________

Q7b How successful are you at finding data with a general search engine (e.g. Google)?

Please select one answer

  • Very successful

  • Successful

  • Sometime successful, sometimes not

  • Rarely successful

  • Not successful

Q8: How frequently do you find data in the following ways?

Please select one answer per row

 

Often

Occasionally

Never

By actively searching for data in an online resource

Serendipitously, when searching for something else (e.g. when looking for journal articles or news)

Serendipitously, when NOT actively looking for something else (e.g. via an email notice or interaction with a colleague)

In the course of sharing or managing my own data

Q9: Please indicate if you use the following to discover, access, or make sense of data. 

Please select all that apply

 

Q10a - Discover

Q10b - Access

Q10c - Making sense of data

Conversations with personal networks (e.g. colleagues, peers)

Contacting the data creator

Developing new academic collaborations with data creators

Attending conferences

Disciplinary mailing lists or discussion forums

Q10: Do you discover data differently than how you discover academic literature? 

Please select one answer

  • Yes

  • Sometimes

  • No

Q10a: How is your process for finding data different than your process for finding academic literature? 

Please write your answer in the box below:

 

Q11: How easy is it to find data? 

Please select one answer

  • Easy

  • Sometimes challenging

  • Difficult

Q11a: Why is it challenging to find the data that you need? 

Please select all that apply

  • The data are not accessible (e.g. behind paywalls, held by industry).

  • I don't know where or how to best look for the data.

  • The data are located in many different places.

  • The data are not digital.

  • Online search tools are inadequate.

  • I do not have the personal network needed to find or access the data.

  • Other. Please specify ____________

Part 3: Evaluating Data

Q12: Please indicate the importance of the following information when deciding whether or not to use secondary data. 

Please select one answer per row

 

Extremely important

Important

Somewhat important

Less important

Not important

Data collection conditions and methodology

How data has been processed and handled

Reputation of data creator

Personally knowing the data creator

Reputation of data source (e.g. repository or journal)

Detailed and complete metadata and documentation

Data size

Data format

Licensing/copyright conditions

Correct coverage (time, location, population, etc.)

Original purpose of the data

Ease of access

Topic relevance

Q12_open: Please specify any other information you consider when deciding whether to use or not secondary data.

 

Q13: How important are the following strategies in evaluating and making sense of data? 

Please select one answer per row

 

Extremely important

Important

Somewhat important

Less important

Not important

Consulting associated journal articles

Consulting data documentation and codebooks

Consulting the data creator

Consulting personal networks (e.g. colleagues, peers)

Exploratory data analysis (e.g. statistical checks, graphical analysis)

Q13_open: Please specify any other strategies you consider to evaluate and make sense of data.

Please write your answer in the box below:

 

Q14: Please indicate the importance of the following in helping you to establish trust in secondary data. 

Please select one answer per row

 

Extremely important

Important

Somewhat important

Less important

Not important

Others' prior usage of the data

Reputation of source (e.g. repository, journal)

Reputation of data creator

Transparency in data collection methods

Lack of errors

Ease of access

Personal relationship with the data creator

Q14_open: Please specify any other important aspects you consider to help establish trust in secondary data.

Please write your answer in the box below:

 

Q15: Please indicate the importance of the following in helping you to establish the quality of secondary data. 

Please select one answer per row

 

Extremely important

Important

Somewhat important

Less important

Not important

Lack of errors

Ease of downloading and exploring data

Data size

Data completeness

Reputation of source (e.g. repository, journal)

Resolution or clarity

Reputation of data creator

Detail or amount of work done to prepare data

Consistency of formatting

Q15_open: Please specify any other important aspects you consider to help establish the quality of secondary data.

Please write your answer in the box below:

 

Part 4: Demographics

You are nearly at the end of the survey. Below are some questions to help us classify your answers.

D1: In which subject discipline do you specialize?

Please check all that apply.

  • Agriculture

  • Arts and Humanities

  • Astronomy

  • Biochemistry, Genetics, and Molecular Biology

  • Biological Sciences

  • Business, Management and Accounting

  • Chemical Engineering

  • Chemistry

  • Computer Sciences / IT

  • Decision Sciences

  • Dentistry

  • Earth and Planetary Sciences

  • Economics, Econometrics and Finance

  • Energy

  • Engineering and Technology

  • Environmental Sciences

  • Health professions

  • Immunology and Microbiology

  • Materials Science

  • Mathematics

  • Medicine

  • Multidisciplinary

  • Neuroscience

  • Nursing

  • Pharmacology, Toxicology and Pharmaceutics

  • Physics

  • Psychology

  • Social Science

  • Veterinary

  • Information science

  • Other. Please specify____________

 

D2: How many years of professional experience do you have in your field?

Please select one answer

  • 0-5

  • 6-15

  • 16-30

  • 31+

D3: In which county do you currently work? 

  • Afghanistan

  • Albania

  • Algeria

  • American Samoa

  • Andorra

  • Angola

  • Anguilla

  • Antarctica

  • Antigua and Barbuda

  • Argentina

  • Armenia

  • Aruba

  • Australia

  • Austria

  • Azerbaijan

  • Bahamas

  • Bahrain

  • Bangladesh

  • Barbados

  • Belarus

  • Belgium

  • Belize

  • Benin

  • Bermuda

  • Bhutan

  • Bolivia

  • Bosnia and Herzegovina

  • Botswana

  • Brazil

  • British Indian Ocean Territory

  • Brunei

  • Brunei Darussalam

  • Bulgaria

  • Burkina Faso

  • Burundi

  • Cambodia

  • Cameroon

  • Canada

  • Cape Verde

  • Cayman Islands

  • Central African Republic

  • Chad

  • Chile

  • China

  • Christmas Island

  • Cocos (Keeling) Islands

  • Colombia

  • Comoros

  • Congo

  • Cook Islands

  • Costa Rica

  • Cote d'Ivoire

  • Croatia

  • Cuba

  • Cyprus

  • Czech Republic

  • Denmark

  • Djibouti

  • Dominica

  • Dominican Republic

  • East Timor

  • Ecuador

  • Egypt

  • El Salvador

  • Equatorial Guinea

  • Eritrea

  • Estonia

  • Ethiopia

  • Falkland Islands (Malvinas)

  • Fiji

  • Finland

  • France

  • French Guiana

  • French Polynesia

  • French Southern Territories

  • Gambia

  • Georgia

  • Germany

  • Ghana

  • Gibraltar

  • Greece

  • Greenland

  • Grenada

  • Guadeloupe

  • Guam

  • Guatemala

  • Guinea-Bissau

  • Haiti

  • Heard Island and McDonald Islands

  • Holy See (Vatican City State)

  • Honduras

  • Hong Kong

  • Hungary

  • Iceland

  • India

  • Indonesia

  • Iran (Islamic Republic of)

  • Iraq

  • Ireland

  • Israel

  • Italy

  • Jamaica

  • Japan

  • Jordan

  • Kazakhstan

  • Kenya

  • Kiribati

  • North Korea

  • Kuwait

  • Kyrgyzstan

  • Lao People's Democratic Republic

  • Laos

  • Latvia

  • Lebanon

  • Lesotho

  • Liberia

  • Libyan Arab Jamahiriya

  • Lithuania

  • Luxembourg

  • Macau

  • Madagascar

  • Malawi

  • Malaysia

  • Maldives

  • Mali

  • Malta

  • Martinique

  • Mauritania

  • Mauritius

  • Mexico

  • Micronesia (Federated States of)

  • Monaco

  • Mongolia

  • Montserrat

  • Morocco

  • Mozambique

  • Myanmar

  • Namibia

  • Nauru

  • Nepal

  • Netherlands

  • Netherlands Antilles

  • New Caledonia

  • New Zealand

  • Nicaragua

  • Niger

  • Nigeria

  • Niue

  • Norfolk Island

  • Norway

  • Oman

  • Pakistan

  • Palau

  • Panama

  • Papua New Guinea

  • Paraguay

  • Peru

  • Philippines

  • Pitcairn

  • Poland

  • Portugal

  • Puerto Rico

  • Qatar

  • Reunion

  • Romania

  • RUSSIA

  • Rwanda

  • Saint Helena

  • Saint Kitts and Nevis

  • Saint Lucia

  • Saint Vincent and the Grenadines

  • Samoa

  • Sao Tome and Principe

  • Saudi Arabia

  • Senegal

  • Serbia and Montenegro

  • Seychelles

  • Sierra Leone

  • Singapore

  • Slovakia

  • Slovenia

  • Solomon Islands

  • Somalia

  • South Africa

  • South Korea

  • Spain

  • Sri Lanka

  • Sudan

  • Suriname

  • Swaziland

  • Sweden

  • Switzerland

  • Syrian Arab Republic

  • Taiwan

  • Tajikistan

  • TANZANIA

  • Thailand

  • Togo

  • Tonga

  • Trinidad and Tobago

  • Tunisia

  • Turkey

  • Turkmenistan

  • Turks and Caicos Islands

  • Uganda

  • Ukraine

  • United Arab Emirates

  • United Kingdom

  • United States Minor Outlying Islands

  • Uruguay

  • USA

  • Uzbekistan

  • Vanuatu

  • Venezuela

  • Viet Nam

  • Virgin Islands

  • Virgin Islands (US)

  • Virgin Islands, British

  • Wallis and Futuna

  • Yemen

  • Zambia

  • Zimbabwe

  • Palestinian Territory, Occupied

  • Moldova, Republic of

  • Marshall Islands

  • Macedonia, The Former Yugoslav Republic of

  • Liechtenstein

  • Korea, Republic of

  • Guyana

  • Guinea

  • Gabon

  • Faroe Islands

  • Zanzibar

  • Tokelau

D4: What type of organization do you work for?

Please select one answer

  • University or college

  • Research institution

  • Government agency

  • Corporate

  • Independent archive or library

  • m Other. Please specify ____________

D5: Please indicate how the following people feel about sharing their research data. 

Please select one answer per row

 

Data sharing is strongly encouraged

Data sharing is somewhat encouraged

Data sharing is neither encouraged nor discouraged

Data sharing is somewhat discouraged

Data sharing is strongly discouraged

Don't know/ Not applicable

You

The people you work with directly

Your disciplinary community

Your institution

D6: Please indicate how the following people feel about reusing data produced by other people. 

Please select one answer per row

 

Data reusing is strongly encouraged

Data reusing is somewhat encouraged

Data reusing is neither encouraged nor discouraged

Data reusing is somewhat discouraged

Data reusing is strongly discouraged

Don't know/ Not applicable

You

The people you work with directly

Your disciplinary community

Your institution

D7: Have you ever shared your own research data?

Please select one answer

  • Yes

  • No

D8: Final comments: Do you have anything else that you would like us to know?

Please write your comments in the box below:

 

 

Additional questions asked to participants selecting “Librarian, archivist or research/data support provider” as their role.

 

L3: Do you use or need secondary data for your own research or to support others?

Please select one answer

  • For my own research

  • To support others

  • For both my own research and to support others

L4: Who are the people whom you support?

Please select all that apply

  • Students

  • Researchers

  • Industry employees

  • Other. Please specify ____________

L5: How do you support people with their data needs?

Please select all that apply

  • I teach people about data management planning (e.g. through consultations, workshops, etc.).

  • I teach people how to discover and evaluate data (e.g. through consultations, workshops, etc.).

  • I find data for people.

  • I help people to curate their data.

  • I find literature for people.

  • Other. Please specify ____________

Appendix B

P-Value Tables

Table B1. P-value table for Figure 6: Associations between disciplinary domain and needed data.

Note. Significance was determined at the p < .05 level with a Bonferroni correction with m = 155. Significant associations are marked with an asterisk and colored in blue.


Table B2. P-value table for Table 4: Associations between types of data use and needed data type.

Note. Significance was determined at the p < .05 level with a Bonferroni correction with m = 70. Significant associations are marked with an asterisk and colored in blue. “Other” options are not shown as there were no significant associations present.


Table B3. P-value table for Table 4: Associations between types of data use and other data uses.

Note. Significance was determined at the p < .05 level with a Bonferroni correction with m = 196. Significant associations are marked with an asterisk and colored in blue. “Other” options are not shown as there were no significant associations present; duplicate values were removed.

 
Table B4. P-value table for Figure 8: Associations between disciplinary domain and data use.

Note. Significance was determined at the p < .05 level with a Bonferroni correction with m = 434. Significant associations are marked with an asterisk and colored in blue. “Other” options are not shown as there were no significant associations present.


Table B5. P-value table for Figure 15: Associations between data use and evaluation criteria.

Note. Significance was determined at the p < .05 level with a Bonferroni correction with m = 196. Significant associations are marked with an asterisk and colored in blue. “Other” options are not shown as there were no significant associations present.

Appendix C

Sources Used in Disciplinary Subset

Figure C1. Sources used in the disciplinary subset for respondents selecting only one discipline. Percents are percent respondents. Arts & humanities (n = 43); astronomy (n=14); biological science (n = 46); computer science (n = 57); earth & planetary science (n = 24); engineering & technology (n = 80); environmental science (n = 22); medicine (n = 91); physics (n = 42); social science (n = 81).


Disclosure Statement

The article associated with this supplement is part of the project Re-SEARCH: Contextual Search for Research Data and was funded by the NWO Grant 652.001.002


©2020 Kathleen Gregory, Paul Groth, Andrea Scharnhorst, and Sally Wyatt. This supplement is licensed under a Creative Commons Attribution (CC BY 4.0) International license, except where otherwise indicated with respect to particular material included in the supplement.

Comments
0
comment
No comments here
Why not start the discussion?