Skip to main content
SearchLogin or Signup

Lost or Found? Discovering Data Needed for Research: Supplementary Materials

Published onApr 30, 2020
Lost or Found? Discovering Data Needed for Research: Supplementary Materials
·
key-enterThis Pub is a Supplement to

Appendix A

Survey questionnaire


Introduction

This study investigates how participants locate and evaluate data they do not create themselves.

The survey consists of three main sections:
• Part 1: Data Needs
• Part 2: Finding Data
• Part 3: Evaluating Data

Our funding comes from the Netherlands Organization for Scientific Research (NWO). The study is part of a collaborative research project between researchers at the Data Archiving and Networked Services (DANS), the University of Amsterdam, the Vrije Universiteit Amsterdam and Elsevier.

By clicking on the below button to start the survey, you indicate your consent to participate in this research. You can read more about the survey and what will be done with the data here (this will launch a new window).

Thank you for your participation.

Please click >> button to indicate consent to participate and to begin the survey.

 

Survey Questions


Part 1: Data Needs

Q1: Which of the following best describes you?

Please select one answer

  • Researcher

  • Student

  • Librarian, archivist or research/data support provider

  • Manager

  • Other. Please specify ____________

 

Q2: Please describe the secondary data that you (might) need. (We define secondary data as data that you do not create yourself).

Please write your answer in the box below:

 

 

Q3: Please select the options that describe the secondary data that you (might) need.

Please select all that apply 

  • Observational or empirical (e.g. sensor data, survey data, interview transcripts, sample data, neuroimages, ethnographic data, diaries)

  • Experimental (e.g. gene sequences, chromatograms, toroid magnetic field data)

  • Simulation (e.g. climate models, economic models)

  • Derived or compiled (e.g. text and data mining, compiled database, 3D models)

  • Other, Please specify ____________


Q4: Why do you use or need secondary data?

Please select all that apply 

  • As the basis for a new study 

  • To calibrate instruments or models

  • For benchmarking 

  • To verify my own data

  • As model, algorithm or system inputs

  • To generate new ideas

  • For teaching/training

  • To prepare for a new project or proposal

  • To experiment with new methods and techniques (e.g. to develop data science skills)

  • To identify trends or make predictions

  • To compare multiple datasets to find commonalities or differences

  • To create summaries, visualizations, or analysis tools

  • To integrate with other data to create a new dataset

  • Other. Please specify ____________


Q5: Have you ever used data outside of your area of expertise?

Please select one answer

  • Yes

  • No

Q5a: How did you find this data? 

Please write your answer in the box below:

 


Part 2: Finding Data

Q6: When you need data, who finds it for you? 

Please select all that apply

  • I find it myself

  • Graduate student

  • Research support professional (e.g. librarian, archivist, data or literature manager)

  • Someone else in my personal network (e.g. peers, collaborators, mentors)

  • Other. Please specify ____________


Q7: How frequently do you use the following to find data?

Please select one answer per row

 

Often

Occasionally

Never

Multidisciplinary data repositories




Discipline-specific data repositories




Governmental agencies and websites




Personal networks (e.g. colleagues, peers)




Academic literature (e.g. journal articles, conference proceedings




Code repository (e.g. GitHub)




General search engines (e.g. Google)




Professional associations




Data specific search engines




Commercial sources




Consultation with research support professionals (e.g. librarians, archivists or data managers)




 

Q7_open: Please specify any other resources that you use to find data:

Please write your answer in the box below:

 

Q7a: Which statement(s) describe how you discover data using the academic literature?

Please select all that apply

  • I search the academic literature with the goal of finding data.

  • I find data serendipitously while reading articles or performing literature searches.

  • I follow citations and references in the literature to datasets.

  • I extract and use data from the literature directly (e.g. from tables, graphs, or instrument specifications and parameters)

  • Other. Please specify ____________

Q7b How successful are you at finding data with a general search engine (e.g. Google)?

Please select one answer

  • Very successful

  • Successful

  • Sometime successful, sometimes not

  • Rarely successful

  • Not successful


Q8: How frequently do you find data in the following ways?

Please select one answer per row

 

Often

Occasionally

Never

By actively searching for data in an online resource




Serendipitously, when searching for something else (e.g. when looking for journal articles or news)




Serendipitously, when NOT actively looking for something else (e.g. via an email notice or interaction with a colleague)




In the course of sharing or managing my own data





Q9: Please indicate if you use the following to discover, access, or make sense of data. 

Please select all that apply

 

Q10a - Discover

Q10b - Access

Q10c - Making sense of data

Conversations with personal networks (e.g. colleagues, peers)




Contacting the data creator




Developing new academic collaborations with data creators




Attending conferences




Disciplinary mailing lists or discussion forums





Q10: Do you discover data differently than how you discover academic literature? 

Please select one answer

  • Yes

  • Sometimes

  • No

Q10a: How is your process for finding data different than your process for finding academic literature? 

Please write your answer in the box below:

 


Q11: How easy is it to find data? 

Please select one answer

  • Easy

  • Sometimes challenging

  • Difficult

Q11a: Why is it challenging to find the data that you need? 

Please select all that apply

  • The data are not accessible (e.g. behind paywalls, held by industry).

  • I don't know where or how to best look for the data.

  • The data are located in many different places.

  • The data are not digital.

  • Online search tools are inadequate.

  • I do not have the personal network needed to find or access the data.

  • Other. Please specify ____________


Part 3: Evaluating Data

Q12: Please indicate the importance of the following information when deciding whether or not to use secondary data. 

Please select one answer per row

 

Extremely important

Important

Somewhat important

Less important

Not important

Data collection conditions and methodology






How data has been processed and handled






Reputation of data creator






Personally knowing the data creator






Reputation of data source (e.g. repository or journal)






Detailed and complete metadata and documentation






Data size






Data format






Licensing/copyright conditions






Correct coverage (time, location, population, etc.)






Original purpose of the data






Ease of access






Topic relevance






Q12_open: Please specify any other information you consider when deciding whether to use or not secondary data.

 


Q13: How important are the following strategies in evaluating and making sense of data? 

Please select one answer per row

 

Extremely important

Important

Somewhat important

Less important

Not important

Consulting associated journal articles






Consulting data documentation and codebooks






Consulting the data creator






Consulting personal networks (e.g. colleagues, peers)






Exploratory data analysis (e.g. statistical checks, graphical analysis)






Q13_open: Please specify any other strategies you consider to evaluate and make sense of data.

Please write your answer in the box below:

 


Q14: Please indicate the importance of the following in helping you to establish trust in secondary data. 

Please select one answer per row

 

Extremely important

Important

Somewhat important

Less important

Not important

Others' prior usage of the data






Reputation of source (e.g. repository, journal)






Reputation of data creator






Transparency in data collection methods






Lack of errors






Ease of access






Personal relationship with the data creator






Q14_open: Please specify any other important aspects you consider to help establish trust in secondary data.

Please write your answer in the box below:

 


Q15: Please indicate the importance of the following in helping you to establish the quality of secondary data. 

Please select one answer per row

 

Extremely important

Important

Somewhat important

Less important

Not important

Lack of errors






Ease of downloading and exploring data






Data size






Data completeness






Reputation of source (e.g. repository, journal)






Resolution or clarity






Reputation of data creator






Detail or amount of work done to prepare data






Consistency of formatting






Q15_open: Please specify any other important aspects you consider to help establish the quality of secondary data.

Please write your answer in the box below:

 


Part 4: Demographics
You are nearly at the end of the survey. Below are some questions to help us classify your answers.

D1: In which subject discipline do you specialize?

Please check all that apply.

  • Agriculture

  • Arts and Humanities

  • Astronomy

  • Biochemistry, Genetics, and Molecular Biology

  • Biological Sciences

  • Business, Management and Accounting

  • Chemical Engineering

  • Chemistry

  • Computer Sciences / IT

  • Decision Sciences

  • Dentistry

  • Earth and Planetary Sciences

  • Economics, Econometrics and Finance

  • Energy

  • Engineering and Technology

  • Environmental Sciences

  • Health professions

  • Immunology and Microbiology

  • Materials Science

  • Mathematics

  • Medicine

  • Multidisciplinary

  • Neuroscience

  • Nursing

  • Pharmacology, Toxicology and Pharmaceutics

  • Physics

  • Psychology

  • Social Science

  • Veterinary

  • Information science

  • Other. Please specify____________

 

D2: How many years of professional experience do you have in your field?

Please select one answer

  • 0-5

  • 6-15

  • 16-30

  • 31+


D3: In which county do you currently work? 

  • Afghanistan

  • Albania

  • Algeria

  • American Samoa

  • Andorra

  • Angola

  • Anguilla

  • Antarctica

  • Antigua and Barbuda

  • Argentina

  • Armenia

  • Aruba

  • Australia

  • Austria

  • Azerbaijan

  • Bahamas

  • Bahrain

  • Bangladesh

  • Barbados

  • Belarus

  • Belgium

  • Belize

  • Benin

  • Bermuda

  • Bhutan

  • Bolivia

  • Bosnia and Herzegovina

  • Botswana

  • Brazil

  • British Indian Ocean Territory

  • Brunei

  • Brunei Darussalam

  • Bulgaria

  • Burkina Faso

  • Burundi

  • Cambodia

  • Cameroon

  • Canada

  • Cape Verde

  • Cayman Islands

  • Central African Republic

  • Chad

  • Chile

  • China

  • Christmas Island

  • Cocos (Keeling) Islands

  • Colombia

  • Comoros

  • Congo

  • Cook Islands

  • Costa Rica

  • Cote d'Ivoire

  • Croatia

  • Cuba

  • Cyprus

  • Czech Republic

  • Denmark

  • Djibouti

  • Dominica

  • Dominican Republic

  • East Timor

  • Ecuador

  • Egypt

  • El Salvador

  • Equatorial Guinea

  • Eritrea

  • Estonia

  • Ethiopia

  • Falkland Islands (Malvinas)

  • Fiji

  • Finland

  • France

  • French Guiana

  • French Polynesia

  • French Southern Territories

  • Gambia

  • Georgia

  • Germany

  • Ghana

  • Gibraltar

  • Greece

  • Greenland

  • Grenada

  • Guadeloupe

  • Guam

  • Guatemala

  • Guinea-Bissau

  • Haiti

  • Heard Island and McDonald Islands

  • Holy See (Vatican City State)

  • Honduras

  • Hong Kong

  • Hungary

  • Iceland

  • India

  • Indonesia

  • Iran (Islamic Republic of)

  • Iraq

  • Ireland

  • Israel

  • Italy

  • Jamaica

  • Japan

  • Jordan

  • Kazakhstan

  • Kenya

  • Kiribati

  • North Korea

  • Kuwait

  • Kyrgyzstan

  • Lao People's Democratic Republic

  • Laos

  • Latvia

  • Lebanon

  • Lesotho

  • Liberia

  • Libyan Arab Jamahiriya

  • Lithuania

  • Luxembourg

  • Macau

  • Madagascar

  • Malawi

  • Malaysia

  • Maldives

  • Mali

  • Malta

  • Martinique

  • Mauritania

  • Mauritius

  • Mexico

  • Micronesia (Federated States of)

  • Monaco

  • Mongolia

  • Montserrat

  • Morocco

  • Mozambique

  • Myanmar

  • Namibia

  • Nauru

  • Nepal

  • Netherlands

  • Netherlands Antilles

  • New Caledonia

  • New Zealand

  • Nicaragua

  • Niger

  • Nigeria

  • Niue

  • Norfolk Island

  • Norway

  • Oman

  • Pakistan

  • Palau

  • Panama

  • Papua New Guinea

  • Paraguay

  • Peru

  • Philippines

  • Pitcairn

  • Poland

  • Portugal

  • Puerto Rico

  • Qatar

  • Reunion

  • Romania

  • RUSSIA

  • Rwanda

  • Saint Helena

  • Saint Kitts and Nevis

  • Saint Lucia

  • Saint Vincent and the Grenadines

  • Samoa

  • Sao Tome and Principe

  • Saudi Arabia

  • Senegal

  • Serbia and Montenegro

  • Seychelles

  • Sierra Leone

  • Singapore

  • Slovakia

  • Slovenia

  • Solomon Islands

  • Somalia

  • South Africa

  • South Korea

  • Spain

  • Sri Lanka

  • Sudan

  • Suriname

  • Swaziland

  • Sweden

  • Switzerland

  • Syrian Arab Republic

  • Taiwan

  • Tajikistan

  • TANZANIA

  • Thailand

  • Togo

  • Tonga

  • Trinidad and Tobago

  • Tunisia

  • Turkey

  • Turkmenistan

  • Turks and Caicos Islands

  • Uganda

  • Ukraine

  • United Arab Emirates

  • United Kingdom

  • United States Minor Outlying Islands

  • Uruguay

  • USA

  • Uzbekistan

  • Vanuatu

  • Venezuela

  • Viet Nam

  • Virgin Islands

  • Virgin Islands (US)

  • Virgin Islands, British

  • Wallis and Futuna

  • Yemen

  • Zambia

  • Zimbabwe

  • Palestinian Territory, Occupied

  • Moldova, Republic of

  • Marshall Islands

  • Macedonia, The Former Yugoslav Republic of

  • Liechtenstein

  • Korea, Republic of

  • Guyana

  • Guinea

  • Gabon

  • Faroe Islands

  • Zanzibar

  • Tokelau


D4: What type of organization do you work for?

Please select one answer

  • University or college

  • Research institution

  • Government agency

  • Corporate

  • Independent archive or library

  • m Other. Please specify ____________


D5: Please indicate how the following people feel about sharing their research data. 

Please select one answer per row

 

Data sharing is strongly encouraged

Data sharing is somewhat encouraged

Data sharing is neither encouraged nor discouraged

Data sharing is somewhat discouraged

Data sharing is strongly discouraged

Don't know/ Not applicable

You







The people you work with directly







Your disciplinary community







Your institution








D6: Please indicate how the following people feel about reusing data produced by other people. 

Please select one answer per row

 

Data reusing is strongly encouraged

Data reusing is somewhat encouraged

Data reusing is neither encouraged nor discouraged

Data reusing is somewhat discouraged

Data reusing is strongly discouraged

Don't know/ Not applicable

You







The people you work with directly







Your disciplinary community







Your institution








D7: Have you ever shared your own research data?

Please select one answer

  • Yes

  • No


D8: Final comments: Do you have anything else that you would like us to know?

Please write your comments in the box below:

 

 

Additional questions asked to participants selecting “Librarian, archivist or research/data support provider” as their role.

 

L3: Do you use or need secondary data for your own research or to support others?

Please select one answer

  • For my own research

  • To support others

  • For both my own research and to support others


L4: Who are the people whom you support?

Please select all that apply

  • Students

  • Researchers

  • Industry employees

  • Other. Please specify ____________


L5: How do you support people with their data needs?

Please select all that apply

  • I teach people about data management planning (e.g. through consultations, workshops, etc.).

  • I teach people how to discover and evaluate data (e.g. through consultations, workshops, etc.).

  • I find data for people.

  • I help people to curate their data.

  • I find literature for people.

  • Other. Please specify ____________


 

Appendix B


P-Value Tables


<p><strong>Table B1. </strong><em><strong>P</strong></em><strong>-Value Table for Figure 6: Associations Between Disciplinary Domain and Needed Data</strong></p>

Table B1. P-Value Table for Figure 6: Associations Between Disciplinary Domain and Needed Data


Note. Significance was determined at the p < .05 level with a Bonferroni correction with m = 155. Significant associations are marked with an asterisk and colored in blue.

 

<p><strong>Table B2. P-Value Table for Table 4: Associations Between Types of Data Use and Needed Data Type</strong></p>

Table B2. P-Value Table for Table 4: Associations Between Types of Data Use and Needed Data Type


Note. Significance was determined at the p < .05 level with a Bonferroni correction with m = 70. Significant associations are marked with an asterisk and colored in blue. “Other” options are not shown as there were no significant associations present.


 

<p><strong>Table B3. </strong><em><strong>P</strong></em><strong>-Value Table for Table 4: Associations Between Types of Data Use and Other Data Uses</strong></p>

Table B3. P-Value Table for Table 4: Associations Between Types of Data Use and Other Data Uses


Note. Significance was determined at the p < .05 level with a Bonferroni correction with m = 196. Significant associations are marked with an asterisk and colored in blue. “Other” options are not shown as there were no significant associations present; duplicate values were removed.

 

<p>&nbsp;<strong>Table B4. </strong><em><strong>P</strong></em><strong>-Value Table for Figure 8: Associations Between Disciplinary Domain and Data Use</strong></p>

 Table B4. P-Value Table for Figure 8: Associations Between Disciplinary Domain and Data Use


Note. Significance was determined at the p < .05 level with a Bonferroni correction with m = 434. Significant associations are marked with an asterisk and colored in blue. “Other” options are not shown as there were no significant associations present.


<p><strong>Table B5. </strong><em><strong>P</strong></em><strong>-Value Table for Figure 15: Associations Between Data Use and Evaluation Criteria</strong></p>

Table B5. P-Value Table for Figure 15: Associations Between Data Use and Evaluation Criteria


Note. Significance was determined at the p < .05 level with a Bonferroni correction with m = 196. Significant associations are marked with an asterisk and colored in blue. “Other” options are not shown as there were no significant associations present.

 

Appendix C

Sources Used in Disciplinary Subset


<p><strong>Figure C1. Sources used in the disciplinary subset for respondents selecting only one discipline</strong>. Percents are percent respondents. Arts &amp; humanities (<em>n</em> = 43); astronomy (n=14); biological science (<em>n</em> = 46); computer science (<em>n</em> = 57); earth &amp; planetary science (<em>n </em>= 24); engineering &amp; technology (<em>n</em> = 80); environmental science (<em>n</em> = 22); medicine (<em>n</em> = 91); physics (<em>n</em> = 42); social science (<em>n</em> = 81).</p>

Figure C1. Sources used in the disciplinary subset for respondents selecting only one discipline. Percents are percent respondents. Arts & humanities (n = 43); astronomy (n=14); biological science (n = 46); computer science (n = 57); earth & planetary science (n = 24); engineering & technology (n = 80); environmental science (n = 22); medicine (n = 91); physics (n = 42); social science (n = 81).

Comments
0
comment

No comments here