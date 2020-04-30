This study investigates how participants locate and evaluate data they do not create themselves.
The survey consists of three main sections:
• Part 1: Data Needs
• Part 2: Finding Data
• Part 3: Evaluating Data
Our funding comes from the Netherlands Organization for Scientific Research (NWO). The study is part of a collaborative research project between researchers at the Data Archiving and Networked Services (DANS), the University of Amsterdam, the Vrije Universiteit Amsterdam and Elsevier.
By clicking on the below button to start the survey, you indicate your consent to participate in this research. You can read more about the survey and what will be done with the data here (this will launch a new window).
Thank you for your participation.
Please click >> button to indicate consent to participate and to begin the survey.
Q1: Which of the following best describes you?
Please select one answer
Researcher
Student
Librarian, archivist or research/data support provider
Manager
Other. Please specify ____________
Q2: Please describe the secondary data that you (might) need. (We define secondary data as data that you do not create yourself).
Please write your answer in the box below:
Q3: Please select the options that describe the secondary data that you (might) need.
Please select all that apply
Observational or empirical (e.g. sensor data, survey data, interview transcripts, sample data, neuroimages, ethnographic data, diaries)
Experimental (e.g. gene sequences, chromatograms, toroid magnetic field data)
Simulation (e.g. climate models, economic models)
Derived or compiled (e.g. text and data mining, compiled database, 3D models)
Other, Please specify ____________
Q4: Why do you use or need secondary data?
Please select all that apply
As the basis for a new study
To calibrate instruments or models
For benchmarking
To verify my own data
As model, algorithm or system inputs
To generate new ideas
For teaching/training
To prepare for a new project or proposal
To experiment with new methods and techniques (e.g. to develop data science skills)
To identify trends or make predictions
To compare multiple datasets to find commonalities or differences
To create summaries, visualizations, or analysis tools
To integrate with other data to create a new dataset
Other. Please specify ____________
Q5: Have you ever used data outside of your area of expertise?
Please select one answer
Yes
No
Q5a: How did you find this data?
Please write your answer in the box below:
Q6: When you need data, who finds it for you?
Please select all that apply
I find it myself
Graduate student
Research support professional (e.g. librarian, archivist, data or literature manager)
Someone else in my personal network (e.g. peers, collaborators, mentors)
Other. Please specify ____________
Q7: How frequently do you use the following to find data?
Please select one answer per row
Often
Occasionally
Never
Multidisciplinary data repositories
Discipline-specific data repositories
Governmental agencies and websites
Personal networks (e.g. colleagues, peers)
Academic literature (e.g. journal articles, conference proceedings
Code repository (e.g. GitHub)
General search engines (e.g. Google)
Professional associations
Data specific search engines
Commercial sources
Consultation with research support professionals (e.g. librarians, archivists or data managers)
Q7_open: Please specify any other resources that you use to find data:
Please write your answer in the box below:
Q7a: Which statement(s) describe how you discover data using the academic literature?
Please select all that apply
I search the academic literature with the goal of finding data.
I find data serendipitously while reading articles or performing literature searches.
I follow citations and references in the literature to datasets.
I extract and use data from the literature directly (e.g. from tables, graphs, or instrument specifications and parameters)
Other. Please specify ____________
Q7b How successful are you at finding data with a general search engine (e.g. Google)?
Please select one answer
Very successful
Successful
Sometime successful, sometimes not
Rarely successful
Not successful
Q8: How frequently do you find data in the following ways?
Please select one answer per row
Often
Occasionally
Never
By actively searching for data in an online resource
Serendipitously, when searching for something else (e.g. when looking for journal articles or news)
Serendipitously, when NOT actively looking for something else (e.g. via an email notice or interaction with a colleague)
In the course of sharing or managing my own data
Q9: Please indicate if you use the following to discover, access, or make sense of data.
Please select all that apply
Q10a - Discover
Q10b - Access
Q10c - Making sense of data
Conversations with personal networks (e.g. colleagues, peers)
Contacting the data creator
Developing new academic collaborations with data creators
Attending conferences
Disciplinary mailing lists or discussion forums
Q10: Do you discover data differently than how you discover academic literature?
Please select one answer
Yes
Sometimes
No
Q10a: How is your process for finding data different than your process for finding academic literature?
Please write your answer in the box below:
Q11: How easy is it to find data?
Please select one answer
Easy
Sometimes challenging
Difficult
Q11a: Why is it challenging to find the data that you need?
Please select all that apply
The data are not accessible (e.g. behind paywalls, held by industry).
I don't know where or how to best look for the data.
The data are located in many different places.
The data are not digital.
Online search tools are inadequate.
I do not have the personal network needed to find or access the data.
Other. Please specify ____________
Q12: Please indicate the importance of the following information when deciding whether or not to use secondary data.
Please select one answer per row
Extremely important
Important
Somewhat important
Less important
Not important
Data collection conditions and methodology
How data has been processed and handled
Reputation of data creator
Personally knowing the data creator
Reputation of data source (e.g. repository or journal)
Detailed and complete metadata and documentation
Data size
Data format
Licensing/copyright conditions
Correct coverage (time, location, population, etc.)
Original purpose of the data
Ease of access
Topic relevance
Q12_open: Please specify any other information you consider when deciding whether to use or not secondary data.
Q13: How important are the following strategies in evaluating and making sense of data?
Please select one answer per row
Extremely important
Important
Somewhat important
Less important
Not important
Consulting associated journal articles
Consulting data documentation and codebooks
Consulting the data creator
Consulting personal networks (e.g. colleagues, peers)
Exploratory data analysis (e.g. statistical checks, graphical analysis)
Q13_open: Please specify any other strategies you consider to evaluate and make sense of data.
Please write your answer in the box below:
Q14: Please indicate the importance of the following in helping you to establish trust in secondary data.
Please select one answer per row
Extremely important
Important
Somewhat important
Less important
Not important
Others' prior usage of the data
Reputation of source (e.g. repository, journal)
Reputation of data creator
Transparency in data collection methods
Lack of errors
Ease of access
Personal relationship with the data creator
Q14_open: Please specify any other important aspects you consider to help establish trust in secondary data.
Please write your answer in the box below:
Q15: Please indicate the importance of the following in helping you to establish the quality of secondary data.
Please select one answer per row
Extremely important
Important
Somewhat important
Less important
Not important
Lack of errors
Ease of downloading and exploring data
Data size
Data completeness
Reputation of source (e.g. repository, journal)
Resolution or clarity
Reputation of data creator
Detail or amount of work done to prepare data
Consistency of formatting
Q15_open: Please specify any other important aspects you consider to help establish the quality of secondary data.
Please write your answer in the box below:
You are nearly at the end of the survey. Below are some questions to help us classify your answers.
D1: In which subject discipline do you specialize?
Please check all that apply.
D2: How many years of professional experience do you have in your field?
Please select one answer
0-5
6-15
16-30
31+
D3: In which county do you currently work?
D4: What type of organization do you work for?
Please select one answer
University or college
Research institution
Government agency
Corporate
Independent archive or library
m Other. Please specify ____________
D5: Please indicate how the following people feel about sharing their research data.
Please select one answer per row
Data sharing is strongly encouraged
Data sharing is somewhat encouraged
Data sharing is neither encouraged nor discouraged
Data sharing is somewhat discouraged
Data sharing is strongly discouraged
Don't know/ Not applicable
You
The people you work with directly
Your disciplinary community
Your institution
D6: Please indicate how the following people feel about reusing data produced by other people.
Please select one answer per row
Data reusing is strongly encouraged
Data reusing is somewhat encouraged
Data reusing is neither encouraged nor discouraged
Data reusing is somewhat discouraged
Data reusing is strongly discouraged
Don't know/ Not applicable
You
The people you work with directly
Your disciplinary community
Your institution
D7: Have you ever shared your own research data?
Please select one answer
Yes
No
D8: Final comments: Do you have anything else that you would like us to know?
Please write your comments in the box below:
Additional questions asked to participants selecting “Librarian, archivist or research/data support provider” as their role.
L3: Do you use or need secondary data for your own research or to support others?
Please select one answer
For my own research
To support others
For both my own research and to support others
L4: Who are the people whom you support?
Please select all that apply
Students
Researchers
Industry employees
Other. Please specify ____________
L5: How do you support people with their data needs?
Please select all that apply
I teach people about data management planning (e.g. through consultations, workshops, etc.).
I teach people how to discover and evaluate data (e.g. through consultations, workshops, etc.).
I find data for people.
I help people to curate their data.
I find literature for people.
Other. Please specify ____________
Note. Significance was determined at the p < .05 level with a Bonferroni correction with m = 155. Significant associations are marked with an asterisk and colored in blue.
Note. Significance was determined at the p < .05 level with a Bonferroni correction with m = 70. Significant associations are marked with an asterisk and colored in blue. “Other” options are not shown as there were no significant associations present.
Note. Significance was determined at the p < .05 level with a Bonferroni correction with m = 196. Significant associations are marked with an asterisk and colored in blue. “Other” options are not shown as there were no significant associations present; duplicate values were removed.
Note. Significance was determined at the p < .05 level with a Bonferroni correction with m = 434. Significant associations are marked with an asterisk and colored in blue. “Other” options are not shown as there were no significant associations present.
Note. Significance was determined at the p < .05 level with a Bonferroni correction with m = 196. Significant associations are marked with an asterisk and colored in blue. “Other” options are not shown as there were no significant associations present.
The article associated with this supplement is part of the project Re-SEARCH: Contextual Search for Research Data and was funded by the NWO Grant 652.001.002
©2020 Kathleen Gregory, Paul Groth, Andrea Scharnhorst, and Sally Wyatt. This supplement is licensed under a Creative Commons Attribution (CC BY 4.0) International license, except where otherwise indicated with respect to particular material included in the supplement.