Data Sharing

How to Successfully Share Research Data

Sep 6, 2019
Data Sharing

Dear Diary,

I have been struggling with an eating disorder for the past few years. I am afraid to eat and afraid I will gain weight. The fear is unjustified as I was never overweight. I have weighed the same since I was 12 years old, and I am currently nearing my 25th birthday. Yet, when I see my reflection, I see somebody who is much larger than reality.

I told my therapist that I thought I was fat. She said it was 'body dysmorphia'.
She explained this as a mental health condition where a person is apprehensive about their appearance and suggested I visit a nutritionist. She also told me that this condition was associated with other anxiety disorders and eating disorders. I did not understand what she was saying as I was in denial; I had a problem, to begin with. I wanted a solution without having to address my issues.

Upon visiting my nutritionist, he conducted an in-body scan and told me my body weight was dangerously low.

I disagreed with him.

I felt he was speaking about a different person than the person I saw in the mirror. I felt like the elephant in the room- both literally and figuratively. He then made the simple but revolutionary suggestion to keep a food diary to track what I was eating.

This was a clever way for my nutritionist and me to be on the same page. By recording all my meals, drinks, and snacks, I was able to see what I was eating versus what I was supposed to be eating. Keeping a meal diary was a powerful and non-invasive way for my nutritionist to walk in my shoes for a specific time and understand my eating (and thinking) habits.

No other methodology would have allowed my nutritionist to capture so much contextual and behavioural information on my eating patterns other than a daily detailed food diary.
However, by using a paper and pen, I often forgot (or intentionally did not enter my food entries) as I felt guilty reading what I had eaten or that I had eaten at all.

I also did not have the visual flexibility to express myself through using photos, videos, voice recordings, and screen recordings. The usage of multiple media sources would have allowed my nutritionist to observe my behaviour in real-time and gain a holistic view of my physical and emotional needs.

I confessed to my therapist my deliberate dishonesty in completing the physical food diary and why I had been reluctant to participate in the exercise. My therapist then suggested to my nutritionist and me to transition to a mobile diary study.

Whilst I used a physical diary (paper and pen), a mobile diary study app would have helped my nutritionist and me reach a common ground (and to be on the same page) sooner rather than later.

As a millennial, I wanted to feel like journaling was as easy as Tweeting or posting a picture on Instagram. But at the same time, I wanted to know that the information I  provided in a digital diary would be as safe and private as it would have been as my handwritten diary locked in my bedroom cabinet.

Further, a digital food diary study platform with push notifications would have served as a constant reminder to log in my food entries as I constantly check my phone. It would have also made the task of writing a food diary less momentous by transforming my journaling into micro-journaling by allowing me to enter one bite at a time rather than the whole day's worth of meals at once.

Mainly, the digital food diary could help collect the evidence that I was not the elephant in the room, but rather that the elephant in the room was my denied eating disorder.

Sincerely,
The elephant in the room

Editor's note: We have converted these 8 steps into a checklist for easier use: https://www.teamscopeapp.com/data-sharing-checklist

Introduction

Sharing de-identified research data promotes collaboration within the research community. It allows researchers to increase the utility of their data and gives participants the possibility to maximize the benefit of their involvement by allowing their data to have a long-term impact. Ultimately, collaborative culture will save costs by enabling others to build upon robust and reliable prior research. 

Data sharing, however, requires a researcher to be willing to contribute their data, more importantly, that he or she understands how to make it publicly available. A researcher may still decide to share their data at the end of his study; however, without proper preparations, it might be too late. Consent forms, copyrights, and de-identification are just a few prerequisites for sharing your data. Without which, beyond your study, the data could potentially end up siloed from the rest of the world. 

The success of data sharing depends on how it was planned and whether it was intended since the beginning of a study. When researchers intend to share their data early on, they can include data sharing clauses in their informed consent form so their participants understand the extent of data use, anticipate issues with data anonymization and incorporate early on metadata that will ensure the usability of the research data itself.

This is our second article on Data Sharing, a series of posts where we cover everything you need to know to maximize the potential of your research. 

In this article, we will go through the steps needed to share your research data successfully.


How to plan for data dissemination

Data sharing starts with planning ahead, and often researchers will be motivated to share their results when it's too late. Considering these eight steps from the start of your project will allow you to streamline your data sharing plans. 

1. Check for existing datasets 

Research is an expensive and laborious activity. Incurring in such efforts is only reasonable if currently available datasets are insufficient for the proposed study. Apart from doing a literature search, it is wise that data repositories are reviewed to see if the required data is not available. 

An excellent place to start your search is Google' Dataset Search and https://www.re3data.org/search.


2. Provide participants consent forms for sharing de-identified data

Gaining informed consent for data publication from research participants before data is collected is the best practice. It avoids the cost and delay of attempting to obtain permission from participants after data has been collected. As long as participants are fully informed and have confidence that no identifying data will be shared it is reasonable to expect that large a number will be willing to have their de-identified data shared publicly.

Consent forms should:

  • Avoid specifically blocking the possibility for data anonymization and ultimate sharing.
  • State the intention of the researcher to de-identify the data and publish it in a data repository.
  • State the conditions under which others will be given access to the data.

A great resource to learn more about sensitive data and how to properly write consent forms for data sharing is University of Bristol's sensitive research data bootcamp.

3. Piloting your data collection tools

Conducting a pilot study is an effective way to uncover possible issues with all aspects of a project. When the collected data in a project is of poor quality, the specific objectives project are undermined as well as the usability of that data for the research community. 

By conducting a small scale preliminary study of the ultimate project, a researcher can anticipate issues with metadata, file export formats, and overall data integrity.
 

4. Choose a copyright license for your data

A copyright license is a legal agreement between a researcher who wants to use a dataset, image, or text and someone else who can give permission to use it. Licenses grant permissions under specific terms, and these conditions are intended to safeguard the researchers' authorship and work.

A Creative Commons (CC) license is one of several public copyright licenses that enable the free distribution of an otherwise copyrighted "work." (Wikipedia)

The most commonly used Creative Commons licenses for research work are:

CC-BY: This license lets others adapt and build upon your work, even for commercial purposes, as long as they credit the original creation. (Creative Commons)

CC0: This license waives all copyrights, and places work as fully as possible in the public domain, so that others may freely build upon, enhance, and reuse the works for any purposes without restriction. (Creative Commons)

Diego

If appropriately used in the 21st century, data could save us from lots of failed interventions and enable us to provide evidence-based solutions towards tackling malaria globally. This is also part of what makes the ALMA scorecard generated by the African Leaders Malaria Alliance an essential tool for tracking malaria intervention globally.

If we are able to know the financial resources deployed to fight malaria in an endemic country and equate it to the coverage and impact, it would be easier to strengthen accountability for malaria control and also track progress in malaria elimination across the continent of Africa and beyond.

Odinaka Kingsley Obeta

West African Lead, ALMA Youth Advisory Council/Zero Malaria Champion

There is a smarter way to do research.

Build fully customizable data capture forms, collect data wherever you are and analyze it with a few clicks — without any training required.

Learn more  

5. De-identify Sensitive Data

Before publishing your data, you must ensure that your datasets no longer contain any variables that might lead to the identification of individual respondents. The following variables, if existent, must be removed: 

  • Name of the respondent
  • Address of the respondent
  • Telephone number
  • Social security number

and the following variables must be recoded:

  • Date of birth
  • GPS data of the respondent
  • Postcode or Zipcode

Researchers may choose to remove identifiers manually or use a data anonymization software like Amnesia


6. Define what access availability your data will have: Public-use vs. Restricted-use

Depending on the nature of the dataset and the feasibility to remove all identifiers, researchers may choose to share their data in a controlled way.

Broadly speaking, researchers can choose among two levels of accessibility:  

Public-use: Public-use dataset include data that has been thoroughly filtered to mitigate the risk of confidentiality violations. All data that could lead to the identification of participants will be removed or altered.

Restricted-use: In some cases, it might not be viable to remove from a dataset all sensitive variables. This might cause the researcher to lose the ability to reproduce or extend the original study findings. In addition to having a version of their data for public-use, researchers can choose to archive a version containing identifiable information as restricted-user. Access to data that is archived as restricted-use is only granted to specific parties that have agreed to protect the confidentiality of respondents. 


7. Organize your data to comply with the FAIR Data Principles

The FAIR Data Principles are a set of guiding principles that make data findable, accessible, interoperable, and reusable. These principles provide guidance to data producers and publishers on how to maximize the utility of research data. 

Metadata is one of the fundamental building blocks of the FAIR Data Principles. Metadata is data that gives descriptive information about any entity. A dataset that meets the FAIR principle has elaborate and precise metadata that describes the dataset, and it's variables. 

Another essential element of the FAIR principle is persistent and unique identifiers. Datasets should be assigned a unique id that allows others to track its history and cite it. An example of a widely used identifier for datasets is the Digital Object Identifier (DOI). 


8. Choose a research data repository

Lastly, choose a research data repository. Using a repository will allow your dataset to be preserved over time, be findable by others, and easily citable. 

There are institution-specific, discipline-specific, and general-purpose data repositories. Data repositories will provide users an online interface where researchers can search for and discover data, though not necessarily obtain direct access if the dataset is restricted-use. 

For a list of research data repositories you can use, we recently published a list of six general-purpose data repositories, some of which are free of charge. 


Conclusion

Data sharing can be a fascinating endeavor for researchers. It allows them to improve transparency in their findings, gain more visibility, and enhance the impact of their work. 

Data sharing requires proper planning. When data sharing plans begin as early as proposal writing, a research team will make sure consent forms do not block data sharing possibilities, that the research tools will yield high-quality data and that the output files will have the necessary metadata, so they are useful to others.

The eight steps shared in this article give researchers an understanding of the considerations they must keep in mind when sharing their work in a manner that acknowledges and safeguards the rights of participants, allows others to reuse that data and enables proper attribution and citation.


Reference:

Preparing data for sharing : guide to social science data archiving. Amsterdam: Pallas Publications, 2010. (CC BY-NC-SA 3.0)

“About The Licenses.” Creative Commons, https://creativecommons.org/licenses/ (CC BY 4.0)

“Sensitive research data bootcamp.” University of Bristol,  https://data.blogs.bristol.ac.uk/bootcampsd/

Dear Digital Diary,

I realized that there is an unquestionable comfort in being misunderstood. For to be understood, one must peel off all the emotional layers and be exposed.

This requires both vulnerability and strength. I guess by using a physical diary (a paper and a pen), I never felt like what I was saying was analyzed or judged. But I also never thought I was understood.

Paper does not talk back.Using a daily digital diary has required emotional strength. It has required the need to trust and the need to provide information to be helped and understood.

Using a daily diary has needed less time and effort than a physical diary as I am prompted to interact through mobile notifications. I also no longer relay information from memory, but rather the medical or personal insights I enter are real-time behaviours and experiences.

The interaction is more organic. I also must confess this technology has allowed me to see patterns in my behaviour that I would have otherwise never noticed. I trust that the data I enter is safe as it is password protected. I also trust that I am safe because my doctor and nutritionist can view my records in real-time.

Also, with the data entered being more objective and diverse through pictures and voice recordings, my treatment plan has been better suited to my needs.

Sincerely,
No more elephants in this room

Diego Menchaca's profile picture

Diego Menchaca

Diego is the founder and CEO of Teamscope. He started Teamscope from a scribble on a table. It instantly became his passion project and a vehicle into the unknown. Diego is originally from Chile and lives in Nijmegen, the Netherlands.

More articles on

Data Sharing