Data Sharing

6 repositories to share your research data

Aug 20, 2019
Data Sharing

Dear Diary,

I have been struggling with an eating disorder for the past few years. I am afraid to eat and afraid I will gain weight. The fear is unjustified as I was never overweight. I have weighed the same since I was 12 years old, and I am currently nearing my 25th birthday. Yet, when I see my reflection, I see somebody who is much larger than reality.

I told my therapist that I thought I was fat. She said it was 'body dysmorphia'.
She explained this as a mental health condition where a person is apprehensive about their appearance and suggested I visit a nutritionist. She also told me that this condition was associated with other anxiety disorders and eating disorders. I did not understand what she was saying as I was in denial; I had a problem, to begin with. I wanted a solution without having to address my issues.

Upon visiting my nutritionist, he conducted an in-body scan and told me my body weight was dangerously low.

I disagreed with him.

I felt he was speaking about a different person than the person I saw in the mirror. I felt like the elephant in the room- both literally and figuratively. He then made the simple but revolutionary suggestion to keep a food diary to track what I was eating.

This was a clever way for my nutritionist and me to be on the same page. By recording all my meals, drinks, and snacks, I was able to see what I was eating versus what I was supposed to be eating. Keeping a meal diary was a powerful and non-invasive way for my nutritionist to walk in my shoes for a specific time and understand my eating (and thinking) habits.

No other methodology would have allowed my nutritionist to capture so much contextual and behavioural information on my eating patterns other than a daily detailed food diary.
However, by using a paper and pen, I often forgot (or intentionally did not enter my food entries) as I felt guilty reading what I had eaten or that I had eaten at all.

I also did not have the visual flexibility to express myself through using photos, videos, voice recordings, and screen recordings. The usage of multiple media sources would have allowed my nutritionist to observe my behaviour in real-time and gain a holistic view of my physical and emotional needs.

I confessed to my therapist my deliberate dishonesty in completing the physical food diary and why I had been reluctant to participate in the exercise. My therapist then suggested to my nutritionist and me to transition to a mobile diary study.

Whilst I used a physical diary (paper and pen), a mobile diary study app would have helped my nutritionist and me reach a common ground (and to be on the same page) sooner rather than later.

As a millennial, I wanted to feel like journaling was as easy as Tweeting or posting a picture on Instagram. But at the same time, I wanted to know that the information I  provided in a digital diary would be as safe and private as it would have been as my handwritten diary locked in my bedroom cabinet.

Further, a digital food diary study platform with push notifications would have served as a constant reminder to log in my food entries as I constantly check my phone. It would have also made the task of writing a food diary less momentous by transforming my journaling into micro-journaling by allowing me to enter one bite at a time rather than the whole day's worth of meals at once.

Mainly, the digital food diary could help collect the evidence that I was not the elephant in the room, but rather that the elephant in the room was my denied eating disorder.

Sincerely,
The elephant in the room

Why share research data?

Sharing information stimulates science. When researchers choose to make their data publicly available, they are allowing their work to contribute far beyond their original findings.

The benefits of data sharing are immense. When researchers make their data public, they increase transparency and trust in their work, they enable others to reproduce and validate their findings, and ultimately, contribute to the pace of scientific discovery by allowing others to reuse and build on top of their data.


"If I have seen further it is by standing on the shoulders of Giants."

Isaac Newton, 1675.


While the benefits of data sharing and open science are categorical, sadly 86% of medical research data is never reused. In a 2014 survey conducted by Wiley with over 2000 researchers across different fields, found that 21% of surveyed researchers did not know where to share their data and 16% how to do so.

In a series of articles on Data Sharing we seek to break down this process for you and cover everything you need to know on how to share your research outputs.

In this first article, we will introduce essential concepts of public data and share six powerful platforms to upload and share datasets.


What is a Research Data Repository?

The best way to publish and share research data is with a research data repository. A repository is an online database that allows research data to be preserved across time and helps others find it.

Apart from archiving research data, a repository will assign a DOI to each uploaded object and provide a web page that tells what it is, how to cite it and how many times other researchers have cited or downloaded that object.


What is a DOI?

When a researcher uploads a document to an online data repository, a digital object identifier (DOI) will be assigned. A DOI is a globally unique and persistent string (e.g. 10.6084/m9.figshare.7509368.v1) that identifies your work permanently. 

A data repository can assign a DOI to any document, such as spreadsheets, images or presentation, and at different levels of hierarchy, like collection images or a specific chapter in a book.

The DOI contains metadata that provides users with relevant information about an object, such as the title, author, keywords, year of publication and the URL where that document is stored. 

The International DOI Foundation (IDF) developed and introduced the DOI in 2000. Registration Agencies, a federation of independent organizations, register DOIs and provide the necessary infrastructure that allows researchers to declare and maintain metadata.


Key benefits of the DOI system:

  • A more straightforward way to track research outputs
  • Gives certainty to scientific work
  • DOI's versioning system tracks changes to work overtime
  • Can be assigned to any document
  • Enables proper indexation and citation of research outputs

Once a document has a DOI, others can easily cite it. A handy tool to convert DOI's into a citation is DOI Citation Formatter


Six repositories to share research data

Now that we have covered the role of a DOI and a data repository, below is a list of 6 data repositories for publishing and sharing research data.

1. figshare

Figshare is an open access data repository where researchers can preserve their research outputs, such as datasets, images, and videos and make them discoverable. 

Figshare allows researchers to upload any file format and assigns a digital object identifier (DOI) for citations. 

Mark Hahnel launched Figshare in January 2011. Hahnel first developed the platform as a personal tool for organizing and publishing the outputs of his PhD in stem cell biology. More than 50 institutions now use this solution. 

Figshare releases' The State of Open Data' every year to assess the changing academic landscape around open research.

Free accounts on Figshare can upload files of up to 5gb and get 20gb of free storage. 


2. Mendeley Data

Mendeley Data is an open research data repository, where researchers can store and share their data. Datasets can be shared privately between individuals, as well as publicly with the world. 

Mendeley's mission is to facilitate data sharing. In their own words, "when research data is made publicly available, science benefits:

- the findings can be verified and reproduced- the data can be reused in new ways

- discovery of relevant research is facilitated

- funders get more value from their funding investment."

Datasets uploaded to Mendeley Data go into a moderation process where they are reviewed. This ensures the content constitutes research data, is scientific, and does not contain a previously published research article. 

Researchers can upload and store their work free of cost on Mendeley Data.

If appropriately used in the 21st century, data could save us from lots of failed interventions and enable us to provide evidence-based solutions towards tackling malaria globally. This is also part of what makes the ALMA scorecard generated by the African Leaders Malaria Alliance an essential tool for tracking malaria intervention globally.

If we are able to know the financial resources deployed to fight malaria in an endemic country and equate it to the coverage and impact, it would be easier to strengthen accountability for malaria control and also track progress in malaria elimination across the continent of Africa and beyond.

Odinaka Kingsley Obeta

West African Lead, ALMA Youth Advisory Council/Zero Malaria Champion

There is a smarter way to do research.

Build fully customizable data capture forms, collect data wherever you are and analyze it with a few clicks — without any training required.

Learn more  

3. Dryad Digital Repository

Dryad is a curated general-purpose repository that makes data discoverable, freely reusable, and citable.

Most types of files can be submitted (e.g., text, spreadsheets, video, photographs, software code) including compressed archives of multiple files.

Since a guiding principle of Dryad is to make its contents freely available for research and educational use, there are no access costs for individual users or institutions. Instead, Dryad supports its operation by charging a $120US fee each time data is published.


4. Harvard Dataverse

Harvard Dataverse is an online data repository where scientists can preserve, share, cite and explore research data.

The Harvard Dataverse repository is powered by the open-source web application Dataverse, developed by Insitute of Quantitative Social Science at Harvard.

Researchers, journals and institutions may choose to install the Dataverse web application on their own server or use Harvard's installation. Harvard Dataverse is open to all scientific data from all disciplines.

Harvard Dataverse is free and has a limit of 2.5 GB per file and 10 GB per dataset.


5. Open Science Framework

 OSF is a free, open-source research management and collaboration tool designed to help researchers document their project's lifecycle and archive materials. It is built and maintained by the nonprofit Center for Open Science.

Each user, project, component, and file is given a unique, persistent uniform resource locator (URL) to enable sharing and promote attribution. Projects can also be assigned digital object identifiers (DOIs) if they are made publicly available. 

OSF is a free service.


6. Zenodo

Zenodo is a general-purpose open-access repository developed under the European OpenAIRE program and operated by CERN. 

Zenodo was first born as the OpenAire orphan records repository, with the mission to provide open science compliance to researchers without an institutional repository, irrespective of their subject area, funder or nation. 

Zenodo encourages users to early on in their research lifecycle to upload their research outputs by allowing them to be private. Once an associated paper is published, datasets are automatically made open.

Zenodo has no restriction on the file type that researchers may upload and accepts dataset of up to 50 GB.

 

Conclusion

Research data can save lives, help develop solutions and maximise our knowledge. Promoting collaboration and cooperation among a global research community is the first step to reduce the burden of wasted research.

Although the waste of research data is an alarming issue with billions of euros lost every year, the future is optimistic. The pressure to reduce the burden of wasted research is pushing journals, funders and academic institutions to make data sharing a strict requirement.  

We hope with this series of articles on data sharing that we can light up the path for many researchers who are weighing the benefits of making their data open to the world.

The six research data repositories shared in this article are a practical way for researchers to preserve datasets across time and maximize the value of their work.

Cover image by Copernicus Sentinel data (2019), processed by ESA, CC BY-SA 3.0 IG.

References:

“Harvard Dataverse,” Harvard Dataverse, https://library.harvard.edu/services-tools/harvard-dataverse

“Recommended Data Repositories.” Nature, https://go.nature.com/2zdLYTz

“DOI Marketing Brochure,” International DOI Foundation, http://bit.ly/2KU4HsK

“Managing and sharing data: best practice for researchers.” UK Data Archive, http://bit.ly/2KJHE53

Wikipedia contributors, “Figshare,” Wikipedia, The Free Encyclopedia, https://en.wikipedia.org/w/index.php?title=Figshare&oldid=896290279 (accessed August 20, 2019).

Walport, M., & Brest, P. (2011). Sharing research data to improve public health. The Lancet, 377(9765), 537–539. https://doi.org/10.1016/s0140-6736(10)62234-9

Foster, E. D., & Deardorff, A. (2017). Open Science Framework (OSF). Journal of the Medical Library Association : JMLA, 105(2), 203–206. doi:10.5195/jmla.2017.88

Wikipedia contributors, "Zenodo," Wikipedia, The Free Encyclopedia, https://en.wikipedia.org/w/index.php?title=Zenodo&oldid=907771739 (accessed August 20, 2019).

Wikipedia contributors, "Dryad (repository)," Wikipedia, The Free Encyclopedia, https://en.wikipedia.org/w/index.php?title=Dryad_(repository)&oldid=879494242 (accessed August 20, 2019).

“How and Why Researchers Share Data (and Why They don't),” The Wiley Network, Liz Ferguson, http://bit.ly/31TzVHs

“Frequently Asked Questions,” Mendeley Data, https://data.mendeley.com/faq


Dear Digital Diary,

I realized that there is an unquestionable comfort in being misunderstood. For to be understood, one must peel off all the emotional layers and be exposed.

This requires both vulnerability and strength. I guess by using a physical diary (a paper and a pen), I never felt like what I was saying was analyzed or judged. But I also never thought I was understood.

Paper does not talk back.Using a daily digital diary has required emotional strength. It has required the need to trust and the need to provide information to be helped and understood.

Using a daily diary has needed less time and effort than a physical diary as I am prompted to interact through mobile notifications. I also no longer relay information from memory, but rather the medical or personal insights I enter are real-time behaviours and experiences.

The interaction is more organic. I also must confess this technology has allowed me to see patterns in my behaviour that I would have otherwise never noticed. I trust that the data I enter is safe as it is password protected. I also trust that I am safe because my doctor and nutritionist can view my records in real-time.

Also, with the data entered being more objective and diverse through pictures and voice recordings, my treatment plan has been better suited to my needs.

Sincerely,
No more elephants in this room

Diego Menchaca's profile picture

Diego Menchaca

Diego is the founder and CEO of Teamscope. He started Teamscope from a scribble on a table. It instantly became his passion project and a vehicle into the unknown. Diego is originally from Chile and lives in Nijmegen, the Netherlands.

More articles on

Data Sharing