The development of open educational resources about OScH has been suggested many times, and it has been included as one of the proposed actions in our roadmap. I didn’t know about any initiatives within GOSH about this (sorry if I missed something), so I thought one easy way to get started with this was to get OScH included in other (broader) educational resources. So I contacted Jon Tennant (@protohedgehog), who is coordinating the develpment of the Open Science MOOC (massive open online course), and he agreed to include contributions from the GOSH community about OScH into the course.
They have so far developed a very nice and comprehensive structure for the course, which comprises 10 modules. However, the proposed workflow for conducting open science did not include open hardware or materials - it started with the data already collected (this is very common in the open science movement, which seems to be quite dominated by data scientists).
I copied below the contents of Chapter 3, which was suggested by Jon as the place to include OScH. I already added some content (in italics), but it would be great if more people from GOSH could contribute. The google doc is open to contributions from anyone, so you can go there and add more things. I think it can be a good way to raise the visibility of OScH and hopefully get more people involved. And maybe create the basis for a more detailed MOOC about OScH.
3. Reproducible Research and Data Analysis
Rationale (3 lines max):
Reproducible research is at the heart of science. There has been an increased need and willingness to open and share research from the data collection right through to the interpretations of results. This has come with its own set of challenges, which include designing workflows that can be adopted by collaborators in a way that does not compromise the integrity of their contribution. This module will introduce the necessary tools required for transparent reporting which is reproducible and readable.
Learning Objectives (specific): Specific objective will include workflow design, source data management, data manipulation, dynamic reporting and reproducible analysis.
LO3a: Learn about the nature of reproducible research, what the key requirements are, and which resources are available to support a workflow for reproducible research (knowledge)
LO3b: Be able to use available resources for reproducible research; be able to use a workflow which leads to reproducible research
- Open materials and hardware
- Data analysis documentation and open workflows
- Living figures and Markdown
- Pre-registration and prevention of p-hacking
- Reproducible analysis environments (virtualization)
- What are the computing options and environments that allow collaborative and reproducible set up?
- Mention of open materials resources/repositories/standards
Who to involve:
- Andy Byers, Anna Krystalli, Julien Colomb, Rutger Vos
- Brian Nosek (COS) on lessons learned from Reproducibility Studies in Psychology & Cancer Biology
- Lorena Barba, Karl Broman
- GOSH community (openhardware.science)
- [other pre-registration sites - to add] asPredicted
- Jupyter notebooks & Rmarkdown
- Statcheck, GRIM
- VMs, Docker, Vagrant, binder, nteract.io
- Data hygiene and data provenance eg (link)
- R for Data Science r4ds
- Bookdown Yihui Bookdown
- Modern Drive Chester Ismay
- A Data Cleaner’s Cookbook
- ReproZip - Open source tool for full computational reproducibility
- Software Carpentry and Data Carpentry lessons
- Reproducibility PI Manifesto
- Initial steps toward reproducible research
- ROpenSci’s reproducibility guide
- Barba group Reproducibility Syllabus (summary of the group’s top-10 readings in reproducibility)
- Barba group’s onboarding course “Essential skills for reproducible research computing.” A four-day, intensive, hands-on workshop on the foundational skills that everyone using computers in the pursuit of scientific research should have.
- Barba, Lorena A. (2017): How to run a lab for reproducible research. Figshare, doi:10.6084/m9.figshare.4676170.v1 (presentation slides and presenter notes also on SpeakerDeck).
* Gathering for Open Science Hardware (GOSH, openhardware.science)
* Institutions/projects using open hardware/materials:
* CERN’s Open Hardware Repository (http://www.ohwr.org/) and Open Hardware License
* UFRGS Centro de Tecnologia Acadêmica (CTA, http://cta.if.ufrgs.br)
* Michigan Tech Open Sustainability Technology research group (http://www.mse.mtu.edu/~pearce/Index.html)
* Open Plant (https://www.openplant.org/)
- Find a core data set that is used throughout the examples
- If possible, the dataset should have a diverse set of formats and styles for different types of analysis
- Workflow design.
- A flowchart of options to help get you started ie. Are your collaborators/supervisors using the same tools (YES/NO)
- Possible collaboration with ScienceTogether based on AlternativeTo.net
- This can be created as a google doc and shared in here for collaboration