Introduction to Open Science
Overview
Teaching: 37 min
Exercises: 21 minQuestions
What is Open Science?
How can I benefit from Open Science?
Why has Open Science become a hot topic?
Objectives
Identify parts of the Open Science movement, their goals and motivations
Explain the main benefits of Open Science
Recognize the barriers and risks in the adoption of Open Science practices
Science works best by exchanging ideas and building on them. Most efficient science involves both questions and experiments being made as fully informed as possible, which requires the free exchange of data and information.
All practices that make knowledge and data freely available fall under the umbrella-term of Open Science/Open Research. It makes science more reproducible, transparent, and accessible. As science becomes more open, the way we conduct and communicate science changes continuously.
What is Open Science
Open science is the movement to make scientific research (including publications, data, physical samples, and software) and its dissemination accessible to all levels of an inquiring society, amateur or professional.
Open Science represents a new approach to the scientific process based on cooperative work and new ways of diffusing knowledge by using digital technologies and new collaborative tools
Open science is transparent and accessible knowledge that is shared and developed through collaborative networks.
Characteristics:
- Using web-based tools to facilitate information exchange and scientific collaboration
- Transparency in experimental methodology, observation, and collection of data
- Public availability and reusability of scientific data, methods and communications
What is the Open Science movement?
Sharing of information is fundamental for science. This began at a significant scale with the invention of scientific journals in 1665. At that time this was the best available alternative to critique & disseminate research, and foster communities of like-minded researchers.
Whilst this was a great step forward, the journal-driven system of science has led to a culture of ‘closed’ science, where knowledge or data is unavailable or unaffordable to many.
The distribution of knowledge has always been subject to improvement. Whilst the internet was initially developed for military purposes, it was hijacked for communication between scientists, which provided a viable route to change the dissemination of science.
The momentum has built up with a change in the way science is communicated to reflect what research communities are calling for – solutions to the majority of problems (e.g. impact factors, data reusability, reproducibility crisis, trust in the public science sector etc…) that we face today.
Open Science is the movement to increase transparency and reproducibility of research, through using the open best practices.
After Gema Bueno de la Fuente
Open Science Building Blocks
-
Open Access: Research outputs hosted in a way that make them accessible for everyone. Traditionally Open Access referred to journal articles, but now includes books, chapters or images.
-
Open Data: Data freely and readily available to access, reuse, and share. Smaller data sets were often accessible as supplemental materials by journals alongside articles themselves. However, they should be hosted in dedicated platforms for more convenient and better access.
-
Open Software: Software where the source code is made readily available; others are free to use, change, and share. Some examples of these including the coding language and supporting software R and RStudio, as well as image analysis software such as Fiji/ImageJ.
-
Open Notebooks: Lab & notebooks hosted online, readily accessible to all. These are popular among some of the large funding bodies and allow anyone to comment on any stage of the experimental record.
-
Open Peer Review: A system where peer review reports are published alongside the body of work. This can include reviewers’ reports, correspondence between parties involved, rebuttals, editorial decisions etc…
-
Citizens Science: Lay people become involved in scientific research, most commonly in data collection or image analysis. Platforms such as zooniverse.org help connect projects with lay people interested in playing an active role in research, which can help generate and/or process data which would otherwise be unachievable by one single person.
-
Scientific social networks: Networks of researchers, which often meet locally in teams, but are also connected online, foster open discussions on scientific issues. Online, many people commonly use traditional social media platforms for this, such as Twitter, Instagram, various sub-reddits, discussion channels on Slack/Discord etc…, although there are also more dedicated spaces such as researchgate.net.
-
Open Education resources: Educational materials that are free for anyone to access and use to learn from. These can be anything from talks, instructional videos, and explanations posted on video hosting websites (e.g. YouTube), to entire digital textbooks written and then published freely online.
-
Citizen science: Citizen participation of various stages of research process from project funding to collecting and analysing data.
Exercise 1: Benefits of Open Science (5 min)
Being open has other outcomes/consequences beyond giving the free access to information. For example, Open educational resources:
- enables collaborative development of courses
- improves teachers/instructors skills by sharing ideas
Select one or two of the following OS parts:
- Open Access
- Open Data
- Open Software
- Open Notebooks
- Open Peer Review
and discuss what are the benefits or what problems are solved by adaption of those Open initiatives.
Solution
Possible benefits and consequences for each part:
Open Access
- speed of knowledge distribution
- leveling field for underfunded sites which otherwise wouldn’t be able to navigate the paywall
- prevent articles being paid for ‘thrice’ (first to produce, second to publish, third to access) by institutions.
- greater access to work by others, increasing chance for exposure & citations
- access to work by lay audiences, thus increases social exposure of research
Open Data
- ensures data isn’t lost overtime - reusability
- acceleration of scientific discovery rate
- value for money/reduced redundancy
- permits statistical re-analysis of the data to validate findings
- gives access to datasets which were not published as papers (e.g. negative results, large screening data sets)
- provides an avenue to generate new hypotheses
- permits combination of multiple data sources to address questions, provides greater power than a single data source
Open Software
- great source to learn programming skills
- the ability to modify creates a supportive community of users and rapid innovation
- saves time
- faster bug fixes
- better error scrutiny
- use of the same software/code allows better reproducibility between experiments
- need funds to maintain and update software
Open Notebooks
- 100% transparent science, allowing input from others at early stages of experiments
- source of learning about the process of how science is actually conducted
- allows access to experiments and data which otherwise never get published
- provides access to ‘negative’ results and failed experiments
- anyone, anywhere around the world, at any time, can check in on projects, including many users simultaneously
- possibility of immediate feedback
- thorough evidence of originality of ideas and experiments, negating effect of ‘scooping’
Open Peer Review
- visibility leads to more constructive reviews
- mitigates against editorial conflicts of interest and/or biases
- mitigates against reviewers conflicts of interest and/or biases
- allows readers to learn/benefit from comments of the reviewers
Open Educational Materials
- Foster collaboration between educators/others
- Show clearly how method was taught (e.g. Carpentries materials) which can be reproduces anywhere, anytime
- protects materials from becoming technologically obsolete
- authors preparing the material or contribute all earn credit (e.g. GitHub)
- recycle animations and material that is excellent (why reinvent the wheel?)
Motivation: Money (8 min teaching)
One has to consider the moral objectives that accompany the research/publication process: charities/taxpayers pay to fund research, these then pay again to access the research they already funded.
From an economic point of view, scientific outputs generated by public research are a public good that everyone should be able to use at no cost.
According to EU report “Cost-benefit analysis for FAIR research data”, €10.2bn is lost every year because of not accessible data (plus additional 16bn if accounting for re-use and research quality).
The goals of Open Science is to make research and research data available to e.g. charities/taxpayers who funded this research.
COAlition S, a group of national research funding organisations backed by the European Commission and the European Research Council, is a big driver trying to get rid of the paywalls that our research is sat behind. They announced Plan S, an initiative to make research publications fully free at the point of access, meaning that all research funded by public funding bodies must be published Open Access from 2021 onwards.
Open Access (a successful example)
The majority of larger UK and other countries’ funding bodies are now making Open Access publication conditional upon funding.
The initiative is known as Plan S, which requires “resulting publications available immediately (without embargoes) and under open licences, either in quality Open Access platforms or journals or through immediate deposit in open repositories that fulfil the necessary conditions.”
Exact requirements differ between funding bodies, with the minimum requirement being that a copy be deposited with your home institution.
Details of funding bodies and their involvement and requirements can be found at Plan S/cOAlition S. There is also a cOAlition S journal checker tool to assess compliance being developed. The Directory of Open Access Journals (DOAJ) is a tool to find which journals are Open Access.
Motivation: Reproducibility
The inherited transparency of Open Science and the easy access to data, methods and analysis details naturally help to address the Reproducibility crisis. The openness of scientific communications and of the actual process of evaluation of the research (Open Peer Review) increase confidence in the research findings.
Personal motivators
Open Science is advantageous to many parties involved in science (including researcher community, funding bodies, the public even journals), which is leading to a push for the widespread adoption of Open Science practices.
Large UK funding bodies such as The Wellcome Trust are big supporters of Open Science. We can see with the example of Open Access, that once enforced by funders (the stick) there is a wide adoption. But what about the personal motivators, the carrots.
Exercise 2: Personal benefits of being “open” (4 min)
Below are some personal benefits to adopting Open Science practices. Read through them which of them are the strongest motivators for you. Select two the most important/attractive for you and mark them with +1, select the two least important for you and mark them with 0
- receive higher citations
- complying with funders’ policies
- get extra value from your work (e.g. collaborators, reuse by modellers, ML specialists)
- demonstrate research impact
- save own time (reproducibility but also communication overhead)
- become pioneers
- distinguish yourself from the crowd
- plan successful research proposals
- gain valuable experience
- form community
- increased speed and/or ease of writing papers
- speed up and help with peer review
- build reputation and presence in the science community
- evidence of your scientific rigour and work ethic
- avoid embarrassment/disaster when you cannot reproduce your results
Can you think of other benefits? How personal benefits of Open Science compare to the benefits for the (scientific) society?
(3 min teaching)
The main difference between the public benefits of Open Science practices and the personal motivators of outputs creators, that the public can benefit almost instantly from the open resources. However, the advantages for data creator comes with a delay, typically counted in years. For example, building reputation will not happen with one dataset, the re-use also will lead to citations/collaboration after the next research cycle.
Barriers and risks of OS movement:
Exercise 3: Why we are not doing Open Science already (4 min)
Discuss Open Science barriers, mention the reasons for not already being open:
Solution
- sensitive data (anonymising data from administrative health records can be difficult)
- IP
- misuse (fake news)
- lack of confidence (the fear of critics)
- lack of expertise
- the costs in $ and in time
- novelty of data
- it is not mandatory
- lack of credit (publishing negative results is of little benefit to you)
(9 min teaching)
It may seem obvious that we should adopt open science practices, but there are associated challenges with doing so.
Sensitivity of data is sometimes considered a barrier. Shared data needs to be compliant with data privacy laws, leading many to shy away from hosting it publicly. Anonymising data to desensitise it can help overcome this barrier.
The potential for intellectual property on research can dissuade some from adopting open practices. Again, much can be shared if the data is filtered carefully to protect anything relating to intellectual property.
Another risk could be seen with work on Covid19: pre-prints. A manuscript hosted publicly prior to peer review, may accelerate access to knowledge, but can also be misused and/or misunderstood. This can result in political and health decision making based on faulty data, which is counter to societies’ best interest.
One concern is that opening up ones data to the scientific community can lead to the identification of errors, which may lead to feelings of embarrassment. However, this could be considered an upside - we should seek for our work to be scrutinized and errors to be pointed out, and is the sign of a competent scientist. One should rather have errors pointed out rather than risking that irreproducible data might cause even more embarrassment and disaster.
One of the biggest barriers are the costs involved in “being Open”. Firstly, making outputs readily available and usable to others takes time and significant effort. Secondly, there are costs of hosting and storage. For example, microscopy datasets reach sizes in terabytes, making such data accessible for 10 years involves serious financial commitment.
Get involved
Thankfully, incentive structures are beginning to support Open Science practices:
- Universities signing up to the Declaration on Research Assessment (DORA).
- Wellcome Trust funding proposals that increase Open Science
- Wellcome Trust asked for description of Open Science activities in the grant application
You do not want to be left behind!
Where to next
Further reading/links:
Exercise 4: Open Science Quiz (5 min + runs over break)
Which of the following statements about the OS movement are true/false?
- Open Science relies strongly on the Internet
- Open Access eliminates publishing costs
- Open Data facilitates re-use
- Open Data can increases confidence in research findings
- In Open Peer Review, readers vote on publication acceptance
- Open Notebooks improve reproducibility
- Open Notebooks can create patenting issues
- Open Access permits the whole society to benefit from scientific findings
- Citizen Science engages the public in the research process
- Citizen Science can help get ecological data quickly
Solution
- Open Science relies strongly on the Internet T
- Open Access eliminates publishing costs F
- Open Data facilitates re-use T
- Open Data increases confidence in research findings T
- In Open Peer Review, readers vote on publication acceptance F
- Open Notebooks improve reproducibility T
- Open Notebooks can create patenting issues T*
- Open Access permits the whole society to benefit from scientific findings T
- Citizen Science engages the public in the research process T
- Citizen Science can help get ecological data quickly T
Attribution
Content of this episode was adapted from:
Key Points
Open Science increases transparency in research
Publicly funded science should be publicly available