Peer Reviewers Right to View Code and Data

Nosotros live in the times of a reproducibility crisis in scientific discipline (Open Science Collaboration 2015; Baker 2016; National Academies of Sciences, Engineering, and Medicine 2019). Fifty-fifty Ecology and Evolutionary Biology have been shown to be affected past many of the factors involved in decreasing research transparency and enhancing publication bias (Fidler et al. 2017; Fraser et al. 2018; O'Grady 2020). Regrettably, nosotros practise not have plenty replication studies in Ecology and Evolutionary Biology to quantify the extent of the reproducibility problem (Kelly 2019; Fraser et al. 2020), only those published to date underscore potential issues (Clark et al. 2020a, b; Munday et al. 2020). It appears that Behavioral Environmental is not impervious to the reproducibility crisis (Parker 2013; Ilhe et al. 2017; Stevens 2017; Farrar et al. 2020).

There have been many calls and specific suggestions to revert reproducibility issues (Hampton et al. 2015; Munafo et al. 2017). 1 of the strategies to amend inquiry transparency is computational and statistical reproducibility (Buckheit and Donoho1995; National Academies of Sciences, Technology, and Medicine 2019), which has been specifically considered in many ecological journals (Mislan et al. 2016; Archmiller et al. 2020; Culina et al. 2020). The thought is to brand available not but the data just also the lawmaking (in R, Python, SAS, etc.) used to process raw information, simulate data, run statistical analyses, and develop figures so that readers can generate the same outcomes as those presented in the paper (Sandve et al. 2013; Ilhe et al. 2017; Perkel 2019; Powers and Hampton 2019). Many ecological journals are already request authors to make their data available with the publication of a paper, and authors appear to accept embraced this exercise (well-nigh 79% of papers comply; Culina et al. 2020), although a large proportion of these datasets appear incomplete to be reused (Roche et al. 2015). Some other journals have also made mandatory or encouraged the availability of lawmaking with the publication, but authors appear to non have followed this practise every bit much (but about 27% of papers comply; Culina et al. 2020). Other papers report like trends in Ecology (Mislan et al. 2016; Archmiller et al. 2020).

Powers and Hampton (2019) argue that data and lawmaking should actually become available during (instead of after) the peer review process. The motivation is to provide reviewers the opportunity to inspect the data structure, the code, and even run it if they wish to practice then. There are two components about reviewing the lawmaking: get-go, to ensure that the lawmaking provided past the authors produces the same results as those reported in the manuscript; 2nd, to ensure that the statistical analyses used in the lawmaking are aligned with the experimental design and structure of the information provided by the authors (i.e., model assumptions tested, appropriate data transformations, formulation of the statistical model matching the temporal and spatial construction of the data points, correct interpretation of statistical model coefficients, P-values, and effect sizes). The benefits of reviewing data/lawmaking are non small (Ilhe et al. 2017; Powers and Hampton 2019): (a) reviewers are provided with a new dimension in the evaluation of the strengths and weaknesses of the experimental blueprint, information collected, and analyses conducted; (b) reviewers tin then provide very specific comments to help authors improve the quality of the manuscript; (c) editors tin optimize the limited availability of reviewers' time by better establishing the fit of a manuscript to a given journal from a statistical viewpoint; and (d) the overall inquiry transparency of a discipline tin exist enhanced. Multiple guidelines have been proposed to implement computational and statistical reproducibility (Noble 2009; Hampton et al. 2015; Mislan et al. 2016; Wilkinson et al. 2016; Cooper and Hsing 2017; Ilhe et al. 2017; Powers and Hampton 2019; Van Lissa et al. 2020).

There is something very important for the whole community to realize: allowing data and code to exist available during peer review is not intended to identify scientific fraud. It is actually intended to heighten the quality of a manuscript! More specifically, the motivation is to ensure that the experimental design, structure of the data, and statistical analyses are all aligned logically. Reviewers may have no problem assessing that alignment with simple experimental designs, simple data sets, and simple statistical assay (east.g., contained sample t-tests). Yet, the aforementioned task may be much more challenging with more than circuitous experimental designs (e.g., repeated measures) and statistical analysis (e.grand., mixed models) often used in Behavioral Ecology. When manuscripts provide petty detail nigh their statistical analyses (e.g., "We analyzed our data with a linear mixed model."), reviewers are placed in a very difficult position that sometimes leads to an unproductive cycle: reviewers may not respond favorably to the manuscript, which deeply frustrates authors.

All the same sharing both information and lawmaking for peer review is not easy. In my lab, nosotros accept spent almost ii years figuring out strategies to implement computational and statistical reproducibility in our workflows. We transitioned to a point in which besides writing a manuscript, we spend days/weeks writing code that can be understood and run by others. Adopting this approach manifestly adds to the already long to-do lists of scientists who are already overworked and stressed with deadlines and the loftier level of competition to publish and get funded. The counterpoint is that the procedure and the outcome tin can exist quite rewarding. We have learned a lot about bottlenecks in the lab'southward workflows that made computational and statistical reproducibility challenging. We are now trying to bridge the gap in data curation and statistical procedures betwixt former and current lab members, and information technology will probably take a few years to close this gap. Still, moving forward, we strive to take data plus lawmaking ready for the submission of any manuscript irrespective of whether the journal makes it mandatory or not.

After all this hard work, I started noticing that my reference points for reviewing manuscripts (our own and others) shifted in a dramatic fashion. Now I have a difficult fourth dimension critically reviewing a manuscript without having extremely like shooting fish in a barrel admission to the data and the code used. For instance, by looking at the code, I can hands establish whether (and most importantly how) the assumptions of statistical models were checked and what was washed to deal with supposition violations without having to ask authors for more details every time I get a new version or having to assume that it was done this or that mode. Having access to code plus information has sharpened my ability to notice the strengths and weaknesses of a manuscript and offer a very detailed roadmap on how authors can make the all-time out of the information they gathered.

Given this paradigm shift, I take started requesting information plus code from both authors and collaborators in the last few months. Authors argued that they might be willing to brand information and code available subsequently the official credence of the manuscript. This respond appears rooted in the perception that their data and/or code could exist "scooped." Although understandable, this view is not aligned with the reality of mandatory open data (in some cases enforced by both funding agencies and journals) and open platforms (R, Python, etc.) scientists are embracing worldwide (Lai et al. 2019; http://r4stats.com/articles/popularity/). When journals request sharing data, authors can use a repository or an electronic supplementary file during submission. When information technology comes to code, using open up lawmaking from others (provided the source is best-selling) is accustomed in some scientific communities to maintain analytical standards or even improve them (Barnes 2010; Perkel 2019). Sharing data and/or code in certain repositories volition generate a DOI (Mislan et al. 2016; https://guides.github.com/activities/citable-lawmaking/), ensuring authors have an official tape of their work, and ultimately decreasing the chances of scooping. More than importantly, the peer review process is the moment to assess that computational and statistical robustness of the study to find areas for improvement. For instance, the 3 times I accept asked authors to share data and code as a reviewer/editor, I plant that the code did non match the structure of the data and when stock-still, the results were slightly to very different from the ones originally reported in the first version of the manuscript. Although no decision can be fatigued from a sample size of 3, the contention still stands: reviewer-suggested modifications of the statistical analyses could modify the results and their discussion in major ways. If those modifications in analyses/estimation are major and occur later publication, information technology could even atomic number 82 to the retraction of papers (examples: https://retractionwatch.com/), questioning the brownie of not only authors but also reviewers/editors (Anonymus 2020; Wessel 2020).

When I asked collaborators most sharing data and lawmaking, their responses showed an extremely high caste of variation: from receiving them right away to dissimilar levels of reticence, including initial second guessing, and fifty-fifty declining the request altogether arguing it was offending. The latter response is also understandable. Authors cascade their hearts and souls into developing their experimental designs, deciding on specific statistical tests, processing information, running the analyses, etc. When a collaborator out of the blue asks to run across the data and code, it can lead to perceptions of mistrust. Still, this tin can also be an opportunity to educate ourselves and colleagues. First, when setting up a research collaboration, we ought to lay out the expectations on computational and statistical reproducibility between labs. My lab now states that whatever information and code nosotros generate, we will share them (whether requested or non) as new versions are developed and that we expect the other lab/s to re-run it to brand sure we are on the same analytical page. We too mention that we would appreciate this to be reciprocated during the collaboration. For collaborations already established, our strategy is now to explain the value of computational and statistical reproducibility, attaching the following papers (Mislan et al. 2016; Cooper and Hsing 2017; Powers and Hampton 2019), and eventually kindly asking for information and lawmaking. These strategies have smoothed out many of the negative responses nosotros used to go. More importantly, running the lawmaking of collaborators has led to strengthening the connections with them. For instance, we now accept virtual sessions running sections of code that are unclear, which often turn into excellent conversations about model assumptions, data transformation, statistical model set-up, and even interpretations of the results. These sessions take become great learning resources for everyone involved. When grad students and post-docs participate, these sessions help them better understand the mechanics and value of research collaborations between labs.

Editors also have concerns about encouraging (or making it mandatory for) authors to share data and code at the moment of submission. The argument is that it is already difficult to find reviewers to read manuscripts; thus, burdening reviewers fifty-fifty more to also assess code and data could unbalance the already strained peer-review process (Hatchwell 2017). This is a fair comment, but information technology needs to be considered in a broader context. I of the main reasons behind promoting computational/statistical reproducibility during peer review is to strengthen the belittling support of the results of a written report. If a periodical does non make reviewing data/lawmaking even an option, we run the risk that manuscripts published in that journal might not take reached their peak in computational and statistical quality. This could injure the reputation of a journal. Ultimately, this dilemma can be thought of as a trade-off between reviewers' time allocation and quality of published papers. At that place are several non-mutually sectional strategies to embrace a civilisation of reviewing data/code by reducing the high implementation costs per reviewer: (a) adopting volunteer/paid code editors for merely circuitous statistical procedures (due east.g., mixed models, neural networks, etc.), (b) encouraging (rather than expecting) reviewers to likewise take a expect at the code if willing to do so, (c) adding a reviewer to simply go over the code on summit of the two ofttimes asked to review the whole manuscript, (d) encouraging authors to upload pre-prints before submission with data and code to get comments from the community and submitting those comments related to data/code implementation (and the authors' answers) as part of the submission procedure, (east) requiring authors to get a document on code reproducibility earlier submission via https://codecheck.org.uk/ to ensure their code can exist run successfully by anybody, (f) encouraging authors to have a colleague (not associated with the authorship of the manuscript) run the code for reproducibility purposes and provide their name and contact info every bit role of the submission process, etc. The principal idea is that the more people see the authors' code, the higher the chances of improving it (Ilhe et al. 2017). That in itself can raise the overall quality of manuscripts published in a given journal, benefiting the whole subject.

For the whole community to buy in the submission of data and code for peer review, editorial offices should consider irresolute the way manuscripts are evaluated. I of the fears authors may take is that reviewers' suggestions related to data processing and data analysis may change the results of the study; and consequently, the interest of the periodical in further considering the manuscript for publication. This fearfulness stems from a culture in which the chances of acceptance rely heavily on "cute results" rather than on the strength of the experimental design addressing a given question. Equally many advocates of open up science have stated, nosotros demand to focus on "beautiful methods" rather than "beautiful results" (Nosek et al. 2012). This might not be a popular editorial shift in some journals, but it has the potential to strengthen our discipline by providing authors some certainty that modification of the results (every bit long as they have a well-defined question followed past a strong experimental design) through the peer review of their code will not be penalized. This is not a small detail equally publishing is all the same an important component of advancing a scientist's career.

There are a few boosted arguments that can be made to hopefully tip the balance in the direction of submitting data and code for peer review in the context of research reproducibility.

First, once a lab embraces a culture of sharing information and lawmaking for peer review, its publication charge per unit may increase because unpublished manuscripts by former students may be easier to submit by PIs and new grad students given the clear understanding of how the data were collected and analyzed in the past. This is not commonly the instance when labs do not take high standards of computational and statistical reproducibility, leading to datasets that are never published, which hurts both the lab as well equally the entire discipline.

Second, making the data and lawmaking attainable for peer review may facilitate the work of reviewers in major ways, which might pb to more positive decisions nearly revisions (instead of plain rejections), boosting the chances of eventual credence of a manuscript. Obviously, this line of logic would require proper empirical testing.

Tertiary, sharing data and code for peer review provides an opportunity for the analytical foundations of a report to stand out on themselves rather than the name and/or prestige of some principal investigators and/or institutions. This can promote multifariousness and inclusion goals in bookish environments (Grahe et al. 2019).

Fourth, preparing the information and code for peer review can facilitate the process of making it open right afterwards the manuscript is accepted. This in turn can lead into the data and code to be more than easily reused by the community, increasing the chances of new collaborations, boosting newspaper commendation rates (Piwowar et al. 2007; Piwowar and Vision 2013; McKiernan et al. 2016; Christensen et al. 2019; Colavizza et al. 2020), and making it possible for others to add the results of the study in meta-analyses as effect sizes and measures of variation tin can be more hands extracted. Additionally, papers with easily accessible data and code can become greater exposure to undergraduate and graduate students taking statistics courses as instructors are oftentimes in search of papers with open data and lawmaking to provide real-life examples in their classes. Ultimately, open up information and code tin can be thought of equally a learning resources for early career researchers in our bailiwick.

The have-home message is that making data and code available for peer review (and reviewing it) does have implementation costs; however, the multiple benefits may outweigh these costs. Editors of journals who publish behavioral environmental papers take the opportunity to pb the style towards shifting research practices and make transformative and lasting changes. At that place are some journals that are taking the necessary steps for data/code availability during peer reviews, such every bit Behavioral Ecology and Sociobiology (Bakker and Traniello 2020). The promise is that we soon motion to a new standard in peer review in the behavioral sciences. Information technology is a timely and healthy move for Behavioral Ecology.

References

  • Anonymus (2020) Editorial: regarding mentorship. Nat Commun 11:6447. https://doi.org/10.1038/s41467-020-20618-ten

  • Archmiller AA, Johnson Ad, Nolan J, Edwards Thou, Elliott LH, Ferguson JM, Iannarilli F, Vélez F, VitenseK JDH, Fieberg J (2020) Computational reproducibility in the wild animals society's flagship journals. J Wildlife Manage 84:1012–1017. https://doi.org/x.1002/jwmg.21855

    Article  Google Scholar

  • Baker Thou (2016) Is there a reproducibility crisis? Nature 533:452–454

    CAS  Article  Google Scholar

  • Bakker TCM, Traniello JFA (2020) Ensuring data access, transparency, and preservation: mandatory data deposition for Behavioral Environmental and Sociobiology. Behav Ecol Sociobiol 74:132

  • Barnes Due north (2010) Publish your computer lawmaking: information technology is proficient plenty. Nature 467:753

  • Buckheit JB, Donoho DL (1995) WaveLab and reproducible inquiry. In: Antoniadis A, Oppenheim K (eds) Wavelets and Statistics. Springer-Verlag, New York, pp 55–81

    Affiliate  Google Scholar

  • Christensen 1000, Dafoe A, Miguel E, Moore DA, Rose AK (2019) A study of the impact of data sharing on commodity citations using journal policies as a natural experiment. PLoSONE 14:e0225883. https://doi.org/10.1371/journal.pone.0225883

  • Clark TD, Raby GD, Roche DG, Binning SA, Speers-Roesch B, Jutfelt F, Sundin J (2020a) Ocean acidification does not impair the behaviour of coral reef fishes. Nature 577:370–375

    CAS  Article  Google Scholar

  • Clark TD, Raby GD, Roche DG, Binning SA, Speers-Roesch B, Jutfelt F, Sundin J (2020b) Reply to: Methods matter in repeating ocean acidification studies. Nature 586:E25–E27

    CAS  Article  Google Scholar

  • Colavizza K, Hrynaszkiewicz I, Staden I, Whitaker M, McGillivray B (2020) The citation advantage of linking publications to research data. PLoS One fifteen:e0230416. https://doi.org/10.1371/journal.pone.0230416

  • Cooper Northward, Hsing P-Y (eds) (2017) A guide to reproducible code in Ecology and Evolution. British Ecological Society, London, https://www.britishecologicalsociety.org/wp-content/uploads/2017/12/guide-to-reproducible-code.pdf. Accessed 12 Jan 2021

  • Culina A, van den Berg I, Evans S, Sánchez-Tójar A (2020) Depression availability of code in ecology: a call for urgent activity. PLoS Biol eighteen:e3000763

  • Farrar BG, Boeckle M, Clayton NS (2020) Replications in comparative cognition: what should we expect and how can we improve? Anim Behav Cogn 7:i–22

    Article  Google Scholar

  • Fidler F, Chee YE, Wintle BC, Burgman MA, McCarthy MA, Gordon A (2017) Metaresearch for evaluating reproducibility in ecology and evolution. Bioscience 67:282–289

    PubMed  PubMed Primal  Google Scholar

  • Fraser H, Barnett A, Parker Thursday, Fidler F (2020) The role of replication studies in ecology. Ecol Evol 10:5197–5207

    Article  Google Scholar

  • Fraser H, Parker T, Nakegawa S, Barnett A, Fidler F (2018) Questionable inquiry practices in ecology and evolution. PLoS 1 13:e0200303

  • Grahe JE, Cuccolo G, Leighton DC, Cramblet Alvarez LD (2019) Open science promotes diverse, just, and sustainable inquiry and educational outcomes. Psychol Learn Teach xix:v–xx

    Article  Google Scholar

  • Hampton SE, Anderson SS, Bagby SC et al (2015) The Tao of open science for ecology. Ecosphere 6:one–thirteen. https://doi.org/x.1890/ES14-00402.1

    Commodity  Google Scholar

  • Hatchwell BJ (2017) Replication in behavioural environmental: a comment on Ihle et al. Behav Ecol 28:360. https://doi.org/10.17226/25303

  • Ihle M, Winney IS, Krysalli A, Croucher M (2017) Striving for transparent and credible research: practical guidelines for behavioral ecologists. Behav Ecol 28:348–354

    Article  Google Scholar

  • Kelly CD (2019) Rate and success of report replication in ecology and evolution. PeerJ 7:e7654. https://doi.org/10.7717/peerj.7654

  • Lai J, Lortie CJ, Muenchen RA, Yang J, Ma K (2019) Evaluating the popularity of R in ecology. Ecosphere 10:e02567. https://doi.org/10.1002/ecs2.2567

  • McKiernan EC, Bourne PE, Dark-brown CT et al (2016) How open up scientific discipline helps researchers succeed. eLife 5:e16800

  • Mislan KAS, Heer JM, White EP (2016) Evaluating the condition of lawmaking in Ecology. Trends Ecol Evol 31:4–7

    CAS  Article  Google Scholar

  • Munafò MR, Nosek BA, Bishop DVM, Button KS, Chambers CD, Percie du Sert N, Simonsohn U, Wagenmakers Eastward-J, Ware JJ, Ioannidis JPA (2017) A manifesto for reproducible science. Nat Hum Behav 1:0021. https://doi.org/10.1038/s41562-016-0021

  • Munday PL, Dixson DL, Welch MJ et al (2020) Methods matter in repeating ocean acidification studies. Nature 586:E20–E24

    CAS  Article  Google Scholar

  • National Academies of Sciences, Engineering, and Medicine (2019) Reproducibility and replicability in science. The National Academies Press, Washington DC

  • Noble WS (2009) A quick guide to organizing computational biology projects. PLoS Comput Biol 5:e1000424. https://doi.org/10.1371/journal.pcbi.1000424

  • Nosek BA, Spies JR, Motyl M (2012) Scientific utopia: II. Restructuring incentives and practices to promote truth over publishability. Perspect Psychol Sci seven:615–631

    Article  Google Scholar

  • O'Grady C (2020) Ecologists button for more reliable research. Science 370:1260–1261

    Commodity  Google Scholar

  • Open Science Collaboration (2015) Estimating the reproducibility of psychological science. Science 394:aac4716. https://doi.org/ten.1126/scientific discipline.aac4716

  • Parker TH (2013) What practise nosotros actually know about the signaling function of plume colour in bluish tits? A instance study of impediments to progress in evolutionary biology. Biol Rev 88:511–536

    Commodity  Google Scholar

  • Perkel JM (2019) Paper lets scientists play with each other'south results. Nature 567:17–18

    CAS  Article  Google Scholar

  • Piwowar HA, Day RS, Fridsma DB (2007) Sharing detailed research information is associated with increased citation rate. PLoS ONE 2:e308. https://doi.org/ten.1371/periodical.pone.0000308

  • Piwowar HA, Vision TJ (2013) Data reuse and the open data citation reward. PeerJ 1:e175

  • Powers SM, Hampton SE (2019) Open science, reproducibility, and transparency in ecology.Ecol Appl 29:e01822. https://doi.org/x.1002/eap.1822

  • Roche DG, Kruuk LEB, Lanfear R, Binning SA (2015) Public data archiving in Environmental and Development: how well are we doing? PLoSBiol thirteen:e1002295. https://doi.org/10.1371/journal.pbio.1002295

  • Sandve GK, Nekrutenko A, Taylor J, Hovig E (2013) Ten simple rules for reproducible computational research. PLoS Comput Biol nine:e1003285

  • Stevens JR (2017) Replicability and reproducibility in comparative psychology. Front Psychol 8:862

  • Van Lissa CJ, Brandmaier AM, Brinkman L, Lamprecht A, Peikert A, Struiksma ME, Vreede B (2020) WORCS: a workflow for open up reproducible code in science. PsyArXiv, https://doi.org/10.31234/osf.io/k4wde

  • Wessel Fifty (2020) Afterward scalding critiques of study on gender and mentorship, journal says information technology is reviewing the work. Science, https://doi.org/10.1126/scientific discipline.abf8164

  • Wilkinson MD, Dumontier K, Aalbersberg IJ et al (2016) The Off-white guiding principles for scientific data management and stewardship. Sci Data 3:160018. https://doi.org/10.1038/sdata.2016.18

Download references

Acknowledgements

I truly thank Theo Bakker, James Traniello, Susan Healy, Nancy Solomon, Tim Parker, and Brain Nosek for sharing their views and input on this topic and my lab members for embracing the transition into computational and statistical reproducibility.

Author information

Affiliations

Corresponding author

Correspondence to Esteban Fernández-Juricic.

Additional information

Publisher's notation

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Fernández-Juricic, Due east. Why sharing data and code during peer review can raise behavioral ecology inquiry. Behav Ecol Sociobiol 75, 103 (2021). https://doi.org/10.1007/s00265-021-03036-x

Download citation

  • Published:

  • DOI : https://doi.org/10.1007/s00265-021-03036-10

johnsonfaciabove.blogspot.com

Source: https://link.springer.com/article/10.1007/s00265-021-03036-x

0 Response to "Peer Reviewers Right to View Code and Data"

Post a Comment

Iklan Atas Artikel

Iklan Tengah Artikel 1

Iklan Tengah Artikel 2

Iklan Bawah Artikel