Brussels / 1 & 2 February 2020

schedule

AMENDMENT Transforming scattered analyses into a documented, reproducible and shareable workflow


This presentation is a feedback from experience on helping a researcher transforming a series of scattered analyses into a documented, reproducible and shareable workflow.
Time allocated by researchers to program / code the analyses required to answer their scientific questions is usually low compared to other tasks. As a result, multiple small experiments are developed and outputs are gathered as best as possible to be presented in a scientific paper. However, science is not only about sharing results but also sharing methods. How can we make our results reproducible when we developed multiple, usually undocumented analyses? What do we do if the program is only applicable to our computer directory architecture? This is always possible to take time to rewrite, re-arrange and document analyses at the time we want/have to share them. Here, I will take the exemple of a "collaboration fest" where we dissected R scripts of a researcher in ecology. We started a reproducible, documented and open-source R-package along with its website, automatically built using continuous integration: https://cesco-lab.github.io/Vigie-Chiro_scripts/.
However, can we think, earlier in the process, a better way to use our small programming time slots by adopting a method that will save time in our future? In this aim, I will present a documentation-first method using little time while writing analyses, but saving a lot when the time has come to share your work.

Session type (Lecture or Lightning Talk)

Lecture

Session length (20-40 min, 10 min for a lightning talk)

30 min

Expected prior knowledge / intended audience

No prior knowledge expected. Example will be about building documentation for R software but any developper, using any programming language may be interested in the method adopted.

Speaker bio

Sébastien Rochette has a PhD in marine ecology. After a few years has a researcher in ecology, he joined ThinkR (https://rtask.thinkr.fr), a company giving courses and consultancy around the R-software. Along with commercial activities, he is highly involved in the development of open-source R packages. He also shares his experience with the R-community through free tutorials, blog posts, online help and other conferences. https://statnmap.com/

Links to code / slides / material for the talk (optional)

I wrote a blog post in French about what I am planning to present: https://thinkr.fr/transformer-plusieurs-scripts-eparpilles-en-beau-package-r/
This topic is also related to another blog post: https://rtask.thinkr.fr/when-development-starts-with-documentation/

Links to previous talks by the speaker

Talks about R are in my Github repository: https://github.com/statnmap/prez/. The "README" lists talks that have a live recorded video.
As a researcher, I also gave multiple talks about marine science, modelling and other topics related to my research.

Please note that this talk was originally scheduled to be at 17h. The talk originally in this slot was "Developing from the field." by Audrey Baneyx and Robin de Mourat which will now take place at 17h.

Speakers

Sébastien Rochette

Links