forked from CRIStAL-PADR/reproducible-research-SE-notes
-
Notifications
You must be signed in to change notification settings - Fork 1
/
version_control.pillar
42 lines (29 loc) · 3.32 KB
/
version_control.pillar
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
!! Version Control Systems (Guille 20%)
Version control systems (VCSs) are one of the cornerstones of reproducibility.
Informally speaking, a VCS works as a database that saves versions of your project.
That is, it will store all changes you will do (as soon as you instruct it to do so).
Thus, the main feature of a VCS is that it allows us to freeze our project at any point in time, and then query and recover old versions as they were.
These features will improve our day-to-day work with two more direct benefits: they remove fear to change and allow us to clean up unused and old code.
Unexperienced developers fear changing some parts of their project.
This happens mainly when developers cannot guarantee that the changed code is right or will work properly.
The absence or the complexity of testing feeds this fear even more.
However, when using a VCS, making mistakes is not painful anymore.
We can come back to any saved version and ignore or discard potentially wrong versions.
Finally, a VCS does not only store our versions but other useful meta-data.
For example, it saves the timestamp of any change, its author, and some useful comments.
This avoids adding this kind of identifying data in the code, leaving the code more readable and single-purpose, while we let the VCS automatically manage and label changes.
In this chapter we will explore the basics of VCSs with Git and Github. We have chosen Git because of its popularity nowadays, and Github because it is one of the most prominent platforms for Git. However, most of the concepts can be easily applied to other VCSs and platforms. This chapter will start with the setup of a repository, how to store changes into it and how to investigate/query them. The second part of the chapter will show some other advanced features such as tagging and branches.
!!! Amateurs, engineers and researchers: What are VCSs useful for?
VCSs are a technology that are mostly used to store programming source code.
However, they can most of the times store any kind of files: from text files to binary files.
This means that we can use VCSs to version text documents, pictures, websites, pdfs, excel files or others.
Software engineers use VCSs to manage their software projects.
They store the project's versions to store all file changes in (at a minimum) a daily basis.
They may use it generally to store the project's documentation.
These technologies can be, however, of value in other fields, particularly in research, where the reproducibility of experimental results and documents is important.
Researchers can use VCSs to version experimental setups and their results, to track the advance and changes of their research papers.
While this chapter will cover the basics of VCSs using Git, the workflow specific chapters will show how to apply these techniques in specific scenarios: reproducible code, reproducible papers, reproducible documentation.
!!! Terminology
A Git repository is like a database storing changes. In the modern world, good practice is to store the main repository in a remote machine, usually a server hosted in our company/university or in the cloud. We avoid storing our main repository only in our machine, as losing our machine would then result in losing our project. As we shall see, a remotely-stored Git repository advantageously replaces manual backups.
!!! Metaphores
!!! Semantic Versionning