-
In recent years, open-source software has become an ever increasing important solution in the architecture of software infrastructure, spilling from its niche of core developers to the world of business to that of research and development.
-
Open-source software aligns with the goals of open science, which aims to remove any barrier to knowledge in research and the dissemination of scientific results.
-
Making supporting data freely and easily available makes reproducibility tests easier. Even more, open-source libraries (where a library is a self-contained software package) make code reusable. The shift from just aiming at reproducible data to providing reusable code can reshape the way scientific discovery is performed.
-
Some of the recent incentives in academia have leaned toward competition and personal achievements, from the focus on metrics of published papers to personal citations. Open-source libraries can in many cases realign collaboration and cooperation with scientific progress.
-
Python provides an intuitive syntax that facilitates debugging and focusing on the research project.
-
Python is an open-source language that has witnessed a steady growth since the 2000s.
-
Python provides a rich ecosystem of excellent libraries for numerical and scientific investigations, such as Numpy, Scipy, Matplolib, and Cython.
-
Python libraries are generated by an inclusive, grassroot community of developers. Most of the Python open-source libraries are maintained by volunteers, often academic researchers in the case of scientific computing.
-
Install Python. Mac machines currently come with Python 2.x already installed. Since most scientific libraries in the Python ecosystem have made plans to stop supporting Python 2.x distributions, it is better to install a 3.x version.
-
Download Python 3.x with
pip
orconda
. Pip is the Python Package Index and helps manage installations. Anaconda is a package manager that comes with a bundle of libraries. For most cases downloadingminiconda
, which is an essential version of Anaconda, a lighter version from the Terminal and with only a limited number of libraries, might work. -
Download
git
. Git is a language for version control and continuous integration. git can be downloaded at https://git-scm.com/. -
Set up a GitHub personal account. You can use this to manage your folders ("repositories"), copy code, download it and upload it, allowing other users to collaborate to the same project. GitHub is not the only project allowing this, another less popular project is GitLab. https://github.com/join.
-
StackOverflow is basically a forum where you will most likely end up for code-related questions from an appropriate Google search. Do not worry about ending up there continuously, this happens to most coders almost every day.
-
GitHub you can sometimes find issues and pull requests (PR) relevant to your problem. Sometimes you might find out that you did everything as supposed to, just the library, protocol or platform itself has an outstanding issue. This can be initially self-reassuring ("Yay! I did everything right"), before desperation kicks in (you are literally at the border of known solutions, something similar to what happens when you do research... hack a solution or hope someone does it soon enough).
-
Libraries documentations: It is sometimes a bit hard to understand the documentation of a given function. It just takes time and some practice.
-
Peers: fellow coders or researchers are sometimes the best allies. Before bugging with a random question, check that this has not been answered somewhere, but then do not be intimidated by raising an Issue on a project page, making a question on a Google Help group or on gitter, or just poking the coder down the corridor.
Advance to Section 1 - Managing a Code Project.