UQ Library 2024-11-18
We will use Git through GitHub Desktop, you will need to create a GitHub Account and install GitHub Desktop.
Account creation instructions are available on this page.
Installation instructions are available on this page.
If you need to collaborate on a project, a script, some code or a document, there are a few ways to operate. Sending a file back and forth and taking turns is not efficient; a cloud-based office suite requires a connection to the Internet and doesn’t usually keep a clean record of contributions.
Version control allows users to:
- record a clean history of changes;
- keep track of who did what;
- go back to previous versions;
- work offline; and
- resolve potential conflicts.
Programmers use version control systems to collaborativelly write code all the time, but it isn’t just for software: books, papers, small data sets, and anything that changes over time or needs to be shared can be stored in a version control system.
A version control system is a tool that keeps track of changes for us, effectively creating different versions of our files. It allows us to decide which changes will be made to the next version (each record of these changes is called a commit), and keeps useful metadata about them. The complete history of commits for a particular project and their metadata make up a repository. Repositories can be kept in sync across different computers, facilitating collaboration among different people.
GitHub is a Git host server which stores your Git repository online. This means that you can easily use this platform to share your work, find the work of others, and collaborate. There are many other Git host servers such as GitLab and BitBucket.
GitHub can be quite intimidating the first time you come across it. Let’s break things down a bit.
You can search for repositories in a search engine, or on https://github.com/
For example, if you search for spotify artists analysis on GitHub you can see many projects relating to spotify.
The highest and most popular link listed is generally the one that you’re after.
You should find your way to this repository: https://github.com/khanhnamle1994/spotify-artists-analysis
Click the link to have a look.
A first look at a Github repository can be intimidating, but you should initially ignore the folder structure you see, and scroll down to the Readme section where you can usually find details surrounding what the project is, how to install/use it, and get more help.
The about section on the right gives a brief overview of the repository. Below that it has links to:
- The Readme
- The License type (it’s important to give your code a license so others know what they can do with it, here’s a helpful resource for choosing the right one).
- Popularity (those who like this repo)
- Watching (those want to be aware of changes)
- Forks (those who have created their own spinoffs)
Near the top of the repository you will see a clock icon followed by some numbers. This is the History of edits to this repository. You can click on this an see the entire history of changes made to this code.
Once you go back to the folder structure you can dig in to find the code
that makes up this repository. If you navigate to Data-Processing.R
you can see the code that underpins the data processing process.
To clone, or make a copy, of the code, you can simply click the
**Code**
button on the main repository screen, and import it into
GitHub Desktop, allowing you to edit and make your own alterations to,
and versions of the code before you.
Once you have created a GitHub account, and have GitHub Desktop installed, open GitHub Desktop.
We have a few options here:
- You can clone (make a copy of) a repository that you (or someone else) has previously created on the internet
- You can create a repository within an already existing folder for a project you’re already working on
- You can create a new folder and repository when you begin your project
Let’s create a new folder for our project today by doing the following:
- Click
+ Create a New Repository on your hard drive...
or go toFile > New Repository...
or pressCtrl + N
- In the Create a new repository window fill in the details for your repository:
- Name:
Portfolio
- Description:
A portfolio of my coding projects
- Local path: In this case we’re going to create a new folder in our
Documents click
Choose...
and select Documents - GitHub will create a new folder here for our project to be stored in. (generally you want to save this in the project folder you will be working from going forward) - Tick the Initialize this repository with a README box
- Leave Git ignore and License as
None
for now.
Now that we’ve created our repository, let’s populate it with some files.
Open RStudio create a new R script file File > New File > R Script
Add a line of code to get started:
# a basic R comment
print("Hello World")
Save your script File > Save As...
navigate to your portfolio folder
and save your file as process
If you navigate back to GitHub, you will see that it has already identified that there is a new change in our repository. > They appear green to demonstate that these changes have added something new, as opposed to deleting something. We can now begin the process of commiting and backing up these changes.
Instead of viewing your version history as a series of documents with different changes (e.g. thesis_final.pdf, thesis_final_1.pdf, etc.), Git views your document as a compilation, or a stack, of different changes through time. This means that you can go back in time to view each change as it was commited to the main document.
When we commit a change, it writes that change to the branch of our
repository that we choose to, in this case, that is called main
. This
means that if you want to do some experimenting with your code, you can
instead commit it to an experimental
branch so that you don’t break
your main code while you play. If you want more info about Branching, I
have provided a link at the bottom of this document.
Before commiting your changes, it’s best practice to briefly describe your changes.
Type a brief description into the box in the bottom right which
currently reads Update process.R
.
- Type something along the lines of “Added a comment and line of code”.
You can provide more detail in the
Description
box below if necessary.
Comments describing our commits are a very important part of Git, as they allow us to quickly visual changes at a glance.
Now we can commit our changes click Commit to **main**
(this is where
you could commit it to a different branch if it were an experimental
change, but for today we’re just committing to main).
We can go to the History tab to look at this saved commit.
This has committed the version history to our local device, but if we want to share this with others, or store it online, we will need to Publish/Push the file to GitHub.
To do this we will first need to sign in to our GitHub.com account:
- Navigate to
File > Options... . Accounts > Sign in > Continue with browser
- Sign in to GitHub in your browser
- It will then take you back to GitHub Desktop with you signed in (you may need to click to allow this in your browser)
You can publish your repository in any one of three different ways:
- clicking the blue
Publish Repository
button - going to
Repository > Push
- or pressing
Ctrl + p
You can choose to keep your code private for now, and simply make it
public later on. Click Publish Repository
.
You will see that the Publish Repository
button has changed to
Fetch origin
. Origin refers to the online repository where you
code is kept, you should click this before starting to work on code,
just in case changes have been made elsewhere that you need to pull to
the device you’re working on.
Let’s have a look at what it looks like when we edit a file.
Go to R and edit our previous line to: print("Hello GitHub")
If you return to GitHub, you will see the former text in red, and our new addition highlighted in green.
Before we can push this to GitHub, we need to:
- Describe our edits “changed the print code”
- Then click
Commit to **main**
- Then press
Ctrl + P
to push it to GitHub.
If you click your profile picture in the top right of Github online, you can view all Your repositories.
From here you can access your portfolio repository, and then the process.R file we pushed earlier.
Once you’ve navigated to the file, you can click the pencil button to edit the code online.
- Make some edits to the code
- Provide a description of your changes in the Commit changes section (these descriptions may feel unnecessary, but generally when you make changes, you’re changing a number of things, and this can be very very helpful).
- Then click the
Commit changes
button.
On the desktop you would need to push to the repository, as we are already in the external repository, you don’t need to do that, however next time you access the file in another location, you should Pull the data.
Before doing any edits on a file in a repository that is worked on by many individuals or on different devices, it’s best to Fetch/Pull your repository. You can do this in Github Desktop by clicking Pull origin up the top. This will pull the changes from the online repository, and let you know if there are any new additons, deletions, or, if you have simultaneously changed the same section of code, any conflicts (which need to be fixed and resolved before continuing).
We can see the creation, updates, and commits on the left side of Github Desktop.
If we simultaneously make edits online, and then make edits on desktop without Fetching/Pulling first, we will have conflicts. Let’s create a conflict.
- Edit the document online, add
# Edited online
, then Commit - Edit the document on desktop, add
# Edited on Desktop
, save the file, then Commit - Click Push
You will receive an error that there are newer commits on the remote. Before we Push, we will need to Fetch
- Click Fetch
- Click Pull Origin
A popup will appear asking you to resolve the conflicts in the document before you merge. We need to open the file to view and remove those conflicts.
- Open the file with the conflict. You will see code similar to that below:
<<<<<<< HEAD
# Edited on Desktop
=======
# Edited Online
>>>>>>> 7e2e0c3b4a819b3cf3c285a47a862f975629bf8a
- Here we decide what we want to keep, and what we want to delete. Once we have decided what to keep, we need to remove the unwanted code, as well as the tags that Git has inserted. You will want to make sure the code conflicts are resolved, and that the code works. In this case, let’s keep both edits, and simply remove the tags that Git added. Your code should look like this:
print("Hello Github")
# I added this code on GitHub Online
# Edited on Desktop
# Edited Online
- Save your file.
- Return to GitHub Desktop, and it should now update to show that there are no conflicts remaining. Click Continue merge
- The conflicts are now resolved, and you can once again Push our edits to our GitHub repository.
We can now have a look at the history tab to see the history we’re starting to create.
Let’s navigate to the file location, and then delete the file so that we can see the process behind this. Once you delete the file, Github Desktop will show this as a Change where all lines of code were removed. We can now add comments, commit this change, and then push it to the repository.
Even though we have deleted the file locally, and pushed that deletion to the web based repository, we can revert that change within Github Desktop (you can also do this through the Git command line, but note that you cannot do this on Github online.
Within Github Desktop click the History tab, right click on the commit where the file was deleted, and select Revert changes in commit. This will bring our file back.
Before you push to a git repository, it’s worth making sure that you want to commit everything in that repository. For example, there may be private documents, or large datafiles you do not want to commit and upload. You can set these to be ignored with Gitignore.
Let’s create some documents we want GitHub to ignore.
- Navigate to your Portfolio folder
- Right-click inside the folder and select New > Text Document, and call it example.txt
Now let’s have GitHub ignore this file. You can do this in Github
Desktop by going to
Repository > Repository Settings... > Ignored files
and then entering
the locations, names, or types of files you want to ignore. Let’s enter
example.txt.
You can also cover all instances of a particular document type by entering something similar to *.dat (the asterix will mean that any file of that type will be ignored)
The Readme is often the first place people go when looking at a Git Repository, so it’s important to have useful information here, and displayed in a meaningful way. This is especially the case for your Portfolio.
You can use GitHub to create and display your own work. This good example has their details, their achievements, and links to all their major projects: https://github.com/archd3sai/Portfolio
When editing a Readme file you can format it using a simple coding language called Markdown, as well as HTML coding.
Today we’re going to finish by creating a simple Readme template.
- Navigate to your GitHub repository online.
- Click the pencil icon to edit your README.md
- Create a heading by entering
# Analytics Portfolio - <your name>
You can create headings using # symbols, # will create the largest heading, and ###### will create the smallest
- On the next line, enter some normal text describing your portfolio
This **Portfolio** contains all of my Analytics projects
You can Bold text by putting it between a set of double asterisks, eg. *example text* becomes example text.
- Create a dot point with a link to your contact details by entering
- **Email**: [yourname@email.com](yourname@email.com)
A - symbol will give you a bullet point. You can add links, including emails, by wrapping the link text in square brackets
[ ]
, and then putting the link address in parentheses( )
.
- Add a second level heading by entering
## Projects
- Insert an image by entering the following HTML code:
<img align="left" width="250" height="150" src="https://user-images.githubusercontent.com/67612228/184837530-9a4537b3-22f0-495c-90d1-6ccdcb4bc4bd.png">
- Scroll to the bottom and descibe your commit, and click
**Commit** changes
There is a lot more customisation you can do, and you can find a complete breakdown of Git Markdown here.
- Branching (creating experimental branches of your code)
- Fulling cloning others’ code (this can vary from straightforward to complex)
- Collaboration
- Making your Repo Public
This course borrows information from the Git Bash course Git version control for collaboration which is based on the longer course Version Control with Git developped by the non-profit organisation The Carpentries. The original material is licensed under a Creative Commons Attribution license (CC-BY 4.0), and this modified version uses the same license. You are therefore free to:
- Share — copy and redistribute the material in any medium or format
- Adapt — remix, transform, and build upon the material
… as long as you give attribution, i.e. you give appropriate credit to the original author, and link to the license.