Getting started

Prelude

Suppose you have data, lots of data. Perhaps they are about penguins, such these three different species of penguins:

Artwork by @allison_horst


If you’re here it’s because you want to learn how to go from this:


to this:

Table 1: Summary of bill and flipper lengths by species
Species Count Bill length (mm) Flipper length (mm)
M SD M SD
Adelie 152 38.79 2.66 189.95 6.54
Chinstrap 68 48.83 3.34 195.82 7.13
Gentoo 124 47.50 3.08 217.19 6.48


or this:


Collecting a large amount of data and looking at it in Excel or Numbers is not helpful for humans, it does not give us any insights or knowledge.

Knowledge is obtained by creating suitable summaries and visual displays from the data.

What you need

To succeed in this bootcamp you will only need:

  1. a laptop

Note: If you are using a Chromebook, please contact us via email.

  1. active learning

Just reading the material won’t be enough, you need to type along the code and get familiar with errors.

  1. willingness to learn

If you approach the material with an inquisitive attitude, it will be easier to learn.

R

What is R?

R is a programming language: an actual language that a computer can understand. The purpose of a programming language is to instruct the computer to do some boring and long computations on our behalf.

When you learn to program you are in fact learning a new language, just like English, Italian, and so on. The only difference is that, since we will be communicating with a machine, the language itself needs to be unambiguous, concise, and hence very limited in its grammar and scope. Basically, a programming language follows a very strict set of rules. The computer will do exactly what you type. It will not try to understand what you want it to do and, if you make a language error, the computer will not fix it, but it will just execute exactly what you said.

If you commit an error, there are two possible outcomes:

  1. The computation goes ahead without any sign of errors or messages. This is the most worrying type of error as it’s hard to catch. You will get a result for your computation, but it may make no sense.
  2. The computer will tell you that what you’re asking to do doesn’t make sense. Easier to fix!

The programming language that you will learn is called R. It also has a very fancy logo:

The code you type using the R programming language then will need to be converted to lower level instructions for the computer, such as “store this number into memory location with a specific address”. This is done by the interpreter which, is also called R. So R is both the programming language and the interpreter telling the computer what to do with your commands.

How does R look? Exactly as the picture below. It comes into a window called the Console, which is where any R code you type there will be executed.

Installing/updating R

Click the section that applies to your specific PC to expand it.

Installing for Windows

Windows PC

  1. If you are updating R, uninstall all previous R or RTools programs you have installed in your PC.

  2. Install RTools

  3. Install R

Installing for macOS

macOS

  1. If you are updating R, uninstall all previous R installations you have by moving the R icon from the Applications folder to the Bin.

  2. Install XQuartz

  3. Install R. Click the release that has title R-Number.Number.Number.pkg (for example R-4.1.0.pkg but this will change in the future).

RStudio

What is RStudio?

RStudio is a nicer interface to R. It is simply a wrapper around R that combines the R Console, a text editor, a file explorer, a help panel, and a graphics panel to see all your pictures.

In summary:

Source: www.moderndive.com


Let’s see how RStudio looks:

It has four panels or panes, described below. You can customise the appearance of the panes by clicking in the menu View -> Panes -> Pane Layout.

  1. Environment. The environment shows the things you have created, for example data.

  2. Plots and Files. The plots and files panel displays any plots you create and, if you click the files tab, it has a file explorer for you to find files and data stored in excel or similar. There is also a Help tab here, which is where you get help for R code.

Installing/updating RStudio

  1. If you have a previous version of RStudio already installed, uninstall it (if on Windows), or move it to the bin (if on a macOS).

  2. At this link, scroll down until you see RStudio Desktop and “Download RStudio”.

  3. Open RStudio, type the following in the console, and press Enter after each line

options(pkgType = "binary")
update.packages(ask = FALSE)

Update regularly

It is IMPORTANT that you keep your R and RStudio installations up-to-date. If you don’t you will encounter many errors at some point.

Postlude

Whenever we say “open R” or “using R”, what we really mean is “open RStudio” or “using RStudio”.

You should always using RStudio to write code. So, even if you will have two applications in your computer: R and RStudio, you will only need to open RStudio for your day-to-day work.

Readings

For further information, check to the following: