Module 1.2: Downloading R
Old Dominion University
You might be thinking, “why can’t we use Excel”?
It’s not that Excel is bad – it’s that R is better.
Using code allows for:
Perhaps the best argument for code over Excel is fixed vs marginal cost. Using Excel has a much lower fixed cost than learning to code. Everyone has seen excel, data is nicely formatted in cells, you can point-and-click to generate nearly everything. Programming is a language, and learning a language is slow and potentially painful. However, once you have some code written and it works, you never have to write it again! In other words, the future marginal cost of coding is much lower.
So, your next question might be, “why R? My comp-sci friends use Python, C++, and SQL.”
The following might seem abstract at the moment, but will hopefully become more concrete and clear as time goes on.
First, we have to download R via cran.r-project.org. This is the actual language, and what you should think of as the “brain” of R.
Second, we need to download RStudio via posit.co. If the previous download is the brain, you should think of RStudio as the body. We will only ever interact with R through RStudio in this course.
When you open RStudio for the first time, you will see four panels like in the previous image. It is likely that your version of RStudio has a white background with blue or black text. If you would like to change this, go to “Tools > Global Options… > Appearance > Editor theme”. I like a darker theme to make it easier on my eyes.
The four panels are as follows:
rm(list = ls())
.Before we start writing code, here are some important tips and tricks that will make your life (our lives) easier.
#
before you type something. This will help you remember what your code does after you have been away from it for a long time. Sometimes, re-reading (decyphering) uncommented code is harder than re-writing it from scratch.As a final tip, and this one is important, we are going to change some default settings to RStudio.
It might seem like these auto-saving features are a good idea, but trust me: you will be much better off without it.
R has a few different data types:
TRUE
and FALSE
values. Think of this like binary values (0
and 1
). Here is a picture of George Boole."
or '
.To execute / evaluate / run code in R, there a few different ways to do it. The easiest way is to highlight whatever you are interested in running, and typing ctrl
(Cmd
on Mac) + enter
. You can also just have your cursor on the line and use the same keys to run that specific line. There is also a button on the topright of the Source panel that says “Run”, which will do the same thing.
Before moving on to explore basic operations, I want to mention something you’ll see embedded throughout this course. I will be exhibiting code in each module in static code blocks. Many times, these code blocks, sometimes called code chunks, might generate output, plots, both, or nothing. Unfortunately, these code blocks are, for all intensive purposes, set in stone. In other words, besides collapsing/expanding them, you cannot really interact or experiment with them. This probably stifles student curiosity, since you’ll probably want to tweak things as you’re going through the notes.
To address this, I have included WebR
chunks into each module’s notes. These chunks will look a bit different from the static chunks, and I encourage you to interact with them! You can write, alter, and execute code inside each chunk, and each WebR
chunk will “remember” what you’ve run in other chunks. Go ahead and explore a bit with the chunks below:
Starting from data types, we can begin to perform operations on data.
Arithmetic for numeric values: Addition (+
), Subtraction (-
), Multiplication (*
), Division (/
)
Most, if not all, of the code blocks (and output) in this course will be collapsable. In other words, if you click on them, you can hide/display the code.
Try some of this in WebR:
Logic for boolean values: And (&
), Or (|
), Not (!
)
TRUE
only when both values are TRUE
.TRUE
only when both values are not FALSE
(or at least one is TRUE
).FALSE
statementTRUE
statementTry some of these with WebR:
Some other operations to note are <
, >
, >=
, and <=
, since these are in between logical and numerical operations.
We will discuss operations for characters and factors later in the course. However, there may be times where you will want to convert data from one type to another. To convert from a number or boolean to character, you can use as.character()
. To go from text to numeric, you can use as.numeric()
.
[1] "5"
[1] 5
[1] FALSE
[1] NA
Notice how the final line produces an NA
value. Seeing an NA
value is seeing R shrug its shoulders. It is not smart enough to know that "Five"
is 5
, so it returns a missing value. NA
values can mess up a lot of things in R. For example, what is the average of this collection of numbers: 2, 4, NA, 8
? R will return NA
when asked, because it isn’t sure how to think about the NA
. A helpful function, therefore, is is.na()
. This returns a boolean depending on the input.
As a final note, if you run the above in RStudio, you might get output saying Warning: NAs introduced by coercion
. This is R giving you a heads up about what I just mentioned above. Sometimes, this warning is expected, but other times it’s a good signal to check your data!
ECON 311: Economics, Causality, and Analytics