[1] 4
2024-01-01
Why is it called R?
Created as a programming language for statistics and graphics.
Created as a programming language for statistics and graphics
IMO, easier to learn than Python or Julia
Chart showing Usage Shares of Programming Languages in Economics Research.
Taken from Economics and R Blog of Sebastian Kranz. Calculated from file extensions used in code files for published papers.
Order of operations as expected
Booleans are objects that are either TRUE
or FALSE
They are very useful.
&
|
Value Matching %in%
Compares if object on left is “in” the object on the right1
Computers handle decimals in odd ways!
? ? ?
Because computers use binary, cannot represent 0.1 exactly
Same as we cannot represent 1/3 in base 10 exactly
In R, we use a special “arrow” operator for assignment: <-
Why not use =
?
Technically you can in R, but <-
is preferred becauase =
is used to assign values for a function call.
We have seen the “sequence” operator :
Which it turns out is just a shortcut for seq()
How do we know all of the options for a function?
Documentation!
This documentation is also hosted online
Never use the following:
All of this is important information!
But it shouldn’t be stored as a comment.
Instead we should use a task manager, variables declared at the top of a script, git for version control, etc.
You may be tempted to add this sort of comment
But what happens when you decide to change your code?
If someone else reads your code, do they trust
Code forces you to do exactly what you say (i.e. square, not cube). But comments do not, so they tend to get out of sync with the code.
“Good code does not need comments”
This is the goal.
Your code should be readable without any comments.
But that’s probably unrealisitc for most of us.
Some good rules:
These and more from: Best Practices for Writing Code Comments
I like to use comments to give sections to my code
I find this useful as a way to structure my code and make it more readable later on.
Character
Logical (“boolean”)
Integer
Numeric
Complex (imaginary numbers)
Raw (bytes)
Characters (a.k.a. “strings”) store text information
We saw logical types before. Stored as TRUE
or FALSE
.
All whole numbers (no decimal): (…, -2, -1, 0, 1, 2, …)
An exact number storage, compared to the approximate “numeric” type.
To create an integer value, add an L
at the end of the number
Can be useful for setting ID values,
but usually we will store numbers as “numeric” type instead.
Numeric is a class that stores numbers as floating point values.
In R, “double” is the only numeric type.
Equivalent to “float64” in other languages.
There used to be a “single” precision. Equivalent to “float32”.
R has a full set of as.___()
functions for each type.
[1] "12"
[1] 12
[1] 0c
Some languages are very strict about data types, R is not.
This is convenient, but somewhat dangerous.
R will try to convert other types to a string to paste.
Vectors are an ordered set of values all of the same type.
They are created with the c()
function (for “concatenate”).
Vectors all have lengths.
Vectors can have names for each element.
Vector elements can be accessed by their position, or name,
or using square brackets[]
.
A lot of fuctions are “vectorized” to apply to each element.
We will learn later how to vectorize any function.
Vectors are only one dimensional.
What if I need to store a mix of data types? - Use a list!
Each element has preserved its type!
We can again access list elements by their index position.
Note: x[3]
returns a list with one element
Lists can also have names for each element. We can assign them using names()
or at construction.
We saw earlier that 1 element objects are actually vectors.
This means that we can have lists of multiple element vectors.
We can also have lists of lists!
$l1
$l1[[1]]
[1] 5 6 7
$l1[[2]]
[1] "A" "D" "E"
$l2
$l2[[1]]
[1] 1 2 3 4 5
$l2[[2]]
[1] TRUE
This can be as many list layers deep as you want.
Lists are much more general than vectors.
So why use vectors?
Think of them as “tables” of elements.
Behind the scenes, they are a
Data.frame values can be accesesd using index values:
x[row, col]
You can leave one index blank to get a whole row or column.
The function str()
will return information about the data structure of the passed object.
A specific type of vector.
Details to be covered in the problem set!
If statements evaluate a condition,
and then execute code if the condition is TRUE.
Nothing is printed. Because print()
never was run.
Sometimes you want to check a series of conditions,
x <- "5"
if (is.numeric(x)) {
print("X is a number.")
} else if (is.character(x)) {
print("X is a character.")
}
[1] "X is a character."
This code,
else
To catch any cases that do not pass any condition, you can use
x <- NA
if (is.numeric(x)) {
print("X is a number.")
} else if (is.character(x)) {
print("X is a character.")
} else {
print("I'm not sure what X is.")
str(x)
}
[1] "I'm not sure what X is."
logi NA
If (and if-else) statements are the basics of controlling the flow of your program.
You can make sections of code that only execute for one dataset, or a robustness check that runs on only one model.
Loops are another key component for controlling your program flow.
Two basic loops are:
for(){}
while(){}
For loops execute code for a defined number of times.
While loops execute code repeatedly until a condition is met.
Here we emulated the function of the for
loop from before.
It is easy to write a while loop that will run forever.
This one is inane, but you can inadevertantly construct them.
While loops let you execute code for an unspecified duration.
If you are going to use a while loop, it’s a good idea to set a “safety” option to limit the maximum number of iterations.
There are some disadvantages to loops,
One alternative is to use one of the family of apply()
functions.
lapply()
We will see how to use the l
+ apply()
function.
l
stands for “list”, which is what the function returns.lapply()
One nice thing about lapply()
is it returns the values as list.
Apply functions assume that each of your elements can be operated on separately.
For loops operate on each element sequentially.
lapply()
Comments
Commenting code can be useful to yourself and others.
In R, commments are any line that begins with
#
In VS Code
⌘k⌘c
⌘k⌘u