<- -5:5) (x
[1] -5 -4 -3 -2 -1 0 1 2 3 4 5
In the Section 1.5, we discussed about different types of R objects. For example, a vector can be a certain data type with a set number of elements. Here we construct a vector called x
increasing from -5 to 5 by one unit:
<- -5:5) (x
[1] -5 -4 -3 -2 -1 0 1 2 3 4 5
The vector x
has 11 elements. If I want to know what the 6th element of x
, I can index the 6th element from a vector. To do this, we use []
square brackets on x
to index it. For example, we index the 6th element of x
:
6] x[
[1] 0
When ever we use []
next to an R object, it will print out the data to a specific value inside the square brackets. We can index an R object with multiple values:
1:3] x[
[1] -5 -4 -3
c(3,9)] x[
[1] -3 3
Notice how the second line uses the c()
. This is necessary when we want to specify non-contiguous elements. Now let’s see how we can index a matrix
A matrix can be indexed the same way as a vector using the []
brackets. However, since the matrix is a 2-dimensional objects, we will need to include a comma to represent the different dimensions: [,]
. The first element indexes the row and the second element indexes the columns. To begin, we create the following
<- matrix(1:12, nrow = 4, ncol = 3)) (x
[,1] [,2] [,3]
[1,] 1 5 9
[2,] 2 6 10
[3,] 3 7 11
[4,] 4 8 12
Now to index the element at row 2 and column 3, use x[2, 3]
:
2, 3] x[
[1] 10
We can also index a specific row and column:
2,] x[
[1] 2 6 10
3] x[,
[1] 9 10 11 12
There are several ways to index a data frame, since it is in a matrix format, you can index it the same way as a matrix. Here are a couple of examples using the mtcars
data frame.
2] mtcars[,
[1] 6 6 4 6 8 6 8 4 4 6 6 8 8 8 8 8 8 4 4 4 4 8 8 8 8 4 4 4 8 6 8 4
2,] mtcars[
mpg cyl disp hp drat wt qsec vs am gear carb
Mazda RX4 Wag 21 6 160 110 3.9 2.875 17.02 0 1 4 4
However, a data frame has labeled components, variables, we can index the data frame with the variable names within the brackets:
"cyl"] mtcars[,
[1] 6 6 4 6 8 6 8 4 4 6 6 8 8 8 8 8 8 4 4 4 4 8 8 8 8 4 4 4 8 6 8 4
Lastly, a data frame can be indexed to a specific variable using the $
notation as described in Section 1.5.5.
As described in Section 1.5.6, lists contain elements holding different R objects. To index a specific element of a list, you will use [[]]
double brackets. Below is a toy list:
<- list(mtcars = mtcars,
toy_list vector = rep(0, 4),
identity = diag(rep(1, 3)))
To access the second element, vector element, you can type toy_list[[2]]
2]] toy_list[[
[1] 0 0 0 0
Since the elements are labeled within the list, you can place the label in quotes inside [[]]
:
"vector"]] toy_list[[
[1] 0 0 0 0
The element can be accessed using the $
notation with a list:
$vector toy_list
[1] 0 0 0 0
Lastly, you can further index the list if needed, we can access the mpg
variable in mtcars
from the toy_list
:
$mtcars$mpg toy_list
[1] 21.0 21.0 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 17.8 16.4 17.3 15.2 10.4
[16] 10.4 14.7 32.4 30.4 33.9 21.5 15.5 15.2 13.3 19.2 27.3 26.0 30.4 15.8 19.7
[31] 15.0 21.4
"mtcars"]]$mpg toy_list[[
[1] 21.0 21.0 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 17.8 16.4 17.3 15.2 10.4
[16] 10.4 14.7 32.4 30.4 33.9 21.5 15.5 15.2 13.3 19.2 27.3 26.0 30.4 15.8 19.7
[31] 15.0 21.4
$mtcars[,'mpg'] toy_list
[1] 21.0 21.0 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 17.8 16.4 17.3 15.2 10.4
[16] 10.4 14.7 32.4 30.4 33.9 21.5 15.5 15.2 13.3 19.2 27.3 26.0 30.4 15.8 19.7
[31] 15.0 21.4
In R, there are control flow functions that will dictate how a program will be executed. The first set of functions we will talk about are if
and else
statements. First, the if
statement will evaluate a task, If the conditions is satisfied, yields TRUE
, then it will conduct a certain task, if it fails, yields FALSE
, the else
statement will guide it to a different task. Below is a general format:
Below is an example where we generate x
from a standard normal distribution and print the statement ‘positive’ or ‘non-positive’ based on the condition x > 0
.
<- rnorm(1)
x
## if statements
if (x > 0){
print("Positive")
else {
} print("Non-Positive")
}
[1] "Non-Positive"
What if we want to print the statement ‘negative’ as well if the value is negative? We will then need to add another if
statement after the else
statement since x > 0
only lets us know if the value is positive.
<- rnorm(1)
x
if (x > 0){
print("Positive")
else if (x < 0) {
} print("Negative")
}
[1] "Negative"
Above, we add the if
statement with condition (x < 0)
indicating if the number is negative. Lastly, if x
is ever else
statement:
<- rnorm(1)
x
if (x > 0){
print("Positive")
else if (x < 0) {
} print("Negative")
else {
} print("Zero")
}
[1] "Positive"
for
loopsA for loop
is a way to repeat a task a certain amount of times. Every time a loop repeats a task, we state it is an iteration of the loop. For each iteration, we may change the inputs by a certain way, either from an indexed vector, and repeat the task. The general anatomy of a loop looks like:
The for
statement indicates that you will repeat a task inside the brackets. The i
in the parenthesis controls how the task will be completed. The in
statement tells R where i
can look for the values, and vectorr
is a vector R object that contains the values i
can be. It also controls how many times the task will be repeated based on the length of the vector.
Learning about a loop is quite challenging, my recommendation is to read the section below and break the example code so you can understand how a for
loop works.
for
loopLet’s say we want R to print one to five separately. We can achieve this by repeating the print()
5 times.
print(1); print(2); print(3); print(4); print(5)
[1] 1
[1] 2
[1] 3
[1] 4
[1] 5
However, this takes quite awhile to type up. Let’s try to achieve the same task using a for
loop.
for (i in 1:5){
print(i)
}
[1] 1
[1] 2
[1] 3
[1] 4
[1] 5
Here, i
will take a value from the vector 1:5
,1 Then, R will print out what the value of i
is.
Now, let’s try another example with letters. To begin, create a new vector called letters_10
containing the first 10 letters of the alphabet. Use the vector letters
to construct the neww vector.
<- letters[1:10] letters_10
Now, we will use a loop to print out the first 10 letters:
for (i in 1:10) {
print(letters_10[i])
}
[1] "a"
[1] "b"
[1] "c"
[1] "d"
[1] "e"
[1] "f"
[1] "g"
[1] "h"
[1] "i"
[1] "j"
Here, we have i
take on the values 1 through 10. Using those values, we will index the vector letters_10
by i
. The resulting letter will then be printed. This task repeated 10 times.
Lastly, we can replace 1:10
by letters_10
instead:
for (i in letters_10){
print(i)
}
[1] "a"
[1] "b"
[1] "c"
[1] "d"
[1] "e"
[1] "f"
[1] "g"
[1] "h"
[1] "i"
[1] "j"
This is because letters_10
are the values that we want to print and i
takes on the value of letters_10
each time.
for
loopsA nested for
loop is a loop that contain a loop within. Below is an example of 3 for
loops nested within each other. Below is a general example:
As an example, we will use the greekLetter::
2 and use the greek_vector
vector to obtain greek letters in R. Lastly, create a vector called greek_10
.
library(greekLetters)
<- greek_vector[1:10] greek_10
For this example, we want R to print “a” and “
print(paste0(letters_10[1], greek_10[1]))
[1] "aα"
Now let’s repeat this process to print all possible combinations of the first 3 letters and 3 greek letters:
for (i in 1:3){
for (ii in 1:3){
print(paste0(letters_10[i], greek_10[ii]))
} }
[1] "aα"
[1] "aβ"
[1] "aγ"
[1] "bα"
[1] "bβ"
[1] "bγ"
[1] "cα"
[1] "cβ"
[1] "cγ"
break
A break
statement is used to stop a loop midway if a certain condition is met. A general setup of break
statement goes as follows:
As you can see there is an if
statement in the loop. This is used to tell R when to break the loop. If the if
statement was not there, then the loop will break without iterating.
To demonstrate the break statement, we will simulate from a
<- rep(NA,length = 30)
x
for (i in seq_along(x)){
<- rnorm(1,1)
y if (y<0) {
break
}else {
<- y
x[i]
}
}print(x)
[1] 0.9773483 1.7917295 1.2907964 2.6436599 0.8576497 2.0094081 2.2106120
[8] 1.5479097 NA NA NA NA NA NA
[15] NA NA NA NA NA NA NA
[22] NA NA NA NA NA NA NA
[29] NA NA
print(y)
[1] -0.1423665
Notice that the vector does not get filled up all the way, that is because the loop will break once a negative number is simulated
next
Similar to the break
statement, the next
statement is used in loops that will tell R to move on to the next iteration if a certain condition is met.
The main difference here is that a next
statement is used instead of a break
statement.
Going back to simulating positive numbers, we will use the same setup but change it to a next
statement.
<- rep(NA,length = 30)
x
for (i in seq_along(x)){
<- rnorm(1,1)
y if (y<0) {
next
}else {
<- y
x[i]
}
}print(x)
[1] 0.91270459 3.08197264 0.72089222 0.18219414 0.23886831 1.36808335
[7] 3.40865796 0.27035560 NA NA 1.56910622 1.27601839
[13] 1.93305823 0.77069335 1.41109492 1.97015699 1.51544717 1.24035773
[19] NA NA 0.05664593 1.96861889 1.14983838 1.10942886
[25] 1.16644878 0.29784533 0.48227478 1.51119269 1.30830747 1.39608537
As you can see, the vector contains missing values, these were the iterations that a negative number was simulated.
while
loopThe last loop that we will discuss is a while loop. The while loop is used to keep a loop running until a certain condition is met. To construct a while loop, we will use the while
statement with a condition attached to it. In general, a while loop will have the following format:
Above, we see that the while
statement is used followed by a condition. Then the loop will complete its task and update the condition. If the condition yields a FALSE
value, then the loop will stop. Otherwise, it will continue.
while
loopsTo implement a basic while
loop, we will work on the previous example of simulating positive numbers. We want to simulate 30 positive numbers from
<- c()
x <- 0
size while (size < 30){
<- rnorm(1)
y if (y > 0) {
<- c(x, y)
x
}<- length(x)
size
}print(size)
[1] 30
print(x)
[1] 0.005245146 0.280625589 0.526876606 0.822030249 2.246205824 1.952491935
[7] 0.670100699 2.311234135 0.688772123 0.320199750 1.397227140 0.830938390
[13] 0.178526953 0.478543903 0.329451783 0.170460111 0.838914598 0.532007459
[19] 1.308559454 1.807544365 0.102020257 0.556702144 0.914246544 1.661145724
[25] 0.128944880 0.479945924 0.034947857 0.153277439 0.011630151 0.856472034
Notice that we do not use an else
statement. This is because we do not need R to complete a task if the condition fails.
while
loopsWith while loops, we must be weary about potential infinite loops. This occurs when the condition will never yield a FALSE
value. Therfore, R will never stop the loop because it does not know when to do this.
For example, let’s say we are interest if
<- 1
x <- 1
diff while (diff > 1e-20) {
<- x
old_x <- x + 1
x <- abs(sin(x) - sin(old_x))
diff
}print(x)
print(diff)
My condition above is to see if the absolute difference between sequential values is smaller than
To prevent an infinite while loop, we can add a counter to the condition statement. This counter will also need to be true for the loop to continue. Therefore, we can arbitrarily stop it when the loop has iterated a certain amount of times. We just need to make sure to add one to the counter every time it iterates it. Below is the code that adds a counter to the while loop:
<- 1
x <- 0
counter <- 1
diff while (diff > 1e-20 & counter < 10^3) {
<- x
old_x <- x + 1
x <- abs(sin(x) - sin(old_x))
diff <- counter + 1
counter
}print(x)
[1] 1001
print(diff)
[1] 0.09311106
print(counter)
[1] 1000