2  Control Flow

2.1 Indexing

2.1.1 Vectors

In the Section 1.5, we discussed about different types of R objects. For example, a vector can be a certain data type with a set number of elements. Here we construct a vector called x increasing from -5 to 5 by one unit:

(x <- -5:5)
 [1] -5 -4 -3 -2 -1  0  1  2  3  4  5

The vector x has 11 elements. If I want to know what the 6th element of x, I can index the 6th element from a vector. To do this, we use [] square brackets on x to index it. For example, we index the 6th element of x:

x[6]
[1] 0

When ever we use [] next to an R object, it will print out the data to a specific value inside the square brackets. We can index an R object with multiple values:

x[1:3]
[1] -5 -4 -3
x[c(3,9)]
[1] -3  3

Notice how the second line uses the c(). This is necessary when we want to specify non-contiguous elements. Now let’s see how we can index a matrix

2.1.2 Matrices

A matrix can be indexed the same way as a vector using the [] brackets. However, since the matrix is a 2-dimensional objects, we will need to include a comma to represent the different dimensions: [,]. The first element indexes the row and the second element indexes the columns. To begin, we create the following \(4 \times 3\) matrix:

(x <- matrix(1:12, nrow = 4, ncol = 3))
     [,1] [,2] [,3]
[1,]    1    5    9
[2,]    2    6   10
[3,]    3    7   11
[4,]    4    8   12

Now to index the element at row 2 and column 3, use x[2, 3]:

x[2, 3]
[1] 10

We can also index a specific row and column:

x[2,]
[1]  2  6 10
x[,3]
[1]  9 10 11 12

2.1.3 Data Frames

There are several ways to index a data frame, since it is in a matrix format, you can index it the same way as a matrix. Here are a couple of examples using the mtcars data frame.

mtcars[,2]
 [1] 6 6 4 6 8 6 8 4 4 6 6 8 8 8 8 8 8 4 4 4 4 8 8 8 8 4 4 4 8 6 8 4
mtcars[2,]
              mpg cyl disp  hp drat    wt  qsec vs am gear carb
Mazda RX4 Wag  21   6  160 110  3.9 2.875 17.02  0  1    4    4

However, a data frame has labeled components, variables, we can index the data frame with the variable names within the brackets:

mtcars[, "cyl"]
 [1] 6 6 4 6 8 6 8 4 4 6 6 8 8 8 8 8 8 4 4 4 4 8 8 8 8 4 4 4 8 6 8 4

Lastly, a data frame can be indexed to a specific variable using the $ notation as described in Section 1.5.5.

2.1.4 Lists

As described in Section 1.5.6, lists contain elements holding different R objects. To index a specific element of a list, you will use [[]] double brackets. Below is a toy list:

toy_list <- list(mtcars = mtcars,
                 vector = rep(0, 4),
                 identity = diag(rep(1, 3)))

To access the second element, vector element, you can type toy_list[[2]]

toy_list[[2]]
[1] 0 0 0 0

Since the elements are labeled within the list, you can place the label in quotes inside [[]]:

toy_list[["vector"]]
[1] 0 0 0 0

The element can be accessed using the $ notation with a list:

toy_list$vector
[1] 0 0 0 0

Lastly, you can further index the list if needed, we can access the mpg variable in mtcars from the toy_list:

toy_list$mtcars$mpg
 [1] 21.0 21.0 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 17.8 16.4 17.3 15.2 10.4
[16] 10.4 14.7 32.4 30.4 33.9 21.5 15.5 15.2 13.3 19.2 27.3 26.0 30.4 15.8 19.7
[31] 15.0 21.4
toy_list[["mtcars"]]$mpg
 [1] 21.0 21.0 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 17.8 16.4 17.3 15.2 10.4
[16] 10.4 14.7 32.4 30.4 33.9 21.5 15.5 15.2 13.3 19.2 27.3 26.0 30.4 15.8 19.7
[31] 15.0 21.4
toy_list$mtcars[,'mpg']
 [1] 21.0 21.0 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 17.8 16.4 17.3 15.2 10.4
[16] 10.4 14.7 32.4 30.4 33.9 21.5 15.5 15.2 13.3 19.2 27.3 26.0 30.4 15.8 19.7
[31] 15.0 21.4

2.2 If/Else Statements

In R, there are control flow functions that will dictate how a program will be executed. The first set of functions we will talk about are if and else statements. First, the if statement will evaluate a task, If the conditions is satisfied, yields TRUE, then it will conduct a certain task, if it fails, yields FALSE, the else statement will guide it to a different task. Below is a general format:

Important Concept
if (condition) {
  TRUE task
} else {
  FALSE task
}

2.2.1 Example

Below is an example where we generate x from a standard normal distribution and print the statement ‘positive’ or ‘non-positive’ based on the condition x > 0.

x <- rnorm(1)

## if statements
if (x > 0){
  print("Positive")
} else {
  print("Non-Positive")
}
[1] "Non-Positive"

What if we want to print the statement ‘negative’ as well if the value is negative? We will then need to add another if statement after the else statement since x > 0 only lets us know if the value is positive.

x <- rnorm(1)

if (x > 0){
  print("Positive")
} else if (x < 0) {
  print("Negative")
}
[1] "Negative"

Above, we add the if statement with condition (x < 0) indicating if the number is negative. Lastly, if x is ever \(0\), we will want R to let us know it is \(0\). We can achieve this by adding one last else statement:

x <- rnorm(1)

if (x > 0){
  print("Positive")
} else if (x < 0) {
  print("Negative")
} else {
  print("Zero")
}
[1] "Positive"

2.3 for loops

A for loop is a way to repeat a task a certain amount of times. Every time a loop repeats a task, we state it is an iteration of the loop. For each iteration, we may change the inputs by a certain way, either from an indexed vector, and repeat the task. The general anatomy of a loop looks like:

Important Concept
for (i in vector){
  perform task
}

The for statement indicates that you will repeat a task inside the brackets. The i in the parenthesis controls how the task will be completed. The in statement tells R where i can look for the values, and vectorr is a vector R object that contains the values i can be. It also controls how many times the task will be repeated based on the length of the vector.

Learning about a loop is quite challenging, my recommendation is to read the section below and break the example code so you can understand how a for loop works.

2.3.1 Basic for loop

Let’s say we want R to print one to five separately. We can achieve this by repeating the print() 5 times.

print(1); print(2); print(3); print(4); print(5)
[1] 1
[1] 2
[1] 3
[1] 4
[1] 5

However, this takes quite awhile to type up. Let’s try to achieve the same task using a for loop.

for (i in 1:5){
  print(i)
}
[1] 1
[1] 2
[1] 3
[1] 4
[1] 5

Here, i will take a value from the vector 1:5,1 Then, R will print out what the value of i is.

Now, let’s try another example with letters. To begin, create a new vector called letters_10 containing the first 10 letters of the alphabet. Use the vector letters to construct the neww vector.

letters_10 <- letters[1:10]

Now, we will use a loop to print out the first 10 letters:

for (i in 1:10) {
  print(letters_10[i])
}
[1] "a"
[1] "b"
[1] "c"
[1] "d"
[1] "e"
[1] "f"
[1] "g"
[1] "h"
[1] "i"
[1] "j"

Here, we have i take on the values 1 through 10. Using those values, we will index the vector letters_10 by i. The resulting letter will then be printed. This task repeated 10 times.

Lastly, we can replace 1:10 by letters_10 instead:

for (i in letters_10){
  print(i)
}
[1] "a"
[1] "b"
[1] "c"
[1] "d"
[1] "e"
[1] "f"
[1] "g"
[1] "h"
[1] "i"
[1] "j"

This is because letters_10 are the values that we want to print and i takes on the value of letters_10 each time.

2.3.2 Nested for loops

A nested for loop is a loop that contain a loop within. Below is an example of 3 for loops nested within each other. Below is a general example:

Important Concept
for (i in vector_1) {
  for (ii in vector_2) {
    for (iii in vector_3) {
      perform task
    }
  }
}

As an example, we will use the greekLetter::2 and use the greek_vector vector to obtain greek letters in R. Lastly, create a vector called greek_10.

library(greekLetters)
greek_10 <- greek_vector[1:10]

For this example, we want R to print “a” and “\(\alpha\)” together as demonstrated below3:

print(paste0(letters_10[1], greek_10[1]))
[1] "aα"

Now let’s repeat this process to print all possible combinations of the first 3 letters and 3 greek letters:

for (i in 1:3){
  for (ii in 1:3){
    print(paste0(letters_10[i], greek_10[ii]))
  }
}
[1] "aα"
[1] "aβ"
[1] "aγ"
[1] "bα"
[1] "bβ"
[1] "bγ"
[1] "cα"
[1] "cβ"
[1] "cγ"

2.4 break

A break statement is used to stop a loop midway if a certain condition is met. A general setup of break statement goes as follows:

Important Concept
for (i in vector){
  if (condition) {break}
  else {
    task
  }
}

As you can see there is an if statement in the loop. This is used to tell R when to break the loop. If the if statement was not there, then the loop will break without iterating.

To demonstrate the break statement, we will simulate from a \(N(1,1)\) until we have 30 positive numbers or we simulate a negative number.

x <- rep(NA,length = 30)

for (i in seq_along(x)){
  y <- rnorm(1,1)
  if (y<0) {
    break
  }
  else {
    x[i] <- y
  }
}
print(x)
 [1] 0.9773483 1.7917295 1.2907964 2.6436599 0.8576497 2.0094081 2.2106120
 [8] 1.5479097        NA        NA        NA        NA        NA        NA
[15]        NA        NA        NA        NA        NA        NA        NA
[22]        NA        NA        NA        NA        NA        NA        NA
[29]        NA        NA
print(y)
[1] -0.1423665

Notice that the vector does not get filled up all the way, that is because the loop will break once a negative number is simulated

2.5 next

Similar to the break statement, the next statement is used in loops that will tell R to move on to the next iteration if a certain condition is met.

Important Note
for (i in vector){
  if (condition) {
    next
  } else {
    task
  }
}

The main difference here is that a next statement is used instead of a break statement.

Going back to simulating positive numbers, we will use the same setup but change it to a next statement.

x <- rep(NA,length = 30)

for (i in seq_along(x)){
  y <- rnorm(1,1)
  if (y<0) {
    next
  }
  else {
    x[i] <- y
  }
}
print(x)
 [1] 0.91270459 3.08197264 0.72089222 0.18219414 0.23886831 1.36808335
 [7] 3.40865796 0.27035560         NA         NA 1.56910622 1.27601839
[13] 1.93305823 0.77069335 1.41109492 1.97015699 1.51544717 1.24035773
[19]         NA         NA 0.05664593 1.96861889 1.14983838 1.10942886
[25] 1.16644878 0.29784533 0.48227478 1.51119269 1.30830747 1.39608537

As you can see, the vector contains missing values, these were the iterations that a negative number was simulated.

2.6 while loop

The last loop that we will discuss is a while loop. The while loop is used to keep a loop running until a certain condition is met. To construct a while loop, we will use the while statement with a condition attached to it. In general, a while loop will have the following format:

Important Concept
while (condition) {
  task
  update condition
}

Above, we see that the while statement is used followed by a condition. Then the loop will complete its task and update the condition. If the condition yields a FALSE value, then the loop will stop. Otherwise, it will continue.

2.6.1 Basic while loops

To implement a basic while loop, we will work on the previous example of simulating positive numbers. We want to simulate 30 positive numbers from \(N(0,1)\) until we have 30 values. Here, our condition is that we need to have 30 numbers. Therefore we can use the following code to simulate the values:

x <- c()
size <- 0
while (size < 30){
  y <- rnorm(1) 
  if (y > 0) {
    x <- c(x, y)
  }
  size <- length(x)
}
print(size)
[1] 30
print(x)
 [1] 0.005245146 0.280625589 0.526876606 0.822030249 2.246205824 1.952491935
 [7] 0.670100699 2.311234135 0.688772123 0.320199750 1.397227140 0.830938390
[13] 0.178526953 0.478543903 0.329451783 0.170460111 0.838914598 0.532007459
[19] 1.308559454 1.807544365 0.102020257 0.556702144 0.914246544 1.661145724
[25] 0.128944880 0.479945924 0.034947857 0.153277439 0.011630151 0.856472034

Notice that we do not use an else statement. This is because we do not need R to complete a task if the condition fails.

2.6.2 Infinite while loops

With while loops, we must be weary about potential infinite loops. This occurs when the condition will never yield a FALSE value. Therfore, R will never stop the loop because it does not know when to do this.

For example, let’s say we are interest if \(y=sin(x)\) will converge to a certain value. As you know it will not converge to a certain value; however, we can construct a while loop:

x <- 1
diff <- 1
while (diff > 1e-20) {
  old_x <- x
  x <- x + 1
  diff <- abs(sin(x) - sin(old_x))
}
print(x)
print(diff)

My condition above is to see if the absolute difference between sequential values is smaller than \(10^{-20}\). As you may know, the absolute difference will never become that small. Therefore, the loop will continue on without stopping.

To prevent an infinite while loop, we can add a counter to the condition statement. This counter will also need to be true for the loop to continue. Therefore, we can arbitrarily stop it when the loop has iterated a certain amount of times. We just need to make sure to add one to the counter every time it iterates it. Below is the code that adds a counter to the while loop:

x <- 1
counter <- 0
diff <- 1
while (diff > 1e-20 & counter < 10^3) {
  old_x <- x
  x <- x + 1
  diff <- abs(sin(x) - sin(old_x))
  counter <- counter + 1
}
print(x)
[1] 1001
print(diff)
[1] 0.09311106
print(counter)
[1] 1000