Chapter 6 Control Flow
6.1 Conditioning
6.1.1 if
The if
statement (a function indeed) can handle conditional branching of the codes. It can be followed by optional else if
and else
branches.
if ( TRUE ) {
print("Yes it is true.")
} else if ( TRUE ) {
print("This is never printed.")
} else {
print("When nothing above holds, this is printed.")
}
## [1] "Yes it is true."
Notice that when else if
exists, conditions are checked in order. The first matched branch will be executed and all the other dsicarded. This is why the second branch is never triggered in the above example even if the condition is TRUE
as well.
The condition in if
can not handle logical vectors with length more than one. In the following example, a warning will be issued with the result clearly explained.
if ( c(TRUE, FALSE) ) {
1 + 1
}
## Warning in if (c(TRUE, FALSE)) {: the condition has length > 1 and only the
## first element will be used
## [1] 2
This is why sometimes people think there does exist scalar variable in R. But the truth is that everything is in vector form.
There is a more functional way of using if
. Consider this example:
# set.seed(777)
cond <- sample(1:10, 1)
res <- if ( cond > 5 ) "foo" else "bar"
res
## [1] "bar"
The above code is stochastic due to sample
without a fixed seed
. Here one directly asign the result of an if
to a variable. Such syntax only works in functional language. It works because if
is a function, and a function should always return something–here evaluation of the last expression of if
is the returned value of it.
When the conditioning is more complicated, a recommended indent style may look like:
# set.seed(777)
cond <- sample(1:10, 1)
res <-
if ( cond > 5 ) {
message("cond is ", cond, ": go if")
"foo"
} else {
message("cond is ", cond, ": go else")
"bar"
}
## cond is 1: go else
res
## [1] "bar"
6.1.2 ifelse
There is a vectorized version of if
: the ifelse
function. Now it accepts a vector of any length:
vec_cond <- sample(1:10, 10, replace=TRUE) > 5
ifelse(sample(1:10, 10, replace=TRUE) > 5, 1:10, -(1:10))
## [1] 1 -2 -3 -4 -5 6 -7 -8 9 -10
Notice that the call to sample
results in a vector of length 10, the second argument is a vector when the condition holds, and the third for condition not holds. What happen if the vector lengths are different? Recycling will occur. See section 8.3 for more details about recycling.
6.1.3 Logical Operations
There are in general two types of logical operators in R. The single form (like &
, |
) and the double form (like &&
annd ||
). The former is vectorized, i.e., element-wise operator, and the latter is scalar-like: only the first element is processed. To see things clear:
c(TRUE, FALSE, FALSE) & -1:1
## [1] TRUE FALSE FALSE
c(TRUE, FALSE, FALSE) && -1:1
## [1] TRUE
Users who are familiar with other programming language may feel confused because in most general-purposed language only the double form is used as logical operator and the single form is used to perform bit-wise operation. In the R language this is simply not true. The single form serves both as bit-wise and also vectorized logical operator. Use xor
for vectorized logical AND bit-wise XOR operator.
One should also note that there is implicit casting (or coercion) happening in the above example. Numeric vectors are coerced into logical ones before a logical operation actually takes place.
6.2 Loop
6.2.1 repeat
To repeat a code chunk, use repeat
. Since there is no conditioning in repeat
, usually users must also specify a conditonal break
by if
to avoid an infinite loop.
cnt <- 0
repeat {
print("Hello World!")
cnt <- cnt + 1
if ( cnt >= 5 )
break
}
## [1] "Hello World!"
## [1] "Hello World!"
## [1] "Hello World!"
## [1] "Hello World!"
## [1] "Hello World!"
6.2.2 while
The while
loop is another common control flow available in most programming language. It accepts a condition and as long as the condition holds the code chunk will repeat. The condition is checked at the very begining of each run.
cnt <- 0
while ( cnt < 5 ) {
print("Hello while!")
cnt <- cnt + 1
}
## [1] "Hello while!"
## [1] "Hello while!"
## [1] "Hello while!"
## [1] "Hello while!"
## [1] "Hello while!"
There is no “until” loop in R. One can use repeat
to implement an “until” function. Also a while ( TRUE )
implementation can be readily replaced with repeat.
6.2.3 for
Another must-have loop structure is the for
loop. It implements iteration operations. The basic syntax of for
would look like:
for ( i in 1:5 )
print(i)
## [1] 1
## [1] 2
## [1] 3
## [1] 4
## [1] 5
Remember that one can use if
functionally. That is, we cab asign the returned value of a if
to a variable. Is that also true for for
? Consider the following codes:
res <- for ( i in 1:10 ) i
res
## NULL
i
## [1] 10
Oops! the variable res
has a value of NULL
, not a value of the last evaluated i
(which shall be 10). Does this mean that for
is not a function? No. Every operation in R is a function call. if
is a function. for
is a function. But for
is a function that will return NULL
by default and that is not alterable. More interesting, for
actually returns NULL
invisibly, so one won’t get a NULL
printed to the console everytime when a for
is done.
Iterating over numbers is not all what for
can do. Basically a for
can iterate on any vector, atomic or recursive. This is powerful and is often ignored by beginners.
a <- rnorm(100)
b <- runif(100)
for ( l in list(a,b) )
print(mean(l))
## [1] -0.08564948
## [1] 0.4978582
All vectors can be iterated. What about a matrix
? Remember that a matrix
is simply an atomic vector
with a dim
attribute. So surely it can be iterated: (only for illustration purpose because such scenario is rarely practical)
for ( i in matrix(1:4,2,2) ) print(i)
## [1] 1
## [1] 2
## [1] 3
## [1] 4
for
is useful. But don’t use for
for large-sized iteration. The overhead of for
is usually quite high and since most basic operations in R are vectorzied, one should consider vectorizing in the first place when developing scripts. The following somewhat silly coding illustrates the basic difference in performance when vectorization is available:
tt <- vv <- kk <- integer(1e5)
system.time(for ( i in 1:length(tt) ) vv[i] <- tt[i] + 1)
## user system elapsed
## 0.146 0.006 0.155
system.time(kk <- tt + 1)
## user system elapsed
## 0 0 0
identical(vv, kk)
## [1] TRUE
To understand more, see the section 8.2.