Fine Print
R
or RStudio
command prompt, git, and etc. or crash/burn/blow up your computer, please be aware that I do not accept any responsibility.Important Notes
Git
.Bitbucket
in a remote public git repository which can be found here.
RStudio
using R
and R Markdown
.
R Markdown
as this one.R
such as
R
and related softwareR
basicsR
objectsR
packagesR
R
programmingR
functionsR
is a sequentially interpreted object-oriented programming language for statistical computing, data mining, web scraping, graphics and more.R
cannot handle two procedures at the same time. As a consequence, the code you write will be read starting from the first line, then the second, until the last line.R
, you can perform simple calculations, vector and matrix operations, data manipulation, create your own functions and procedures, and do almost anything you want with data in an easy and ordered way.R
was created by Ross Ihaka and Robert Gentleman at the University of Auckland, New Zealand in 1993.R
is named partly after the first names of the first two R
authors and partly as a play on the name of S
programming language.R
is currently developed by the R
Development Core Team and supported by The R Foundation.R
are as following
R
matches and exceeds the most of the features of the current available statistical packages.R
is open-source and completely free in the sense of monetary cost.R
packages.R
, you can write code and save it for replication, debugging and modification. You cannot do that when using commands from menus or dialog boxes of some econometric programs.R
is very similar across methods. So, if you know other programming languages, your job will be relatively easy.R
are as following
R
has a very steep learning curve.RStudio
which is the integrated development environment (IDE) for R
.R
compels you to type in commands for every task. However, it is a good practice to get use to type in your own commands from scratch.RStudio
is the best IDE available for R
.R
easier to use.R
packages, and git.RStudio
that I like most are
R
help and documentationR Project
feature for easily managing multiple working directoriesR Markdown
R
as our main programming software, but to make coding easier and more fun we will use RStudio
as our IDE, which will use R
at the background.RStudio
preferences settings in details, please see R vs. RStudio section in the Installation_and_Software_Notes.html file.
RStudio
preferences.R
and RStudio
in terms of user functionality.R
programmer, the first place you should check is The R Project for Statistical Computing web page.
R
, help guides and manuals, along with the latest distribution of R
for different platforms (Windows, Mac, Linux) hosted by Comprehensive R Archive Network (CRAN).R
distributions, contributed extensions, documentation for R
, and binaries.R
.
R
installation files can be found here. On the files section, you need to use the installation package matching your operation system.R
installation files can be found here.R
software is installed, you can open it to see how it looks but you don’t need to since we will be using RStudio
.R
is installed, you need to download and install RStudio
.
RStudio
can be downloaded from here.RStudio Desktop
version not the RStudio Server
.RStudio
is installed, open the software and start exploring.
R
project directly, go to the main file directory of the workshop downloaded to your computer from the remote git repository.RStudio
directly from the R mini BootCamp.Rproj file will set your working directory as the main file directory of the workshop, and loads some pre-defined paths, functions and even installs and loads some R
packages.R
version used to built this document.R
, your best friends will be
R
and its extensions, please see
R
code, please see
R
cheat sheets, R
manuals from the R
Development Core Team, notes of Data Science Specilization online class by Xing Su, and some important articles.R
basics such as
R
objects such as vectors, factors, logicals, matrices, arrays, lists and dataRStudio
is console and editor.RStudio
, console shows some basic information about the program’s version, license and citation information.>
symbol (prompt) and a blinking cursor which indicates that RStudio
is waiting for some input.> 10 + 5
#> [1] 15
>
> (2 * ((10 + 5 - 3) * 2 - 9)) / 2^1
#> [1] 15
#>
symbol.[1]
at the beginning of each result.
[1] 15
tells us that it is a one-dimensional array with containing only the number 15
.>
will be dropped after this point.R
commands can be run on editor which allows you to save your code, manipulate and run whenever necessary.15 + 5
#> [1] 20
((10 - 2)/4)^2
#> [1] 4
log(10) ## Takes the natural logarithm of the input.
#> [1] 2.3025851
log(exp(5)) ## Note the exponential function written as "exp()".
#> [1] 5
sin(pi/6) ## Sine function and π.
#> [1] 0.5
R
, the #
sign is used to comment out your code, text and etc.#
sign, all the code, text and etc. will be ignored by R
and printed as text.#
for a full comment line and use ##
for a code line with a command at the end.#
for sections and ##
for subsections to ease the reading.# This is a command line which can be very long if you wish.
3 * 3 ## This is a code line with a command at the end.
#> [1] 9
# This is a comment line for a section.
## This is a comment for the subsection.
R
regards everything as objects.R
objects.R
objects such as vectors, factors, logicals, matrices, arrays, data frames, and lists.R
objects, let’s see how to create a simple R
object.3
approaches to create any R
object which are shown in the below code.# Approach 1: try to use this approach.
x <- 6
print(x) ## Prints the object. I rarely use this function.
#> [1] 6
x ## Also prints the object.
#> [1] 6
z <- "R mini BootCamp" ## A character string. Note that strings in R are contained within double quotes.
z
#> [1] "R mini BootCamp"
r <- 1:10 ## ":" operator generates regular sequences.
r
#> [1] 1 2 3 4 5 6 7 8 9 10
# Approach 2: try to avoid this approach.
y = 4
y
#> [1] 4
w = "Go Wolfpack"
w
#> [1] "Go Wolfpack"
# Approach 3: try to avoid this approach.
assign("ncsu", "Go Wolfpack") ## Assings the "Go Wolfpack" character string to ncsu object.
ncsu
#> [1] "Go Wolfpack"
R
programming language is case sensitive.R
objects, there is not a certain naming convention agreed upon like other programming languages. However, in general I recommend following the below coding style.
new.object
instead of new-object
or new_object
.2b
) is not accepted by R
.R
functions.
conflicts(detail = TRUE)$.GlobalEnv
checks if a user-created R
object conflicts with built-in objects supplied from R
packages.R
objects listed by that code should be removed in order to use the built-in objects with the same name. Otherwise, you might get unexpected results.R
object by creating a different R
object with the same name.x <- 6 ## With lower case name.
x
#> [1] 6
X <- "NCSU" ## With capital letter name.
X
#> [1] "NCSU"
new.object <- 2:6 ## Don't use new_object or new-object.
new.object
#> [1] 2 3 4 5 6
conflicts(detail = TRUE)$.GlobalEnv
#> [1] "income" "x" "y" "year" "lines" "data" "c"
y <- "R mini BootCamp" ## Creating an object.
y
#> [1] "R mini BootCamp"
y <- 10:15 ## Overriding it with different values.
y
#> [1] 10 11 12 13 14 15
R
language.vector
function.c
.length
function should be used.class
, str
and some specific control functions to check the class and structure of vectors.# Creating an empty vector.
x <- vector("numeric", length = 10) ## Defines the class and length of a vector. You can use other classes that we will see in detail later.
x
#> [1] 0 0 0 0 0 0 0 0 0 0
# Vector with numeric value.
a <- c(5, 6, 7, 8, 9) ## Concatenate function which created a vector with numeric values.
a
#> [1] 5 6 7 8 9
length(a) ## Give the length of a vector.
#> [1] 5
dim(a) ## There is no dimension for vectors.
#> NULL
is.numeric(a) ## Checks whether the vector is numeric.
#> [1] TRUE
is.double(a) ## Numeric class in R is also called "double". So you can use "is.double" function as well.
#> [1] TRUE
class(a) ## Gives the class of an object in a character string.
#> [1] "numeric"
str(a) ## Gives the details of object structure (class of the object and its values). Try to use it frequently, it is very useful.
#> num [1:5] 5 6 7 8 9
seq(from = 1, to = 10, by = 1) ## seq function generates regular sequences.
#> [1] 1 2 3 4 5 6 7 8 9 10
seq(from = 1, to = 10, by = 4)
#> [1] 1 5 9
rep(1, each = 4) ## Replicates each value 3 times.
#> [1] 1 1 1 1
rep(c(2:5), times = 3) ## Replicates all value 4 times.
#> [1] 2 3 4 5 2 3 4 5 2 3 4 5
rep(c(2:5), each = 3) ## Replicates each value 3 times.
#> [1] 2 2 2 3 3 3 4 4 4 5 5 5
# Vector with integer values.
x <- c(1L, 2L, 3L, 4L) ## To create an integer in R use "L" after the numeric value.
x
#> [1] 1 2 3 4
is.integer(x) ## Checks whether the vector is integer.
#> [1] TRUE
class(x)
#> [1] "integer"
str(x)
#> int [1:4] 1 2 3 4
# Vector with complex values.
a <- c(1 + 0i, 2 + 4i) ## Vector with complex values.
a
#> [1] 1+0i 2+4i
is.complex(a) ## Checks whether the vector is complex.
#> [1] TRUE
class(a)
#> [1] "complex"
str(a)
#> cplx [1:2] 1+0i 2+4i
# Vector with character value.
a <- c("NCSU", "Wolfpack")
a
#> [1] "NCSU" "Wolfpack"
is.character(a) ## Checks whether the vector is character.
#> [1] TRUE
class(a)
#> [1] "character"
str(a)
#> chr [1:2] "NCSU" "Wolfpack"
# Vector with logical values.
x <- c(TRUE, FALSE) ## Logical vector. See Logicals section for more details.
x
#> [1] TRUE FALSE
is.logical(x) ## Checks whether the vector is logical.
#> [1] TRUE
class(x)
#> [1] "logical"
str(x)
#> logi [1:2] TRUE FALSE
# Vector with names.
a <- c(1:3)
str(a)
#> int [1:3] 1 2 3
attr(a, "names") <- c("First", "Second", "Third") ## A new attribute and description is added.
a ## Vector with names.
#> First Second Third
#> 1 2 3
str(a) ## Named num.
#> Named int [1:3] 1 2 3
#> - attr(*, "names")= chr [1:3] "First" "Second" "Third"
b <- c(First = 1, Second = 2, Third = 3, Fourth = 4, Fifth = 5)
b ## Vector with names.
#> First Second Third Fourth Fifth
#> 1 2 3 4 5
str(b) ## Named num.
#> Named num [1:5] 1 2 3 4 5
#> - attr(*, "names")= chr [1:5] "First" "Second" "Third" "Fourth" ...
numeric
and character
values in the same vector, creates a character
vector.logical
and character
values in the same vector, creates a character
vector.logical
and numeric
values in the same vector, creates a numeric
vector.numeric
vector is not an integer
vector, but an integer
vector is also numeric
.# Vector with numeric and character values.
a <- c(5, 6, 7, 8, "d") ## Note tha the last element in vector a is a character value.
str(a) ## Note tha class of vector a.
#> chr [1:5] "5" "6" "7" "8" "d"
# Vector with logical and character values.
b <- c("a", TRUE) ## Character vector.
str(b)
#> chr [1:2] "a" "TRUE"
# Vector with logical and numeric values.
x <- c(TRUE, 2) ## Numeric (TRUE will be converted into number 1). See Logicals section for more details.
str(x)
#> num [1:2] 1 2
str(c(FALSE, 2)) ## Numeric (FALSE will be converted into number 0).
#> num [1:2] 0 2
# Numeric vs. Integer values.
a <- c(1, 2)
is.integer(a)
#> [1] FALSE
is.numeric(a)
#> [1] TRUE
b <- c(1:2)
is.integer(b)
#> [1] TRUE
is.numeric(b)
#> [1] TRUE
a <- c(5:15)
b <- c(10:20)
c <- c(25:30)
x <- c(b, a) ## Combining two vectors.
x
#> [1] 10 11 12 13 14 15 16 17 18 19 20 5 6 7 8 9 10 11 12 13 14 15
y <- c(a, b, c) ## Combining three vectors.
y
#> [1] 5 6 7 8 9 10 11 12 13 14 15 10 11 12 13 14 15 16 17 18 19 20 25
#> [24] 26 27 28 29 30
a <- c(1:10)
2 + a ## Vectorized operation.
#> [1] 3 4 5 6 7 8 9 10 11 12
1/a
#> [1] 1.00000000 0.50000000 0.33333333 0.25000000 0.20000000 0.16666667
#> [7] 0.14285714 0.12500000 0.11111111 0.10000000
length
are used in vectorized operations then the length
of the vector(s) are not important.
a
and b
defined as below.x
and y
defined as below.# Vectors with same length.
a <- seq(from = 1, to = 10, by = 2)
a
#> [1] 1 3 5 7 9
b <- seq(from = 1, to = 15, by = 3)
b
#> [1] 1 4 7 10 13
a/b
#> [1] 1.00000000 0.75000000 0.71428571 0.70000000 0.69230769
# Vectors with different length.
x <- seq(from = 1, to = 10, by = 2)
x
#> [1] 1 3 5 7 9
y <- seq(from = 1, to = 10, by = 3)
y
#> [1] 1 4 7 10
x/y ## Note the last value in this vector.
#> Warning in x/y: longer object length is not a multiple of shorter object
#> length
#> [1] 1.00000000 0.75000000 0.71428571 0.70000000 9.00000000
a <- rnorm(n = 10000, mean = 0, sd = 1) ## Random number generator for the normal distribution with the specified mean and standard deviation. This is standard normal distribution.
head(x = a, n = 5) ## Prints the first 5 elements of a vector.
#> [1] -1.20706575 0.27742924 1.08444118 -2.34569770 0.42912469
tail(x = a, n = 5) ## Prints the last 5 elements of a vector.
#> [1] 0.019739024 -2.126745287 -0.050222009 -0.238174080 0.776405308
mean(a) ## Mean.
#> [1] 0.006115893
var(a) ## Variance.
#> [1] 0.97521426
sum((a - mean(a))^2) / (length(a) - 1) ## Same as above.
#> [1] 0.97521426
sd(a) ## Standard deviation.
#> [1] 0.98752937
sqrt(var(a)) ## Same as above. sqrt function is for square root.
#> [1] 0.98752937
min(a) ## Minimum value.
#> [1] -3.3960635
max(a) ## Maximum value.
#> [1] 3.6181065
sum(a) ## Total of all elements in a vector.
#> [1] 61.15893
b <- c(seq(from = 10, to = 20, by = 4), seq(from = 10, to = 20, by = 2))
b
#> [1] 10 14 18 10 12 14 16 18 20
sqrt(b) ## Square root.
#> [1] 3.1622777 3.7416574 4.2426407 3.1622777 3.4641016 3.7416574 4.0000000
#> [8] 4.2426407 4.4721360
log(b)
#> [1] 2.3025851 2.6390573 2.8903718 2.3025851 2.4849066 2.6390573 2.7725887
#> [8] 2.8903718 2.9957323
sort(b, decreasing = FALSE, na.last = TRUE) ## Sorts the value of a vector alphabetically. na.last puts the missing values at the end. We will see the missing values later.
#> [1] 10 10 12 14 14 16 18 18 20
unique(b) ## Gives you the unique values in a vector.
#> [1] 10 14 18 12 16 20
sort(unique(b)) ## Unique values are sorted.
#> [1] 10 12 14 16 18 20
a <- seq(from = 1, to = 200, by = 3)
a
#> [1] 1 4 7 10 13 16 19 22 25 28 31 34 37 40 43 46 49
#> [18] 52 55 58 61 64 67 70 73 76 79 82 85 88 91 94 97 100
#> [35] 103 106 109 112 115 118 121 124 127 130 133 136 139 142 145 148 151
#> [52] 154 157 160 163 166 169 172 175 178 181 184 187 190 193 196 199
18th
element in vector a
is 52.factor
function.a <- c("yes", "yes", "no", "yes", "no") ## A character vector.
str(a)
#> chr [1:5] "yes" "yes" "no" "yes" "no"
b <- factor(x = a) ## Creates the factor and gives you the "Levels".
b
#> [1] yes yes no yes no
#> Levels: no yes
is.factor(b) ## Checks whether the object is a factor.
#> [1] TRUE
class(b)
#> [1] "factor"
str(b) ## Note that the levels are automatically identified by alphabetic order.
#> Factor w/ 2 levels "no","yes": 2 2 1 2 1
attributes(b) ## Gives the object's attributes.
#> $levels
#> [1] "no" "yes"
#>
#> $class
#> [1] "factor"
levels(b) ## Gives the levels.
#> [1] "no" "yes"
table(b) ## Gives the number of levels by factors.
#> b
#> no yes
#> 2 3
unclass(b) ## Shows the factors in numbers (no:1, yes:2).
#> [1] 2 2 1 2 1
#> attr(,"levels")
#> [1] "no" "yes"
b <- factor(x = a, levels = c("yes", "no"))
b ## Note that the levels are identified with levels argument.
#> [1] yes yes no yes no
#> Levels: yes no
table(b) ## Gives the number of levels by factors.
#> b
#> yes no
#> 3 2
unclass(b) ## Shows the factors in numbers (no:1, yes:2).
#> [1] 1 1 2 1 2
#> attr(,"levels")
#> [1] "yes" "no"
attr(b, "levels") <- c("Aye", "Nay") ## Changin the level by using the attr function.
b
#> [1] Aye Aye Nay Aye Nay
#> Levels: Aye Nay
# Factor with names
a <- c(1:3)
attr(a, "names") <- c("First", "Second", "Third") ## A new attribute and description is added.
b <- factor(x = a)
b ## Note that the levels are identified with levels argument.
#> First Second Third
#> 1 2 3
#> Levels: 1 2 3
gl
function for easily creating factors in R
.gl(n = 2, k = 4, labels = c("yes", "no")) ## Creates a factor object with 2 levels and 8 replications.
#> [1] yes yes yes yes no no no no
#> Levels: yes no
cut
function.a <- c(1:15)
a.factor <- cut(x = a, breaks = c(min(a), 6, 12, max(a))) ## Note that first value is not included in the interval.
table(a.factor)
#> a.factor
#> (1,6] (6,12] (12,15]
#> 5 6 3
a.factor <- cut(x = a, breaks = c(min(a) - 1, 12, max(a))) ## Now all values are included.
table(a.factor)
#> a.factor
#> (0,12] (12,15]
#> 12 3
a.factor <- cut(x = a, breaks = c(6, 12, max(a)), include.lowest = TRUE) ## If the minumum value is not specified, you need to use include.lowest argument.
table(a.factor)
#> a.factor
#> [6,12] (12,15]
#> 7 3
a.factor <- cut(x = a, breaks = c(min(a) - 1, 6, 12, max(a)), labels = c("1st Group", "2nd Group", "3rd Group")) ## Note that we also defined the labels.
table(a.factor)
#> a.factor
#> 1st Group 2nd Group 3rd Group
#> 6 6 3
str(a.factor)
#> Factor w/ 3 levels "1st Group","2nd Group",..: 1 1 1 1 1 1 2 2 2 2 ...
R
language is logical objects.
R
are one-dimensional data structure like vectors.TRUE
and FALSE
.a <- TRUE
a
#> [1] TRUE
is.logical(a) ## Checks whether the vector is logical.
#> [1] TRUE
class(a)
#> [1] "logical"
str(a)
#> logi TRUE
b <- FALSE
str(b)
#> logi FALSE
x <- "TRUE"
str(x)
#> chr "TRUE"
y <- c(TRUE, FALSE, FALSE)
str(y)
#> logi [1:3] TRUE FALSE FALSE
R
are equal
(==
), not equal
(!
), and
(&
, &&
) and or
(|
, ||
).
and
and or
evaluate your code element by element in the same way as arithmetic operators.and
and or
evaluate your code by examining only the first element of each vector. Evaluation proceeds only until the result is determined.if
clauses.%in%
can be used to logically check whether there is a match in the right object for the elements of the left object.R
.36 == 36 ## Checks equality of two numeric objects.
#> [1] TRUE
6 * 6 == 30 + 6 ## Checks the equality of two seperate calculations.
#> [1] TRUE
TRUE == FALSE ## Checks the equaility of two logical objects.
#> [1] FALSE
"NCSU" == "ncsu" ## Checks equality of two character objects.
#> [1] FALSE
36 != 36 ## Checks non-equaility.
#> [1] FALSE
36 != 6 ## Checks non-equaility.
#> [1] TRUE
TRUE != FALSE
#> [1] TRUE
1 < 0 ## Smaller than.
#> [1] FALSE
1 > 0 ## Bigger than.
#> [1] TRUE
1 <= 0 ## Smaller than and equal to.
#> [1] FALSE
1 >= 0 ## Bigger than and equal to.
#> [1] TRUE
-2:2 >= 0 ## Elementwise evaluation. Note that the shorter object is recycled fully over the longer one since the longer object length is a multiple of shorter object length.
#> [1] FALSE FALSE TRUE TRUE TRUE
(-2:2 >= 0) & (-2:2 <= 0) ## Elementwise evalutation. Note that the lengths of two vectors are same.
#> [1] FALSE FALSE TRUE FALSE FALSE
((-2:2) >= 0) && ((-2:2) <= 0) ## Only evaluates the first elements in each vector.
#> [1] FALSE
(-3:3 >= 0) & (-2:2 <= 0) ## Elementwise evaluation. Note that the shorter object cannot be recycled fully over the longer one since the longer object length is a multiple of shorter object length.
#> Warning in (-3:3 >= 0) & (-2:2 <= 0): longer object length is not a
#> multiple of shorter object length
#> [1] FALSE FALSE FALSE FALSE FALSE TRUE TRUE
length(-3:3) %% length(-2:2) ## The reminder.
#> [1] 2
1:6 %in% 3:10 ## Elements of the left object is checked individually whether it matches with any of the right object elements.
#> [1] FALSE FALSE TRUE TRUE TRUE TRUE
R
gives a numeric interpretation to logical objects.
FALSE
is equivalent to scalar 0
.TRUE
is equivalent to scalar 1
.a <- 0
class(a) ## Object "a" is a numeric object.
#> [1] "numeric"
a == FALSE ## But it is still considered as FALSE since its value is "0".
#> [1] TRUE
b <- 1
class(b) ## Object "b" is a numeric object.
#> [1] "numeric"
b == TRUE ## But it is still considered as TRUE since its value is "1".
#> [1] TRUE
x <- 2
class(x)
#> [1] "numeric"
x == TRUE
#> [1] FALSE
a <- c(1:4, 2:5)
a == 3 ## Note tha TRUE values.
#> [1] FALSE FALSE TRUE FALSE FALSE TRUE FALSE FALSE
a <- c(5:15) ## First vector.
b <- c(10:20) ## Second vector.
x <- c(b, a) ## First and second vector are combined.
x
#> [1] 10 11 12 13 14 15 16 17 18 19 20 5 6 7 8 9 10 11 12 13 14 15
sort(unique(x)) ## Unique values are sorted.
#> [1] 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
sort(unique(x)) == sort(unique(c(a, b))) ## Checking if the sorting the unique values worked. Note the order of vectors (here it is c(a, b) but c vector is defined as c(b, a)).
#> [1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
#> [15] TRUE TRUE
sum(!(sort(unique(x)) == sort(unique(c(a, b))))) == 0 ## A quick control structure using the logical object and vectorized operations. So, there is no need to check it item by item.
#> [1] TRUE
TRUE
or FALSE
.if statements
in the Control Structures section.x <- TRUE
if (x == TRUE) { ## We will see the details of IF statements later.
print("My first IF statement.")
}
#> [1] "My first IF statement."
if (x) {
print("IF statement result is TRUE, so it will be printed.")
}
#> [1] "IF statement result is TRUE, so it will be printed."
if (!x) {
print("IF statement result is FALSE, so it won't be printed.")
} else {
print("IF statement result is TRUE, so it will be printed.")
}
#> [1] "IF statement result is TRUE, so it will be printed."
R
.2
dimensions, which are rows and columns.matrix
as shown below.matrix(data = 1:6, nrow = 2, ncol = 3, byrow = FALSE) ## Default is to fill the matrix by column.
#> [,1] [,2] [,3]
#> [1,] 1 3 5
#> [2,] 2 4 6
matrix(data = 1:6, nrow = 2) ## You can define the length of one dimension.
#> [,1] [,2] [,3]
#> [1,] 1 3 5
#> [2,] 2 4 6
matrix(data = 1:2, nrow = 2, ncol = 3) ## If the elements are not enought the data vector will be recycled to fill the whole matrix.
#> [,1] [,2] [,3]
#> [1,] 1 1 1
#> [2,] 2 2 2
matrix(0, 2, 3) ## Creates a zero matrix. Note that value 0 recycles.
#> [,1] [,2] [,3]
#> [1,] 0 0 0
#> [2,] 0 0 0
matrix(data = 1:6, nrow = 2, ncol = 3, byrow = TRUE) ## Filled by row.
#> [,1] [,2] [,3]
#> [1,] 1 2 3
#> [2,] 4 5 6
a <- matrix(data = 1:6, nrow = 2, ncol = 3, dimnames = list(c("row1", "row2"), c("col1", "col2", "col3"))) ## We can also define the dimension names. Note that you need to use list() function. Wee will see the details of lists later.
a
#> col1 col2 col3
#> row1 1 3 5
#> row2 2 4 6
is.matrix(a) ## Checks whether the object is a matrix.
#> [1] TRUE
class(a)
#> [1] "matrix"
attributes(a) ## Gives the object's attributes.
#> $dim
#> [1] 2 3
#>
#> $dimnames
#> $dimnames[[1]]
#> [1] "row1" "row2"
#>
#> $dimnames[[2]]
#> [1] "col1" "col2" "col3"
str(a)
#> int [1:2, 1:3] 1 2 3 4 5 6
#> - attr(*, "dimnames")=List of 2
#> ..$ : chr [1:2] "row1" "row2"
#> ..$ : chr [1:3] "col1" "col2" "col3"
x <- c(1:6) ## You can also create matrix by first creating a vector then assigning its dimension.
dim(x) <- c(2, 3) ## Note that the matrix is filled by columns.
x
#> [,1] [,2] [,3]
#> [1,] 1 3 5
#> [2,] 2 4 6
R
.R
data frames.
x <- matrix(data = 9:4, nrow = 3, ncol = 2, dimnames = list(c("row1", "row2", "row3"), c("col1", "col2")))
x
#> col1 col2
#> row1 9 6
#> row2 8 5
#> row3 7 4
dim(x) ## Dimensions of a matrix.
#> [1] 3 2
dimnames(x) ## Dimension names.
#> [[1]]
#> [1] "row1" "row2" "row3"
#>
#> [[2]]
#> [1] "col1" "col2"
nrow(x) ## Number of rows.
#> [1] 3
ncol(x) ## Number of columns.
#> [1] 2
rownames(x) ## Gives the row names.
#> [1] "row1" "row2" "row3"
colnames(x) ## Gives the column names.
#> [1] "col1" "col2"
rowSums(x) ## Row sums.
#> row1 row2 row3
#> 15 13 11
colSums(x) ## Column sums.
#> col1 col2
#> 24 15
rowMeans(x) ## Row means.
#> row1 row2 row3
#> 7.5 6.5 5.5
colMeans(x) ## Column means.
#> col1 col2
#> 8 5
diag(x = 2, nrow = 3, ncol = 3) ## Extract or replace the diagonal of a matrix, or construct a diagonal matrix. Here it creates an identity matrix.
#> [,1] [,2] [,3]
#> [1,] 2 0 0
#> [2,] 0 2 0
#> [3,] 0 0 2
diag(x = 3, nrow = 2, ncol = 2) ## Creates a diagonal matrix with values of 3.
#> [,1] [,2]
#> [1,] 3 0
#> [2,] 0 3
diag(x = 3, nrow = 2, ncol = 3) ## One additional column.
#> [,1] [,2] [,3]
#> [1,] 3 0 0
#> [2,] 0 3 0
diag(x = 3, nrow = 4, ncol = 3) ## One additional row.
#> [,1] [,2] [,3]
#> [1,] 3 0 0
#> [2,] 0 3 0
#> [3,] 0 0 3
#> [4,] 0 0 0
y <- diag(x = 3, nrow = 5, ncol = 5)
diag(y) ## Extracts the diagonal.
#> [1] 3 3 3 3 3
z <- matrix(0 , nrow = 3, ncol = 3)
diag(z) <- 1:3 ## Assigns the given values to diagonal.
z
#> [,1] [,2] [,3]
#> [1,] 1 0 0
#> [2,] 0 2 0
#> [3,] 0 0 3
lower.tri(z, diag = FALSE) ## Gives the lower triangle of a matrix in logical values. You can use the result of this function to subset the lower triangle.
#> [,1] [,2] [,3]
#> [1,] FALSE FALSE FALSE
#> [2,] TRUE FALSE FALSE
#> [3,] TRUE TRUE FALSE
upper.tri(z, diag = FALSE) ## Gives the upper triangle of a matrix.
#> [,1] [,2] [,3]
#> [1,] FALSE TRUE TRUE
#> [2,] FALSE FALSE TRUE
#> [3,] FALSE FALSE FALSE
upper.tri(z, diag = TRUE) ## Diagonal is included.
#> [,1] [,2] [,3]
#> [1,] TRUE TRUE TRUE
#> [2,] FALSE TRUE TRUE
#> [3,] FALSE FALSE TRUE
R
.x <- matrix(data = 1:6, nrow = 2, ncol = 3)
y <- matrix(data = 6:11, nrow = 3, ncol = 2)
z <- matrix(data = 10:15, nrow = 2, ncol = 3)
w <- matrix(data = 10:13, nrow = 2, ncol = 2)
t(x) ## Transpose.
#> [,1] [,2]
#> [1,] 1 2
#> [2,] 3 4
#> [3,] 5 6
det(w) ## Determinant.
#> [1] -2
solve(w) ## inverse.
#> [,1] [,2]
#> [1,] -6.5 6
#> [2,] 5.5 -5
eigen(w) ## Eigen values.
#> $values
#> [1] 23.086630226 -0.086630226
#>
#> $vectors
#> [,1] [,2]
#> [1,] -0.67584466 -0.76549652
#> [2,] -0.73704409 0.64344003
x + 1 ## Scalar summation.
#> [,1] [,2] [,3]
#> [1,] 2 4 6
#> [2,] 3 5 7
x + z ## Matrix summation.
#> [,1] [,2] [,3]
#> [1,] 11 15 19
#> [2,] 13 17 21
x / z ## Division by element.
#> [,1] [,2] [,3]
#> [1,] 0.10000000 0.25000000 0.35714286
#> [2,] 0.18181818 0.30769231 0.40000000
2 * x ## Scalar multiplication.
#> [,1] [,2] [,3]
#> [1,] 2 6 10
#> [2,] 4 8 12
c(2, 10) * x ## Scalar multiplication by row.
#> [,1] [,2] [,3]
#> [1,] 2 6 10
#> [2,] 20 40 60
c(1, 2, 3) * y
#> [,1] [,2]
#> [1,] 6 9
#> [2,] 14 20
#> [3,] 24 33
x %% z ## Matrix multiplication
#> [,1] [,2] [,3]
#> [1,] 1 3 5
#> [2,] 2 4 6
crossprod(x) ## Cross product.
#> [,1] [,2] [,3]
#> [1,] 5 11 17
#> [2,] 11 25 39
#> [3,] 17 39 61
kronecker(x, y) ## Kronecker product.
#> [,1] [,2] [,3] [,4] [,5] [,6]
#> [1,] 6 9 18 27 30 45
#> [2,] 7 10 21 30 35 50
#> [3,] 8 11 24 33 40 55
#> [4,] 12 18 24 36 36 54
#> [5,] 14 20 28 40 42 60
#> [6,] 16 22 32 44 48 66
c(1:5) %o% c(1:5) ## Outer product
#> [,1] [,2] [,3] [,4] [,5]
#> [1,] 1 2 3 4 5
#> [2,] 2 4 6 8 10
#> [3,] 3 6 9 12 15
#> [4,] 4 8 12 16 20
#> [5,] 5 10 15 20 25
matrix(1:5, 5, 1) %*% matrix(1:5, 1, 5) ## Same result as above.
#> [,1] [,2] [,3] [,4] [,5]
#> [1,] 1 2 3 4 5
#> [2,] 2 4 6 8 10
#> [3,] 3 6 9 12 15
#> [4,] 4 8 12 16 20
#> [5,] 5 10 15 20 25
rbind
and cbind
functions.R
employs recycling.x <- matrix(data = 1:6, nrow = 2, ncol = 3)
y <- matrix(data = 6:11, nrow = 3, ncol = 2)
z <- matrix(data = 10:15, nrow = 2, ncol = 3)
rbind(x, z)
#> [,1] [,2] [,3]
#> [1,] 1 3 5
#> [2,] 2 4 6
#> [3,] 10 12 14
#> [4,] 11 13 15
rbind(x, 1, 0) ## Recycling by columns.
#> [,1] [,2] [,3]
#> [1,] 1 3 5
#> [2,] 2 4 6
#> [3,] 1 1 1
#> [4,] 0 0 0
cbind(x, z)
#> [,1] [,2] [,3] [,4] [,5] [,6]
#> [1,] 1 3 5 10 12 14
#> [2,] 2 4 6 11 13 15
cbind(x, t(y))
#> [,1] [,2] [,3] [,4] [,5] [,6]
#> [1,] 1 3 5 6 7 8
#> [2,] 2 4 6 9 10 11
cbind(x, 1, 0) ## Recyling by rows.
#> [,1] [,2] [,3] [,4] [,5]
#> [1,] 1 3 5 1 0
#> [2,] 2 4 6 1 0
array
as shown below.array(data = 1:6, dim = c(2, 3), dimnames = NULL) ## Two-dimensional array.
#> [,1] [,2] [,3]
#> [1,] 1 3 5
#> [2,] 2 4 6
matrix(data = 1:6, nrow = 2, ncol = 3, dimnames = NULL) ## Same as above.
#> [,1] [,2] [,3]
#> [1,] 1 3 5
#> [2,] 2 4 6
x <- array(data = 1:12, dim = c(3, 2, 2)) ## 3-dimensional arrays.
x
#> , , 1
#>
#> [,1] [,2]
#> [1,] 1 4
#> [2,] 2 5
#> [3,] 3 6
#>
#> , , 2
#>
#> [,1] [,2]
#> [1,] 7 10
#> [2,] 8 11
#> [3,] 9 12
dim(x)
#> [1] 3 2 2
is.array(x) ## Checks whether the object is an array.
#> [1] TRUE
class(x)
#> [1] "array"
str(x)
#> int [1:3, 1:2, 1:2] 1 2 3 4 5 6 7 8 9 10 ...
attributes(x) ## Gives the object's attributes.
#> $dim
#> [1] 3 2 2
y <- array(data = 1:12, dim = c(3, 2, 2), dimnames = list(c("Row.1", "Row.2", "Row.3"), c("Col.1", "Col.2"), c("Dim.1", "Dim.2"))) ## Dimension names should be in a list form. See the Lists section first and check this function again.
y
#> , , Dim.1
#>
#> Col.1 Col.2
#> Row.1 1 4
#> Row.2 2 5
#> Row.3 3 6
#>
#> , , Dim.2
#>
#> Col.1 Col.2
#> Row.1 7 10
#> Row.2 8 11
#> Row.3 9 12
str(y)
#> int [1:3, 1:2, 1:2] 1 2 3 4 5 6 7 8 9 10 ...
#> - attr(*, "dimnames")=List of 3
#> ..$ : chr [1:3] "Row.1" "Row.2" "Row.3"
#> ..$ : chr [1:2] "Col.1" "Col.2"
#> ..$ : chr [1:2] "Dim.1" "Dim.2"
rows
, each containing values in several fields which are called as columns
.
R
.R
workspace as objects.
R
and typing them one by one is not a good idea.data.frame()
code.my.data <- data.frame() ## Creates an empty data frame.
is.data.frame(my.data) ## Checks whether the object is a data frame.
#> [1] TRUE
class(my.data)
#> [1] "data.frame"
str(my.data)
#> 'data.frame': 0 obs. of 0 variables
data.frame
function.sample.size <- 30 ## Defining the sample size for later use.
column.1 <- round(rnorm(n = sample.size, mean = 5, sd = 1), digits = 2) ## Random number generator for the normal distribution with the specified mean and standard deviation. ## round function rounds the numeric values.
column.2 <- sample(x = c(-50:50, NA), size = sample.size, replace = TRUE, prob = NULL) ## Sample function takes a sample of the specified size from the elements of x using either with or without replacement. Creates an integer class.
column.3 <- sample(x = c("NCSU", "CALS", "Economics"), size = sample.size, replace = TRUE, prob = NULL)
column.4 <- factor(sample(x = c("Yes", "No"), size = sample.size, replace = TRUE, prob = NULL))
column.5 <- sample(x = c(TRUE, FALSE), size = sample.size, replace = TRUE, prob = NULL)
my.data <- data.frame(Column.1 = column.1, Column.2 = column.2, Column.3 = column.3, Column.4 = column.4, Column.5 = column.5) ## Creating data frame from scratch.
my.data
is.data.frame(my.data)
#> [1] TRUE
class(my.data)
#> [1] "data.frame"
str(my.data) ## Note that column.3 is factor variable but we wanted a character class.
#> 'data.frame': 30 obs. of 5 variables:
#> $ Column.1: num 3.18 5.63 5.52 5.14 6.46 4.51 2.88 4.87 4.57 5.09 ...
#> $ Column.2: int NA -21 -33 32 6 9 34 21 -42 -9 ...
#> $ Column.3: Factor w/ 3 levels "CALS","Economics",..: 1 2 2 2 3 2 2 1 2 1 ...
#> $ Column.4: Factor w/ 2 levels "No","Yes": 2 2 1 2 2 2 2 2 2 2 ...
#> $ Column.5: logi FALSE FALSE TRUE FALSE TRUE TRUE ...
attributes(my.data) ## Gives the object's attributes.
#> $names
#> [1] "Column.1" "Column.2" "Column.3" "Column.4" "Column.5"
#>
#> $row.names
#> [1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
#> [24] 24 25 26 27 28 29 30
#>
#> $class
#> [1] "data.frame"
data.frame
’s default behavior turns character strings into factors.stringsAsFactors = FALSE
argument suppresses this behavior.new.data <- data.frame(Column.1 = column.1, Column.2 = column.2, Column.3 = column.3, Column.4 = column.4, Column.5 = column.5, stringsAsFactors = FALSE) ## Creating data frame from scratch.
new.data
is.data.frame(new.data)
#> [1] TRUE
class(new.data)
#> [1] "data.frame"
str(new.data) ## Note that column.3 is character class which is what we wanted.
#> 'data.frame': 30 obs. of 5 variables:
#> $ Column.1: num 3.18 5.63 5.52 5.14 6.46 4.51 2.88 4.87 4.57 5.09 ...
#> $ Column.2: int NA -21 -33 32 6 9 34 21 -42 -9 ...
#> $ Column.3: chr "CALS" "Economics" "Economics" "Economics" ...
#> $ Column.4: Factor w/ 2 levels "No","Yes": 2 2 1 2 2 2 2 2 2 2 ...
#> $ Column.5: logi FALSE FALSE TRUE FALSE TRUE TRUE ...
R
.data.1 <- data.frame(c(1:5), c(6:10), c(11:15), c(16:20), stringsAsFactors = FALSE) ## Creating data frame from scratch without specific column names.
data.1
dim(data.1) ## Dimensions of a data frame.
#> [1] 5 4
dimnames(data.1) ## Dimension names.
#> [[1]]
#> [1] "1" "2" "3" "4" "5"
#>
#> [[2]]
#> [1] "c.1.5." "c.6.10." "c.11.15." "c.16.20."
nrow(data.1) ## Number of rows.
#> [1] 5
ncol(data.1) ## Number of columns.
#> [1] 4
rownames(data.1) ## Gives the row names.
#> [1] "1" "2" "3" "4" "5"
colnames(data.1) ## Gives the column names.
#> [1] "c.1.5." "c.6.10." "c.11.15." "c.16.20."
names(data.1) ## Gives the column names.
#> [1] "c.1.5." "c.6.10." "c.11.15." "c.16.20."
column.names <- paste("Column", ".", 1:ncol(data.1), sep = "") ## Creating the generic column names automatically. paste function pastes the supplied values with a given string using vectorized operations.
column.names <- paste0("Column", ".", 1:ncol(data.1)) ## Same result as above. paste0 function pastes the supplied values with nothing using vectorized operations.
column.names
#> [1] "Column.1" "Column.2" "Column.3" "Column.4"
colnames(data.1) <- column.names ## Assignes the column names to the data frame by using the colnames.
# View(data.1) ## View the data frame in a new tab in interactive R session.
head(x = data.1, n = 2) ## Prints the first 2 elements of a data frame.
tail(x = data.1, n = 2) ## Prints the last 2 elements of a data frame.
rowSums(data.1) ## Row sums.
#> [1] 34 38 42 46 50
colSums(data.1) ## Column sums.
#> Column.1 Column.2 Column.3 Column.4
#> 15 40 65 90
rowMeans(data.1) ## Row means.
#> [1] 8.5 9.5 10.5 11.5 12.5
colMeans(data.1) ## Column means.
#> Column.1 Column.2 Column.3 Column.4
#> 3 8 13 18
R
object, even another list.R
is a list.# List without names.
a <- list(1, "a", TRUE, 1 + 4i)
a
#> [[1]]
#> [1] 1
#>
#> [[2]]
#> [1] "a"
#>
#> [[3]]
#> [1] TRUE
#>
#> [[4]]
#> [1] 1+4i
is.list(a) ## Checks whether the object is a list.
#> [1] TRUE
class(a)
#> [1] "list"
str(a)
#> List of 4
#> $ : num 1
#> $ : chr "a"
#> $ : logi TRUE
#> $ : cplx 1+4i
# List with names.
a <- list(Numeric = 1, Character = "a", Logical = TRUE, Complex = 1 + 4i)
a
#> $Numeric
#> [1] 1
#>
#> $Character
#> [1] "a"
#>
#> $Logical
#> [1] TRUE
#>
#> $Complex
#> [1] 1+4i
names(a) ## Gives a character vector of all the names of objects in a list.
#> [1] "Numeric" "Character" "Logical" "Complex"
# List inside of a list.
a <- list(c(2:4), "k", TRUE, list(rep(1, 3), rep(2, 4)))
a
#> [[1]]
#> [1] 2 3 4
#>
#> [[2]]
#> [1] "k"
#>
#> [[3]]
#> [1] TRUE
#>
#> [[4]]
#> [[4]][[1]]
#> [1] 1 1 1
#>
#> [[4]][[2]]
#> [1] 2 2 2 2
is.list(a) ## Checks whether the object is a list.
#> [1] TRUE
str(a)
#> List of 4
#> $ : int [1:3] 2 3 4
#> $ : chr "k"
#> $ : logi TRUE
#> $ :List of 2
#> ..$ : num [1:3] 1 1 1
#> ..$ : num [1:4] 2 2 2 2
# List consists of different objects.
x <- c(1:2) ## Numeric vector.
y <- c("NCSU","Wolfpack", "Economics") ## Character vector.
z <- c(TRUE, FALSE, TRUE, FALSE, FALSE) ## Logical vector.
w <- factor(c("yes", "no", "no", "yes")) ## Factor vector.
v <- c(1 + 4i, 4 + 6i, 3 + 3i, 2 + 5i) ## Vector for complex values.
a <- matrix(data = 1:4, nrow = 2, ncol = 2, byrow = FALSE) ## Matrix.
b <- array(1:8, dim = c(2, 2, 2), dimnames = NULL) ## Array
data.1 <- data.frame(Column.1 = c(1:3), Column.2 = c(4:6), Column.3 = c(7:9), stringsAsFactors = FALSE)
my.list <- list(3, x, y, z, w, v, a, b, data.1) ## The list contains diffrent class of objects.
my.list
#> [[1]]
#> [1] 3
#>
#> [[2]]
#> [1] 1 2
#>
#> [[3]]
#> [1] "NCSU" "Wolfpack" "Economics"
#>
#> [[4]]
#> [1] TRUE FALSE TRUE FALSE FALSE
#>
#> [[5]]
#> [1] yes no no yes
#> Levels: no yes
#>
#> [[6]]
#> [1] 1+4i 4+6i 3+3i 2+5i
#>
#> [[7]]
#> [,1] [,2]
#> [1,] 1 3
#> [2,] 2 4
#>
#> [[8]]
#> , , 1
#>
#> [,1] [,2]
#> [1,] 1 3
#> [2,] 2 4
#>
#> , , 2
#>
#> [,1] [,2]
#> [1,] 5 7
#> [2,] 6 8
#>
#>
#> [[9]]
#> Column.1 Column.2 Column.3
#> 1 1 4 7
#> 2 2 5 8
#> 3 3 6 9
str(my.list)
#> List of 9
#> $ : num 3
#> $ : int [1:2] 1 2
#> $ : chr [1:3] "NCSU" "Wolfpack" "Economics"
#> $ : logi [1:5] TRUE FALSE TRUE FALSE FALSE
#> $ : Factor w/ 2 levels "no","yes": 2 1 1 2
#> $ : cplx [1:4] 1+4i 4+6i 3+3i ...
#> $ : int [1:2, 1:2] 1 2 3 4
#> $ : int [1:2, 1:2, 1:2] 1 2 3 4 5 6 7 8
#> $ :'data.frame': 3 obs. of 3 variables:
#> ..$ Column.1: int [1:3] 1 2 3
#> ..$ Column.2: int [1:3] 4 5 6
#> ..$ Column.3: int [1:3] 7 8 9
RStudio
.RStudio
session.R
packages.R
software and R
packages.R
is always pointed at a directory on your computer file system.
R
objects, R
uses this pre-specified working directory as the base path for file operations.R
to start in a certain directory, you have to specify a path in your computer file system to be your default working directory.
RStudio > Preferences > General > Default working directory (when not in a project)
.R
to start the R
session on this pre-specified path.R
from there.
R
sessions.RStudio
Project feature.RStudio
Project feature, see RStudio
.R
console or editor, you can
R
interactive session.R
under any Windows Operating Systems, the slashes must be replaced with double backslashes.# R code chunk is not evaluated.
R.home() ## Gives you the home directory of R software itself.
getwd() ## Gives the current working directory.
my.current.dir <- getwd() ## Assigns the current working directory to an object.
setwd("Path of Working Directory") ## Sets the working directory to a new one. Note that this can be a relative path or a full path.
setwd(my.current.dir) ## Using the assigned object, setting the working directory.
setwd("~") ## Changes the working directory to home directory.
setwd("../") ## Double dots are used for moving up in the folder hierarchy.
setwd("./") ## A single dot represents the current directory itself.
setwd("/") ## Forward slash changes the working directory to the root.
R
working environment and includes any user-created objects such as vectors, matrices, data frames, lists, and functions.R
console or editor, you can check the class, structure, and attributes of the R
objects saved in your workspace.x <- "NSCU"
class(x) ## Gives the class of an object in a character string.
#> [1] "character"
str(x) ## Gives the details of object structure (class of the object and its values). Try to use it, very useful.
#> chr "NSCU"
attributes(x) ## This object does not have any attibutes yet.
#> NULL
attr(x, "Awesomeness Level") <- "Top Notch" ## A new attribute and description is added.
attributes(x) ## Not it has an user assign attribute.
#> $`Awesomeness Level`
#> [1] "Top Notch"
structure(x, new.attribute = "This is a new attribute") ## Returns a new object with modified attributes.
#> [1] "NSCU"
#> attr(,"Awesomeness Level")
#> [1] "Top Notch"
#> attr(,"new.attribute")
#> [1] "This is a new attribute"
attributes(cars) ## cars is a dataset from the datasets package in R.
#> $names
#> [1] "speed" "dist"
#>
#> $row.names
#> [1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
#> [24] 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46
#> [47] 47 48 49 50
#>
#> $class
#> [1] "data.frame"
class(cars)
#> [1] "data.frame"
str(cars)
#> 'data.frame': 50 obs. of 2 variables:
#> $ speed: num 4 4 7 7 8 9 10 10 10 11 ...
#> $ dist : num 2 10 4 22 16 10 18 26 34 17 ...
R
console or editor, you can
R
objects in a character vector.R
objects along with their structure.R
objects from your workspace.# R code chunk is not evaluated.
ls() ## Shows the user created objects in your workspace.
objects() ## Shows the user created objects in your workspace.
ls.str() ## Shows the details of all objects in your workspace.
rm(x) ## Removes the object from your workspace.
rm(c(x, y)) ## Removes multiple objects at the same time from your workspace.
rm(list = ls()) ## Removes all objects from your workspace.
R
console or editor, you can
option
settings of your R
session.option
settings of your R
session.R
session.q()
function to end your R
session.# R code chunk is not evaluated.
help(options)
options() ## View current option settings.
options(digits = 3) ## Change an option setting. Number of digits to print on output.
history() # Displays last 25 commandss
history(max.show = Inf) ## Displays all previous commandss
q() ## Ends R session. You will be prompted to save the workspace.
R
console or editor, you can perform file related operations such as create, delete, modify and etc.R
under any Windows Operating Systems, the slashes must be replaced with double backslashes.# R code chunk is not evaluated.
dir() ## Shows the files and folders in the working directory.
dir.create("./folder") ## Will create a directory if it doesn't exist.
file.exists("./RStudi_Setup.R") ## Will check to see if the directory exists.
file.remove("./file.csv") ## Deletes the file or folder in the given path.
unlink("./data.R") ## Deletes the file(s) or directories specified.
list.files("./") ## Lists the files in the given directory.
list.files(pattern = "(.Rmd)") ## Lists the files with the selected pattern in the given directory.
if (!file.exists("./file.txt")) {
dir.create("./folder") ## Chekcs if the file exists, if not then creates the folder called "data".
}
file.remove("./file.xlsx") ## Removes the file or folder in the given path.
unzip("./file.zip") ## Extract the file from a zip archive.
?files ## See the help file for more information such as renaming, appending and copying files.
fake.data <- rnorm(n = 1e6, mean = 0, sd = 1) ## Creates a fake data for size checking.
object.size(fakeData) ## Gives the size of the R object in bytes.
print(object.size(fakeData), units = "Mb") ## Gives the size of the R object in MB.
file.info("../../R mini BootCamp.Rproj") ## Gives the file information of a file.
file.info("../../R mini BootCamp.Rproj")$size ## Size of the file.
File > New File > R Script
and an empty script will appear.
.R
extension.source
function as shown below.
source
function, executes all the code lines in the script but does not generate any command line in the console.source
function.# R code chunk is not evaluated.
source("my_script.R") ## Loads "my_script.R" file from the current working directory.
source("./my_script.R") ## Loads "my_script.R" file from the current working directory.
source("../A Folder/my_script.R") ## Loads "my_script.R" file from another folder. See File System section for more information about file paths in R.
source("FULL PATH of my_script.R file") ## Loads "my_script.R" with the full path.
R
objects is by using .RData
file format.
.RData
file format allows you to save a R
object with its current state in your workspace..RData
file to your workspace, loads the R
object with its attributes when it was saved.R
object in RData
format, you should use save
and load
commands.load
function, loads the R
objects with the exact same name when they were saved.save.image()
command to save your current workspace (all R
objects) as a hidden .RData
file. +You can use saveRDS
and readRDS
functions for saving single R
object and loading it with a different name.R
objects is to use function pairs of dump
- source
and dput
- dget
functions.
dump
and dput
produce text representation of the R
objects and save it in a R
script file (.R
).R
code in a .R
file which can create the R
object in the future.source
and dget
reads code in a .R
file and create the R
object.# R code chunk is not evaluated.
x <- rbinom(n = 10, size = 1, prob = 0.5) ## Random number generator for the binomial distribution with parameters size and prob. n = number of observations, size = number of trials, prob = probability of success on each trial.
y <- c("NCSU", "Wolfpack")
save(x, file = "./x_object.RData") ## Saves the given objects to the RData file.
save(x, y, file = "./xy_object.RData") ## You can save multiple objects in a Rdata file at the same time.
rm(c(x, y)) ## Removes x and y objects.
load("./xy_object.RData") ## Loads the RData file. Note that it overwrites the existing x and y objects if they are still in your workspace..
save.image() ## Saves the current workspace as .RData file. Note that it save as a hidden file.
saveRDS(object = x, file = "./xy_object.RData") ## Saves a single R object to a file.
new.x <- readRDS(file = "xy_object.RData", refhook = NULL) ## Loads the x object as a new object.
new.x <- load("./x.RData") ## Gives error.
dump(c("x", "y"), file = "./data.R") ## Dump can be used for multiple objects.
rm(x)
source("./data.R") ## Loads and runs the R code in the script.
dput(x, file = "./data.R") ## Dput can be used on single R objects.
rm(x)
new.x <- dget("./data.R") ## Loads and runs the R code in the script, then assigns a new name.
R
, use help()
or ?
.RStudio
.RStudio
is very useful in learning about what function does, its arguments, detailed explanations, examples and more. You can even find the related functions which will help you learn even more functions and code.# R code chunk is not evaluated.
help(lm) ## Opens the help page for "lm" function which is for fitting linear models.
?lm
?"lm"
??lm ## Gives the search results for word "lm".
??errorsarlm ## If the package that contains the function is not installed, then you should use "??".
?":" ## Help for operator.
?"%in%"
help.start() ## Opens the main page for R Help.
help.search("covariance") ## Gives the search results for word "covariance".
RSiteSearch("vecm") ## Opens your browser and searches for "vecm" on http://search.r-project.org.
find("lm") ## Tells you what package the function is in.
apropos("lm") ## Returns a character vector giving the names of all objects in the search list that match your query.
args(lm) ## Presents the arguments of the function.
example(lm) ## Presents an example of the searched function.
demo(graphics) ## Gives a user-friendly interface to run some demonstration R scripts.
R
are extended through user-created packages, which allow specialized statistical techniques, graphical devices, import/export capabilities, reporting tools, and etc.R
packages are collections of R
functions, data, and compiled code in a well-defined format.R
code.
R
.R
is the thousands of user-written packages that solve specific problems in various disciplines.R
packages see the followings:
R
packages, you need to first install the package, which needs to be done just once.R
or Rstudio.
R
packages.# R code chunk is not evaluated.
install.packages("tidyr") ## Installs single package.
library("tidyr") ## Loads single package.
install.packages(c("RColorBrewer", "stringr")) ## Installs multiple packages.
lapply(c("RColorBrewer", "stringr"), library, character.only = TRUE) ## Loads multiple packages.
packageVersion("tidyr") ## Current Version of the package.
detach("package:RColorBrewer", unload = TRUE) ## Unloads package.
R
packages.## R chunk is not evaluated.
# Check the CRAN mirror.
getOption("repos")
# Lists loaded packages in your global environment.
(.packages())
# Some packages need to be installed from the source.
install.packages("rgdal", type = "source")
# Installing a package from GitHub repository.
install.packages("devtools") ## devtools package is necessary to install packages from GitHub repositories.
devtools::install_github("tidyverse/ggplot2") ## user.name/package.name
library("ggplot2")
# devtools::install_github("hadley/devtools")
# Installing a package from bioconductor website.
source("http://bioconductor.org/biocLite.R")
biocLite("rhdf5")
library("rhdf5")
# Installing CRAN Task Views.
## CRAN Task View (https://cran.r-project.org/web/views/) gives you the collection of packges in terms of area. For example spatial, econometrics, graphics and etc. To automatically install these views, the "ctv"" package needs to be installed, and then the views can be installed via "install.views" or "update.views" (which first assesses which of the packages are already installed and up-to-date) functions.
install.packages("ctv") ## Intalling the necessary package to install views.
library("ctv") ## Loading the package.
available.views(repos = NULL) ## Gives you all the views available.
install.views("Econometrics") ## Installing the "Econometrics" view.
update.views("Econometrics") ## Updating the "Econometrics" view.
# Updating packages from the editor or console.
update.packages() ## Updates all packages from CRAN.
devtools::install_github("hrbrmstr/dtupdate") ## Updates Git sourced package. dtupdate package is used for this purpose.
library("dtupdate")
github_update() ## See what packages are avilable to update.
# Some other tools with packages.
find.package("devtools") ## Shows you where (file loaction) the packge is installed in your computer.
search() ## Displays all the packages in the global environment.
utils::installed.packages() ## Displays all the packages that are installed in your computer.
available.packages() ## Displays all the R packages that are available.
head(rownames(available.packages()), 3) ## Shows the names of the first 3 packages which are available.
# Package help.
install.packages("sp") ## "sp" package is for spatial analysis.
library("sp")
vignette("sp") ## Opens the vignette for selected package if available.
R
functions, please see the Creating R Functions section.# R code chunk is not evaluated.
Load.Install <- function(package_names) {
is_installed <- function(mypkg) is.element(mypkg, utils::installed.packages()[ ,1])
for (package_name in package_names) {
if (!is_installed(package_name)) {
utils::install.packages(package_name, dependencies = TRUE)
}
suppressMessages(library(package_name, character.only = TRUE, quietly = TRUE, verbose = FALSE))
}
}
Load.Install(c("plyr", "dplyr", "tidyr", "sp"))
R
software itself and R
packages, see the following codes.# Cite R software.
citation()
#>
#> To cite R in publications use:
#>
#> R Core Team (2017). R: A language and environment for
#> statistical computing. R Foundation for Statistical Computing,
#> Vienna, Austria. URL https://www.R-project.org/.
#>
#> A BibTeX entry for LaTeX users is
#>
#> @Manual{,
#> title = {R: A Language and Environment for Statistical Computing},
#> author = {{R Core Team}},
#> organization = {R Foundation for Statistical Computing},
#> address = {Vienna, Austria},
#> year = {2017},
#> url = {https://www.R-project.org/},
#> }
#>
#> We have invested a lot of time and effort in creating R, please
#> cite it when using it for data analysis. See also
#> 'citation("pkgname")' for citing R packages.
# Cite R packages.
citation("ggplot2")
#>
#> To cite ggplot2 in publications, please use:
#>
#> H. Wickham. ggplot2: Elegant Graphics for Data Analysis.
#> Springer-Verlag New York, 2009.
#>
#> A BibTeX entry for LaTeX users is
#>
#> @Book{,
#> author = {Hadley Wickham},
#> title = {ggplot2: Elegant Graphics for Data Analysis},
#> publisher = {Springer-Verlag New York},
#> year = {2009},
#> isbn = {978-0-387-98140-6},
#> url = {http://ggplot2.org},
#> }
Operator | Description |
---|---|
+ | Addition |
- | Substraction |
* | Multiplication |
/ | Division |
^ or ** | Exponentiation |
%% | Reminder |
%/% | Quotient |
Operator | Description |
---|---|
< | Less than |
<= | Less than or equal to |
> | Greater than |
>= | Greater than or equal to |
== | Exactly equal to |
!= | Not equal to |
| and || | OR |
& and && | AND |
%in% | QLeft to rigth matching |
Operator | Description |
---|---|
: | Generates regular sequences |
R
programming language such as
R
objectsR
, and moreR
is capable of handling missing values.R
appears as NA
.
NA
is not a string or a numeric value.is.na
function to logically check whether the object has missing values.complete.cases
function to logically check whether the object has non-missing values.is.na
function checks the missingness element by element."NA"
returns a character value. For missingness, NA
should be used.x <- c(1:3, NA, 5:7, NA) ## Numeric vector.
x
#> [1] 1 2 3 NA 5 6 7 NA
str(x) ## Fourth value is missing.
#> int [1:8] 1 2 3 NA 5 6 7 NA
is.na(x) ## Checks the missing values.
#> [1] FALSE FALSE FALSE TRUE FALSE FALSE FALSE TRUE
complete.cases(x) ## Checks the non-missing values.
#> [1] TRUE TRUE TRUE FALSE TRUE TRUE TRUE FALSE
is.na(x) == !complete.cases(x) ## Same functions.
#> [1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
!is.na(x) == complete.cases(x) ## Same functions.
#> [1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
sum(is.na(x)) ## Gives the number of missing values.
#> [1] 2
any(is.na(x)) ## Is there a missing value?
#> [1] TRUE
y <- c("a", "b", "c", NA, "NA") ## Character vector.
y
#> [1] "a" "b" "c" NA "NA"
str(y)
#> chr [1:5] "a" "b" "c" NA "NA"
is.na(y) ## Note that "NA" is not missing, it is a character string with values "NA".
#> [1] FALSE FALSE FALSE TRUE FALSE
is.na
function checks the missingness element by element like in vectors.<NA>
rather than just NA
.a <- factor(x = c("yes", NA, "no", "yes", NA)) ## Creates the factor and gives you the 'Levels'.
a
#> [1] yes <NA> no yes <NA>
#> Levels: no yes
str(a)
#> Factor w/ 2 levels "no","yes": 2 NA 1 2 NA
is.na(a)
#> [1] FALSE TRUE FALSE FALSE TRUE
is.na
function checks the missingness element by element like in vectors.a <- c(TRUE, NA, FALSE, NA)
a
#> [1] TRUE NA FALSE NA
str(a)
#> logi [1:4] TRUE NA FALSE NA
is.na(a)
#> [1] FALSE TRUE FALSE TRUE
is.na
function checks the missingness element by element like in vectors.
is.na
function is a logical matrix with the same dimension attributes.a <- rep(c(3, NA, 2), each = 2)
b <- matrix(data = a, nrow = 2, ncol = 3)
b
#> [,1] [,2] [,3]
#> [1,] 3 NA 2
#> [2,] 3 NA 2
str(b)
#> num [1:2, 1:3] 3 3 NA NA 2 2
is.na(b)
#> [,1] [,2] [,3]
#> [1,] FALSE TRUE FALSE
#> [2,] FALSE TRUE FALSE
na.check <- is.na(b)
class(na.check)
#> [1] "matrix"
str(na.check)
#> logi [1:2, 1:3] FALSE FALSE TRUE TRUE FALSE FALSE
is.na
function checks the missingness element by element like in vectors.
is.na
function is a logical array with the same dimension attributes.a <- sample(x = c(1:3, NA), size = 12, replace = TRUE, prob = NULL) ## Sample function takes a sample of the specified size from the elements of x using either with or without replacement. Creates an integer class.
b <- array(data = a, dim = c(2, 3, 2)) ## Two-dimensional array.
b
#> , , 1
#>
#> [,1] [,2] [,3]
#> [1,] 2 2 3
#> [2,] 3 3 2
#>
#> , , 2
#>
#> [,1] [,2] [,3]
#> [1,] 1 2 NA
#> [2,] 1 1 NA
str(b)
#> int [1:2, 1:3, 1:2] 2 3 2 3 3 2 1 1 2 1 ...
is.na(b)
#> , , 1
#>
#> [,1] [,2] [,3]
#> [1,] FALSE FALSE FALSE
#> [2,] FALSE FALSE FALSE
#>
#> , , 2
#>
#> [,1] [,2] [,3]
#> [1,] FALSE FALSE TRUE
#> [2,] FALSE FALSE TRUE
na.check <- is.na(b)
class(na.check)
#> [1] "array"
str(na.check)
#> logi [1:2, 1:3, 1:2] FALSE FALSE FALSE FALSE FALSE FALSE ...
NA
can arise when you load a data set with empty cells.x <- data.frame(c(NA, 1:3, NA), c(NA, 4, NA, 5:6), c(7:9, NA, NA), c(10:14), stringsAsFactors = FALSE) ## Creating data frame from scratch without specific column names.
colnames(x) <- paste0("Column", ".", 1:ncol(x)) ## Assignes the column names to the data frame by using the colnames.
x
str(x)
#> 'data.frame': 5 obs. of 4 variables:
#> $ Column.1: int NA 1 2 3 NA
#> $ Column.2: num NA 4 NA 5 6
#> $ Column.3: int 7 8 9 NA NA
#> $ Column.4: int 10 11 12 13 14
is.na(x)
#> Column.1 Column.2 Column.3 Column.4
#> [1,] TRUE TRUE FALSE FALSE
#> [2,] FALSE FALSE FALSE FALSE
#> [3,] FALSE TRUE FALSE FALSE
#> [4,] FALSE FALSE TRUE FALSE
#> [5,] TRUE FALSE TRUE FALSE
sum(is.na(x)) ## Gives number of the missing values
#> [1] 6
any(is.na(x)) ## Is there a missing value?
#> [1] TRUE
colSums(is.na(x)) ## Missing values by columns.
#> Column.1 Column.2 Column.3 Column.4
#> 2 2 2 0
rowSums(is.na(x)) ## Missing values by rows.
#> [1] 2 0 1 1 2
is.na
function checks the missingness element by element like in vectors.
NA
, then it is considered as missing.unlist
function.
a <- list(NA, c(2:4), "k", rep(NA, 2), list(rep(NA, 3)))
a
#> [[1]]
#> [1] NA
#>
#> [[2]]
#> [1] 2 3 4
#>
#> [[3]]
#> [1] "k"
#>
#> [[4]]
#> [1] NA NA
#>
#> [[5]]
#> [[5]][[1]]
#> [1] NA NA NA
is.na(a) ## Returns only 3 logical results. This is because is.na function thinks c(2:4) and rep(NA, 2) are single elements.
#> [1] TRUE FALSE FALSE FALSE FALSE
str(a)
#> List of 5
#> $ : logi NA
#> $ : int [1:3] 2 3 4
#> $ : chr "k"
#> $ : logi [1:2] NA NA
#> $ :List of 1
#> ..$ : logi [1:3] NA NA NA
b <- unlist(a)
b
#> [1] NA "2" "3" "4" "k" NA NA NA NA NA
str(b)
#> chr [1:10] NA "2" "3" "4" "k" NA NA NA NA NA
is.na(b) ## Unlisting gives the correct result.
#> [1] TRUE FALSE FALSE FALSE FALSE TRUE TRUE TRUE TRUE TRUE
NA
when you try certain operations that are illegal or don’t make sense.NA
produces an NA
.
NA
always produces an NA
.R
functions, there is a way to exclude missing values in their calculations. You can use the following functions separately or as arguments in some R
functions. The below code shows how they are used as arguments in functions.
na.rm
: Remove the missing values.na.fail
: Stop if any missing values are encountered.na.omit
: Drop out any rows with missing values anywhere in them and forgets them forever.na.exclude
: Drop out rows with missing values, but keeps track of where they were.na.pass
: Take no action.NA
indicates a missing value, it is still considered a value by R
.
length
function to see that.var(10) ## Variance of a number which returns NA.
#> [1] NA
sd(8) ## Standard deviation of a number which returns NA.
#> [1] NA
c(1, NA) == NA
#> [1] NA NA
NA + 1
#> [1] NA
NA * 2
#> [1] NA
x <- c(4, 5, NA)
x < 10
#> [1] TRUE TRUE NA
x + 2
#> [1] 6 7 NA
x * 2
#> [1] 8 10 NA
sum(x)
#> [1] NA
sum(x, na.rm = TRUE) ## Only the missing values are removed.
#> [1] 9
mean(x)
#> [1] NA
mean(x, na.rm = TRUE)
#> [1] 4.5
var(x)
#> [1] NA
var(x, na.rm = TRUE)
#> [1] 0.5
a <- rep(c(3, NA, 2), each = 2)
b <- matrix(data = a, nrow = 2, ncol = 3)
b
#> [,1] [,2] [,3]
#> [1,] 3 NA 2
#> [2,] 3 NA 2
sum(b, na.rm = TRUE) ## Only the missing values are removed.
#> [1] 10
sum(b, na.omit = TRUE) ## The row of the missing value is removed.
#> [1] NA
length(x)
#> [1] 3
0/0
, 1/0
) are represented by NaN
(not a number).
NaN
is a NA
, but NA
is not a NaN
.is.nan
function to check whether a value is NaN
.x <- c(NA, NaN) ## Numeric vector.
str(x)
#> num [1:2] NA NaN
is.na(x) ## Both NA and NaN are considered as NA.
#> [1] TRUE TRUE
is.nan(x) ## Only NA is not considered as NaN.
#> [1] FALSE TRUE
is.nan(0/0)
#> [1] TRUE
R
object, but not all of them. In such cases the subsetting operators become handy.R
objects.R
, you can use three different subsetting operators to select an individual element or group of elements.
[ ]
.[[ ]]
.$
for simplifying subsetting.[ ]
and [[ ]]
you have to put the position number (index
) of the element into square brackets.
[index]
, [[index]]
, [index.1, index.2]
, [index.1, index.2, index.3]
and so on (depending on the dimensions of the R
object).[ ]
can be used with multiple indices but it is not possible for [[ ]]
.R
object, indexing starts from 1
.[ ]
and [[ ]]
, you need to specify the names with quotes.index
.[index]
for preserving subsetting.[[index]]
for simplifying subsetting.[index]
and [[index]]
does not matter.[[index]]
.[index.1][index.2]
.names
of a R
object you should utilize from unname
function.# Vector with numeric values.
a <- c(1:10)
a
#> [1] 1 2 3 4 5 6 7 8 9 10
a[3] ## Selecting the 3rd element. Preserving subsetting with unnamed vectors.
#> [1] 3
a[[3]] ## Use it for vectors with names. Simplifying subsetting with unnamed vectors. Same result as above.
#> [1] 3
a[c(2, 3, 5)] ## Selecting the 2nd, 3rd and the 5th elements.
#> [1] 2 3 5
a[1:3]
#> [1] 1 2 3
a[10:5]
#> [1] 10 9 8 7 6 5
a[3:length(a)]
#> [1] 3 4 5 6 7 8 9 10
a[c(seq(1, 10, 2))]
#> [1] 1 3 5 7 9
a[c(2, 3, 5, 6)][c(1, 2)] ## Subsetting two times.
#> [1] 2 3
a[c(2, 3, 5, 6)][c(1, 2)][1] ## Subsetting three times..
#> [1] 2
# Negative indexing
a <- c(1:10)
a[-1]
#> [1] 2 3 4 5 6 7 8 9 10
a[-c(1:4)]
#> [1] 5 6 7 8 9 10
a[-c(1, 4)]
#> [1] 2 3 5 6 7 8 9 10
# Vector with names.
b <- c(First = 1, Second = 2, Third = 3, Fourth = 4, Fifth = 5)
b
#> First Second Third Fourth Fifth
#> 1 2 3 4 5
b[1] ## Preseving subsetting with named vectors.
#> First
#> 1
b[[1]] ## Simplifying subsetting with named vectors. Compared to above result, they are not the same.
#> [1] 1
b["First"]
#> First
#> 1
b[["First"]] ## Simplifying subsetting.
#> [1] 1
b[[1]] ## Same as above.
#> [1] 1
b[c("First", "Third")]
#> First Third
#> 1 3
b[c(1, 3)] ## Same as above.
#> First Third
#> 1 3
# b[[c("First", "Third")]] ## Error. You cannot use multiple indices with "[[ ]]".Instead use the below command.
unname(b[c("First", "Third")]) ## You can use "unname" functio to drop the names.
#> [1] 1 3
c(b[["First"]], b[["Third"]])
#> [1] 1 3
b[c(names(b)[c(3, 5)])]
#> Third Fifth
#> 3 5
b[c(names(b)[-c(3, 5)])]
#> First Second Fourth
#> 1 2 4
drop = TRUE
argument to drop the unused levels.a <- factor(x = c("yes", NA, "no", "yes", NA)) ## Creates the factor and gives you the 'Levels'.
a
#> [1] yes <NA> no yes <NA>
#> Levels: no yes
a[3] ## Selecting the 3rd element. Preserving subsetting with unnamed factors.
#> [1] no
#> Levels: no yes
a[[3]] ## Use it for factors with names. Simplifying subsetting with unnamed vectors. Same result as above.
#> [1] no
#> Levels: no yes
a[c(1:3)]
#> [1] yes <NA> no
#> Levels: no yes
a[1, drop = TRUE] ## Drops the unused levels.
#> [1] yes
#> Levels: yes
a <- factor(x = c(First = "yes", Second = NA, Third = "no", Fourth = "yes", Fifth = NA))
a
#> First Second Third Fourth Fifth
#> yes <NA> no yes <NA>
#> Levels: no yes
a[1] ## Preseving subsetting with named factors.
#> First
#> yes
#> Levels: no yes
a[[1]] ## Simplifying subsetting with named factors. Compared to above result, they are not the same.
#> [1] yes
#> Levels: no yes
unname(a[1]) ## Same as above.
#> [1] yes
#> Levels: no yes
a["First"] ## You can also use names.
#> First
#> yes
#> Levels: no yes
a[1, drop = TRUE] ## Preserving subsetting. Drops the unused levels.
#> First
#> yes
#> Levels: yes
a[[1]][ , drop = TRUE] ## Simplifying subsetting. Drops the unused levels.
#> [1] yes
#> Levels: yes
y <- sample(x = c(TRUE, FALSE, NA), size = 5, replace = TRUE)
y
#> [1] TRUE NA NA NA TRUE
y[1]
#> [1] TRUE
y[[1]]
#> [1] TRUE
y[c(2:4)]
#> [1] NA NA NA
# Named logicals are dropped since they are rarely used.
R
objects, you can use up to two indices for subsetting.[index.1, , drop = FALSE]
or [, index.1, drop = FALSE]
for preserving subsetting.[index.1, ]
or [, index.1]
for simplifying subsetting.index.1
is for rows, and index.2
is for columns.[index]
to subset a single element in a matrix.
[1, 1]
and continues as [2, 1]
, [nrow, 1]
, [1, 2]
… [nrow, ncol]
.unname
function to drop the names.a <- matrix(data = 1:9, nrow = 3, ncol = 3, dimnames = list(c("row1", "row2", "row3"), c("col1", "col2", "col3")))
a
#> col1 col2 col3
#> row1 1 4 7
#> row2 2 5 8
#> row3 3 6 9
a[2] ## Gives the second element in a matrix. Note that to index of elements are by columns. Second element is on the first column second row.
#> [1] 2
a[[2]] ## Same result.
#> [1] 2
a[1, 1] ## Simplifying subsetting. First row and first column.
#> [1] 1
a[1, 1, drop = FALSE] ## Preserving subsetting.
#> col1
#> row1 1
a[1:2, 1:2]
#> col1 col2
#> row1 1 4
#> row2 2 5
a["row1", "col1"] ## You can also use names with quotes.
#> [1] 1
a[ , 1] ## Simplifying subsetting. First column.
#> row1 row2 row3
#> 1 2 3
a[ , 1, drop = FALSE] ## Preserving subsetting.
#> col1
#> row1 1
#> row2 2
#> row3 3
unname(a[, 1])
#> [1] 1 2 3
a[, 2:3]
#> col2 col3
#> row1 4 7
#> row2 5 8
#> row3 6 9
a[1, ] ## Simplifying subsetting. First row.
#> col1 col2 col3
#> 1 4 7
a[1, , drop = FALSE] ## Preserving subsetting.
#> col1 col2 col3
#> row1 1 4 7
unname(a[1, ])
#> [1] 1 4 7
a[2:3, ]
#> col1 col2 col3
#> row2 2 5 8
#> row3 3 6 9
a[-1, ] ## Negative subsetting
#> col1 col2 col3
#> row2 2 5 8
#> row3 3 6 9
a[, -1]
#> col2 col3
#> row1 4 7
#> row2 5 8
#> row3 6 9
a[-c(2:3), -1]
#> col2 col3
#> 4 7
index.1
is for rows, and index.2
is for columns.index.3
to index.K
indicates the other dimensions.a <- array(data = 1:12, dim = c(3, 2, 2)) ## 3-dimensional array.
a
#> , , 1
#>
#> [,1] [,2]
#> [1,] 1 4
#> [2,] 2 5
#> [3,] 3 6
#>
#> , , 2
#>
#> [,1] [,2]
#> [1,] 7 10
#> [2,] 8 11
#> [3,] 9 12
a[12] ## Gives the 12th element.
#> [1] 12
a[[12]] ## Same result.
#> [1] 12
# a[1, 1] ## Incorrect number of dimensions
a[1, 1, 1] ## Simplifying subsetting.
#> [1] 1
a[1, 1, 1, drop = FALSE] ## Preserving subsetting.
#> , , 1
#>
#> [,1]
#> [1,] 1
a[1:2, 1:2, 1:2]
#> , , 1
#>
#> [,1] [,2]
#> [1,] 1 4
#> [2,] 2 5
#>
#> , , 2
#>
#> [,1] [,2]
#> [1,] 7 10
#> [2,] 8 11
a[1, , ] ## Simplifying subsetting.
#> [,1] [,2]
#> [1,] 1 7
#> [2,] 4 10
a[1, , , drop = FALSE] ## Preserving subsetting.
#> , , 1
#>
#> [,1] [,2]
#> [1,] 1 4
#>
#> , , 2
#>
#> [,1] [,2]
#> [1,] 7 10
a[1:2, , ]
#> , , 1
#>
#> [,1] [,2]
#> [1,] 1 4
#> [2,] 2 5
#>
#> , , 2
#>
#> [,1] [,2]
#> [1,] 7 10
#> [2,] 8 11
a[, 1, ] ## Simplifying subsetting.
#> [,1] [,2]
#> [1,] 1 7
#> [2,] 2 8
#> [3,] 3 9
a[, 1, , drop = FALSE] ## Preserving subsetting.
#> , , 1
#>
#> [,1]
#> [1,] 1
#> [2,] 2
#> [3,] 3
#>
#> , , 2
#>
#> [,1]
#> [1,] 7
#> [2,] 8
#> [3,] 9
a[, 1:2, ]
#> , , 1
#>
#> [,1] [,2]
#> [1,] 1 4
#> [2,] 2 5
#> [3,] 3 6
#>
#> , , 2
#>
#> [,1] [,2]
#> [1,] 7 10
#> [2,] 8 11
#> [3,] 9 12
a[ , , 1] ## Simplifying subsetting.
#> [,1] [,2]
#> [1,] 1 4
#> [2,] 2 5
#> [3,] 3 6
a[ , , 1, drop = FALSE] ## Preserving subsetting.
#> , , 1
#>
#> [,1] [,2]
#> [1,] 1 4
#> [2,] 2 5
#> [3,] 3 6
a[, , 1]
#> [,1] [,2]
#> [1,] 1 4
#> [2,] 2 5
#> [3,] 3 6
a[-1, , ] ## Negative subsetting
#> , , 1
#>
#> [,1] [,2]
#> [1,] 2 5
#> [2,] 3 6
#>
#> , , 2
#>
#> [,1] [,2]
#> [1,] 8 11
#> [2,] 9 12
a[, -1, ]
#> [,1] [,2]
#> [1,] 4 10
#> [2,] 5 11
#> [3,] 6 12
a[-c(2:3), , -1]
#> [1] 7 10
R
objects, you can use up to two indices for subsetting.[, index.2, drop = FALSE]
or [index.2]
for preserving subsetting.[, index.2]
, [[index.2]]
and $
with names for simplifying subsetting.$
subsetting operator is generally used to select a dimension of a R
objects with its name, with or without quotes.[index.1, index.2]
for subsetting rows and columns simultaneously.
index.1
is for rows, and index.2
is for columns.unname
function to drop the names.x <- data.frame(c(1:5), c(6:10), c(11:15), c(16:20), stringsAsFactors = FALSE) ## Creating data frame from scratch without specific column names.
colnames(x) <- paste0("Column", ".", 1:ncol(x)) ## Assignes the column names to the data frame by using the colnames.
x
# Subsetting columns.
x[, 1] ## Simplifying subsetting for columns. Column 1 values only.
#> [1] 1 2 3 4 5
x[[1]] ## Same as above. Note that it subsets the columns only.
#> [1] 1 2 3 4 5
x[["Column.1"]]
#> [1] 1 2 3 4 5
x[, "Column.1"]
#> [1] 1 2 3 4 5
x["Column.1"]
x$Column.1 ## Same as above.
#> [1] 1 2 3 4 5
x$"Column.1" ## Same as above.
#> [1] 1 2 3 4 5
x[2:4, 1]
#> [1] 2 3 4
x[, 1, drop = FALSE] ## Preserving subsetting for columns. Column 1 values only.
x[1] ## Note that it subsets the columns only.
x[2:4, 1, drop = FALSE]
# Subsetting rows.
x[1, ] ## Structure of the data is preserved
unname(as.matrix(x[1, ])[1, ]) ## This is the simplified subsetting for rows. Note that as.matrix function coerce the subsetted data frame into matrix. We will see the details of coercion later.
#> [1] 1 6 11 16
# Subsetting row and columns.
x[1, 1]
#> [1] 1
x[2:4, 1]
#> [1] 2 3 4
x[c(1, 3), c(2, 4)]
# Negative subsetting
x[-1, -1] ## First row and first column is deleted.
x[-c(2:4), ] ## Row 2, 3, 4 are deleted.
$
for simplifying subsetting with named lists.a <- list(c(1:5), c("a", "b"), c(TRUE, FALSE), list(c(6:10), c("c, d")))
a
#> [[1]]
#> [1] 1 2 3 4 5
#>
#> [[2]]
#> [1] "a" "b"
#>
#> [[3]]
#> [1] TRUE FALSE
#>
#> [[4]]
#> [[4]][[1]]
#> [1] 6 7 8 9 10
#>
#> [[4]][[2]]
#> [1] "c, d"
a[[1]] ## Simplifying subsetting.
#> [1] 1 2 3 4 5
a[1] ## Preserving subsetting. Note that the result is still a list.
#> [[1]]
#> [1] 1 2 3 4 5
a[[2]][1] ## [[]] helps us to get in the second element in the list. [] helps us the subset the second element in the list.
#> [1] "a"
a[[4]][[1]]
#> [1] 6 7 8 9 10
a[[4]][[1]][2]
#> [1] 7
a <- list(Numeric = c(1:3), Character = c("a", "b"), Logical = c(TRUE, FALSE))
a
#> $Numeric
#> [1] 1 2 3
#>
#> $Character
#> [1] "a" "b"
#>
#> $Logical
#> [1] TRUE FALSE
a[["Numeric"]]
#> [1] 1 2 3
a$Numeric
#> [1] 1 2 3
a$"Numeric"
#> [1] 1 2 3
a$Numeric[2]
#> [1] 2
R
objects, and you need to do a lot of exercise to be experienced in it.R
, please see Subsetting section of Advanced R by Hadley Wickham.R
, there is one other method for subsetting which is called conditional subsetting.R
object with your condition.TRUE
and FALSE
values in the logical vector determine which element should be selected.# Vectors
x <- c(1:10)
x
#> [1] 1 2 3 4 5 6 7 8 9 10
a <- x > 4 ## This is our condition.
x[a]
#> [1] 5 6 7 8 9 10
x[!a]
#> [1] 1 2 3 4
x[x < 2 | x > 8]
#> [1] 1 9 10
# Matrices
a <- matrix(data = 1:9, nrow = 3, ncol = 3, dimnames = list(c("row1", "row2", "row3"), c("col1", "col2", "col3")))
a
#> col1 col2 col3
#> row1 1 4 7
#> row2 2 5 8
#> row3 3 6 9
a > 4 ## Condition.
#> col1 col2 col3
#> row1 FALSE FALSE TRUE
#> row2 FALSE TRUE TRUE
#> row3 FALSE TRUE TRUE
a[a > 4] ## Condition is applied to all matrix elements.
#> [1] 5 6 7 8 9
b <- unname(a[, 2, drop = TRUE]) ## Second column.
b
#> [1] 4 5 6
b[b > 4] ## Condition on the second column.
#> [1] 5 6
# Data frames
x <- data.frame(c(1:5), c(6:10), c(11:15), c(16:20), stringsAsFactors = FALSE) ## Creating data frame from scratch without specific column names.
colnames(x) <- paste0("Column", ".", 1:ncol(x)) ## Assignes the column names to the data frame by using the colnames.
x
x[x > 4] ## Condition is applied to all data frame elements.
#> [1] 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
x[x$Column.1 > 2, ]
x[x$Column.1 > 2 & x$Column.3 > 13, ]
x[x$Column.1 > 2 & x$Column.3 > 13, ]$Column.4
#> [1] 19 20
x[x$Column.1 > 2 & x$Column.3 > 13, c("Column.2", "Column.3")]
NA
value(s) create logical vector with NA
value(s).NA
value(s) creates unexpected results while performing conditional subsetting.NA
values.# Vectors
x <- sample(x = c(1:10, rep(NA, 5)), size = 10, replace = TRUE, prob = NULL)
x
#> [1] 7 6 8 NA 10 2 10 NA 9 NA
a <- x > 4 ## When there is NA the condition produces NA.
a
#> [1] TRUE TRUE TRUE NA TRUE FALSE TRUE NA TRUE NA
b <- x[!is.na(x)] ## NA values are excluded.
b
#> [1] 7 6 8 10 2 10 9
b[b > 4] ## Values larger than 4.
#> [1] 7 6 8 10 10 9
x[which(x > 4)] ## You can use this directly. which functions omits the NA value automatically.
#> [1] 7 6 8 10 10 9
# Matrices
x <- sample(x = c(1:10, rep(NA, 5)), size = 9, replace = TRUE, prob = NULL)
a <- matrix(data = x, nrow = 3, ncol = 3, dimnames = list(c("row1", "row2", "row3"), c("col1", "col2", "col3")))
a
#> col1 col2 col3
#> row1 NA 10 9
#> row2 NA 9 10
#> row3 6 NA 4
a > 4 ## When there is NA the condition produces NA.
#> col1 col2 col3
#> row1 NA TRUE TRUE
#> row2 NA TRUE TRUE
#> row3 TRUE NA FALSE
a[which(a > 4)] ## Condition is applied to all matrix elements.
#> [1] 6 10 9 9 10
b <- unname(a[, 2, drop = TRUE]) ## Second column.
b
#> [1] 10 9 NA
b[which(b > 4)] ## Condition on the second column.
#> [1] 10 9
# Data frames
x <- data.frame(c(NA, 1:3, NA), c(NA, 4, 10, 5:6), c(7:9, NA, NA), c(10:14), stringsAsFactors = FALSE) ## Creating data frame from scratch without specific column names.
colnames(x) <- paste0("Column", ".", 1:ncol(x)) ## Assignes the column names to the data frame by using the colnames.
x
x > 4 ## When there is NA the condition produces NA.
#> Column.1 Column.2 Column.3 Column.4
#> [1,] NA NA TRUE TRUE
#> [2,] FALSE FALSE TRUE TRUE
#> [3,] FALSE TRUE TRUE TRUE
#> [4,] FALSE TRUE NA TRUE
#> [5,] NA TRUE NA TRUE
is.na(x)
#> Column.1 Column.2 Column.3 Column.4
#> [1,] TRUE TRUE FALSE FALSE
#> [2,] FALSE FALSE FALSE FALSE
#> [3,] FALSE FALSE FALSE FALSE
#> [4,] FALSE FALSE TRUE FALSE
#> [5,] TRUE FALSE TRUE FALSE
complete.cases(x) ## Gives the row with all non-NA values.
#> [1] FALSE TRUE TRUE FALSE FALSE
a <- x[complete.cases(x), ] ## Rows with non missing elements.
a ## Non-NA data frame.
a[a$Column.1 > 1, ] ## Apply the condition.
a[a$Column.1 > 1, c("Column.2", "Column.4")]
x[which(x$Column.1 > 1), ]
x[which(x$Column.1 > 1 & x$Column.3 > 7), ]
x[which(x$Column.1 > 2 & x$Column.4 > 11), ]$Column.4
#> [1] 13
x[which(x$Column.1 > 1 & x$Column.3 > 8), c("Column.2", "Column.3")]
R
objects.# Vectors
x <- c(1:10)
x
#> [1] 1 2 3 4 5 6 7 8 9 10
x[x > 4] <- NA
x
#> [1] 1 2 3 4 NA NA NA NA NA NA
x[is.na(x)] <- 0
# Matrices
a <- matrix(data = 1:9, nrow = 3, ncol = 3, dimnames = list(c("row1", "row2", "row3"), c("col1", "col2", "col3")))
a
#> col1 col2 col3
#> row1 1 4 7
#> row2 2 5 8
#> row3 3 6 9
a[1, c(1:3)] <- NA
a
#> col1 col2 col3
#> row1 NA NA NA
#> row2 2 5 8
#> row3 3 6 9
# Data frames
x <- data.frame(c(1:5), c(6:10), c(11:15), c(16:20), stringsAsFactors = FALSE) ## Creating data frame from scratch without specific column names.
colnames(x) <- paste0("Column", ".", 1:ncol(x)) ## Assignes the column names to the data frame by using the colnames.
x
x[3, 4] <- 10000
x
x[x$Column.1 > 2, ] <- NA
x
R
programming language.R
, you can convert the class of some objects into other classes by explicit coercion.x <- c(0:6)
class(x) ## The class of x is integer.
#> [1] "integer"
as.numeric(x) ## Coerces x as a numeric.
#> [1] 0 1 2 3 4 5 6
as.character(x) ## Coerces x as a character.
#> [1] "0" "1" "2" "3" "4" "5" "6"
as.complex(x) ## Coerces x as a complex.
#> [1] 0+0i 1+0i 2+0i 3+0i 4+0i 5+0i 6+0i
as.factor(x) ## Coerces x as a factor.
#> [1] 0 1 2 3 4 5 6
#> Levels: 0 1 2 3 4 5 6
as.logical(x) ## Coerces x as a logical (0 is FALSE and everything greater than 0 is TRUE).
#> [1] FALSE TRUE TRUE TRUE TRUE TRUE TRUE
as.matrix(x) ## Coerces x as a matrix.
#> [,1]
#> [1,] 0
#> [2,] 1
#> [3,] 2
#> [4,] 3
#> [5,] 4
#> [6,] 5
#> [7,] 6
as.array(x) ## Coerces x as an arrray.
#> [1] 0 1 2 3 4 5 6
as.data.frame(x) ## Coerces x as a data frame.
as.list(x) ## Coerces x as a list.
#> [[1]]
#> [1] 0
#>
#> [[2]]
#> [1] 1
#>
#> [[3]]
#> [1] 2
#>
#> [[4]]
#> [1] 3
#>
#> [[5]]
#> [1] 4
#>
#> [[6]]
#> [1] 5
#>
#> [[7]]
#> [1] 6
R
has developed a special representation of dates and times.
Date
class.1970-01-01
.POSIXct
or POSIXlt
class.1970-01-01
.d1 <- base::date() ## Current date and time.
d1
#> [1] "Tue Feb 6 02:35:10 2018"
class(d1) ## Class is character
#> [1] "character"
d2 <- Sys.Date() ## System date.
d2
#> [1] "2018-02-06"
class(d2) ## Class is "Date".
#> [1] "Date"
d3 <- Sys.time() ## System time.
d3
#> [1] "2018-02-06 02:35:10.170 EST"
str(x) ## Class is "POSIXct"
#> int [1:7] 0 1 2 3 4 5 6
R
, times are represented using the POSIXct
or the POSIXlt
class.
POSIXct
is just a very large integer under the hood; it use a useful class when you want to store times in something like a data frame. # POSIXlt
is a list underneath and it stores a bunch of other useful information like the day of the week, day of the year, month and day of the month.as.POSIXlt
or as.POSIXct
functions.x <- Sys.time()
x
#> [1] "2018-02-06 02:35:10.187 EST"
str(x) ## Class is "POSIXct"
#> POSIXct[1:1], format: "2018-02-06 02:35:10.187"
a <- as.POSIXlt(x) ## Coerced to as.POSIXlt.
str(a) ## Class is "POSIXltt"
#> POSIXlt[1:1], format: "2018-02-06 02:35:10.187"
names(unclass(a)) ## Gives the names, after it is unclassed.
#> [1] "sec" "min" "hour" "mday" "mon" "year" "wday"
#> [8] "yday" "isdst" "zone" "gmtoff"
a$sec
#> [1] 10.187761
a$yday
#> [1] 36
a$mday
#> [1] 6
as.POSIXct(a) ## Coerced to as.POSIXct.
#> [1] "2018-02-06 02:35:10.187 EST"
as.Date
function.unclass
function Date
class objects can be coerced to numeric objects which indicates the number of the date since the first date (1970-01-01
).Date
class.x <- as.Date("1970-01-01")
str(x) ## Class is "Date".
#> Date[1:1], format: "1970-01-01"
unclass(x) ## Note that the starting date is 1970-01-01 for date class.
#> [1] 0
unclass(as.Date("1970-01-02"))
#> [1] 1
unclass(as.Date("1969-12-31")) ## Dates before the starting date are represented by negative numbers.
#> [1] -1
unclass(as.Date(Sys.Date())) ## Number of days since the first date.
#> [1] 17568
as.Date(0) ## The first date.
#> [1] "1970-01-01"
as.Date(c(-2:2))
#> [1] "1969-12-30" "1969-12-31" "1970-01-01" "1970-01-02" "1970-01-03"
R
, there are a number of generic functions that work on dates and times.format
, strptime
, and strftime
functions.%d
: day as number (0-31).%a
: abbreviated weekday.%A
: unabbreviated weekday.%m
: month (00-12).%b
: abbreviated month.%B
: unabbrevidated month.%y
: 2 digit year.%Y
: four digit year.%H
: hours as decimal number (00–23)%M
: minute as decimal number (00-59).%S
: second as integer (00-61).%T
: equivalent to %H:%M:%S
.x <- Sys.Date() ## System date.
weekdays(x, abbreviate = FALSE) ## The weekday.
#> [1] "Tuesday"
months(d2, abbreviate = FALSE) ## The month.
#> [1] "February"
julian(x) ## Gives the number of the days since the origin and the origin is given in the result.
#> [1] 17568
#> attr(,"origin")
#> [1] "1970-01-01"
format(x, "%a %b %d") ## Formats the date class object with the desired representation.
#> [1] "Tue Feb 06"
strftime(x, origin = "1970-01-01", tz = "UTC", format = "%B %d, %Y %H:%M") ## Similar to above.
#> [1] "February 06, 2018 00:00"
strftime(x, origin = "1970-01-01", tz = "UTC", format = "%A %B %Y") ## Similar to above.
#> [1] "Tuesday February 2018"
format(as.POSIXct("Feb 03, 2017 09:12 PM", tz = "UTC", format = "%b %d, %Y %I:%M %p"), "%Y%m%d%H%M")
#> [1] "201702032112"
date.string <- c("January 10, 2012 10:40", "December 9, 2011 09:10:00") ## A date string with a specific representation.
a <- strptime(date.string, format = "%B %d, %Y %H:%M") ## Note that the format needs match your date string format.
a
#> [1] "2012-01-10 10:40:00 EST" "2011-12-09 09:10:00 EST"
str(a)
#> POSIXlt[1:2], format: "2012-01-10 10:40:00" "2011-12-09 09:10:00"
as.Date(a)
#> [1] "2012-01-10" "2011-12-09"
# ?strptime ## Check the arg of strptime function.
R
is very powerful in date and time operations.year <- "2017"
d <- as.Date(paste0(year, "-01-01"))
tuesdays <- d + seq(by = 7, (2 - as.POSIXlt(d)$wday) %% 7, 364 + (months(d + 30 + 29) == "February")) ## Note that the day number starts on sunday with 0 and ends on saturday with 6.
tuesdays[tapply(seq_along(tuesdays), as.POSIXlt(tuesdays)$mon, max)]
#> [1] "2017-01-31" "2017-02-28" "2017-03-28" "2017-04-25" "2017-05-30"
#> [6] "2017-06-27" "2017-07-25" "2017-08-29" "2017-09-26" "2017-10-31"
#> [11] "2017-11-28" "2017-12-26"
d <- as.Date(paste0(year, "-01-01"))
saturdays <- d + seq(by = 7, (6 - as.POSIXlt(d)$wday) %% 7, 364 + (months(d + 30 + 29) == "February"))
saturdays[tapply(seq_along(saturdays), as.POSIXlt(saturdays)$mon, max)]
#> [1] "2017-01-28" "2017-02-25" "2017-03-25" "2017-04-29" "2017-05-27"
#> [6] "2017-06-24" "2017-07-29" "2017-08-26" "2017-09-30" "2017-10-28"
#> [11] "2017-11-25" "2017-12-30"
R
functions.R
function.R
function, please see Function and Functional Programming sections of Advanced R by Hadley Wickham.R
is done by functions.R
functions
function
command.R
objects just like anything else, they are R
objects of class function.R
.log(10) ## Takes the natural logarithm of the input.
#> [1] 2.3025851
is.function(log) ## Checks whether the object is a function.
#> [1] TRUE
log(exp(1)) ## Note the exponential function written as "exp().
#> [1] 1
c(1, 2, 3, 4) ## Concatenate function which created a vector.
#> [1] 1 2 3 4
enter
, you can see its source code.formals
: the list of arguments which controls how you can call the function, which is shown in function(x, y)
.body
: the code inside the function, which is shown between the curly braces {}
.environment
: the map of the location of the function’s variables.R
, it shows you these three important components. If the environment is not displayed, it means that the function was created in the global environment.str(paste)
#> function (..., sep = " ", collapse = NULL)
formals(paste) ## Prints the arguments of a function.
#> $...
#>
#>
#> $sep
#> [1] " "
#>
#> $collapse
#> NULL
body(paste) ## Prints the body of a function.
#> .Internal(paste(list(...), sep, collapse))
environment(paste) ## Prints the environment of a function.
#> <environment: namespace:base>
getMethod("log") ## Shows the source code of one of the functions with the same name in the global environment. If this function does not work, try the below functions.
#> function (x, base = exp(1)) .Primitive("log")
str(log) ## str function gives the structure of the function with its arguments. I rarely use this function for functions but it might be usefull to reveal the full structure of a function.
#> function (x, base = exp(1))
getAnywhere(log) ## Shows the information of matching function names in all pacakges.
#> A single object matching 'log' was found
#> It was found in the following places
#> package:base
#> namespace:base
#> with value
#>
#> function (x, base = exp(1)) .Primitive("log")
getAnywhere(log)[1] ## Selecting the first function.
#> function (x, base = exp(1)) .Primitive("log")
getAnywhere(paste) ## Since there is only one function with the mathing name, functions source code is revealed immediately.
#> A single object matching 'paste' was found
#> It was found in the following places
#> package:base
#> namespace:base
#> with value
#>
#> function (..., sep = " ", collapse = NULL)
#> .Internal(paste(list(...), sep, collapse))
#> <bytecode: 0x1049ad070>
#> <environment: namespace:base>
getAnywhere(Head.Tail) ## This is a user-written R function. Note the curly braces which represents the body of the function.
#> A single object matching 'Head.Tail' was found
#> It was found in the following places
#> .GlobalEnv
#> with value
#>
#> function(x, Select) {
#> if (Select %% 1 != 0)
#> stop("Invalid Select. Please choose a whole number as Select.\n")
#>
#> rbind(head(x, Select), tail(x, Select))
#> }
formals(Head.Tail)
#> $x
#>
#>
#> $Select
body(Head.Tail)
#> {
#> if (Select%%1 != 0)
#> stop("Invalid Select. Please choose a whole number as Select.\n")
#> rbind(head(x, Select), tail(x, Select))
#> }
environment(Head.Tail)
#> <environment: R_GlobalEnv>
# edit(log) ## Use "edit" function to open the source code of a function in a small editor window in RStudio.
R
function is called, it takes the information in the arguments, applies the code in the body, and then returns the final expression (i.e., return value) in the function.R
functions have named arguments.
args
and formals
functions return all the formal arguments of a function with its usage.R
makes use of all the formal arguments.R
functions, arguments can be missing or might have default values.args(c) ## No argument exists.
#> NULL
args(log) ## Has one argument.
#> function (x, base = exp(1))
#> NULL
args(setdiff) ## Has two arguments.
#> function (x, y)
#> NULL
args("+") ## Has two arguments.
#> function (e1, e2)
#> NULL
args(mean) ## At least one argument. "..." means that some other arguments can be passed to other functions.
#> function (x, ...)
#> NULL
formals(mean)
#> $x
#>
#>
#> $...
args(nb2mat) ## Has 4 arguments and the last three have default values.
#> function (neighbours, glist = NULL, style = "W", zero.policy = NULL)
#> NULL
formals(nb2mat)
#> $neighbours
#>
#>
#> $glist
#> NULL
#>
#> $style
#> [1] "W"
#>
#> $zero.policy
#> NULL
R
function arguments can be matched by argument order or by argument name.
a <- c(1, 2, 3, 4) ## Vector 1.
b <- c(1, 2, 5, 9) ## Vector 2.
# setdiff: Everything in "x" and not in "y".
getAnywhere(setdiff)[3]
#> standardGeneric for "setdiff" defined from package "base"
#>
#> function (x, y)
#> standardGeneric("setdiff")
#> <environment: 0x10e61d678>
#> Methods may be defined for arguments: x, y
#> Use showMethods("setdiff") for currently available ones.
str(setdiff)
#> Formal class 'standardGeneric' [package "methods"] with 8 slots
#> ..@ .Data :function (x, y)
#> ..@ generic : atomic [1:1] setdiff
#> .. ..- attr(*, "package")= chr "base"
#> ..@ package : chr "base"
#> ..@ group : list()
#> ..@ valueClass: chr(0)
#> ..@ signature : chr [1:2] "x" "y"
#> ..@ default :Formal class 'derivedDefaultMethod' [package "methods"] with 4 slots
#> .. .. ..@ .Data :function (x, y)
#> .. .. ..@ target :Formal class 'signature' [package "methods"] with 3 slots
#> .. .. .. .. ..@ .Data : chr "ANY"
#> .. .. .. .. ..@ names : chr "x"
#> .. .. .. .. ..@ package: chr "methods"
#> .. .. ..@ defined:Formal class 'signature' [package "methods"] with 3 slots
#> .. .. .. .. ..@ .Data : chr "ANY"
#> .. .. .. .. ..@ names : chr "x"
#> .. .. .. .. ..@ package: chr "methods"
#> .. .. ..@ generic: atomic [1:1] setdiff
#> .. .. .. ..- attr(*, "package")= chr "base"
#> ..@ skeleton : language (structure(function (x, y) { ...
setdiff(x = a, y = b) ## With arguments.
#> [1] 3 4
setdiff(y = b, x = a) ## With arguments.
#> [1] 3 4
setdiff(a, b) ## Without arguments.
#> [1] 3 4
setdiff(b, x = a) ## With some arguments.
#> [1] 3 4
setdiff(b, a)
#> [1] 5 9
R
functions.function
command.{}
unless there is only one line of coding.# R function syntax.
function.syntax <- function(arguments) {
## Do something interesting
}
str(function.syntax)
#> function (arguments)
#> - attr(*, "srcref")=Class 'srcref' atomic [1:8] 2 20 4 1 20 1 2 4
#> .. ..- attr(*, "srcfile")=Classes 'srcfilecopy', 'srcfile' <environment: 0x121ba53b8>
function.syntax(arguments) ## Calling the function.
#> NULL
R
function with only one argument.return
function to return the value you want to be printed.# Function with no arguemnts.
myfunction <- function() {
x <- rnorm(100)
mean(x) ## The last expression will be returned.
}
myfunction()
#> [1] 0.047333324
formals(myfunction)
#> NULL
body(myfunction)
#> {
#> x <- rnorm(100)
#> mean(x)
#> }
environment(myfunction)
#> <environment: R_GlobalEnv>
# Function with one argument.
my.cube.func <- function(x) {
x^3
}
my.cube.func(2)
#> [1] 8
formals(my.cube.func)
#> $x
body(my.cube.func)
#> {
#> x^3
#> }
environment(my.cube.func)
#> <environment: R_GlobalEnv>
# Function with one argument and return function.
my.variance <- function(x) { ## Sample variance.
a <- (sum(x^2) - length(x) * mean(x)^2) / (length(x) - 1)
return(a) ## You can also use return function to return a specific value.
}
my.variance(c(1:5))
#> [1] 2.5
var(c(1:5)) ## Built-in R function gives the same result.
#> [1] 2.5
R
has separate namespaces for functions and non-functions. Thus, it’s possible to have an object named c
and a function named c
. However, I do not recommend that.# R code chunk is not evaluated.
# Naming a user created R function with a known built-in R function name.
c <- function(x) {
a <- x^x
return(a)
}
c(4)
class(c)
rm(c) ## The created c function is removed.
R
function with some default values.
R
class.NULL
.
NULL
value in R
is mainly used to represent a list with zero length.NULL
value is not a missing value.# Function with default values.
sum.of.squares <- function(x, About = mean(x)) { ## Sum of squares.
x <- x[!is.na(x)]
a <- sum((x - About)^2)
a <- return(a)
}
sum.of.squares(x = c(-2:2))
#> [1] 10
sum.of.squares(x = c(-2:2), About = 0)
#> [1] 10
sum.of.squares(x = c(-2:2), About = 1)
#> [1] 15
# Function with default values including NULL value.
my.function <- function(x, Power = 2, Addition = 10, Remove.NA = NULL) { ## Just a function.
a <- sum(x + Addition, na.rm = Remove.NA)^Power
a <- return(a)
}
my.function(x = c(1:5), Power = 3, Addition = 1, Remove.NA = TRUE)
#> [1] 8000
my.function(x = c(1:5, NA), Power = 3, Addition = 1, Remove.NA = FALSE)
#> [1] NA
my.function(x = c(1:5, NA), Power = 3, Addition = 1, Remove.NA = NULL)
#> [1] 8000
# First function.
my.square <- function(x) {
x <- x[!is.na(x)] ## NA values are subsetted.
a <- x^2
return(a)
}
# Second function.
my.cube <- function(x) {
x <- x[!is.na(x)] ## NA values are subsetted.
a <- x^3
return(a)
}
# The main function which nests the first and second functions.
sum.square.cube <- function(x) {
a <- sum(my.square(x))
b <- sum(my.cube(x))
c <- a + b
return(c)
}
sum.square.cube(1)
#> [1] 2
sum.square.cube(2)
#> [1] 12
sum.square.cube(c(1:5))
#> [1] 280
sum.square.cube(c(1:5, rep(NA, 3)))
#> [1] 280
ls
, environment
, get
functions to list the object and their values inside of a function.# Function which creates functions as output.
make.power <- function(power) {
power.func <- function(base) {
return(base^power)
}
return(power.func)
}
make.power(2)
#> function(base) {
#> return(base^power)
#> }
#> <environment: 0x1270ca8e8>
# New function 1.
square.func <- make.power(2) ## Created a square function.
square.func(3) ## Takes the square of 2.
#> [1] 9
## New function 2.
cube.func <- make.power(3) ## Created a cube function.
cube.func(3) ## Takes the cube of 3.
#> [1] 27
# What's in a function's environment?
ls(environment(cube.func)) ## Gives the defined object in cube.func.
#> [1] "power" "power.func"
get("power", environment(cube.func)) ## Gives the value of "power" in cube.func.
#> [1] 3
R
has two types of scoping: lexical scoping, implemented automatically at the language level, and dynamic scoping, used in select functions to save typing during interactive analysis.func.1
is defined within a function func.2
, the variables in func.2
are visible in func.1
.Case 1
.
func.1
and func.2
are created in the global environment.# Case Study 1
y <- 10 ## A variable defined in global environment not inside of a function.
func.1 <- function(x) { ## Function 1.
x*y
}
func.2 <- function(x) { ## Function 2.
y <- 2 ## A variable defined in the environment of func.2 function.
y^2 + func.1(x)
}
## Which y values does func.1 and func.2 use?
func.2(3) ## Check the result.
#> [1] 34
### With lexical scoping the value of y in the function func.1 is looked up in the environment in which the function is created, in this case the global environment, so the value of y is 10.
### For func.2() function, y value is 2 which is defined while the func.2 is created.
Case 2
.
func.2
is created in the global environment.func.1
is created inside of the func.2
, which means it is created in func.2
’s environment.# Case Study 2
y <- 10 ## A variable defined in global environment not inside of a function.
func.2 <- function(x) { ## Function 2.
y <- 2 ## A variable defined in the environment of func.2 function.
func.1 <- function(x) { ## Function 1.
x*y
}
y^2 + func.1(x)
}
## Which y values does func.1 and func.2 use?
func.2(3) ## Check the result.
#> [1] 10
### This time since func.1 is created inside the func.2, the func.1 will use 2 as the y value.
### Func.2 also uses value 2 for y.
R
R
program.R
are
if
, else if
and else
: Executes the code if a condition is TRUE
.for
: Executes a loop for a fixed number of times.while
: Executes a loop while a condition is TRUE
.repeat
: Execute an infinite loop.break
: Breaks the execution of a loop.next
: Skips an iteration of a loop.functions
in R
: For the details of functions
please see the R Functions section.return
: Exits a function and returns a given value. For the details, please see the My R Functions section.?control
in R
.R
allow the functions and programs to perform different calculations according to the value of a logical object.R
, conditional statement are performed by using the if
, else if
, and else
statements.if
statement and
else
statementelse if
statements, and end with else
statement.R
evaluates conditional statements in the order it is written.
TRUE
, then R
executes the code just below the conditional statement and ignores the rest of the conditional statements.FALSE
, then R
skips to the next conditional statement and repeats the previous process.{}
.# R code chunk is not evaluated.
# Syntax Case 1
if(conditional.statement) {
## Executes the code if the conditional.statement is TRUE.
}
# Syntax 2
if(conditional.statement) {
## Executes the code if the conditional.statement is TRUE.
} else {
## Executes the code if the conditional.statement is FALSE.
}
# Syntax 3
if(conditional.statement.1) {
## Executes the code if the "conditional.statement.1" is TRUE.
} else if (conditional.statement.2) {
## Executes the code if the "conditional.statement.1" is FALSE but "conditional.statement.2" is TRUE.
} else {
## Executes the code if the "conditional.statement.1" and "conditional.statement.2" are FALSE.
}
# Syntax 4
if(conditional.statement.1) {
## Executes the code if the "conditional.statement.1" is TRUE.
if (conditional.statement.2) {
## Executes the code if the "conditional.statement.1" and "conditional.statement.2" is TRUE.
if (conditional.statement.2) {
## Executes the code if the "conditional.statement.1", "conditional.statement.2" and "conditional.statement.2" are TRUE.
}
}
}
if
statement.message
function to give informational notes to the readerwarning
function to warn the coder about unusual resultsstop
function to stop the execution of conditional statements.# Simple if (single) statement.
x <- -8
if (x < 0) {
print("Input is a negative number.")
}
#> [1] "Input is a negative number."
## Simple if (multiple) statements with message function.
x <- sample(x = c(-1000:1000), size = 1, replace = TRUE, prob = NULL)
if (x < 0) {
print("Input is a negative number.")
message(paste0("Your input is ", x)) ## You can use "message" function to give informational note on the console.
warning("Something unusual is going on.")
}
#> [1] "Input is a negative number."
#> Your input is -998
#> Warning: Something unusual is going on.
if (x > 0) {
print("Input is a positive number.")
message(paste0("Your input is ", x))
warning("Something unusual is going on.")
}
# Simple if (single) statement with stop function.
if (!("$" %in% letters)) { ## See how to use "stop" function to end the conditional statement if
stop("Invalid letter.") ## Invalid letter.
}
#> Error in eval(expr, envir, enclos): Invalid letter.
else
statement.# Simple if and else statemenst.
x <- sample(x = c(-1000:1000), size = 1, replace = TRUE, prob = NULL)
if (x < 0) {
print(paste0("Input, ", x , ", is a negative number."))
} else {
print(paste0("Input, ", x , ", is a negative number."))
}
#> [1] "Input, 154, is a negative number."
# Simple if and else statement with value assigning inside the conditional statement.
a <- 10
if (a > 3) {
b <- 10 ## Creating a new variable.
} else {
b <- 0
}
b
#> [1] 10
# Simple if and else statement with value assigning.
x <- 50
y <- if (x == 50) {
0
} else {
1
}
y
#> [1] 0
else if
statements and ends with an else
statement.# If, else if (single) and else statements.
x <- 5
if (x < 0) {
print("x is a negative number.")
} else if (x == 0) {
print("x is zero.")
} else {
print("x is a positive number.")
}
#> [1] "x is a positive number."
# If, else if (multiple) and else statements.
grade <- 100
if (grade < 70) {
print("Keep studying!!!")
} else if (grade < 80) {
print("Average")
} else if (grade < 90) {
print("Good")
} else if (grade < 100) {
print("Very Good")
} else {
print("Excellent")
}
#> [1] "Excellent"
# If-else statements are nested in if-else statements.
## This conditional statement yields the same answer.
grade <- 100
if (grade < 100) {
if (grade < 90) {
if (grade < 80) {
if (grade < 70) {
print("Keep studying!!!")
} else {
print("Average")
}
} else {
print("Good")
}
} else {
print("Very Good")
}
} else {
print("Excellent")
}
#> [1] "Excellent"
# If statements are nested in if statements.
## This conditional statement yields the same answer.
grade <- 100
if (grade < 100) {
if (grade < 90) {
if (grade < 80) {
if (grade < 70) {
print("Keep studying!!!")
}
if (grade >= 70) {
print("Average")
}
}
if (grade >= 80) {
print("Good")
}
}
if (grade >= 90) {
print("Very Good")
}
} else {
print("Excellent")
}
#> [1] "Excellent"
if
- else
statement is not vectorized.if
- else
statements, you should use ifelse
function.
ifelse
function is a vectorized if
statement which checks all the elements of a given R
object individually.ifelse
function uses the first value and if not uses the second value.# If else statement for vectorized objects.
x <- 1:10
if (x < 5) {
x <- 0
}
#> Warning in if (x < 5) {: the condition has length > 1 and only the first
#> element will be used
x
#> [1] 0
# ifelse function for vectorized object.
x <- 1:10
y <- ifelse(x < 5, 0, 1)
y
#> [1] 0 0 0 0 1 1 1 1 1 1
R
, you can use three different loops.
for
loop to execute the code for fixed number of times.while
loop to execute the code as long as the tested condition is true.repeat
loop to execute the code for infinite number of times.next
and break
statements to skip some iterations and to terminate the loop.for
loops take a counter variable and assign it to successive values of a sequence or vector and execute the loop for fixed number of times.R
, the general syntax of a for
loop starts with the for
statement and
i in x
for some vector (or list) x
, where the counter i
takes iterative values of x
,{}
.x
is a vector of n
natural numbers.i
is a dummy variable, and can be called as whatever you like, though it retains its value outside the loop.# R code chunk is not evaluated.
# The general syntax of a for loop
for (counter in sequence) {
## Executes the code for each iteration of counter.
}
# The most common syntax of a for loop
for (i in x) {
## Executes the code for each iteration of counter i in x.
}
for
loops work in R
.# Simple for loop.
## This loop takes the variable "i" and for each iteration of the loop 1, 2, 3, ..., 10, are assigned to it. After the last iteration the loop exits.
for (i in 1:10) { ## Counter is "i".
print(i)
}
#> [1] 1
#> [1] 2
#> [1] 3
#> [1] 4
#> [1] 5
#> [1] 6
#> [1] 7
#> [1] 8
#> [1] 9
#> [1] 10
# Simple for loop with a different counter.
x <- c("a", "b", "c")
for (NCSU in 1:length(x)) { ## Counter is "NCSU"
print(x[NCSU])
}
#> [1] "a"
#> [1] "b"
#> [1] "c"
NCSU
#> [1] 3
# Simple for loop with seq-along function.
sample.size <- sample(x = c(5:10), size = 1)
x <- sample(x = c(-10:10), size = sample.size, replace = TRUE, prob = NULL)
for (j in seq_along(x)) { ## Counter is "j"
print(x[j])
}
#> [1] -1
#> [1] 10
#> [1] 10
#> [1] 4
#> [1] 5
j
#> [1] 5
for
loops can be nested in other loops.for
loop is often very difficult to read and understand. Thus, be careful with nesting loops.# Simple nested for loops.
x <- matrix(data = c(1:6), nrow = 2, ncol = 3)
x
#> [,1] [,2] [,3]
#> [1,] 1 3 5
#> [2,] 2 4 6
for (i in 1:nrow(x)) { ## Looping over rows.
for (j in 1:ncol(x)) { ## Looping over columns.
print(x[i, j])
}
}
#> [1] 1
#> [1] 3
#> [1] 5
#> [1] 2
#> [1] 4
#> [1] 6
i ## Number of rows.
#> [1] 2
j ## Number of columns.
#> [1] 3
next
statement can be used to skip an iteration of a loop.break
statement can be used to terminate any loop.# Next statement.
for (i in 1:7) {
if (i <= 5) {
next ## Skip the first 5 iterations.
}
print(i)
}
#> [1] 6
#> [1] 7
# Break statement.
for (i in 1:7) {
if (i > 5) {
break ## Terminates the loop on the 6th iteration.
}
print(i)
}
#> [1] 1
#> [1] 2
#> [1] 3
#> [1] 4
#> [1] 5
while
loops are used to execute a code body for many times as long as the tested condition is true.R
, the general syntax of a while
loop starts with the while
statement and
TRUE
, then it executes the code body, which should be in curly braces {}
,# Simple while loop
count <- 0 ## "Count" variable initializes with 0.
while (count < 10) {
print(count)
count <- count + 1 ## Count variable is updated and the loop starts again.
}
#> [1] 0
#> [1] 1
#> [1] 2
#> [1] 3
#> [1] 4
#> [1] 5
#> [1] 6
#> [1] 7
#> [1] 8
#> [1] 9
# Sometimes there will be more than one condition in the test.
x <- 5
while (x >= 3 & x <= 10) {
print(x)
coin <- rbinom(n = 1, size = 1, prob = 0.5) ## Flips a fair coin. 0 means fail, 1 means success.
if (coin == 1) { ## random walk
x <- x + 1
} else {
x <- x - 1
}
}
#> [1] 5
#> [1] 6
#> [1] 7
#> [1] 8
#> [1] 9
#> [1] 10
#> [1] 9
#> [1] 10
#> [1] 9
#> [1] 10
#> [1] 9
#> [1] 10
#> [1] 9
#> [1] 10
repeat
loops are used to execute a code body for infinite times.break
is the only way to terminate repeat
loops.# Simple repeat loop.
x <- 1 ## Initial value.
repeat {
print(x)
if (x == 6) {
break ## If the condition TRUE then stop.
} else {
x <- x + 1 ## If the condition is FALSE then run the this code.
}
}
#> [1] 1
#> [1] 2
#> [1] 3
#> [1] 4
#> [1] 5
#> [1] 6
repeat
loops are dangerous since there’s no guarantee it will stop.repeat
loops.# R code chunk is not evaluated.
# Simple repeat loop which does not stop.
x <- 1 ## Initial value.
repeat {
print(x)
if (x > Inf) {
break ## If the condition TRUE then stop.
} else {
x <- x + 1 ## If the condition is FALSE then run the this code.
}
}
for
loops are primarily useful for writing programs but not particularly easy when working interactively on the command line (coding in console).lapply
: Loops over a list and evaluates a function on each element.sapply
: Same as lapply
but tries to simplify the result.apply
: Apply a function over the margins of an array.tapply
: Apply a function over subsets of a vector.mapply
: Multivariate version of lapply
.lapply
takes three arguments
FUN
(the name of the function)X
is not a list, it will be coerced to a list using as.list
function.lapply
always returns a list of the same length as X
, each element of which is the result of applying FUN
to the corresponding element of X
.split
is also useful, particularly in conjunction with lapply
.# lapply function with a list.
x <- list(a = 1:4, b = rnorm(10), c = rnorm(20, 1), d = rnorm(100, 5))
lapply(X = x, FUN = mean) ## Gives the mean of each element on a list.
#> $a
#> [1] 2.5
#>
#> $b
#> [1] -0.31572065
#>
#> $c
#> [1] 1.0130227
#>
#> $d
#> [1] 5.0642876
# lapply function with a numeric vector
x <- c(1:4)
lapply(x, runif) ## "runif" function creates uniform rondom variables. First arguement in "runif" is the number of the variables that you want to create uniform random variables. lapply gives you runif(1), runif(2).... Note that "runif" has other arguments but we dont need to specify these right now since they have default values. The default is uniform between 0 and 1.
#> [[1]]
#> [1] 0.46511458
#>
#> [[2]]
#> [1] 0.50077008 0.97716812
#>
#> [[3]]
#> [1] 0.038445408 0.202400500 0.308886211
#>
#> [[4]]
#> [1] 0.73275655 0.31960197 0.64804636 0.66449284
# lapply function with a numeric vector and passing arguments from other functions.
x <- c(1:4)
lapply(x, runif, min = 0, max = 10) ## "min" and "max" arguements are passed from "runif" function.
#> [[1]]
#> [1] 1.8824985
#>
#> [[2]]
#> [1] 6.1161249 7.9408740
#>
#> [[3]]
#> [1] 9.25222353 0.53183536 7.33753542
#>
#> [[4]]
#> [1] 5.3858203 5.6780073 6.7352234 1.6439398
lapply
.# lapply function with a anonymous function.
x <- list(a = matrix(1:8, 4, 2), b = matrix(1:12, 3, 4))
lapply(x, function(col) col[, 1]) ## An anonymous function for extracting the first column of each matrix. There is no function "col" but we just write it and used in lapply. After lapply is finished this function will go away so this "elt" function is anonymous function.
#> $a
#> [1] 1 2 3 4
#>
#> $b
#> [1] 1 2 3
split
is also useful, particularly in conjunction with lapply
.
split
function takes a vector or other objects and splits it into groups determined by a factor or list of factors.split
function is not a loop function but is very useful that can be used in conjunction with loop functions.split
and lapply
functions does the same thing as tapply
function does. We will see the details of tapply
function later.# Using split and lapply functions together.
x <- c(rnorm(5), runif(5), rnorm(5, 1))
f <- gl(n = 3, k = 5) ## Generates factor levels.
split(x, f)
#> $`1`
#> [1] 1.264886486 -0.030771014 0.879678098 -1.785263360 -1.334145395
#>
#> $`2`
#> [1] 0.67387138 0.88993253 0.43541792 0.20668692 0.77929156
#>
#> $`3`
#> [1] 1.34543996 1.54950796 1.08142072 -0.52866736 0.54534641
lapply(split(x, f), mean)
#> $`1`
#> [1] -0.20112304
#>
#> $`2`
#> [1] 0.59704006
#>
#> $`3`
#> [1] 0.79860954
sapply
tries to simplify the result of lapply
if possible.
1
, then a vector is returned.> 1
), a matrix is returned.# sapply function with a list.
x <- list(a = 1:4, b = rnorm(10), c = rnorm(20, 1), d = rnorm(100, 5))
lapply(X = x, FUN = mean) ## List format.
#> $a
#> [1] 2.5
#>
#> $b
#> [1] -0.39509565
#>
#> $c
#> [1] 1.0345496
#>
#> $d
#> [1] 4.9814706
sapply(X = x, FUN = mean, simplify = FALSE) ## Same as lapply.
#> $a
#> [1] 2.5
#>
#> $b
#> [1] -0.39509565
#>
#> $c
#> [1] 1.0345496
#>
#> $d
#> [1] 4.9814706
sapply(X = x, FUN = mean) ## Vector format.
#> a b c d
#> 2.50000000 -0.39509565 1.03454962 4.98147062
mean(x) ## Note that mean function cannot handle list objects.
#> Warning in mean.default(x): argument is not numeric or logical: returning
#> NA
#> [1] NA
apply
is used to a evaluate a function, often an anonymous one, over the margins of an array X
.
X
is not an array but has a dimension attribute, apply
attempts to coerce it to an array via as.matrix
function if it is two-dimensional (e.g., data frames) or via as.array
function.MARGIN
is an integer vector indicating which margins should be retained.FUN
is a function to be applied.lapply
and sapply
, you can pass arguments from other functions.apply
can return different outputs.
FUN
returns a vector of length n
and if n > 1
, then apply
returns an array of dimension c(n, dim(X)[MARGIN])
.n
equals 1
, apply
returns a vector if MARGIN
has length 1
, and an array of dimension dim(X)[MARGIN]
otherwise.# apply function on a matrix which returns a vector.
x <- matrix(rnorm(200), 20, 10)
y <- apply(X = x, MARGIN = 2, FUN = mean) ## Means of columns.
y
#> [1] 0.1281966691 0.0547078269 -0.1353001075 -0.0749594454 0.3316059745
#> [6] 0.0032045774 -0.2222750870 0.3328635026 0.1081437661 -0.0403008476
class(y)
#> [1] "numeric"
str(y)
#> num [1:10] 0.1282 0.0547 -0.1353 -0.075 0.3316 ...
apply(x, 1, sum) ## Calculates the sum of each row.
#> [1] 1.876708309 -1.274228306 3.320315421 3.311382123 3.733855170
#> [6] 0.500747121 0.813166962 -6.339688707 2.166718702 -0.044351288
#> [11] 3.412550506 -1.443729958 2.260549728 -2.354827026 1.356863014
#> [16] -1.100915952 0.640174366 -1.862292631 0.745788668 -0.001049639
# apply function on a matrix which returns a array.
x <- matrix(rnorm(200), 20, 10)
y <- apply(X = x, MARGIN = 2, FUN = quantile, probs = c(0.25, 0.75)) ## Gives the first and the third quantiles of each column.
y
#> [,1] [,2] [,3] [,4] [,5] [,6]
#> 25% -0.19656913 -0.57894620 -0.58522113 -0.14105238 -0.5541239 -0.74813513
#> 75% 0.90453170 0.53543041 0.64705173 0.85289385 1.2061409 0.48785013
#> [,7] [,8] [,9] [,10]
#> 25% -0.038155669 -1.1057168 -0.26771426 -0.60151437
#> 75% 0.774704143 1.0904228 0.75291972 0.35112934
class(y)
#> [1] "matrix"
str(y)
#> num [1:2, 1:10] -0.197 0.905 -0.579 0.535 -0.585 ...
#> - attr(*, "dimnames")=List of 2
#> ..$ : chr [1:2] "25%" "75%"
#> ..$ : NULL
# apply function on an array which returns a matrix.
## Gives averages of an array in a matrix format.
x <- array(rnorm(2 * 2 * 10), c(2, 2, 10)) ## This array has 3 dimensions: with 2 rows, 2 columns and the 3rd dimension with number 10.
apply(x, c(1, 2), mean) ## Generates the mean of the array with the 1st and 2nd dimension. In other meaning, 3rd dimension is collapssed. So the resulting matrix will be a 2x2 matrix with means.
#> [,1] [,2]
#> [1,] -0.025901501 -0.23329858
#> [2,] -0.015702859 0.45235361
rowMeans(x, dims = 2) ## Gives the same result as above. "2" represents the first number of dimensions which are preserved.
#> [,1] [,2]
#> [1,] -0.025901501 -0.23329858
#> [2,] -0.015702859 0.45235361
# apply function on an array which returns an array.
x <- array(rnorm(2 * 2 * 10), c(2, 2, 10))
apply(x, c(2, 3), mean) ## Means for the 2nd and the 3rd dimensions.
#> [,1] [,2] [,3] [,4] [,5]
#> [1,] 1.032033819 -0.22886939 1.15934540 -1.05876132 0.32213331
#> [2,] 0.076930345 0.29861014 0.39098475 0.55558119 -0.41934945
#> [,6] [,7] [,8] [,9] [,10]
#> [1,] 0.42438482 1.0297013 0.41704257 -0.088439608 -0.0419970
#> [2,] -0.97151164 2.8643074 -0.20719924 0.879466661 1.4425146
x <- matrix(rnorm(15), 3, 5)
apply(x, 1, sum) ## Same as "rowSums(x)" function.
#> [1] 1.15104386 0.52906972 1.09875316
rowSums(x)
#> [1] 1.15104386 0.52906972 1.09875316
apply(x, 1, mean) ## Same as "rowMeans(x)" function.
#> [1] 0.23020877 0.10581394 0.21975063
rowMeans(x)
#> [1] 0.23020877 0.10581394 0.21975063
apply(x, 2, sum) ## Same as "colSums(x)" function.
#> [1] 0.35154985 1.84621859 -0.74730656 0.61891238 0.70949248
colSums(x)
#> [1] 0.35154985 1.84621859 -0.74730656 0.61891238 0.70949248
apply(x, 2, mean) ## Same as "colMeans(x)" function.
#> [1] 0.11718328 0.61540620 -0.24910219 0.20630413 0.23649749
colMeans(x)
#> [1] 0.11718328 0.61540620 -0.24910219 0.20630413 0.23649749
tapply
is used to apply a function over a subset of a vector X
which is given by a unique combination of the levels of certain factors.
X
is typically a vector.INDEX
is a list of factors, each of same length as X
. Its elements are coerced to factors by as.factor
function.FUN
is the function to be applied.simplify
is FALSE
, then tapply
returns as list, otherwise it returns an array.# tapply function (simple).
x <- c(rnorm(10), runif(10), rnorm(10, 1))
f <- gl(n = 3, k = 10) ## Generates factor levels.
tapply(X = x, INDEX = f, FUN = mean) ## A factor level is assigned to each value in x in order.
#> 1 2 3
#> 0.096512364 0.492764916 1.358479278
tapply(x, f, range) ## Gives the min and max within the subset of x.
#> $`1`
#> [1] -1.4331798 1.0072600
#>
#> $`2`
#> [1] 0.0038729776 0.9113824219
#>
#> $`3`
#> [1] -0.88013112 2.81199630
tapply(x, f, mean, simplify = FALSE) ## The result is in a list.
#> $`1`
#> [1] 0.096512364
#>
#> $`2`
#> [1] 0.49276492
#>
#> $`3`
#> [1] 1.3584793
lapply(split(x, f), mean) ## Same as above.
#> $`1`
#> [1] 0.096512364
#>
#> $`2`
#> [1] 0.49276492
#>
#> $`3`
#> [1] 1.3584793
# tapply function (complex).
x <- c(rnorm(5), rnorm(5, 1), rnorm(5, 2), rnorm(5, 3)) ## Our values.
f1 <- factor(rep(1:2, each = 10)) ## First factor.
f2 <- factor(rep(rep(3:4, each = 5), times = 2)) ## Second factor.
f <- list(f1, f2) ## List of factors.
tapply(X = x, INDEX = f, FUN = mean)
#> 3 4
#> 1 0.13168967 0.97117196
#> 2 2.07530546 3.21447994
mapply
is a multivariate version of sapply
function.
FUN
is a function to apply....
contains arguments to apply over.MoreArgs
is a list of other arguments to FUN
.SIMPLIFY
indicates whether the result should be simplified.mapply
applies FUN
to the first elements of each ...
argument, the second elements, the third elements, and so on. Arguments are recycled if necessary.# mapply function (simple).
list(rep(1, 4), rep(2, 3), rep(3, 2), rep(4, 1)) # Instead we can do the below code.
#> [[1]]
#> [1] 1 1 1 1
#>
#> [[2]]
#> [1] 2 2 2
#>
#> [[3]]
#> [1] 3 3
#>
#> [[4]]
#> [1] 4
mapply(rep, 1:4, 4:1) ## mapply function takes the arguement in order.
#> [[1]]
#> [1] 1 1 1 1
#>
#> [[2]]
#> [1] 2 2 2
#>
#> [[3]]
#> [1] 3 3
#>
#> [[4]]
#> [1] 4
# mapply function (complex).
noise <- function(n, mean, sd) { ## A function for n, mean and sd.
rnorm(n, mean, sd)
}
noise(5, 1, 2)
#> [1] 5.77172396 6.75199012 0.33407791 1.98091386 5.99155326
noise(1:5, 1:5, 2) ## It does not work correctly for set of n's and means's. No vectorization.
#> [1] -1.38768901 1.96256096 -0.68194949 5.54666240 4.82433828
mapply(noise, 1:5, 1:5, 2) ## With mapply, it will be vectorized.
#> [[1]]
#> [1] -0.045964864
#>
#> [[2]]
#> [1] 4.0227480 1.5404014
#>
#> [[3]]
#> [1] 2.27326866 3.36000422 0.53698866
#>
#> [[4]]
#> [1] -0.76509722 2.36799894 -0.33937480 3.08212633
#>
#> [[5]]
#> [1] 5.1811054 7.4741680 6.3593198 6.2523982 7.9988633
R
functions ease our job in coding.income
you want to calculate the income tax
and net income
(income after tax) for single family households.
n = 10
) of a company.n <- 10 ## Number of employees.
all.incomes <- sample(x = c(0:(5*10^5)), size = n, replace = FALSE, prob = NULL) ## Random sample for income.
all.incomes ## Income values.
#> [1] 193763 295560 496736 334982 157274 394491 494968 207406 45176 55453
index
in the all.incomes
object below.# Calculating for one employee.
income <- all.incomes[1] ## Income for the first employee.
if (income <= 0) {
tax <- 0
} else if (income <= 9325) {
tax <- income * 0.1
} else if (income <= 37950) {
tax <- income * 0.15
} else if (income <= 91900) {
tax <- income * 0.25
} else if (income <= 191650) {
tax <- income * 0.28
} else if (income <= 416700) {
tax <- income * 0.33
} else if (income <= 418400) {
tax <- income * 0.35
} else {
tax <- income * 0.396
}
c("Income" = income, "Tax" = tax, "Net Income" = income - tax)
#> Income Tax Net Income
#> 193763.00 63941.79 129821.21
index
in the all.incomes
object below.# Calculating for one employee.
tax.brackets <- list(c(0, 0.1), c(9325, 0.15), c(37950, 0.25), c(91900, 0.28), c(191650, 0.33), c(416700, 0.35), c(418400, 0.396)) ## Tax brackets in a list.
income <- all.incomes[1]
for (i in 1:length(tax.brackets)) {
if (tax.brackets[[i]][1] < income) {
tax.rate <- tax.brackets[[i]][2]
tax <- income * tax.rate
}
}
c("Income" = income, "Tax Rate" = tax.rate, "Tax" = tax, "Net Income" = income - tax)
#> Income Tax Rate Tax Net Income
#> 193763.00 0.33 63941.79 129821.21
# Calculating for all employee.
tax.brackets <- list(c(0, 0.1), c(9325, 0.15), c(37950, 0.25), c(91900, 0.28), c(191650, 0.33), c(416700, 0.35), c(418400, 0.396)) ## Tax brackets in a list.
for (j in 1:length(all.incomes)) {
income <- all.incomes[j]
for (i in 1:length(tax.brackets)) {
if (tax.brackets[[i]][1] < income) {
tax.rate <- tax.brackets[[i]][2]
tax <- income * tax.rate
}
}
if (j == 1) {
results <- c(income, tax.rate, tax, income - tax)
} else {
temp <- c(income, tax.rate, tax, income - tax)
results <- rbind(results, temp)
}
}
results <- as.data.frame(results, row.names = paste("Employee", " ", 1:length(all.incomes)), stringsAsFactors = FALSE)
colnames(results) <- c("Income", "Tax Rate", "Tax", "Net Income")
results
R
function which calculates these measures with any pre-specified tax brackets for all employees.# Calculating with any pre-specified tax brackets for all employees.
## Pre-specified tax brackets in a list.
tax.brackets <- list(c(0, 0.1), c(9325, 0.15), c(37950, 0.25), c(91900, 0.28), c(191650, 0.33), c(416700, 0.35), c(418400, 0.396))
## Function with Income and Tax.Brakets options.
tax.func <- function(Income, Tax.Brackets) {
for (j in 1:length(Income)) {
for (i in 1:length(Tax.Brackets)) {
if (Tax.Brackets[[i]][1] < Income[j]) {
tax.rate <- Tax.Brackets[[i]][2]
tax <- Income[j] * tax.rate
}
}
if (j == 1) {
results <- c(Income[j], tax.rate, tax, Income[j] - tax)
} else {
temp <- c(Income[j], tax.rate, tax, Income[j] - tax)
results <- rbind(results, temp)
}
}
results <- as.data.frame(results, row.names = paste("Employee", " ", 1:length(Income)), stringsAsFactors = FALSE)
colnames(results) <- c("Income", "Tax Rate", "Tax", "Net Income")
return(results)
}
tax.func(Income = all.incomes, Tax.Brackets = tax.brackets)
# Calculating with any pre-specified tax brackets for all employees.
## Pre-specified tax brackets in a list.
tax.brackets <- list(c(0, 0.1), c(50000, 0.25), c(100000, 0.30), c(150000, 0.35), c(200000, 0.40), c(300000, 0.45), c(400000, 0.50))
tax.func(Income = all.incomes, Tax.Brackets = tax.brackets)
reproducible research
refers to the idea that the ultimate product of an academic research can be recreated by an independent investigator using the full computational environment utilized to produce the results in the paper such as the original code, original data, and etc.reproducible research
is to tie specific instructions to data analysis and experimental data so that the study can be recreated, better understood and verified.Reproducibility
is important because it is the only thing that an investigator can guarantee about a study.Replication
and Reproducibility
interchangeably, there is a distinction between them in the context of scientific verification.
Replication
is done by independent people using new data and even code.Reproducibility
is done by independent people using the same data, code and computational environment.R
packages related to reproducible research techniques.R Markdown
allows you to create documents (PDF, beamer slides, markdown, and HTML) that serve as a neat record of your text and coding with its output (graphs, tables, and etc.).R Markdown
is a wonderful tool for reproducible research
.R Markdown
.R Markdown
.R Markdown
.R Markdown
files, knitr
is necessary which is an engine for dynamic report generation with R
.knitr
, see knitr
page prepared by Yihui Xie.R Markdown
and knitr
come pre-installed with RStudio
so there is need for further action.LaTeX
to generate reports in PDF via R Markdown
, it is better to install MacTeX
distribution for Mac and MacTeX
distribution for PC.R
and Rstudio
that you used.sessionInfo()
function provides this information. Even better is to install the devtools package and use devtools::session_info()
.R.version.string ## Returns the R version in a string.
#> [1] "R version 3.3.3 (2017-03-06)"
sessionInfo() ## From utils package.
#> R version 3.3.3 (2017-03-06)
#> Platform: x86_64-apple-darwin13.4.0 (64-bit)
#> Running under: OS X Mavericks 10.9.5
#>
#> locale:
#> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
#>
#> attached base packages:
#> [1] stats graphics grDevices utils datasets methods base
#>
#> other attached packages:
#> [1] gvlma_1.0.0.2 psych_1.7.8 pastecs_1.3-18
#> [4] boot_1.3-20 gapminder_0.3.0 car_2.1-6
#> [7] leaflet_1.1.0 spdep_0.7-4 spData_0.2.6.7
#> [10] Matrix_1.2-11 sp_1.2-5 zoo_1.8-0
#> [13] NCmisc_1.1.5 magrittr_1.5 rvest_0.3.2
#> [16] xml2_1.1.1 lubridate_1.7.1 dygraphs_1.1.1.4
#> [19] plotly_4.7.1 ggplot2_2.2.1 DT_0.2
#> [22] tibble_1.3.4 kableExtra_0.6.1 stargazer_5.2
#> [25] xtable_1.8-2 stringr_1.2.0 XLConnect_0.2-13
#> [28] XLConnectJars_0.2-13 ctv_0.8-3 knitr_1.17
#> [31] rmarkdown_1.8 devtools_1.13.4 openssl_0.9.9
#> [34] checkpoint_0.4.3
#>
#> loaded via a namespace (and not attached):
#> [1] nlme_3.1-131 pbkrtest_0.4-7 gmodels_2.16.2
#> [4] httr_1.3.1 rprojroot_1.2 tools_3.3.3
#> [7] backports_1.1.1 R6_2.2.2 lazyeval_0.2.1
#> [10] mgcv_1.8-17 colorspace_1.3-2 nnet_7.3-12
#> [13] withr_2.1.0 mnormt_1.5-5 quantreg_5.34
#> [16] SparseM_1.77 expm_0.999-2 scales_0.5.0
#> [19] readr_1.1.1 digest_0.6.12 foreign_0.8-69
#> [22] minqa_1.2.4 pkgconfig_2.0.1 htmltools_0.3.6
#> [25] lme4_1.1-14 highr_0.6 htmlwidgets_0.9
#> [28] rlang_0.1.4 rstudioapi_0.7 shiny_1.0.5
#> [31] bindr_0.1 jsonlite_1.5 crosstalk_1.0.0
#> [34] gtools_3.5.0 dplyr_0.7.4 Rcpp_0.12.14
#> [37] munsell_0.4.3 stringi_1.1.6 yaml_2.1.16
#> [40] MASS_7.3-47 plyr_1.8.4 grid_3.3.3
#> [43] parallel_3.3.3 gdata_2.18.0 deldir_0.1-14
#> [46] lattice_0.20-34 splines_3.3.3 hms_0.4.0
#> [49] LearnBayes_2.15 glue_1.2.0 evaluate_0.10.1
#> [52] data.table_1.10.4-3 nloptr_1.0.4 httpuv_1.3.5
#> [55] MatrixModels_0.4-1 gtable_0.2.0 purrr_0.2.4
#> [58] tidyr_0.7.2 assertthat_0.2.0 mime_0.5
#> [61] coda_0.19-1 viridisLite_0.2.0 rJava_0.9-9
#> [64] proftools_0.99-2 memoise_1.1.0 bindrcpp_0.2
devtools::session_info() ## From devtools package.
#> Session info -------------------------------------------------------------
#> setting value
#> version R version 3.3.3 (2017-03-06)
#> system x86_64, darwin13.4.0
#> ui RStudio (1.1.419)
#> language (EN)
#> collate en_US.UTF-8
#> tz America/New_York
#> date 2018-02-06
#> Packages -----------------------------------------------------------------
#> package * version date source
#> assertthat 0.2.0 2017-04-11 CRAN (R 3.3.2)
#> backports 1.1.1 2017-09-25 CRAN (R 3.3.2)
#> base * 3.3.3 2017-03-07 local
#> bindr 0.1 2016-11-13 CRAN (R 3.3.2)
#> bindrcpp 0.2 2017-06-17 CRAN (R 3.3.2)
#> boot * 1.3-20 2017-07-30 CRAN (R 3.3.2)
#> car * 2.1-6 2017-11-19 CRAN (R 3.3.2)
#> checkpoint * 0.4.3 2017-12-19 CRAN (R 3.3.2)
#> coda 0.19-1 2016-12-08 CRAN (R 3.3.2)
#> colorspace 1.3-2 2016-12-14 CRAN (R 3.3.2)
#> crosstalk 1.0.0 2016-12-21 CRAN (R 3.3.2)
#> ctv * 0.8-3 2017-10-07 CRAN (R 3.3.2)
#> data.table 1.10.4-3 2017-10-27 CRAN (R 3.3.2)
#> datasets * 3.3.3 2017-03-07 local
#> deldir 0.1-14 2017-04-22 CRAN (R 3.3.2)
#> devtools * 1.13.4 2017-11-09 CRAN (R 3.3.2)
#> digest 0.6.12 2017-01-27 CRAN (R 3.3.2)
#> dplyr 0.7.4 2017-09-28 CRAN (R 3.3.2)
#> DT * 0.2 2016-08-09 CRAN (R 3.3.0)
#> dygraphs * 1.1.1.4 2017-01-04 CRAN (R 3.3.2)
#> evaluate 0.10.1 2017-06-24 CRAN (R 3.3.2)
#> expm 0.999-2 2017-03-29 CRAN (R 3.3.2)
#> foreign 0.8-69 2017-06-21 CRAN (R 3.3.2)
#> gapminder * 0.3.0 2017-10-31 CRAN (R 3.3.2)
#> gdata 2.18.0 2017-06-06 CRAN (R 3.3.2)
#> ggplot2 * 2.2.1 2016-12-30 CRAN (R 3.3.2)
#> glue 1.2.0 2017-10-29 CRAN (R 3.3.2)
#> gmodels 2.16.2 2015-07-22 CRAN (R 3.3.0)
#> graphics * 3.3.3 2017-03-07 local
#> grDevices * 3.3.3 2017-03-07 local
#> grid 3.3.3 2017-03-07 local
#> gtable 0.2.0 2016-02-26 CRAN (R 3.3.0)
#> gtools 3.5.0 2015-05-29 CRAN (R 3.3.0)
#> gvlma * 1.0.0.2 2014-01-21 CRAN (R 3.3.0)
#> highr 0.6 2016-05-09 CRAN (R 3.3.0)
#> hms 0.4.0 2017-11-23 CRAN (R 3.3.2)
#> htmltools 0.3.6 2017-04-28 CRAN (R 3.3.2)
#> htmlwidgets 0.9 2017-07-10 CRAN (R 3.3.2)
#> httpuv 1.3.5 2017-07-04 CRAN (R 3.3.2)
#> httr 1.3.1 2017-08-20 CRAN (R 3.3.2)
#> jsonlite 1.5 2017-06-01 CRAN (R 3.3.2)
#> kableExtra * 0.6.1 2017-11-01 CRAN (R 3.3.2)
#> knitr * 1.17 2017-08-10 CRAN (R 3.3.2)
#> lattice 0.20-34 2016-09-06 CRAN (R 3.3.3)
#> lazyeval 0.2.1 2017-10-29 CRAN (R 3.3.2)
#> leaflet * 1.1.0 2017-02-21 CRAN (R 3.3.2)
#> LearnBayes 2.15 2014-05-29 CRAN (R 3.3.0)
#> lme4 1.1-14 2017-09-27 CRAN (R 3.3.2)
#> lubridate * 1.7.1 2017-11-03 CRAN (R 3.3.2)
#> magrittr * 1.5 2014-11-22 CRAN (R 3.3.0)
#> MASS 7.3-47 2017-04-21 CRAN (R 3.3.2)
#> Matrix * 1.2-11 2017-08-16 CRAN (R 3.3.2)
#> MatrixModels 0.4-1 2015-08-22 CRAN (R 3.3.0)
#> memoise 1.1.0 2017-04-21 CRAN (R 3.3.2)
#> methods * 3.3.3 2017-03-07 local
#> mgcv 1.8-17 2017-02-08 CRAN (R 3.3.3)
#> mime 0.5 2016-07-07 CRAN (R 3.3.0)
#> minqa 1.2.4 2014-10-09 CRAN (R 3.3.0)
#> mnormt 1.5-5 2016-10-15 CRAN (R 3.3.0)
#> munsell 0.4.3 2016-02-13 CRAN (R 3.3.0)
#> NCmisc * 1.1.5 2017-01-03 CRAN (R 3.3.2)
#> nlme 3.1-131 2017-02-06 CRAN (R 3.3.3)
#> nloptr 1.0.4 2014-08-04 CRAN (R 3.3.0)
#> nnet 7.3-12 2016-02-02 CRAN (R 3.3.3)
#> openssl * 0.9.9 2017-11-10 CRAN (R 3.3.2)
#> parallel 3.3.3 2017-03-07 local
#> pastecs * 1.3-18 2014-03-02 CRAN (R 3.3.0)
#> pbkrtest 0.4-7 2017-03-15 CRAN (R 3.3.2)
#> pkgconfig 2.0.1 2017-03-21 CRAN (R 3.3.2)
#> plotly * 4.7.1 2017-07-29 CRAN (R 3.3.2)
#> plyr 1.8.4 2016-06-08 CRAN (R 3.3.0)
#> proftools 0.99-2 2016-01-13 CRAN (R 3.3.0)
#> psych * 1.7.8 2017-09-09 CRAN (R 3.3.3)
#> purrr 0.2.4 2017-10-18 CRAN (R 3.3.2)
#> quantreg 5.34 2017-10-25 CRAN (R 3.3.2)
#> R6 2.2.2 2017-06-17 CRAN (R 3.3.2)
#> Rcpp 0.12.14 2017-11-23 CRAN (R 3.3.2)
#> readr 1.1.1 2017-05-16 CRAN (R 3.3.2)
#> rJava 0.9-9 2017-10-12 CRAN (R 3.3.2)
#> rlang 0.1.4 2017-11-05 CRAN (R 3.3.2)
#> rmarkdown * 1.8 2017-11-17 CRAN (R 3.3.2)
#> rprojroot 1.2 2017-01-16 CRAN (R 3.3.2)
#> rstudioapi 0.7 2017-09-07 CRAN (R 3.3.2)
#> rvest * 0.3.2 2016-06-17 CRAN (R 3.3.0)
#> scales 0.5.0 2017-08-24 CRAN (R 3.3.2)
#> shiny 1.0.5 2017-08-23 CRAN (R 3.3.2)
#> sp * 1.2-5 2017-06-29 CRAN (R 3.3.2)
#> SparseM 1.77 2017-04-23 CRAN (R 3.3.2)
#> spData * 0.2.6.7 2017-11-28 CRAN (R 3.3.2)
#> spdep * 0.7-4 2017-11-22 CRAN (R 3.3.2)
#> splines 3.3.3 2017-03-07 local
#> stargazer * 5.2 2015-07-14 CRAN (R 3.3.0)
#> stats * 3.3.3 2017-03-07 local
#> stringi 1.1.6 2017-11-17 CRAN (R 3.3.2)
#> stringr * 1.2.0 2017-02-18 CRAN (R 3.3.2)
#> tibble * 1.3.4 2017-08-22 CRAN (R 3.3.2)
#> tidyr 0.7.2 2017-10-16 CRAN (R 3.3.2)
#> tools 3.3.3 2017-03-07 local
#> utils * 3.3.3 2017-03-07 local
#> viridisLite 0.2.0 2017-03-24 CRAN (R 3.3.2)
#> withr 2.1.0 2017-11-01 CRAN (R 3.3.2)
#> XLConnect * 0.2-13 2017-05-14 CRAN (R 3.3.2)
#> XLConnectJars * 0.2-13 2017-05-14 CRAN (R 3.3.2)
#> xml2 * 1.1.1 2017-01-24 CRAN (R 3.3.2)
#> xtable * 1.8-2 2016-02-05 CRAN (R 3.3.0)
#> yaml 2.1.16 2017-12-12 CRAN (R 3.3.2)
#> zoo * 1.8-0 2017-04-12 CRAN (R 3.3.2)
R
functions in alphabetical order.#> [1] "any" "apply" "args"
#> [4] "array" "as.array" "as.character"
#> [7] "as.complex" "as.data.frame" "as.Date"
#> [10] "as.factor" "as.list" "as.logical"
#> [13] "as.matrix" "as.numeric" "as.POSIXct"
#> [16] "as.POSIXlt" "assign" "attr"
#> [19] "attributes" "body" "c"
#> [22] "cbind" "citation" "class"
#> [25] "colMeans" "colnames" "colSums"
#> [28] "complete.cases" "conflicts" "crossprod"
#> [31] "cube.func" "cut" "data.frame"
#> [34] "date" "det" "diag"
#> [37] "dim" "dimnames" "eigen"
#> [40] "environment" "exp" "factor"
#> [43] "file.exists" "formals" "format"
#> [46] "func.1" "func.2" "function.syntax"
#> [49] "get" "getAnywhere" "getMethod"
#> [52] "gl" "head" "ifelse"
#> [55] "is.array" "is.character" "is.complex"
#> [58] "is.data.frame" "is.double" "is.factor"
#> [61] "is.function" "is.integer" "is.list"
#> [64] "is.logical" "is.matrix" "is.na"
#> [67] "is.nan" "is.numeric" "julian"
#> [70] "kable" "kronecker" "lapply"
#> [73] "length" "levels" "list"
#> [76] "log" "lower.tri" "ls"
#> [79] "make.power" "mapply" "matrix"
#> [82] "max" "mean" "message"
#> [85] "min" "months" "my.cube"
#> [88] "my.cube.func" "my.function" "my.square"
#> [91] "my.variance" "myfunction" "names"
#> [94] "ncol" "noise" "nrow"
#> [97] "paste" "paste0" "print"
#> [100] "rbind" "rbinom" "readRDS"
#> [103] "rep" "return" "rnorm"
#> [106] "round" "rowMeans" "rownames"
#> [109] "rowSums" "runif" "sample"
#> [112] "sapply" "sd" "seq"
#> [115] "seq_along" "session_info" "sessionInfo"
#> [118] "setdiff" "sin" "solve"
#> [121] "sort" "split" "sqrt"
#> [124] "square.func" "stop" "str"
#> [127] "strftime" "strptime" "structure"
#> [130] "sum" "sum.of.squares" "sum.square.cube"
#> [133] "Sys.Date" "Sys.time" "t"
#> [136] "table" "tail" "tapply"
#> [139] "tax.func" "unclass" "unique"
#> [142] "unlist" "unname" "upper.tri"
#> [145] "var" "vector" "warning"
#> [148] "weekdays" "which"