...
)<<-
) of variable insind a functionFunctions
Functions can be described as ”black boxes” that take an input and spit out an output
used by the user to make their work easier. Eg: mean(x), sum(x) ,sqrt(x),toupper(x), etc.
Function | Description |
---|---|
abs(x) | absolute value |
sqrt(x) | square root |
ceiling(x) | ceiling(3.475) is 4 |
trunc(x) | trunc(5.99) is 5 |
round(x, digits=n) | round(3.475, digits=2) is 3.48 |
log(x) | natural logarithm |
log10(x) | common logarithm |
exp(x) | e^x |
strsplit(x, split) | Split the elements of character vector x at split.strsplit("abc", "") returns 3 element vector "a","b","c" |
paste(..., sep="") | paste("x",1:3,sep="") returns c("x1","x2" "x3")paste("x",1:3,sep="M") returns c("xM1","xM2" "xM3")paste("Today is", date()) |
toupper(x) | Uppercase |
tolower(x) | Lowercase |
Sets of instructions that you want to use repeatedly, it is a piece of code written to carry out a specified task, these functions are created by the user to meet a specific requirement of the user.
To understand in R two slogans are helpful:
- Everything that exists is an object
- Everything that happens is a function call
syntax
function_name <–function(arg_1, arg_2, …) {
//Function body
}
Example1:
# the sum of the squares of 2 numbers
sum_of_squares <- function(x,y) {
x^2 + y^2
}
sum_of_squares(3,4)
Example2:
pow <- function(x, y) {
# function to print x raised to the power y
result <- x^y
print(paste(x,"raised to the power", y, "is", result))
}
# Create a function with arguments.
new.function <- function(a = 3, b = 6) {
result <- a * b
print(result)
}
# Call the function without giving any argument.
new.function()
# Call the function with giving new values of the argument.
new.function(9,5)
[1] 18 [1] 45
# Define the function and specify the exponent, second argument directly
# Sets default of exponent to 2 (just square)
MyThirdFun <- function(n, y = 2)
{
# Compute the power of n to the y
n^y
}
# Specify both args
MyThirdFun(2,3)
# Just specify the first arg
MyThirdFun(2)
# Specify no argument: error!
# MyThirdFun()
?round
print(round(9.523, d=2))
print(round(9.523, di=2))
print(round(9.523, dig=2))
[1] 9.52 [1] 9.52 [1] 9.52
testFun <- function(axb, bcd = 1, axdk) {
return(axb + axdk)
}
testFun(ax=2,ax=3) Error in testFun(ax = 2, ax = 3): formal argument "axb" matched by multiple actual arguments Traceback:
testFun(axb=2,ax = 3)
add <- function(x,y=1,z=2){
x+y
x+z
}
add(5)
add <- function(x,y=1,z=2){
x+y
x+z
return(x+y)
}
add(5)
fahr_to_kelvin <- function(temp) {
kelvin <- ((temp - 32) * (5 / 9)) + 273.15
return(kelvin)
}
# freezing point of water
fahr_to_kelvin(32)
# boiling point of water
fahr_to_kelvin(212)
pow <- function(x, y) {
# function to print x raised to the power y
result <- x^y
print(paste(x,"raised to the power", y, "is", result))
}
# 8 raised to the power 2 is 64
pow(8, 2)
# 8 raised to the power 2 is 64
pow(x = 8, y = 2)
# 8 raised to the power 2 is 64
pow(y = 2, x = 8)
[1] "8 raised to the power 2 is 64" [1] "8 raised to the power 2 is 64" [1] "8 raised to the power 2 is 64"
first class objects
fun <- function(a,b){
a^2
}
fun(2,x/0)
fun <- function(x){
10
}
fun("hello")
...
)¶# example to write the function using three dots
printDots <- function(...) {
myDots <- list(...)
paste(myDots)
}
printDots("how", "is", "your", "health")
# Anonymous
(function(x) x*10)(5) #
# named function define
named <- function(x) x*10
# calling a named function
named(6)
#### Local variable: by value
x <- 5 # global assignment of x is not used by the fun
fun <- function(x){
#parameter x is passed by value (copy)
x <- x+1 # local variable is modified
print(x)
}
fun(1) # prints
print(x) # print 5, since global x has not changed
[1] 2 [1] 5
functionName(variable = value)
, as well as arguments by order:input_1 = 20
mySum <- function(input_1, input_2 = 10) {
output <- input_1 + input_2
return(output)
}
mySum(input_1 = 1, 3)
mySum(3)
<<-
) of variable insind a function¶input_1 = 20
print(input_1)
mySum <- function(input_1a=input_1) { # using input_1 here by declarig global assignment latter
input_1 <<-25 # var assigned inside the function for global access purpose
return(input_1 )
}
mySum(input_1)
print(input_1)
[1] 20
[1] 25
args(sample)
## function (x, size, replace = FALSE, prob = NULL)
## NULL
function (x, size, replace = FALSE, prob = NULL)
NULL
The terms x, size, replace, and prob are the function arguments.
First, notice that replace and prob have default values; that is, we do not need to specify these arguments unless we want to override the default values.
Second, notice the order of the arguments. If you enter the argument values in the same order as the argument list we do not need to specify the argument.
dates <- 1:30
sample(dates, 16) # sample "size = 16"
## [1] 28 23 29 16 4 13 14 10 5 20 18 26 22 12 30 11
Third, if we enter the arguments out of order then we will get either an error message or an undesired result. Arguments entered out of their default order need to be specified.
sample(16, dates) # undesired results; wanted "size = 16"
{r}
Error in sample.int(x, size, replace, prob): object 'dates' not found
Traceback:
1. sample(16, dates)
2. sample.int(x, size, replace, prob)
sample(size = 16, x = dates) # gives desired result
## [1] 9 28 17 8 21 18 5 14 24 27 16 25 2 6 30 4
Fourth, when we specify an argument we only need to type a sufficient number of letters so that R can uniquely identify it from the other arguments.
sample(s = 16, x = dates, r = TRUE) # sampling w/ replacement
## [1] 25 19 28 8 6 12 26 2 26 12 21 16 17 4 7 13
Fifth, argument values can be any valid R expression (including functions) that evaluates to an appropriate value. In the following example we see two sample functions that provide random values to the sample function arguments.<
sample(s = sample(1:36, 1), x = sample(1:10, 5), r=T)
## [1] 3 1 2 1 1 2 3 1 3 4 4 2 1 1 4 4 1 8 4 4 4 2 1 3 2 3 8 1 2 3
mtcarsFive <- mtcars[1:5,] #let's take 5 obserations from mtcars for illustration
mtcarsFive
mpg | cyl | disp | hp | drat | wt | qsec | vs | am | gear | carb | |
---|---|---|---|---|---|---|---|---|---|---|---|
Mazda RX4 | 21.0 | 6 | 160 | 110 | 3.90 | 2.620 | 16.46 | 0 | 1 | 4 | 4 |
Mazda RX4 Wag | 21.0 | 6 | 160 | 110 | 3.90 | 2.875 | 17.02 | 0 | 1 | 4 | 4 |
Datsun 710 | 22.8 | 4 | 108 | 93 | 3.85 | 2.320 | 18.61 | 1 | 1 | 4 | 1 |
Hornet 4 Drive | 21.4 | 6 | 258 | 110 | 3.08 | 3.215 | 19.44 | 1 | 0 | 3 | 1 |
Hornet Sportabout | 18.7 | 8 | 360 | 175 | 3.15 | 3.440 | 17.02 | 0 | 0 | 3 | 2 |
# one method
max(mtcarsFive[,1])
max(mtcarsFive[,2])
max(mtcarsFive[,3])
max(mtcarsFive[,4])
max(mtcarsFive[,5])
#...etc
# another method
for (i in 1:ncol(mtcarsFive))
{
col <- mtcarsFive[,i]
max <- max(col)
print(max)
}
[1] 22.8 [1] 8 [1] 360 [1] 175 [1] 3.9 [1] 3.44 [1] 19.44 [1] 1 [1] 1 [1] 4 [1] 4
length(mtcarsFive)
load("data//sales.RData")
class(sales)
head(sales)
ID | Prod | Quant | Val | Insp |
---|---|---|---|---|
v1 | p1 | 182 | 1665 | unkn |
v2 | p1 | 3072 | 8780 | unkn |
v3 | p1 | 20393 | 76990 | unkn |
v4 | p1 | 112 | 1100 | unkn |
v3 | p1 | 6164 | 20260 | unkn |
v5 | p2 | 104 | 1155 | unkn |
set.seed(111)
m <- matrix(data=cbind(rnorm(30, 0), rnorm(30, 2), rnorm(30, 5)), nrow=30, ncol=3)
m
0.23522071 | -1.1132173 | 4.783571 |
-0.33073587 | 1.0586426 | 6.446478 |
-0.31162382 | 3.4002588 | 5.409710 |
-2.30234566 | 0.3795300 | 5.910917 |
-0.17087604 | -0.2659960 | 6.430358 |
0.14027823 | 3.1629936 | 4.618708 |
-1.49742666 | 1.8838450 | 5.202307 |
-1.01018842 | 2.3342560 | 4.193801 |
-0.94847560 | 1.3791419 | 5.294634 |
-0.49396222 | 0.6901551 | 6.404883 |
-0.17367413 | 0.8242740 | 6.023767 |
-0.40659878 | 0.8787845 | 5.476126 |
1.84563626 | 0.6380955 | 4.329670 |
0.39405411 | 2.4811246 | 5.159234 |
0.79752850 | 2.7419716 | 4.617285 |
-1.56666536 | 2.0278246 | 5.935763 |
-0.08585101 | 2.3313797 | 4.368468 |
-0.35913948 | 2.6441141 | 4.901694 |
-1.19360897 | 4.4856616 | 6.031985 |
0.36418674 | 3.9599817 | 5.387808 |
0.36166245 | 2.1916634 | 3.743871 |
0.34696437 | 3.5525443 | 4.213047 |
0.18973653 | 2.9142423 | 5.429812 |
-0.15957681 | 2.3586254 | 4.623584 |
0.32654924 | 2.1750956 | 3.783771 |
0.59825420 | 1.1527322 | 6.029279 |
-1.84153430 | 2.9782317 | 5.430397 |
2.71805560 | 3.8058683 | 3.754426 |
0.19124439 | 2.1229148 | 4.397272 |
-1.30129607 | 1.8702280 | 5.660069 |
# traverse row wise
apply(m, 1, mean)
# traverse column wise
# using Our own functions (udf)
apply(m, 2, function(x) length(x[x<0]))
lapply(1:3, function(x) x^2)
# you can use unlist with lapply to get a vector
unlist(lapply(1:3, function(x) x^2))
#simplify2array and sapply
simplify2array(lapply(1:3, function(x) x^2))
# create another list
CAGO.list <- list(Diet1 = c(2,5,4,3,5,3), Diet2 =c(8,5,6,5,7,7), Diet3 =c(3,4,2,5,2,6) , Diet4 = c(2,2,3,2,5,2))
CAGO.list
lapply(CAGO.list, mean)
lapply()
is working for data frames or notCAGO.df <- as.data.frame(CAGO.list)
CAGO.df
Diet1 | Diet2 | Diet3 | Diet4 |
---|---|---|---|
2 | 8 | 3 | 2 |
5 | 5 | 4 | 2 |
4 | 6 | 2 | 3 |
3 | 5 | 5 | 2 |
5 | 7 | 2 | 5 |
3 | 7 | 6 | 2 |
lapply(CAGO.df, mean) # without specifying margins it calculate as column wise
lappply()
Random <- c("This", "Is", "a", "Random", "Vector")
lapply(Random,nchar)
lapply(Random,toupper)
sapply(1:3, function(x) x^2)
print(sapply(1:3, function(x) x^2,simplify = FALSE))
[[1]] [1] 1 [[2]] [1] 4 [[3]] [1] 9
sapply()
on CAGO.list
and CAGO.df
print(sapply(CAGO.list, mean)) # output as a vector
Diet1 Diet2 Diet3 Diet4 3.666667 6.333333 3.666667 2.666667
print(sapply(CAGO.df, mean)) # output as a vector
Diet1 Diet2 Diet3 Diet4 3.666667 6.333333 3.666667 2.666667
print(sapply(Random, nchar)) # displayed each element name as heading
This Is a Random Vector 4 2 1 6 6
seqchar <- function(x){
seq(nchar(x))
}
print(sapply(Random, seqchar))
$This [1] 1 2 3 4 $Is [1] 1 2 $a [1] 1 $Random [1] 1 2 3 4 5 6 $Vector [1] 1 2 3 4 5 6
print(tapply(mtcars$wt,mtcars$cyl,mean))
4 6 8 2.285727 3.117143 3.999214
print(tapply(iris$Sepal.Length,iris$Species,mean))
setosa versicolor virginica 5.006 5.936 6.588
print(tapply(iris$Sepal.Length,iris$Species,mean))
setosa versicolor virginica 5.006 5.936 6.588
# Without simplification, tapply always returns a list.
tapply(iris$Sepal.Length,iris$Species,mean, simplify = FALSE)
# split function splits a vector into groups using a factor.Using split
# and then applying a function with lapply produces the same result as tapply
lapply(split(iris$Sepal.Length,iris$Species), mean) # Instead of tapply
When to use theses functions: https://www.r-bloggers.com/2012/12/using-apply-sapply-lapply-in-r/
StackOverflow answer on sapply, vapply, tapply etc.: https://stackoverflow.com/questions/3505701/grouping-functions-tapply-by-aggregate-and-the-apply-family
tapply()
can do# ?tapply
# ?aggregate
iris.x <- subset(iris, select= -Species) # subsetting without Species
iris.s <- subset(iris, select= Species) # subsetting only Species
aggregate(iris.x, iris.s, mean)
Species | Sepal.Length | Sepal.Width | Petal.Length | Petal.Width |
---|---|---|---|---|
setosa | 5.006 | 3.428 | 1.462 | 0.246 |
versicolor | 5.936 | 2.770 | 4.260 | 1.326 |
virginica | 6.588 | 2.974 | 5.552 | 2.026 |
aggregate(x = mtcars, by = list(mtcars$carb), FUN = median)
Group.1 | mpg | cyl | disp | hp | drat | wt | qsec | vs | am | gear | carb |
---|---|---|---|---|---|---|---|---|---|---|---|
1 | 22.80 | 4 | 108.00 | 93 | 3.850 | 2.320 | 19.470 | 1.0 | 1 | 4.0 | 1 |
2 | 22.10 | 4 | 143.75 | 111 | 3.730 | 3.170 | 17.175 | 0.5 | 0 | 4.0 | 2 |
3 | 16.40 | 8 | 275.80 | 180 | 3.070 | 3.780 | 17.600 | 0.0 | 0 | 3.0 | 3 |
4 | 15.25 | 8 | 350.50 | 210 | 3.815 | 3.505 | 17.220 | 0.0 | 0 | 3.5 | 4 |
6 | 19.70 | 6 | 145.00 | 175 | 3.620 | 2.770 | 15.500 | 0.0 | 1 | 5.0 | 6 |
8 | 15.00 | 8 | 301.00 | 335 | 3.540 | 3.570 | 14.600 | 0.0 | 1 | 5.0 | 8 |
## example with character variables and NAs
testDF <- data.frame(v1 = c(1,3,5,7,8,3,5,NA,4,5,7,9),
v2 = c(11,33,55,77,88,33,55,NA,44,55,77,99) )
by1 <- c("red", "blue", 1, 2, NA, "big", 1, 2, "red", 1, NA, 12)
by2 <- c("wet", "dry", 99, 95, NA, "damp", 95, 99, "red", 99, NA, NA)
aggregate(x = testDF, by = list(by1, by2), FUN = "mean")
Group.1 | Group.2 | v1 | v2 |
---|---|---|---|
1 | 95 | 5 | 55 |
2 | 95 | 7 | 77 |
1 | 99 | 5 | 55 |
2 | 99 | NA | NA |
big | damp | 3 | 33 |
blue | dry | 3 | 33 |
red | red | 4 | 44 |
red | wet | 1 | 11 |
## Formulas, one ~ one, one ~ many, many ~ one, and many ~ many:
aggregate(weight ~ feed, data = chickwts, mean)
aggregate(breaks ~ wool + tension, data = warpbreaks, mean)
aggregate(cbind(Ozone, Temp) ~ Month, data = airquality, mean)
aggregate(cbind(ncases, ncontrols) ~ alcgp + tobgp, data = esoph, sum)
## Dot notation:
aggregate(. ~ Species, data = iris, mean)
aggregate(len ~ ., data = ToothGrowth, mean)
feed | weight |
---|---|
casein | 323.5833 |
horsebean | 160.2000 |
linseed | 218.7500 |
meatmeal | 276.9091 |
soybean | 246.4286 |
sunflower | 328.9167 |
wool | tension | breaks |
---|---|---|
A | L | 44.55556 |
B | L | 28.22222 |
A | M | 24.00000 |
B | M | 28.77778 |
A | H | 24.55556 |
B | H | 18.77778 |
Month | Ozone | Temp |
---|---|---|
5 | 23.61538 | 66.73077 |
6 | 29.44444 | 78.22222 |
7 | 59.11538 | 83.88462 |
8 | 59.96154 | 83.96154 |
9 | 31.44828 | 76.89655 |
alcgp | tobgp | ncases | ncontrols |
---|---|---|---|
0-39g/day | 0-9g/day | 9 | 261 |
40-79 | 0-9g/day | 34 | 179 |
80-119 | 0-9g/day | 19 | 61 |
120+ | 0-9g/day | 16 | 24 |
0-39g/day | 10-19 | 10 | 84 |
40-79 | 10-19 | 17 | 85 |
80-119 | 10-19 | 19 | 49 |
120+ | 10-19 | 12 | 18 |
0-39g/day | 20-29 | 5 | 42 |
40-79 | 20-29 | 15 | 62 |
80-119 | 20-29 | 6 | 16 |
120+ | 20-29 | 7 | 12 |
0-39g/day | 30+ | 5 | 28 |
40-79 | 30+ | 9 | 29 |
80-119 | 30+ | 7 | 12 |
120+ | 30+ | 10 | 13 |
Species | Sepal.Length | Sepal.Width | Petal.Length | Petal.Width |
---|---|---|---|---|
setosa | 5.006 | 3.428 | 1.462 | 0.246 |
versicolor | 5.936 | 2.770 | 4.260 | 1.326 |
virginica | 6.588 | 2.974 | 5.552 | 2.026 |
supp | dose | len |
---|---|---|
OJ | 0.5 | 13.23 |
VC | 0.5 | 7.98 |
OJ | 1.0 | 22.70 |
VC | 1.0 | 16.77 |
OJ | 2.0 | 26.06 |
VC | 2.0 | 26.14 |
df <- data.frame(ingr=c("2L",6L, 7L),fctr=c(TRUE,FALSE,TRUE))
apply(df,2,class)
sapply(df,class)
l1 <- list(a = c(1:10), b = c(11:20))
l2 <- list(c = c(21:30), d = c(31:40))
# sum the corresponding elements of l1 and l2
print(mapply(sum, l1$a, l1$b, l2$c, l2$d))
[1] 64 68 72 76 80 84 88 92 96 100
print(mapply(sum, l1))
#sum(c(1:10))
a b 55 155
print(mapply(sum, l1,l2))
#sum(c(1:10),c(21:30))
a b 310 510
print(mapply(sum, l1$a, l1$b))
[1] 12 14 16 18 20 22 24 26 28 30
print(mapply(sum, l1$a, l1$b, l2$c, l2$d))
[1] 64 68 72 76 80 84 88 92 96 100
# Map - A wrapper to mapply with SIMPLIFY = FALSE, so it is guaranteed to return a list
print(Map(sum, l1$a, l1$b, l2$c, l2$d))
[[1]] [1] 64 [[2]] [1] 68 [[3]] [1] 72 [[4]] [1] 76 [[5]] [1] 80 [[6]] [1] 84 [[7]] [1] 88 [[8]] [1] 92 [[9]] [1] 96 [[10]] [1] 100
Examples:
x <- list(1:5, 6:10);
print(x)
[[1]] [1] 1 2 3 4 5 [[2]] [1] 6 7 8 9 10
print(do.call(rbind, x))
[,1] [,2] [,3] [,4] [,5] [1,] 1 2 3 4 5 [2,] 6 7 8 9 10
mylist <- list(vec1 = 1:5, vec2 = 6:10, vec3 = 11:15)
print(mylist)
$vec1 [1] 1 2 3 4 5 $vec2 [1] 6 7 8 9 10 $vec3 [1] 11 12 13 14 15
cbind(mylist) # will not work
mylist | |
---|---|
vec1 | 1, 2, 3, 4, 5 |
vec2 | 6, 7, 8, 9, 10 |
vec3 | 11, 12, 13, 14, 15 |
do.call(cbind, mylist) # works
vec1 | vec2 | vec3 |
---|---|---|
1 | 6 | 11 |
2 | 7 | 12 |
3 | 8 | 13 |
4 | 9 | 14 |
5 | 10 | 15 |
# we apply colMeans() function to all the observations on iris dataset grouped by Species.
b <- by(iris[, 1:4], iris$Species, colMeans) # exclude the last column since it contains factor (categories which we need to find)
b
iris$Species: setosa Sepal.Length Sepal.Width Petal.Length Petal.Width 5.006 3.428 1.462 0.246 ------------------------------------------------------------ iris$Species: versicolor Sepal.Length Sepal.Width Petal.Length Petal.Width 5.936 2.770 4.260 1.326 ------------------------------------------------------------ iris$Species: virginica Sepal.Length Sepal.Width Petal.Length Petal.Width 6.588 2.974 5.552 2.026
with()
¶syntax: with(data, expression)
with(mtcars, mpg[cyl == 8 & disp > 350])
# is the same as, but nicer than
mtcars$mpg[mtcars$cyl == 8 & mtcars$disp > 350]
attach()
¶detach()
function to reverses the processx <- data.frame(Subtype= c("A","A","B","A","B","B","A","B","A","B"),
Gender= rep(c("m","f"),each=5),
Expression = c(-0.54,-0.8,-1.03,-0.41,-1.31,-0.66,-0.43,1.01,-1.15,0.14))
# There are 3 variables, "Expression", "Gender" and "Subtype". We can display the variables by:
x$Expression
print(x$Gender)
[1] m m m m m f f f f f Levels: f m
attach(x)
Expression # no need of use $ symbol here
Gender
detach()
function reverses the process:detach(x)
Gender
# Error in eval(expr, envir, enclos): object 'Gender' not found
# Traceback:
x$Gender
detach()
function to unload the package from the current sessionlibrary()
){r}
### note ### before run this first use install.packages("cowsay") and library(cowsay) to install and load the package
detach("package:cowsay") # unload the package named cowsay from the current session
library(ggplot2)
Vs require(ggplot2)
¶if (!require(package)) install.packages('package') library(package)
# will install “package” if it doesn’t exist, and then load it.
OR
if(!require(DescTools)){install.packages("DescTools")}
which()
¶which(LETTERS == "R")
which(LETTERS == "J")
which((1:12)%%2 == 0) # which are even?
which(10:20 %% 3 ==0, arr.ind = TRUE)
mtcars[which(mtcars$mpg <= 18),]
# OR
mtcars[mtcars$mpg <= 18,]
mpg | cyl | disp | hp | drat | wt | qsec | vs | am | gear | carb | |
---|---|---|---|---|---|---|---|---|---|---|---|
Duster 360 | 14.3 | 8 | 360.0 | 245 | 3.21 | 3.570 | 15.84 | 0 | 0 | 3 | 4 |
Merc 280C | 17.8 | 6 | 167.6 | 123 | 3.92 | 3.440 | 18.90 | 1 | 0 | 4 | 4 |
Merc 450SE | 16.4 | 8 | 275.8 | 180 | 3.07 | 4.070 | 17.40 | 0 | 0 | 3 | 3 |
Merc 450SL | 17.3 | 8 | 275.8 | 180 | 3.07 | 3.730 | 17.60 | 0 | 0 | 3 | 3 |
Merc 450SLC | 15.2 | 8 | 275.8 | 180 | 3.07 | 3.780 | 18.00 | 0 | 0 | 3 | 3 |
Cadillac Fleetwood | 10.4 | 8 | 472.0 | 205 | 2.93 | 5.250 | 17.98 | 0 | 0 | 3 | 4 |
Lincoln Continental | 10.4 | 8 | 460.0 | 215 | 3.00 | 5.424 | 17.82 | 0 | 0 | 3 | 4 |
Chrysler Imperial | 14.7 | 8 | 440.0 | 230 | 3.23 | 5.345 | 17.42 | 0 | 0 | 3 | 4 |
Dodge Challenger | 15.5 | 8 | 318.0 | 150 | 2.76 | 3.520 | 16.87 | 0 | 0 | 3 | 2 |
AMC Javelin | 15.2 | 8 | 304.0 | 150 | 3.15 | 3.435 | 17.30 | 0 | 0 | 3 | 2 |
Camaro Z28 | 13.3 | 8 | 350.0 | 245 | 3.73 | 3.840 | 15.41 | 0 | 0 | 3 | 4 |
Ford Pantera L | 15.8 | 8 | 351.0 | 264 | 4.22 | 3.170 | 14.50 | 0 | 1 | 5 | 4 |
Maserati Bora | 15.0 | 8 | 301.0 | 335 | 3.54 | 3.570 | 14.60 | 0 | 1 | 5 | 8 |
mpg | cyl | disp | hp | drat | wt | qsec | vs | am | gear | carb | |
---|---|---|---|---|---|---|---|---|---|---|---|
Duster 360 | 14.3 | 8 | 360.0 | 245 | 3.21 | 3.570 | 15.84 | 0 | 0 | 3 | 4 |
Merc 280C | 17.8 | 6 | 167.6 | 123 | 3.92 | 3.440 | 18.90 | 1 | 0 | 4 | 4 |
Merc 450SE | 16.4 | 8 | 275.8 | 180 | 3.07 | 4.070 | 17.40 | 0 | 0 | 3 | 3 |
Merc 450SL | 17.3 | 8 | 275.8 | 180 | 3.07 | 3.730 | 17.60 | 0 | 0 | 3 | 3 |
Merc 450SLC | 15.2 | 8 | 275.8 | 180 | 3.07 | 3.780 | 18.00 | 0 | 0 | 3 | 3 |
Cadillac Fleetwood | 10.4 | 8 | 472.0 | 205 | 2.93 | 5.250 | 17.98 | 0 | 0 | 3 | 4 |
Lincoln Continental | 10.4 | 8 | 460.0 | 215 | 3.00 | 5.424 | 17.82 | 0 | 0 | 3 | 4 |
Chrysler Imperial | 14.7 | 8 | 440.0 | 230 | 3.23 | 5.345 | 17.42 | 0 | 0 | 3 | 4 |
Dodge Challenger | 15.5 | 8 | 318.0 | 150 | 2.76 | 3.520 | 16.87 | 0 | 0 | 3 | 2 |
AMC Javelin | 15.2 | 8 | 304.0 | 150 | 3.15 | 3.435 | 17.30 | 0 | 0 | 3 | 2 |
Camaro Z28 | 13.3 | 8 | 350.0 | 245 | 3.73 | 3.840 | 15.41 | 0 | 0 | 3 | 4 |
Ford Pantera L | 15.8 | 8 | 351.0 | 264 | 4.22 | 3.170 | 14.50 | 0 | 1 | 5 | 4 |
Maserati Bora | 15.0 | 8 | 301.0 | 335 | 3.54 | 3.570 | 14.60 | 0 | 1 | 5 | 8 |
match()
¶a <- c("a","e","i","o","u")
b <- letters
match(a,b)
# %in% operator gives a logical vector
a %in% b
a <- c("a","#","i","@","u")
b <- letters
match(a,b) # if the matched value not present then returns NA
# %in% operator gives a logical vector
a %in% b
grep()
¶fields <- c("Age", "DateOnset", "Sex", "Admission.date", "dateofdeath", "Country")
grep("date", fields, ignore.case=TRUE)
grep("date", fields, ignore.case=TRUE, value=TRUE)
print
, plot
, summary
.data <- data.frame(ornam=c("gold", "silver", "iron"), headW=c(3.5, 5, 12))
attributes(data)
methods(class="data.frame")
[1] $ $<- [ [[ [[<- [6] [<- aggregate anyDuplicated as.data.frame as.list [11] as.matrix by cbind coerce dim [16] dimnames dimnames<- droplevels duplicated edit [21] format formula head initialize is.na [26] Math merge na.exclude na.omit Ops [31] plot print prompt rbind row.names [36] row.names<- rowsum show slotsFromS3 split [41] split<- stack str subset summary [46] Summary t tail transform unique [51] unstack within see '?methods' for accessing help and source code
when you type print
function without parentheses
print
function (x, ...)
UseMethod("print")
UseMethod
call in there actually is the function inside the print functionUseMethod
is the way that R overloads its functionsmethods(print)
[1] print.acf* [2] print.AES* [3] print.anova* [4] print.aov* [5] print.aovlist* [6] print.ar* [7] print.Arima* [8] print.arima0* [9] print.AsIs [10] print.aspell* [11] print.aspell_inspect_context* [12] print.bibentry* [13] print.Bibtex* [14] print.browseVignettes* [15] print.by [16] print.changedFiles* [17] print.check_code_usage_in_package* [18] print.check_compiled_code* [19] print.check_demo_index* [20] print.check_depdef* [21] print.check_details* [22] print.check_details_changes* [23] print.check_doi_db* [24] print.check_dotInternal* [25] print.check_make_vars* [26] print.check_nonAPI_calls* [27] print.check_package_code_assign_to_globalenv* [28] print.check_package_code_attach* [29] print.check_package_code_data_into_globalenv* [30] print.check_package_code_startup_functions* [31] print.check_package_code_syntax* [32] print.check_package_code_unload_functions* [33] print.check_package_compact_datasets* [34] print.check_package_CRAN_incoming* [35] print.check_package_datasets* [36] print.check_package_depends* [37] print.check_package_description* [38] print.check_package_description_encoding* [39] print.check_package_license* [40] print.check_packages_in_dir* [41] print.check_packages_used* [42] print.check_po_files* [43] print.check_Rd_contents* [44] print.check_Rd_line_widths* [45] print.check_Rd_metadata* [46] print.check_Rd_xrefs* [47] print.check_RegSym_calls* [48] print.check_so_symbols* [49] print.check_T_and_F* [50] print.check_url_db* [51] print.check_vignette_index* [52] print.checkDocFiles* [53] print.checkDocStyle* [54] print.checkFF* [55] print.checkRd* [56] print.checkReplaceFuns* [57] print.checkS3methods* [58] print.checkTnF* [59] print.checkVignettes* [60] print.citation* [61] print.codoc* [62] print.codocClasses* [63] print.codocData* [64] print.colorConverter* [65] print.compactPDF* [66] print.condition [67] print.connection [68] print.CRAN_package_reverse_dependencies_and_views* [69] print.crayon* [70] print.data.frame [71] print.Date [72] print.default [73] print.dendrogram* [74] print.density* [75] print.difftime [76] print.dist* [77] print.Dlist [78] print.DLLInfo [79] print.DLLInfoList [80] print.DLLRegisteredRoutines [81] print.dummy_coef* [82] print.dummy_coef_list* [83] print.ecdf* [84] print.eigen [85] print.factanal* [86] print.factor [87] print.family* [88] print.fileSnapshot* [89] print.findLineNumResult* [90] print.formula* [91] print.fseq* [92] print.ftable* [93] print.function [94] print.getAnywhere* [95] print.glm* [96] print.hclust* [97] print.help_files_with_topic* [98] print.hexmode [99] print.HoltWinters* [100] print.hsearch* [101] print.hsearch_db* [102] print.htest* [103] print.infl* [104] print.integrate* [105] print.isoreg* [106] print.json* [107] print.kmeans* [108] print.Latex* [109] print.LaTeX* [110] print.libraryIQR [111] print.listof [112] print.lm* [113] print.loadings* [114] print.loess* [115] print.logLik* [116] print.ls_str* [117] print.medpolish* [118] print.MethodsFunction* [119] print.mtable* [120] print.NativeRoutineList [121] print.news_db* [122] print.nls* [123] print.noquote [124] print.numeric_version [125] print.object_size* [126] print.octmode [127] print.packageDescription* [128] print.packageInfo [129] print.packageIQR* [130] print.packageStatus* [131] print.pairwise.htest* [132] print.PDF_Array* [133] print.PDF_Dictionary* [134] print.pdf_doc* [135] print.pdf_fonts* [136] print.PDF_Indirect_Reference* [137] print.pdf_info* [138] print.PDF_Keyword* [139] print.PDF_Name* [140] print.PDF_Stream* [141] print.PDF_String* [142] print.person* [143] print.POSIXct [144] print.POSIXlt [145] print.power.htest* [146] print.ppr* [147] print.prcomp* [148] print.princomp* [149] print.proc_time [150] print.R6* [151] print.R6ClassGenerator* [152] print.raster* [153] print.Rd* [154] print.recordedplot* [155] print.restart [156] print.RGBcolorConverter* [157] print.rle [158] print.roman* [159] print.SavedPlots* [160] print.scalar* [161] print.sessionInfo* [162] print.simple.list [163] print.smooth.spline* [164] print.socket* [165] print.srcfile [166] print.srcref [167] print.stepfun* [168] print.stl* [169] print.StructTS* [170] print.subdir_tests* [171] print.summarize_CRAN_check_status* [172] print.summary.aov* [173] print.summary.aovlist* [174] print.summary.ecdf* [175] print.summary.glm* [176] print.summary.lm* [177] print.summary.loess* [178] print.summary.manova* [179] print.summary.nls* [180] print.summary.packageStatus* [181] print.summary.ppr* [182] print.summary.prcomp* [183] print.summary.princomp* [184] print.summary.table [185] print.summaryDefault [186] print.table [187] print.tables_aov* [188] print.terms* [189] print.ts* [190] print.tskernel* [191] print.TukeyHSD* [192] print.tukeyline* [193] print.tukeysmooth* [194] print.undoc* [195] print.vignette [196] print.warnings [197] print.weird [198] print.xgettext* [199] print.xngettext* [200] print.xtabs* see '?methods' for accessing help and source code
S3 methods dispatch
Generic functions are simple: the complete definition of plot()
is a single line:
plot <- function (x, y, ...) UseMethod("plot")
UseMethod
If plot.x
is not found, UseMethod
repeats the search using the next
class attribute, if any. If no class-specific function is found, it uses
plot.default
par(mfrow=c(1,2))
with(iris, plot(Sepal.Length, Species))
with(iris, plot(Species, Sepal.Length))
The plot method dispatched a scatter plot when the first argument was a continuous
variable, but dispatched a boxplot when the first argument was a factor
. It is because R
is trying to figure out what you want based on the supplied arguments. In this case, it
is the first argument that determines the method, not the two arguments together.
** create a new print
function that will be invoked on objects of class weird **
print.weird <- function(x) {
print("Object of class weird")
print(paste("Length:", length(x)))
print(head(x))
}
# the logic inside function is it just shows the length and the first 6 values using head(),
# and lets the user know the object is of class weird
boing <- 1:100
print(boing)
[1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 [19] 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 [37] 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 [55] 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 [73] 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 [91] 91 92 93 94 95 96 97 98 99 100
# make boing an object of class weird
class(boing) <- "weird"
print(boing)
[1] "Object of class weird" [1] "Length: 100" [1] 1 2 3 4 5 6
Note that R is full of overloaded functions — that is functions that behave differently depending on the class of their arguments.
How Generics Work: http://www.stat.umn.edu/geyer/3701/notes/generic.html
new()
S4 method dispatch
S4 dispatch is complicated because S4 has two important features:
https://adv-r.hadley.nz/s4.html#method-dispatch-1
Advanced R by Hadley Wickham's OO field guide: http://adv-r.had.co.nz/OO-essentials.html
(for easy to read, share, and verify)