Lab1: Introduction to R (and More!)


Getting Started with R

Click on the R icon

To quit R:

> q()

What does it ask you? What does this mean?

For R HELP, for example to get help about the function rnorm ,

> help(rnorm)

You can also go to Help on the R toolbar and select R Help.


Example 1: Let the games begin.

Generate a sample of 100 N(0,1) random variables.

> help(rnorm)

> rnorm(100)

What happened?

Now, let's try:

> temp <- rnorm(100)

What is in the object temp?

What is the length of the oject temp?

Make a informative plot of temp?


Example 2: The Nonparametric Bootstrap

Let's get our own bootsrap function by:

> source("http://www.rohan.sdsu.edu/~babailey/bridges13/bootstrap.r")

(Here is the function: bootstrap.r )

There is a help file available: bootstrap.help

Let's boostrap the mean of data. Let's make it simple: 1,2,3

> data <- c(1,2,3)

> results <- bootstrap(x=data,nboot=100,theta=mean)

Let's make a histogram of the 100 boostrap means:

> hist(results$thetastar)

How could you construct a CI?

> quantile(results$thetastar, c(0.05, 0.95))


Example 3: Trees

Here is: Information on the South African Heart Disease Data

Let's get the South African Heart Disease Data into R:

> sahd <- read.table("http://statweb.stanford.edu/~tibs/ElemStatLearn/datasets/SAheart.data", sep=",",head=T,row.names=1)

We can make a scatterplot matrix by:

> pairs(sahd)

Before we grow a gree we have to load the R package rpart :
Go to the toolbar under Packages select Load Packages and click on rpart from the list and load.

Let's look at the help function:

> help(rpart)

Let's grow a tree and look at the tree diagram:

> sahdtree <- rpart(as.factor(chd)~., data=sahd)
> plot(sahdtree)
> text(sahdtree)


Example 4: Random Forest

Before we grow a Random Forest we have to load the R package randomForest :
Go to the toolbar under Packages select Load Packages and click on randomForest from the list and load it.

If it is not in the list, we'll have to go get it and load it.

Go to the toolbar under Packages and select Install Packages.
You will get many CRAN Mirror sites. Let's pick USA CA2.
Under Packages, scoll way down and select randomForest Click OK.

We'll need to load the package:

library(randomForest)

Let's look at the help function:

> help(randomForest)

Let's grow a Random Forest:

> sahdrf <- randomForest(as.factor(chd)~., data=sahd, importance=TRUE)

If you get an error, then let's try:

> sahd$chd <- as.factor(sahd$chd)
> sahdrf <- randomForest(chd~., data=sahd, importance=TRUE)

Let's look at the output:

> print(sahdrf)

Did you grow enough trees?

> plot(sahdrf)

Let's look at the importance of the variables:

> varImpPlot(sahdrf, type=1)