Histogram with "other" value at the end to show how many values were cut off

No votes yet

Sometimes you have to cut your histogram at some value but you want to still inform the reader how many values you cut off. Here I simply add one additional histogram bar at the right hand side, ">30", to display the missing values.

data(USArrests, "VADeaths")
data <- USArrests$Rape
h <- hist(data, breaks=seq(0,3000,1), plot=F)

subset <- h$counts[seq(1,limit)]
rest <- sum(h$counts[seq(limit+1,length(h$counts))])
data <- c(subset, rest)

labels <- c(seq(1,limit,1),paste(">",toString(limit), seq=""))

Read more

merging several files into a data.frame (the simplest way)

No votes yet

rawdata <- list.files("./data")
dfr <- NULL
for (i in rawdata) {
dfr[i] <- read.delim(i)
}

Two-colored symbols

No votes yet

The R plotting symbols pch=21:25 have borders and backgrounds. col gives the border color and bg the background color. (R version 2.8.x)

Example:

opar<-par(mar=c(0,0,2,0),oma=c(0,0,0,0))
plot(1:5,rep(1,5),col='blue',bg='red',pch=21:25,cex=4,
   lwd=2,axes=FALSE,xlim=c(0,6),ylim=c(0.5,1.5),
   main="two-colored symbols",xlab="")
text(1:5,rep(1,5),labels="21":"25")
par(opar)

Parsing date strings

No votes yet

# Another version using location and substrings:

x <- as.POSIXct("2005-09-01 01:25:46.9")
print(x) #note R didn't round up seconds, and added timezone

yr <- substring(x,1,4)
m <- substring(x,6,7)
d <- substring(x,9,10)
hour <- substring(x,12,13)
min <- substring(x,15,16)
sec <- substring(x,18,19) #note, I decided to truncate decimals,
#you don't have to

## This is simple and you only have to print a variable first, to see
## how it is arranged.

Convert data from multivariate (wide) to univariate (stacked/tall/long) form

Average: 5 (1 vote)

Our data in multivariate (wide) format:

my_data<- data.frame(
    id=c(1:50),
    depression1=rnorm(50),
    anxiety1=rnorm(50),
    depression2=rnorm(50),
    anxiety2=rnorm(50),
    depression3=rnorm(50),
    anxiety3=rnorm(50)
)
my_data

Read more after the jump.

Read more

Select time variables in a (multivariate/wide) data frame by name

No votes yet

This function will extract time labeled variables (or any variables with consistent naming) by name from a data frame in the order in which they appear in the data frame. This is especially useful in cases when one has a longitudinal or time-series data set where each row (subject) has many occasions of measurements for each measure. Code and examples after the jump.

Read more

Create a symmetric matrix by specifying upper or lower triangle

Average: 5 (1 vote)

The input is specified as the row-wise lower triangle or column-wise upper triangle.

Read more

Calling R from Python

No votes yet

This is an overview of RPy:

http://www.daimi.au.dk/~besen/TBiB2007/lecture-notes/rpy.html

While I've never used it, I can definitely see doing so, as I've recently become more and more a python programmer.

Convert data from univariate (stacked/tall/long) to multivariate (wide) form

No votes yet

The function and documentation is listed after the jump.

Read more

Syndicate content