Lookups in R: The Wrong Way and the Right Way

I recently wrote a script that takes DBF exports of Cube matrices and prepares them for Biogeme. Â The main… well, only reason I did this in R was because I was considering using mlogit for model estimation. Â I ultimately decided to ‘go with what I know’ and changed course to use Biogeme. Mind you, the part of Stairway to Heaven applies: “There are two paths you can go by, but in the long run /Â There’s still time to change the road you’re on.”

The Wrong Way

I’ve changed my code already, so pardon that this is from memory. Â Also, these are snippets – I have a lot more code than this.

HSkimPk<-read.dbf("data/HSKIM_PK3.dbf")

for(rn in 1:nrow(TripsAll)){
HSkimPkRow<-subset(HSkimPk,I==TripsAll[rn,"PTAZ"] & J==TripsAll[rn,"ATAZ")
TripsAll$DA.IVT<-HSkimPkRow[,"V1"]
...
}

This took no less than 17 hours to complete for around 23,000 trip records and for values from 5 different tables * 2 time periods.

The Right Way

I (obviously) wanted something that wouldn't take forever, especially as I was working in Biogeme and seeing things that made me think that I wanted to change ONE LITTLE THING. Â This seems to always happen.

I took a different approach that by my calculations should be much quicker.

HSkimPk<-read.dbf("data/HSKIM_PK3.dbf")
HSkimPkD<-acast(HSkimPk,I ~ J,value.var="V2",drop=FALSE,fill=0)
HSkimPkT<-acast(HSkimPk,I ~ J,value.var="V1",drop=FALSE,fill=0)

for(rn in 1:nrow(TripsAll)){
if(I<=nrow(HSkimPkT) & J<=nrow(HSkimPkT)){
TripsAll[rn,"DA.IVT"]<-HSkimPkT[I,J]
}
}

Since this is currently running, my only metrics are to look at the time per 50 rows (in my real code, I have a line that outputs a timestamp every 50 rows), and it is taking about 0.27 seconds per record, compared to somewhere around 4.5 seconds per record. Â While not perfect, I'll take an estimated completion of 1.75 hours compared to 17 (update: 2 hours). Â However, I will say that Cube is faster in this regard and that I may not have the fastest R solution.

Tags: cube, lookup, matrix, R

Posted on Thursday, August 28th, 2014 By Andrew Rohne Comments are closed.

Comments from Other Sites

Comments are closed.

Lookups in R: The Wrong Way and the Right Way

The Wrong Way

The Right Way

Like this:

Comments from Other Sites

About this Blog

Associated Sites

Links

Pages

Categories

Subscribe

Tags

Social Networking

Recent Presentations

Photography

Lookups in R: The Wrong Way and the Right Way

The Wrong Way

The Right Way

Share this:

Like this:

Comments from Other Sites

About this Blog

Associated Sites

Links

Pages

Categories

Subscribe

Tags

Social Networking

Recent Presentations

Photography