Quick loan rate calculation for a large number of loans

I have a large data set (about 200 thousand rows), where each row is a loan. I have a loan amount, the number of payments and payment of a loan. I am trying to get a loan rate. R does not have a function to calculate this (at least at the base of R it does not, and I could not find it). It’s not so difficult to write both npv and irr functions

Npv <- function(i, cf, t=seq(from=0,by=1,along.with=cf)) sum(cf/(1+i)^t) Irr <- function(cf) { uniroot(npv, c(0,100000), cf=cf)$root } 

And you can just do

 rate = Irr(c(amt,rep(pmt,times=n))) 

The problem is that you are trying to calculate a bid for a large number of payments. Since uniroot is not vectorized, and because rep takes an amazing amount of time, you get a slow calculation. You can do it faster if you do the math and find out that you are looking for the roots of the following equation.

 zerome <- function(r) amt/pmt-(1-1/(1+r)^n)/r 

and then use this as an input for uniroot. This, on my computer, takes about 20 seconds to work in my 200k database.

The problem is that I'm trying to do some optimization, and this is an optimization step, so I'm trying to speed it up even more.

I tried vectorization, but since uniroot is not vectorized, I cannot go any further. Is there any root search method that is vectorized?

thanks

+4
source share
1 answer

Instead of using the root crawler, you can use a linear interpolator. You will need to create one interpolator for each value of n (the number of remaining payments). Each interpolator will map (1-1/(1+r)^n)/r to r . Of course, you will need to build the grid enough so that it returns r to an acceptable level of accuracy. The best part about this approach is that linear interpolators are fast and vectorized: you can find rates on all loans with the same number of outstanding payments ( n ) per call of the corresponding interpolator.

Now some code that proves this is a viable solution:

First, we create the interpolators, one for each possible value of n :

 n.max <- 360L # 30 years one.interpolator <- function(n) { r <- seq(from = 0.0001, to = 0.1500, by = 0.0001) y <- (1-1/(1+r)^n)/r approxfun(y, r) } interpolators <- lapply(seq_len(n.max), one.interpolator) 

Please note that I used 1/100 percent accuracy (1bp).

Then we create some fake data:

 n.loans <- 200000L n <- sample(n.max, n.loans, replace = TRUE) amt <- 1000 * sample(100:500, n.loans, replace = TRUE) pmt <- amt / (n * (1 - runif(n.loans))) loans <- data.frame(n, amt, pmt) 

Finally, we solve for r :

 library(plyr) system.time(ddply(loans, "n", transform, r = interpolators[[n[1]]](amt / pmt))) # user system elapsed # 2.684 0.423 3.084 

It is fast. Please note that some of the output speeds are NA , but this is due to the fact that my random entries did not make any sense and would return bets outside the grid [0 ~ 15%] that I chose. Your real data will not have this problem.

+7
source

All Articles