I am here a complete R newbie with the appropriate difficulty level for this question.
I use the ROCR package in R to generate graphic data for ROC curves. Then I use ggplot2 to draw a graph. Something like that:
library(ggplot2) library(ROCR) inputFile <- read.csv("path/to/file", header=FALSE, sep=" ", colClasses=c('numeric','numeric'), col.names=c('score','label')) predictions <- prediction(inputFile$score, inputFile$label) auc <- performance(predictions, measure="auc")@y.values[[1]] rocData <- performance(predictions, "tpr","fpr") rocDataFrame <- data.frame( x=rocData@x.values [[1]], y=rocData@y.values [[1]]) rocr.plot <- ggplot(data=rd, aes(x=x, y=y)) + geom_path(size=1) rocr.plot <- rocr.plot + geom_text(aes(x=1, y= 0, hjust=1, vjust=0, label=paste(sep = "", "AUC = ",round(auc,4))),colour="black",size=4)
This works well for drawing a single ROC curve. However, what I would like to do is read in whole directories of input files - one file for each result of the classifier test - and makes a multi-faceted graph ggplot2 of all ROC curves, while at the same time printing the AUC score on each graph.
I would like to understand what the βrightβ R-style approach is to achieve this. I am sure that I can hack something together by passing one cycle through all the files in the directory and creating a separate data frame for each, and then creating another cycle to create several graphs and somehow get ggplo2 to display all these graphs on the same surface. However, this does not allow me to use the built-in ggplot2 cut, which, in my opinion, is the right approach. I am not sure how to get my data in proper form for using the cut. Should I merge all my data frames into one and give each merged fragment a name (for example, a file name) and a cut? If so, is there a library or recommended practice for this?
Your suggestions are welcome. I'm still pondering the best practices in R, so I would rather get expert advice rather than just hacking things to make code that looks more like the usual declarative programming languages ββI'm used to.
EDIT: Iβm the least understood if using the built-in ggplot2 cut features I can output a custom line (AUC score) to each plot that it will generate.