Remove multiple text labels from check marks in ggplot graph plot

I am doing some research on non-payers and non-payers regarding banking operations. In this context, I plan to distribute them relative to some point in the bar. The higher the score, the higher the credit rating.

Since the default number is very limited compared to the number of non-standard defaults, the default values ​​and values ​​other than the default values ​​on the same bar chart do not really give, since you can hardly see the default values. Then I make a second bar chart, based only on estimates of non-payers, but on the same scale of intervals as the full bar chart for both ratings and non-payers. Then I add vertical lines to the first hatching graph indicating where the highest and lowest insolvency is located. That is, to get an idea of ​​where the distribution of defaulters fits into the distribution of the total distribution of both defaulters and non-defaulters.

The x axis easily becomes very "crowded". I would like to remove some text for ticks, but not all marks.

Below is the code that I use instead of replaceable (seeded) random data.

The first bar graph is what I would like to have regarding the text on the marks, but I do not have enough of all the ticks that I have in the second bar. The second column of the screen displays “overflow”.

library(ggplot2) library(ggExtra) #NDS represents non-defaults and DS defaults on the same scale #although here being just some random normals for the sake of simplicity. set.seed(10) NDS<-rnorm(10000,sd=1)-2 DS<-rnorm(100,sd=2)-5 #Cutoffs are constructed such that intervals of size 0.3 contain all values #of NDS & DS minCutoff<--9.3 maxCutoff<-2.1 #Generate the actual interval "bins" NDS_CUT<-cut(NDS,breaks=seq(minCutoff, maxCutoff, by = 0.3)) DS_CUT<-cut(DS,breaks=seq(minCutoff, maxCutoff, by = 0.3)) #Manually generate where to put the vertical lines for min(DS) and max(DS) minDS_bar<-levels(cut(NDS,breaks=seq(minCutoff, maxCutoff, by = 0.3)))[1] maxDS_bar<-levels(cut(NDS,breaks=seq(minCutoff, maxCutoff, by = 0.3)))[32] #Generate data frame - seems stupid, but makes sense #when the "real" data is used :-) NDSdataframe<-cbind(as.data.frame(NDS_CUT),rep(factor("State-1"),length(NDS_CUT))) colnames(NDSdataframe)<-c("Score","Action") DSdataframe<-cbind(as.data.frame(DS_CUT),rep(factor("State-2"),length(DS_CUT))) colnames(DSdataframe)<-c("Score","Action") fulldataframe<-rbind(NDSdataframe,DSdataframe) attach(fulldataframe) #Plot the full distribution of NDS & DS with geom_vlines #Get the tick texts I want to show myLevels<-levels(cut(NDS,breaks=seq(roundDownNDS, roundUpNDS, by = 0.3))) lengthMyLevels<-length(myLevels) myBreaks<-seq(1,lengthMyLevels,3) chosenbreaks<-myLevels[myBreaks[1]] for(i in 2:length(myBreaks)) { chosenbreaks<-rbind(chosenbreaks,myLevels[myBreaks[i]]) } #Generate the plot of both NDS & DS fullplot<-ggplot(fulldataframe, aes(Score, fill=factor(Action,levels=c("State- 2","State-1")))) + geom_bar(position="stack") + opts(axis.text.x = theme_text(angle = 45,size=8)) + opts(legend.position = "none") + xlab("Scoreinterval") + ylab("Antal pr. interval") + geom_vline(aes(xintercept = minDS_bar, colour="red")) + geom_vline(aes(xintercept = maxDS_bar, colour="red")) + scale_x_discrete("test",breaks=chosenbreaks) #Generate second dataframe for the plot of DS only DSdataframe2<-cbind(na.omit(as.data.frame(DS_CUT)),rep(factor("Fallit"),length (na.omit(as.data.frame(DS_CUT))))) colnames(DSdataframe2)<-c("theScore","theAction") #Calculate max value for the DS myMax<-max(table(DSdataframe2))+1 attach(DSdataframe2) #Generate plot for the DS only subplot<-ggplot(fulldataframe, aes(theScore, fill=factor(theAction))) + geom_bar (position="stack") + opts(axis.text.x = theme_text(angle = 45)) + opts(legend.position = "none") + ylim(0, myMax) + xlab("Scoreinterval") + ylab("Antal pr. interval") #Using the ggExtra package the to plots are aligned align.plots(fullplot, subplot) detach(DSdataframe2) detach(fulldataframe) 

Any help is much appreciated!

Thanks,

Christian

+6
r ggplot2
source share
2 answers

If I understand correctly, you can simply specify empty text labels for each other label,

 library(ggplot2) interleave <- function(x,y){ lx <- length(x) ly <- length(y) n <- max(lx,ly) as.vector(rbind(rep(x, length.out=n), rep(y, length.out=n))) } d <- data.frame(x=1:10, y=rnorm(10)) my_breaks <- seq(1,10,by=1) my_labs <- interleave(seq(1,10,by=2), "") qplot(x,y,data=d)+ scale_x_continuous(breaks=my_breaks, labels=my_labs) 

enter image description here

+3
source share

Here's another version based on @baptiste that makes it easy to select even or odd, every third, etc.

 library(ggplot2) library(gridExtra) ## helper function ## periodically replace orig with .fill label_fill <- function(orig, .offset=0, .mod=2, .fill=""){ ## replace ii <- as.logical( ## offset==0 keeps first (1:length(orig)-1+.offset) %% .mod ) orig[ii] <- .fill orig } ## data, labels nn <- 10 my_dat <- data.frame(x=1:nn, y=rnorm(nn)) my_breaks <- my_dat$x my_plot <- ( ggplot(my_dat, aes(x,y)) + geom_line() ## default: every other, start from 1 ## by default, function takes breaks + scale_x_continuous( breaks=my_breaks, labels=label_fill ) + theme_bw() ) ## another form ## manually pass breaks every_third <- scale_x_continuous( breaks=my_breaks, labels=label_fill(my_breaks, .mod=3) ) ## side-by-side grid.arrange(ncol=2, my_plot, ## every third with offset my_plot + every_third ) 

enter image description here

0
source share

All Articles