Final implementation - not finished, but lights up in the right way
Idea / Problem: You have a plot with many overlapping points and you want to replace them with a simple area, so increase productivity by viewing the plot.
Possible implementation: Calculate the matrix of distances between all points and connect all points below a given distance.
Tolo / Not completed:. Currently, it works for manually set distances depending on the size of the printing area. I stayed here because the result did not fit my aesthetic sense.
Minimal example with intermediate charts
set.seed(074079089) n.points <- 3000 mat <- matrix(rnorm(n.points*2, 0,0.2), nrow=n.points, ncol=2) colnames(mat) <- c("x", "y") d.mat <- dist(mat) fit.mat <-hclust(d.mat, method = "single") lims <- c(-1,1) real.lims <- lims*1.1 ## ggplot invokes them approximately # An attempt to estimate the point-sizes, works for default pdfs pdf("test.pdf") cutsize <- sum(abs(real.lims))/100 groups <- cutree(fit.mat, h=cutsize) # cut tree at height cutsize # plot(fit.mat) # display dendogram # draw dendogram with red borders around the 5 clusters # rect.hclust(fit.mat, h=cutsize, border="red") library(ggplot2) df <- data.frame(mat) df$groups <- groups plot00 <- ggplot(data=df, aes(x,y, col=factor(groups))) + geom_point() + guides(col=FALSE) + xlim(lims) + ylim(lims)+ ggtitle("Each color is a group") pdf("plot00.pdf") print(plot00) dev.off()


find_hull <- function(df_0) { return(df_0[chull(df_0$x, df_0$y), ]) } library(plyr) single.points.df <- df[df$groups == 0 , ] connected.points.df <- df[df$groups != 0 , ] hulls <- ddply(connected.points.df, "groups", find_hull)

plot03 <- plot02 for(grp in names(table(hulls$groups))) { plot03 <- plot03 + geom_polygon(data=hulls[hulls$groups==grp, ], aes(x,y), alpha=0.4) }

Starting question
I have a (maybe odd) question.
In some stories, I have thousands of points in my analysis. To display them, the computer takes a lot of time, because there are so many of them. After that, many of these points can overlap, I have a filled area (this is normal!). To save time / effort, it would be useful to simply fill in this area, but the construction of each point in itself.
I know that there are possibilities in heatmaps and so on, but this is not the idea that I have in mind. My idea is something like:
Any ideas? Or can it do nothing at any time?
# Example code # install.packages("ggplot2") library(ggplot2) n.points <- 10000 mat <- matrix(rexp(n.points*2), nrow=n.points, ncol=2) colnames(mat) <- c("x", "y") df <- data.frame(mat) plot00 <- ggplot(df, aes(x=x, y=y)) + theme_bw() + # white background, grey strips geom_point(shape=19)# Aussehen der Punkte print(plot00)


Edit:
In order to have dense patches, as mentioned by fdetsch (how can I tag a name?), There are some questions regarding this topic. But that is not what I want for sure. I know that my concern is a little strange, but densities make the plot more busy sometimes as needed.
Links to topics with densities:
Scatterplot with too many dots
High Density Charts