How to add a legend for a regional map with a legend describing related tags using ggplot2?

Question

How to add a legend for a regional map with a legend describing related tags using ggplot2?

SpatialPoly Data: SpatialData

Productivity Data: Yield Data

the code:

## Loading packages library(rgdal) library(plyr) library(maps) library(maptools) library(mapdata) library(ggplot2) library(RColorBrewer) library(foreign) library(sp) ## Loading shapefiles and .csv files #Morocco <- readOGR(dsn=".", layer="Morocco_adm0") MoroccoReg <- readOGR(dsn=".", layer="Morocco_adm1") MoroccoYield <- read.csv(file = "F:/Purdue University/RA_Position/PhD_ResearchandDissert/PhD_Draft/Country-CGE/RMaps_Morocco/Morocco_Yield.csv", header=TRUE, sep=",", na.string="NA", dec=".", strip.white=TRUE) ## Reorder the data in the shapefile based on the category variable "ID_1" and change to dataframe MoroccoReg <- MoroccoReg[order(MoroccoReg$ID_1), ] MoroccoReg.df <- fortify(MoroccoReg) ## Add the yield impacts column to shapefile from the .csv file by "ID_1" ## Note that in the .csv file, I just added the column "ID_1" to match it with the shapefile MoroccoReg.df <- cbind(MoroccoReg.df,MoroccoYield,by = 'ID_1') ## Check the structure and contents of shapefile attributes(MoroccoReg.df) ## Define new theme for map ## I have found this function on the website theme_map <- function (base_size = 12, base_family = "") { theme_gray(base_size = base_size, base_family = base_family) %+replace% theme( axis.line=element_blank(), axis.text.x=element_blank(), axis.text.y=element_blank(), axis.ticks=element_blank(), axis.ticks.length=unit(0.3, "lines"), axis.ticks.margin=unit(0.5, "lines"), axis.title.x=element_blank(), axis.title.y=element_blank(), legend.background=element_rect(fill="white", colour=NA), legend.key=element_rect(colour="white"), legend.key.size=unit(1.5, "lines"), legend.position="right", legend.text=element_text(size=rel(1.2)), legend.title=element_text(size=rel(1.4), face="bold", hjust=0), panel.background=element_blank(), panel.border=element_blank(), panel.grid.major=element_blank(), panel.grid.minor=element_blank(), panel.margin=unit(0, "lines"), plot.background=element_blank(), plot.margin=unit(c(1, 1, 0.5, 0.5), "lines"), plot.title=element_text(size=rel(1.8), face="bold", hjust=0.5), strip.background=element_rect(fill="grey90", colour="grey50"), strip.text.x=element_text(size=rel(0.8)), strip.text.y=element_text(size=rel(0.8), angle=-90) ) } ## Plotting MoroccoRegMap1 <- ggplot(data = MoroccoReg.df, aes(long, lat, group = group)) MoroccoRegMap1 <- MoroccoRegMap1 + geom_polygon(aes(fill = A2Med_noCO2)) MoroccoRegMap1 <- MoroccoRegMap1 + geom_path(colour = 'gray', linestyle = 2) #MoroccoRegMap <- MoroccoRegMap + scale_fill_gradient(low = "#CC0000",high = "#006600") MoroccoRegMap1 <- MoroccoRegMap1 + scale_fill_gradient2(name = "%Change in yield",low = "#CC0000",mid = "#FFFFFF",high = "#006600") MoroccoRegMap1 <- MoroccoRegMap1 + labs(title="SRES_A2, noCO2 Effect") MoroccoRegMap1 <- MoroccoRegMap1 + coord_equal() + theme_map() MoroccoRegMap1

Results:

Map

Question:

In Yield data, I have a column that describes the labels corresponding to each of the entries in column "ID_1". What I'm trying to achieve are two things:

1) build a map and add records of the variable "ID_1" in the form of marks on the map, thus defining each area;

2) generates a second legend, except for the one that captures the values in the data and which entries in "ID_1" and their corresponding description in the "Labels" column in the data frame.

I hope I have clearly formulated my question.

thanks.

+5

r map ggplot2

iouraich Dec 04 '13 at 14:49

source share

1 answer

jlhoward · Accepted Answer · 2013-12-07T22:15:41+0000

First, let me apologize for having returned so long - I missed your comment among all the others. Is that what you meant?

This was done with the following code. Before delving into the explanation, you should know that creating a legend is the least of your problems. Notice how different colors are on the two cards. The above code does not indicate CO2 changes to the correct regions. For example, according to MoroccoYields.csv , the largest change (improvement?) Was -0.205 in Region 4 , but on your map the largest (darkest red) is at the northeast tip of Morocco, which is actually l'Oriental (Region 6) . The explanation follows the code.

 ## Loading packages library(rgdal) library(plyr) library(maps) library(maptools) library(mapdata) library(ggplot2) library(RColorBrewer) library(foreign) library(sp) # get.centroids: function to extract polygon ID and centroid from shapefile get.centroids = function(x){ poly = MoroccoReg@polygons[[x]] ID = poly@ID centroid = as.numeric(poly@labpt) return(c(id=ID, long=centroid[1], lat=centroid[2])) } #setwd("Directory where shapefile and Yields are stored") ## Loading shapefiles and .csv files MoroccoReg <- readOGR(dsn=".", layer="Morocco_adm1") MoroccoYield <- read.csv(file = "Morocco_Yield.csv", header=TRUE, sep=",", na.string="NA", dec=".", strip.white=TRUE) MoroccoYield$ID_1 <- substr(MoroccoYield$ID_1,3,10) ## Reorder the data in the shapefile based on the category variable "ID_1" and change to dataframe MoroccoReg <- MoroccoReg[order(MoroccoReg$ID_1), ] MoroccoYield <- cbind(id=rownames(MoroccoReg@data),MoroccoYield) # build table of labels for annotation (legend). labs <- do.call(rbind,lapply(1:14,get.centroids)) labs <- merge(labs,MoroccoYield[,c("id","ID_1","Label")],by="id") labs[,2:3] <- sapply(labs[,2:3],function(x){as.numeric(as.character(x))}) labs$sort <- as.numeric(as.character(labs$ID_1)) labs <- labs[order(labs$sort),] MoroccoReg.df <- fortify(MoroccoReg) ## This does NOT work... ## Add the yield impacts column to shapefile from the .csv file by "ID_1" ## Note that in the .csv file, I just added the column "ID_1" to match it with the shapefile #MoroccoReg.df <- cbind(MoroccoReg.df,MoroccoYield,by = 'ID_1') ## Do it this way... MoroccoReg.df <- merge(MoroccoReg.df,MoroccoYield, by="id") ## Check the structure and contents of shapefile attributes(MoroccoReg.df) ## Plotting MoroccoRegMap1 <- ggplot(data = MoroccoReg.df, aes(long, lat, group=id)) MoroccoRegMap1 <- MoroccoRegMap1 + geom_polygon(aes(fill = A2Med_noCO2)) MoroccoRegMap1 <- MoroccoRegMap1 + geom_path(colour = 'gray', linestyle = 2) MoroccoRegMap1 <- MoroccoRegMap1 + scale_fill_gradient2(name = "%Change in yield",low = "#CC0000",mid = "#FFFFFF",high = "#006600") MoroccoRegMap1 <- MoroccoRegMap1 + labs(title="SRES_A2, noCO2 Effect") MoroccoRegMap1 <- MoroccoRegMap1 + coord_equal() #+ theme_map() MoroccoRegMap1 <- MoroccoRegMap1 + geom_text(data=labs, aes(x=long, y=lat, label=ID_1), size=4) MoroccoRegMap1 <- MoroccoRegMap1 + annotate("text", x=max(labs$long)-5, y=min(labs$lat)+3-0.5*(1:14), label=paste(labs$ID_1,": ",labs$Label,sep=""), size=3, hjust=0) MoroccoRegMap1

Explanation:

First, when combining yield data with map regions: you use

 MoroccoReg.df <- cbind(MoroccoReg.df,MoroccoYield,by = 'ID_1')

This does not work cbind(...) . cbind(...) just concatenates the arguments column by column. This is not a merge function. Thus, you had a MoroccoReg.df data frame with MoroccoReg.df lines (one line for each endpoint of the line on your map), and you combine it with MoroccoYield , which has 14 lines (1 for each region). So cbind(...) replicates these 14 lines 7,700 times to fill the cbind(...) lines that it needs. The expression by="ID_1" simply adds another column named by" with "ID_1" replicated head(MoroccoReg.df) times. Run the above expression and enter head(MoroccoReg.df) and find the last column.

So how to make a merge? There are a number of functions in R that should make this easy, but I couldn't get them to work. Here is what worked:

Each polygon in the shapefile has an identifier. In the data section of the shapefile there is also a field "ID_1", but they are different and are not related to each other. [BTW: the ID_1 field in the shapefile data section and the ID_1 field in your csv file were also different: the latter has a "TR" added to the region; so that would also have to be dealt with]. Reordering a shapefile using:

 MoroccoReg <- MoroccoReg[order(MoroccoReg$ID_1), ]

will change the order of polygons, but will not change their identifiers. It turns out that the polygon identifier matches the name of the line in the data section of the shapefile, so I added (using cbind(...) !) To your MoroccoYeild data MoroccoYeild .

 MoroccoYield <- cbind(id=rownames(MoroccoReg@data),MoroccoYield)

So now MoroccoYield has an id field that displays the polygon field and an ID_1 field that identifies the area. Now we can fortify(...) and merge(...) . merge(...) takes a by= argument.

 MoroccoReg.df <- fortify(MoroccoReg) MoroccoReg.df <- merge(MoroccoReg.df,MoroccoYield, by="id")

This adds all of your MoroccoYield columns to the corresponding MoroccoReg.df rows.

Legend Creation:

The obvious question is how to position tags? Ideally, we would put the region number from MoroccoYield$ID_1 in the center of gravity of each region and then create a legend that identifies Regions based on the MoroccoYield$Label .

So where to find centroids? They are stored in an obscure place in the polygonal section of the shapefile. In short, I created a utility function get.centroid(...) that extracts a centroid from a polygon. Then I applied this function to all polygons to create a centroid table with the corresponding polygon identifier. Then I combined this with the tags in MoroccoYield . This created a labs data frame that has the following columns:

 id: polygon ID long: centroid longitude lat: centroid latitude ID_1: region ID label: region name sort: a sortable (numeric) version of ID_1

Then add the following code to your ggplot ...

 ... MoroccoRegMap1 <- MoroccoRegMap1 + geom_text(data=labs, aes(x=long, y=lat, label=label.id), size=4) MoroccoRegMap1 <- MoroccoRegMap1 + annotate("text", x=max(labs$long)-5, y=min(labs$lat)+3-0.5*(1:14), label=paste(labs$label.id,": ",labs$Label,sep=""), size=3, hjust=0)

... creates a map. There is no clean way I could find to do this with the official “ggplot legend”, so I had to use annotate(...) . Positioning annotations is a hack, but it seems to work.

Edit: In response to @ smailov83's comment, if you change the code to create ggplot for this ...

 MoroccoRegMap1 <- ggplot(data = MoroccoReg.df, aes(long, lat, group=group)) MoroccoRegMap1 <- MoroccoRegMap1 + geom_polygon(aes(fill = A2Med_noCO2)) MoroccoRegMap1 <- MoroccoRegMap1 + geom_path(colour = 'gray', linestyle = 2) MoroccoRegMap1 <- MoroccoRegMap1 + scale_fill_gradient2(name = "%Change in yield",low = "#CC0000",mid = "#FFFFFF",high = "#006600") MoroccoRegMap1 <- MoroccoRegMap1 + labs(title="SRES_A2, noCO2 Effect") MoroccoRegMap1 <- MoroccoRegMap1 + coord_equal() #+ theme_map() MoroccoRegMap1 <- MoroccoRegMap1 + geom_text(data=labs, aes(x=long, y=lat, label=ID_1, group=ID_1), size=4) MoroccoRegMap1 <- MoroccoRegMap1 + annotate("text", x=max(labs$long)-5, y=min(labs$lat)+3-0.5*(1:14), label=paste(labs$ID_1,": ",labs$Label,sep=""), size=3, hjust=0)

... you get the following:

I believe the issue is resolved. The reason for the extra lines on your map was that ggplot should be grouped by the MoroccoReg.df$group column (so aes(..., group=group) not aes(...,group=id) ). However, when you do this, ggplot tries to group by "group" in all layers. In geom_text(...) , where we use a new, local dataset - labs data frame - there is no group column. To handle this, we must explicitly set group to something else in geom_text(...) . Bottom line: it seems to work.

How to add a legend for a regional map with a legend describing related tags using ggplot2?

More articles: