First, let me apologize for having returned so long - I missed your comment among all the others. Is that what you meant?

This was done with the following code. Before delving into the explanation, you should know that creating a legend is the least of your problems. Notice how different colors are on the two cards. The above code does not indicate CO2 changes to the correct regions. For example, according to MoroccoYields.csv , the largest change (improvement?) Was -0.205 in Region 4 , but on your map the largest (darkest red) is at the northeast tip of Morocco, which is actually l'Oriental (Region 6) . The explanation follows the code.
Explanation:
First, when combining yield data with map regions: you use
MoroccoReg.df <- cbind(MoroccoReg.df,MoroccoYield,by = 'ID_1')
This does not work cbind(...) . cbind(...) just concatenates the arguments column by column. This is not a merge function. Thus, you had a MoroccoReg.df data frame with MoroccoReg.df lines (one line for each endpoint of the line on your map), and you combine it with MoroccoYield , which has 14 lines (1 for each region). So cbind(...) replicates these 14 lines 7,700 times to fill the cbind(...) lines that it needs. The expression by="ID_1" simply adds another column named by" with "ID_1" replicated head(MoroccoReg.df) times. Run the above expression and enter head(MoroccoReg.df) and find the last column.
So how to make a merge? There are a number of functions in R that should make this easy, but I couldn't get them to work. Here is what worked:
Each polygon in the shapefile has an identifier. In the data section of the shapefile there is also a field "ID_1", but they are different and are not related to each other. [BTW: the ID_1 field in the shapefile data section and the ID_1 field in your csv file were also different: the latter has a "TR" added to the region; so that would also have to be dealt with]. Reordering a shapefile using:
MoroccoReg <- MoroccoReg[order(MoroccoReg$ID_1), ]
will change the order of polygons, but will not change their identifiers. It turns out that the polygon identifier matches the name of the line in the data section of the shapefile, so I added (using cbind(...) !) To your MoroccoYeild data MoroccoYeild .
MoroccoYield <- cbind(id=rownames(MoroccoReg@data),MoroccoYield)
So now MoroccoYield has an id field that displays the polygon field and an ID_1 field that identifies the area. Now we can fortify(...) and merge(...) . merge(...) takes a by= argument.
MoroccoReg.df <- fortify(MoroccoReg) MoroccoReg.df <- merge(MoroccoReg.df,MoroccoYield, by="id")
This adds all of your MoroccoYield columns to the corresponding MoroccoReg.df rows.
Legend Creation:
The obvious question is how to position tags? Ideally, we would put the region number from MoroccoYield$ID_1 in the center of gravity of each region and then create a legend that identifies Regions based on the MoroccoYield$Label .
So where to find centroids? They are stored in an obscure place in the polygonal section of the shapefile. In short, I created a utility function get.centroid(...) that extracts a centroid from a polygon. Then I applied this function to all polygons to create a centroid table with the corresponding polygon identifier. Then I combined this with the tags in MoroccoYield . This created a labs data frame that has the following columns:
id: polygon ID long: centroid longitude lat: centroid latitude ID_1: region ID label: region name sort: a sortable (numeric) version of ID_1
Then add the following code to your ggplot ...
... MoroccoRegMap1 <- MoroccoRegMap1 + geom_text(data=labs, aes(x=long, y=lat, label=label.id), size=4) MoroccoRegMap1 <- MoroccoRegMap1 + annotate("text", x=max(labs$long)-5, y=min(labs$lat)+3-0.5*(1:14), label=paste(labs$label.id,": ",labs$Label,sep=""), size=3, hjust=0)
... creates a map. There is no clean way I could find to do this with the official “ggplot legend”, so I had to use annotate(...) . Positioning annotations is a hack, but it seems to work.
Edit: In response to @ smailov83's comment, if you change the code to create ggplot for this ...
MoroccoRegMap1 <- ggplot(data = MoroccoReg.df, aes(long, lat, group=group)) MoroccoRegMap1 <- MoroccoRegMap1 + geom_polygon(aes(fill = A2Med_noCO2)) MoroccoRegMap1 <- MoroccoRegMap1 + geom_path(colour = 'gray', linestyle = 2) MoroccoRegMap1 <- MoroccoRegMap1 + scale_fill_gradient2(name = "%Change in yield",low = "#CC0000",mid = "#FFFFFF",high = "#006600") MoroccoRegMap1 <- MoroccoRegMap1 + labs(title="SRES_A2, noCO2 Effect") MoroccoRegMap1 <- MoroccoRegMap1 + coord_equal() #+ theme_map() MoroccoRegMap1 <- MoroccoRegMap1 + geom_text(data=labs, aes(x=long, y=lat, label=ID_1, group=ID_1), size=4) MoroccoRegMap1 <- MoroccoRegMap1 + annotate("text", x=max(labs$long)-5, y=min(labs$lat)+3-0.5*(1:14), label=paste(labs$ID_1,": ",labs$Label,sep=""), size=3, hjust=0)
... you get the following:

I believe the issue is resolved. The reason for the extra lines on your map was that ggplot should be grouped by the MoroccoReg.df$group column (so aes(..., group=group) not aes(...,group=id) ). However, when you do this, ggplot tries to group by "group" in all layers. In geom_text(...) , where we use a new, local dataset - labs data frame - there is no group column. To handle this, we must explicitly set group to something else in geom_text(...) . Bottom line: it seems to work.