I implemented a differential evolution algorithm for the side project I was doing. Since the crossover step seemed to include many options for the parameters (for example, the probability of crosstalk), I decided to skip it and just use the mutation. The method seemed to work fine, but I'm not sure that I would get better performance if I introduced a crossover.
The main question: What is the motivation for introducing crossover into differential evolution? Can you provide an example of toys in which the crossover is introduced with a pure mutation?
My intuition is that the crossover will produce something like the following in two dimensions. Say we have two parent vectors (red). A uniform crossover can create a new test vector at one of the blue dots.

I am not sure why such a study would be useful. In fact, it looks like this could degrade performance if highly suitability solutions follow some linear trends. In the figure below, let's say that the red dots are the current population, and the optimal solution is in the lower right corner. The population moves down the valley, so the upper right and lower left corners create poor decisions. The upper left corner gives a โgoodโ but suboptimal solution. Pay attention to how a homogeneous crossover performs tests (in blue) that are orthogonal to the direction of improvement. I used the crossover probability 1 and neglect the mutation to illustrate my point (see Code). I assume that this situation may arise quite often in optimization problems, but something may be incomprehensible.

Note: In the above example, I implicitly assume that the population was accidentally initialized (evenly) through this space and began to converge to the correct solution along the central valley (upper left, lower right).
This toy example is convex, and therefore differential evolution will not even be a suitable technique. However, if this motive was built into the multimodal fitness landscape, it seems that the crossover can be harmful. While the crossover supports reconnaissance, which may be useful, I'm not sure why you decided to explore in this particular direction.
R code for the above example:
N = 50 x1 <- rnorm(N,mean=2,sd=0.5) x2 <- -x1+4+rnorm(N,mean=0,sd=0.1) plot(x1,x2,pch=21,col='red',bg='red',ylim=c(0,4),xlim=c(0,4)) x1_cx = list(rep(0, 50)) x2_cx = list(rep(0, 50)) for (i in 0:N) { x1_cx[i] <- x1[i] x2_cx[i] <- x2[sample(1:N,1)] } points(x1_cx,x2_cx,pch=4,col='blue',lwd=4)
Subsequent question: If a crossover is useful in certain situations, is there a reasonable approach to: a) determining whether your particular problem will benefit from a crossover, and b) how to configure crossover parameters to optimize the algorithm?
A related stack question (I'm looking for something more specific, like a toy example): What is the importance of the transition in the differential evolution algorithm?
A similar question, but not specific to differential evolution: Crossover efficiency in genetic algorithms