One of the most useful cases is floating point calculation - it usually takes a lot longer than the "normal" instructions, so it is useful that the CPU runs them on one side with several instructions, while other ordinary program instructions in the main ALU.
It can also help keep all pipelines active - some processors have multiple pipelines (for example, one specialized for branches, a pair specialized for arithmetic operators, and a pair for floating point instructions and SIMD). The reordering of instructions allows the processor to fully support all pipelines, rather than empty for several instructions, thereby speeding up program execution.
Even for a single pipeline, reordering instructions can help maintain a complete pipeline by removing sequential dependent commands - see http://en.wikipedia.org/wiki/Instruction_pipeline
source share