Clearly, big-O doesn't really say anything about this issue. Assuming the algorithm you are using is a quick sort. It has an average run time:

So, if sorting, then merging, we get:
f1 = 1.39n * log (n) * 2 + 2n
merge and then sort:
f2 = n + 1.39 * 2n * log (2n)
The difference is
f2 - f1 = -n + 2.78n> 0
In general, if the sorting algorithm has complexity
C = k * nlog (n)
since k should usually be greater than 1 and it is unlikely to be somewhere around 0.5, sorting, then the merger will be faster if you assume that the cost of the merger does not exceed 2n.
source share