Std: sort vs inserting in std :: set

I am reading some line segments from cin. Each line segment is represented by a start and end point. 2D. X and Y.

The input is not sorted. This is random. (Update: but I need them to be sorted first by X and then by Y)

I can read in all segments, store them in a vector, and then call std :: sort. On the other hand, I can create an empty std :: set and insert each segment as it arrives. The set will automatically save the sorted order. Which of the two approaches is more effective?

Update: The total input size (number of segments) is known in advance.

+4
source share
4 answers

You should accurately evaluate the effectiveness of both approaches, but it is safe to assume that std::sort on std::vector faster than insertion into std::set due to locality effects and large constants hiding in the tree insertion algorithm. In addition, subsequent searches and iteration will be faster.

(However, std::set better suited to supporting a mixed series of inserts and deletions / searches / iterations. Vectoring is expensive because each insert will take an average time on average.)

+11
source

As a good rule of thumb, stricter guarantees are offered, the worse the performance you get.

The insert in std::set ensures that the sequence is sorted after each insert.

Insertion into std::vector and calling std::sort once after all insertions are completed ensures that the sequence is sorted as soon as all manipulations on vector are completed. This does not require the vector to be sorted during all intermediate inserts.

A std::vector also has better spatial locality and requires less memory allocations. Therefore, I would suggest that the vector approach will be faster, but if performance is important to you, then this is important enough for measurement.

If you do not want to measure what is faster in your case for your data sets with your code in then you don’t care what is faster.

+9
source

It really depends, but it is sure that std::set is for random insertions and deletions. In this case, you are just pasting. Go with std::vector . In addition, it is possible, more important, if you know in advance how many segments there are, you only need to select the vector only once, it will not redistribute the memory every time it doubles in size.

+4
source

Use a container that has the appropriate semantics for your needs. Efficiency usually follows automatically with this choice.

If you encounter performance bottlenecks, do some benchmarking.

+3
source

All Articles