Multiply Many Vectors

Question

Multiply Many Vectors

I have three vectors of the same size (~ 1 million elements):

std::vector<wstring> name; std::vector<int> x; std::vector<int> y;

which can be considered as three "columns".

How to sort A-> Z vector name :

 std::sort(name.begin(), name.end())

but having the corresponding vectors x and y ?

Example:

 name xy name xy BCD 7 9 ABC 4 3 ZYX 1 4 => BCD 7 9 ABC 4 3 ZYX 1 4

The good thing about using std::vector is that I can easily select / filter multiple elements in a large vector by simply saving the list of indexes (for example: let the elements be saved 12, 1872, 2834, 1831). I was thinking about using std::map , but I'm afraid it will not be so effective for this: how to save a list of elements to save on the map?

0

c ++ sorting vector stdvector

Basj Jul 12 '17 at 22:56 on

source share

2 answers

It looks like you want the structure to store data together. For example:

 struct MyData { wstring name; int x; int y; }; ... std::vector<MyData> data;

From there, you want the comparison function to perform custom sorting, which ensures that you sort from the field you want to sort by:

 std::sort(data.begin(), data.end(), compareByName); bool compareByName(const MyData& lhs, const MyData& rhs) { return lhs.name < rhs.name; // This can be whatever }

+1

Josh Jul 12 '17 at 23:13

source share

Justin · Accepted Answer · 2017-07-12 23:02

There are several possible ways to do this. The easiest way is to wrap name , x and y in a struct:

 struct Person { std::wstring name; int x; int y; };

Then you can have std::vector<Person> people and sort it (assuming C ++ 14)

 std::sort(people.begin(), people.end(), [](auto const& lhs, auto const& rhs) { return lhs.name < rhs.name; });

However, if you know that this will cause performance problems due to fewer cache-related elements (i.e. you often repeat only x or only y ) and you are in a very limited environment such as high-performance games) I would suggest only sorting one vector. If you do not know what you are doing, you will need to compare both options.

Basically, there is a vector that tracks the order:

 std::vector<std::wstring> name; std::vector<int> x; std::vector<int> y std::vector<std::size_t> ordering(name.size()); std::iota(ordering.begin(), ordering.end(), 0); std::sort(ordering.begin(), ordering.end(), [&](auto const& lhs, auto const& rhs) { return name[lhs] < name[rhs]; });

Then you can simply ordering over ordering to traverse each parallel vector in a new order.

It is possible that an additional level of indirection will make it less effective. For example, a CPU might think that there is a data dependency in which there is none. In addition, the extra data that we track in ordering could easily take up enough cache space to counteract the advantage of splitting name , x and y ; You need to know the specifications of your target architecture and profile.

If you want to continue iterating over them in this new order, you want to use this ordering vector to sort other vectors, since access to the elements will become random. This will counteract the benefits of storing vectors separately (unless the vectors are small enough to fit in the cache) anyway.

The easiest way to do this is to create a new vector:

 std::vector<std::wstring> newNames; newNames.reserve(name.size()); for (auto i : ordering) { newNames.push_back(name[i]); }

Reconstructing vectors such as this you probably want to do if sorting occurs during initialization.

Multiply Many Vectors

More articles: