How to get median value from a sorted map

Question

How to get median value from a sorted map

I am using std :: map. Sometimes I will perform an operation, for example: searching for the median value of all elements. for example if I add

1 "s" 2 "sdf" 3 "sdfb" 4 "njw" 5 "loo"

then the median is 3.

Is there any solution without repeating more than half of the elements on the map?

+6

c ++ stl

Raymond Aug 10 '10 at 6:04

source share

10 answers

I think the answer is no. You can't just go to the N / 2 element at the beginning, because std::map uses bidirectional iterators . You have to sort through half the elements on the map. If you had access to the basic implementation of Red / Black tree, which is usually used for std::map , you can come close, as in Dani's answers . However, you do not have access to this as it is encapsulated as an implementation detail.

+8

A. Levy Aug 10 '10 at 6:19

source share

Try:

 typedef std::map<int,std::string> Data; Data data; Data::iterator median = std::advance(data.begin(), data.size() / 2);

Works if size () is odd. I will let you know how to do this when size () is equal.

+4

Martin york Aug 10 '10 at 6:17

source share

In a self-balancing binary tree (std :: map - this I think), the root is a good approximation.
For an accurate value, simply cache it with the balance indicator, and each time an element added below the median decreases the indicator and increases when the element is added above. When the indicator is 2 / -2, move the median up / down one step and reset the indicator.

+2

Dani Aug 10 '10 at 6:13

source share

If you can switch data structures, save the elements in std::vector and sort them. This will allow access to the middle element positionally without repetition. (This may be surprising, but a sorted vector often exits map because of locality. You can use a binary search to search with the sort key, and in any case it will have the same performance as map . See Scott Meyer Effective STL .)

+2

Daniel Earwicker Aug 10 '10 at 6:17

source share

If you know that the card will be sorted, you will receive an item on the floor (length / 2). If you are a little tuned, try (length → 1).

+1

Zach rattner Aug 10 '10 at 6:07

source share

Since this sounds like an insert and finding your two common operations, while the median is rare, the easiest approach is to use a map and std::advance( m.begin(), m.size()/2 ); as originally proposed by David Rodriguez. This is linear time, but it is easy to understand, so I will only consider a different approach if profiling shows that median calls are too expensive relative to the work performed by your application.

+1

Mark b Aug 16 '10 at 14:38

source share

For the sort list, here it is in java code, but I assume it is very easy to port it to C ++:

  if (input.length % 2 != 0) { return input[((input.length + 1) / 2 - 1)]; } else { return 0.5d * (input[(input.length / 2 - 1)] + input[(input.length / 2 + 1) - 1]); }

0

InsertNickHere Aug 10 '10 at 6:19

source share

To do this, there is an nth_element () method for it :) It implements part of the quick sort section, and you do not need your vector (or array) to sort. As well as the time complexity of O (n) (while for sorting you need to pay O (nlogn)).

0

Master yoda Aug 10 '10 at 6:38

source share

I do not know how to quickly get the median from a blank STL card for large cards. If your card is small or you rarely need a median, you should use linear advancement to n / 2, as it seems to me, for the sake of simplicity and standardness.

You can use the map to create a new container that offers the median: Jethro suggested using two maps, based on which, perhaps, one map and a constantly updated iterator mediator would be better. These methods suffer from a flaw that you have to perform for each modification operation and in the jethro case even read operations.

A custom writing container will also do what you are possibly most effective at, but for the price of a custom code. You can try, as it was suggested to modify the existing stl-map implementation. You can also search for existing implementations.

There is a super-efficient C implementation that offers most map features, as well as random access called Judy Arrays . They work for integer, string, and byte keys of an array.

0

Peter G. Aug 10 '10 at 9:12

source share

jethro · Accepted Answer · 2010-08-10T07:15:22+0000

I think you can solve the problem using two std::map . One for the smaller half of objects (mapL) and the second for the other half (mapU). When you have an insert operation. This will be any case:

add an element to mapU and move the smallest element to mapL
add an element to mapL and move the largest element to mapU

In case the cards are of different sizes, and you insert an element into a number with fewer elements that you skip in the move section. The basic idea is that you keep the cards balanced, so the maximum size difference is 1 element. As far as I know, STL all operations should work in O (ln (n)) time. Access to the smallest and largest element on the map can be done using an iterator. When you have a request for n_th position, just check the size of the maps and return the largest element in mapL or the smallest element in mapR.

The above use case is for insertion only, but you can also extend it to delete items, but you have to keep track of which card holds the item or try to remove from both.

Here is my code using an example:

 #include <iostream> #include <string> #include <map> using namespace std; typedef pair<int,string> pis; typedef map<int,string>::iterator itis; map<int,string>Left; map<int,string>Right; itis get_last(map<int,string> &m){ return (--m.end()); } int add_element(int key, string val){ if (Left.empty()){ Left.insert(make_pair(key,val)); return 1; } pis maxl = *get_last(Left); if (key <= maxl.first){ Left.insert(make_pair(key,val)); if (Left.size() > Right.size() + 1){ itis to_rem = get_last(Left); pis cpy = *to_rem; Left.erase(to_rem); Right.insert(cpy); } return 1; } else { Right.insert(make_pair(key,val)); if (Right.size() > Left.size()){ itis to_rem = Right.begin(); pis cpy = *to_rem; Right.erase(to_rem); Left.insert(*to_rem); } return 2; } } pis get_mid(){ int size = Left.size() + Right.size(); if (Left.size() >= size / 2){ return *(get_last(Left)); } return *(Right.begin()); } int main(){ Left.clear(); Right.clear(); int key; string val; while (!cin.eof()){ cin >> key >> val; add_element(key,val); pis mid = get_mid(); cout << "mid " << mid.first << " " << mid.second << endl; } }

How to get median value from a sorted map

More articles: