Is it good to have a Java Comparator where the order can change dynamically?

I have a set of timestamped values ​​that I would like to place in a sorted set.

public class TimedValue { public Date time; public double value; public TimedValue(Date time, double value) { this.time = time; this.value = value; } } 

The business logic for sorting this set indicates that the values ​​should be ordered in descending order of value if it does not exceed 7 days than the newest.

So, as a test, I came up with the following code ...

 DateFormat dateFormatter = new SimpleDateFormat("MM/dd/yyyy"); TreeSet<TimedValue> mySet = new TreeSet<TimedValue>(new DateAwareComparator()); mySet.add(new TimedValue(dateFormatter.parse("01/01/2009"), 4.0 )); // too old mySet.add(new TimedValue(dateFormatter.parse("01/03/2009"), 3.0)); // Most relevant mySet.add(new TimedValue(dateFormatter.parse("01/09/2009"), 2.0)); 

As you can see, initially the first value is more relevant than the second, but after the final value is added to the set, the first value has expired and should be the least relevant.

My initial tests say this should work ... that TreeSet will dynamically reorder the entire list as more values ​​are added.

But even if I see it, I'm not sure I believe in it.

Will a sorted collection reorder the entire set as each item is added? Are there any problems using a sorted collection this way (e.g. performance)? Would it be better to manually sort the list after all the values ​​have been added (I assume it will)?



Subsequent: Many people (and even me to some extent) believe that a sorted collection does not support this "dynamic reordering" method. I believe that my initial test was "working" completely by accident. When I added more elements to the set, the "order" broke pretty quickly. Thanks for all the great answers, I reorganized my code to use the approaches suggested by many of you.

+4
source share
8 answers

I don’t see how your comparator can even detect a change if he does not remember the newest value that he is currently seeing - and this is similar to an approach that inevitably ends in tears.

I suggest you do something in the following lines:

  • Collect your data in an unordered set (or list)
  • Find the latest value
  • Create a comparator based on this value, so that all comparisons using this comparator will be fixed (i.e. it will never return another result based on the same input values ​​as the comparator itself is unchanged, although it depends on the originally set value in constructor)
  • Create a sorted collection using this comparator (whichever way it is better, depending on what you want to do with it)
+10
source

I would recommend this for several reasons:

  • Since this is basically a red-black tree behind the scenes (which does not have to be rebuilt from scratch each time you insert), you can easily get values ​​in the wrong part of the tree (invalidating most of the TreeSet API).
  • The behavior is not defined in the specification and, therefore, may change later, even if it works now.
  • In the future, when something is strangely mistaken in anything remotely relating to this code, you will waste time suspecting that this is the reason.

I would recommend either to recreate / resort to the TreeSet before searching for it, or (my preferences), iterating through the set before searching and deleting any of the objects that are too old. You could even if you wanted to swap some memory for speed, save the second list, sorted by date and supported by the same objects, so that all you would need to do to filter your TreeSet is to remove the objects from the TreeSet to time-based sorted list.

+4
source

I do not believe that JDK libraries or even third-party libraries are written to handle a comparator whose results are incompatible. I will not depend on this work. I would be more worried if your comparator can return unequal for two values ​​when called once and can return equal for the same two values ​​if called later.

Read the Comparator.compare() contract carefully. Does your comparator comply with these restrictions?

To develop, if your comparator returns that two values ​​are not equal when you call it once, but then returns that these two values ​​are equal, because a later value was added to the set and changed the output of the comparator, the definition of "Set" ( without duplicates) is canceled.

John Skeet advice in his answer is great advice and will avoid having to worry about these issues. Truly, if your comparator does not return values ​​corresponding to equals() , then you can have big problems. Regardless of whether the sorted set will be re-sorted every time you add something, I will not depend, but the worst thing that can happen when changing the order is your set, which will not be sorted.

+3
source

I am 99% sure that this will not work. If the value in Set suddenly changes its comparison behavior, it is possible (quite likely, in fact) that it will no longer be found; that is, set.contains(value) will return false , because the search algorithm will at some point perform the comparison and continue the incorrect subtree, because now this comparison returns a different result than when inserting the value.

+2
source

No, that will not work.

If you use comparable keys in a collection, the results of the comparison between the two keys should remain unchanged over time.

When storing keys in a binary tree, each fork in the path is selected as a result of the comparison operation. If the subsequent comparison returns a different result, a different plug will be used and the previously saved key will not be found.

+2
source

I think the unchanging nature of the comparator should be based on sorting, so if you agree throughout the sorting operation, you're fine (as long as none of the items cross the average sort on day 7).

However, you may need to make it more obvious that you are asking a TreeSet question, which I believe reuses information from previous varieties to save time when you add a new item, so this is a slightly special case, Jaavadocs TreeSet specifically refers to Comparator semantics, so you are probably not officially supported, but you will need to read the code to get an idea of ​​whether you are safe.

I think it would be better for you to do the full sort when you need to sort the data using once as “now” so that you do not risk jumping from this border if your view takes a long time to make it probable.

+1
source

It is possible that the record will change from <7 days to> 7 days in the middle of the sort, so what you do violates the rules for the comparator. Of course, this does not mean that it will not work: many things that are documented as “unpredictable” really work if you know exactly what is going on inside.

I think the answer to the tutorial: This is unreliable with built-in sorts. You will need to write your own sort function.

At least I would say that you cannot rely on TreeSet or any sorted structure, magically resorting to when dates collapse along the border. At best, this might work if you re-sort right before the display and don't rely on anything that stays true between updates.

In the worst case, inconsistent comparisons can greatly disrupt sorting. You are not sure that this will not lead you into an endless cycle or some other deadly black hole.

So, I would say: read the source code from Sun for any classes or functions that you plan to use, and see if you can understand what will happen. Testing is good, but there are potentially complex cases that are difficult to verify. The most obvious: what if, during sorting, the record moves across the date boundary? That is, he can look at the record once and say this & lt; 7, but the next time he sees her> 7. It could be bad, bad news.

One obvious trick that comes up with me: convert the date to age when adding an entry to the structure, not dynamically. Thus, he cannot change like that. If the structure will live for more than a few minutes, recalculate the age at a specific time, and then re-sort. I doubt that someone will say that your program is incorrect because you said that the recording was less than 7 days, when in fact it is 7 days, 0 hours, 0 minutes and 2 seconds. Even if someone noticed how accurate their watch is?

+1
source

As already noted, the comparator cannot do this for you, because transitivity is broken. Basically, in order to be able to sort the elements, you should be able to compare each of them (independently of the others), which obviously you cannot do. Thus, your script, in principle, will either not work, or will lead to a consistent result.

Perhaps something simpler will be good enough for you:

  • apply a simple comparator that uses the value as needed
  • and just remove from the list / collection all the items that are 7 days older than the newest. Basically, when a new item is added, you check to see if it is the newest, and if so, delete those that are 7 days older than that.

This will not work if you also remove the items from the list, in which case you will need to save everything that you deleted in a separate list (which, by the way, was sorted by date), and add them back to the original list in case MAX (date) less after deletion.

+1
source

All Articles