Do I have to stream several times or do all the calculations in one stream?

I have the following code:

mostRecentMessageSentDate = messageInfoList .stream() .findFirst().orElse(new MessageInfo()) .getSentDate(); unprocessedMessagesCount = messageInfoList .stream() .filter(messageInfo -> messageInfo.getProcessedDate() == null) .count(); hasAttachment = messageInfoList .stream() .anyMatch(messageInfo -> messageInfo.getAttachmentCount() > 0); 

As you can see, I stream one list 3 times because I want to find 3 different values. If I did this in a For-Each loop, I could only loop once.

The better, the better, to do this for loop, so that I loop only once? I believe threads are more readable.

Edit: I did some tests:

 import java.util.List; import java.util.stream.Collectors; import java.util.stream.IntStream; public class Main { public static void main(String[] args) { List<Integer> integerList = populateList(); System.out.println("Stream time: " + timeStream(integerList)); System.out.println("Loop time: " + timeLoop(integerList)); } private static List<Integer> populateList() { return IntStream.range(0, 10000000) .boxed() .collect(Collectors.toList()); } private static long timeStream(List<Integer> integerList) { long start = System.currentTimeMillis(); Integer first = integerList .stream() .findFirst().orElse(0); long containsNumbersGreaterThan10000 = integerList .stream() .filter(i -> i > 10000) .count(); boolean has10000 = integerList .stream() .anyMatch(i -> i == 10000); long end = System.currentTimeMillis(); System.out.println("first: " + first); System.out.println("containsNumbersGreaterThan10000: " + containsNumbersGreaterThan10000); System.out.println("has10000: " + has10000); return end - start; } private static long timeLoop(List<Integer> integerList) { long start = System.currentTimeMillis(); Integer first = 0; boolean has10000 = false; int count = 0; long containsNumbersGreaterThan10000 = 0L; for (Integer i : integerList) { if (count == 0) { first = i; } if (i > 10000) { containsNumbersGreaterThan10000++; } if (!has10000 && i == 10000) { has10000 = true; } count++; } long end = System.currentTimeMillis(); System.out.println("first: " + first); System.out.println("containsNumbersGreaterThan10000: " + containsNumbersGreaterThan10000); System.out.println("has10000: " + has10000); return end - start; } } 

and, as expected, the for loop is always faster than threads

 first: 0 containsNumbersGreaterThan10000: 9989999 has10000: true Stream time: 57 first: 0 containsNumbersGreaterThan10000: 9989999 has10000: true Loop time: 38 

But never seriously.

findFirst was probably a bad example because it just exits if the thread is empty, but I wanted to know if that had changed.

I was hoping to get a solution that allows me to do multiple calculations from a single thread. IntSummaryStatistics is not doing exactly what I want. I think I will listen to @ florian-schaetz and adhere to the favorable ability to increase marginal productivity.

+8
java java-stream
source share
1 answer

You do not re-check the collection three times.

 mostRecentMessageSentDate = messageInfoList .stream() .findFirst().orElse(new MessageInfo()) .getSentDate(); 

The above checks if there are any elements in the collection, and returns a value depending on this. He does not need to go through the entire collection.

 unprocessedMessagesCount = messageInfoList .stream() .filter(messageInfo -> messageInfo.getProcessedDate() == null) .count(); 

It is necessary to filter out all elements without a process date and read them, so that it goes through the entire collection.

 hasAttachment = messageInfoList .stream() .anyMatch(messageInfo -> messageInfo.getAttachmentCount() > 0); 

The above must go through the elements until it finds a message with an attachment.

So, of the three threads , only one of them is required to go through the entire collection, in the worst case, you iterate twice (the second and essentially the third thread).

It might be more efficient with a regular For-Each loop, but do you really need this? If your collection contains only a few objects, I would not optimize it.

However, using the traditional For-Each loop, you can combine the last two threads:

 int unprocessedMessagesCount = 0; boolean hasAttachment = false; for (MessageInfo messageInfo: messageInfoList) { if (messageInfo.getProcessedDate() == null) { unprocessedMessagesCount++; } if (hasAttachment == false && messageInfo.getAttachmentCount() > 0) { hasAttachment = true; } } 

It really is up to you if you think this is a better solution (I also find threads more readable). I see no way to combine the three streams into one, at least in a not more readable way.

+5
source share

All Articles