Little is beautiful, but just as fast?

Question

Little is beautiful, but just as fast?

I had an argument with an employee about implementing a simple line parser. One of them is “small”, 10 lines of code, using C ++ and threads, the other 70 lines of code, using switch enclosures and an iteration string char to char. We tested it for more than 1 million iterations and measured the speed using a time command. It seems like a long and ugly approach is on average 1 second faster.

Problem: Input: string

"v=spf1 mx include:_spf-a.microsoft.com include:_spf-b.microsoft.com include:_spf-c.microsoft.com include:_spf-ssg-a.microsoft.com ip4:131.107.115.212 ip4:131.107.115.215 ip4:131.107.115.214 ip4:205.248.106.64 ip4:205.248.106.30 ip4:205.248.106.32 ~all a:1.2.3.4"

Output: map<string, list<string>> with all values for each key, for example: ip4, include,

An example of the output of one iteration on the input line above:

key:

1.2.3.4,

key: enable

_spf-a.microsoft.com, _spf-b.microsoft.com, _spf-c.microsoft.com, _spf-ssg-a.microsoft.com,

key: ip4

131.107.115.212, 131.107.115.215, 131.107.115.214, 205.248.106.64, 205.248.106.30, 205.248.106.32,

The guy is "little handsome":

  istringstream iss(input); map<string, list<string> > data; string item; string key; string value; size_t pos; while (iss.good()) { iss >> item; pos = item.find(":"); key = item.substr(0,pos); data[key].push_back(item.substr(pos+1)); }

Second faster approach:

  typedef enum {I,Include,IP,A,Other} State; State state = Other; string line = input; string value; map<string, list<string> > data; bool end = false; size_t pos = 0; while (pos < line.length()) { switch (state) { case Other: value.clear(); switch (line[pos]) { case 'i': state = I; break; case 'a': state = A; break; default: while(line[pos]!=' ' && pos < line.length()) pos++; } pos++; break; case I: switch (line[pos]) { case 'p': state = IP; break; case 'n': state = Include; break; } pos++; break; case IP: pos+=2; for (;line[pos]!=' ' && pos<line.length(); pos++) { value+=line[pos]; } data["ip4"].push_back(value); state = Other; pos++; break; case Include: pos+=6; for (;line[pos]!=' ' && pos<line.length(); pos++) { value+=line[pos]; } data["include"].push_back(value); state = Other; pos++; break; case A: if (line[pos]==' ') data["a"].push_back("a"); else { pos++; for (;line[pos]!=' ' && pos<line.length(); pos++) { value+=line[pos]; } } data["a"].push_back(value); state = Other; pos++; break; } }

I really believe that “little is beautiful” is the way to go and I don't like the longer code presented here, but it's hard for me to argue about it when the code runs faster.

Can you suggest ways to optimize or completely rewrite a small approach, in a sense where it remains small and beautiful, but also works faster?

Update: Added state determination and initialization. Context: a longer approach completes 1 million iterations on the same line in 15.2 seconds, the less code does this in an average of 16.5 seconds.

both versions compiled with g ++ -O3, g ++ - 4.4, worked on Intel (R) Core (TM) 2 Duo CPU E8200 @ 2.66 GHz, Linux Mint 10

The good side won this battle :) I found a small mistake in a small program, it added even invalid values to the map, those that did not have a colon :: in the line. After adding an if statement to check for a colon, the smaller code is faster, much faster. Now the timings: “small and beautiful”: 12.3 and long and ugly: 15.2.

Little beautiful :)

+6

c ++ c coding-style parsing

Vladimir Nov 24 '10 at 9:19

source share

11 answers

Less may not be faster. One example: bubble sort is very short, but it's O (n * n). QuickSort and MergeSort are longer and seem more complex, but this is O (n log n).

But, having said that ... always make sure the code is readable or the logic is complicated, add good comments to it so that other people can follow.

+16

太極者無極而生 Nov 24 '10 at 9:21

source share

The second version may be faster, but also note that it is very attached to the content that it reads, and if that changes, you must change this terrible mess of code. The first version is a little more forgiving of this and may require less maintenance when something changes. Both will throttle if the data is garbage / incorrect.

The real question to ask is that an extra second is important? Does it matter?

And yes, look for ways to optimize the small / readable version. You may lose a second, but you will get immediate clarity.

+4

Moo juice Nov 24 '10 at 9:29

source share

This is the only function, if enhancing the performance provides significant benefits, I would keep the ugly version. But, as mentioned above, document it well and perhaps include a small and beautiful version in the commentary so that people looking at the code can understand the choice.

In this particular case, it seems that the choice does not affect the large architecture, it is just an implementation of the “sheet function”, so if there is a good reason, I will not continue the slow version just for the sake of aesthetics of the code.

I am glad that when I call the Sort function in some library, it uses QuickSort instead of a two-line super elegant but slow BubbleSort :-)

+2

Francesco de vittori Nov 24 '10 at 9:36

source share

I really believe that “little is beautiful” is the way to go and I don't like the longer code presented here, but it's hard for me to argue about it when the code runs faster.

No, it is not. Well, not so long ago, when you say that it is one second faster than the other, you mean something like 10 seconds, and the other takes 11 seconds, and does not take 0.1 seconds, and the other - 1 , 1 second. Even then, if you only need to expand the single once when the program starts, this may not be worth the worry.

They always prefer a concise and understandable code for a long-lasting opaque, but faster version, if it is impossible to achieve a significant increase in productivity due to profiling. Do not forget that the time of the programmer is also worth it. If you spend 10 minutes trying to figure out what the bottom bit of code does, which is equivalent to saving 600 code runs.

+2

Jeremyp Nov 24 '10 at 10:19

source share

One small optimization may be to avoid using the map [] operator and instead use find first to see if the key is already on the map, and otherwise use insert to add a new key. Two substr can also be combined into one to avoid unnecessary copying.

Also, I don’t remember who said it first, it’s easier to make the right code faster than fast code.

Also, Knuth's quote about premature optimization is often inferred from the context, he also said:

"The traditional wisdom shared by many of today's software developers calls for ignoring efficiency in the small; but I think it's just an overreaction to the abuses they see practiced by petty-and-pound-stupid programmers who can't debug or maintain their own" optimized "programs"

+1

Andreas Brinck Nov 24 '10 at 9:26

source share

Consider the size:

Language standards
Tool Chains Used
The necessary knowledge to verify the code.
Generated output
Compilation time

Note that these points apply to the subsets used if C-style code was actually compiled on the C ++ compiler.

It is important to understand that the time to pick up and start developing without the implicit magic of various materials of the standard C ++ library that you used in the "small" version may outweigh the time spent coding the "long" version.

Also, to comment on the difference in speed: widespread use of libraries will smooth out differences in programming capabilities. To really squeeze performance and “beauty”, you need to master the lowest level language applicable to the problem.

+1

Matt joiner Nov 24 '10 at 9:31

source share

In my experience, written to check the boundary conditions, validations are very necessary, and yes adds more code base in terms of the lack of lines, but it is very necessary to create ready-made production code.

+1

Ankur gupta Nov 24 '10 at 9:36

source share

It's hard for me to understand what the second version does. the state does not seem to be initialized from the very beginning, firstly.

It is also difficult for me to figure out what the first does, where the token does not have a colon. Does it just seem to add it without value?

The first seems common, the second - knowing what keys you are looking for.

If performance is everything and you are looking for specific keys, then you can optimize in various ways, for example, just have a list of the names you are looking for and know what they are.

However, before you go for performance, if this is what you read only once as a configuration, then 1 second is slower in a million iterations, the microsecond is less in reality and you should not worry, but the general (and not the line code that it has ) makes it better.

+1

Cashcow Nov 24 '10 at 9:49

source share

I do not agree with the answers, saying that you should categorically choose a shorter, slower version; depending on the trade-off with performance gains and an improved user interface, as well as for the convenience of reading and maintaining code, a faster version may be preferable.

Trust your own opinion!

Regarding this particular case, I have the following notes:

The two versions do not actually do the same. The shorter one handles all possible key names, while the longer one only supports a few hard-coded keys.
I would suggest (this should be profiled!) That most of the time spent in both algorithms is building / distributing the std :: map and std :: list nodes. Switching to another data structure that does not require distribution for each key and value is likely to significantly speed up both implementations.

+1

Viktor Nov 30 '10 at 9:51

source share

I would crack it with a lexer generator. Looking at the entrance, I assume that most of the difficulty comes down to figuring out what the markers mean. A simple machine-tool tokenizer and analyzer with manual recording (I would assume that it will have about 2 or 3 states) should add up to about 20 lines of code:

 extern string payload; Output Process() { Output output; string key = ""; bool value = false; while(!eof()) { switch(yylex()) { case tok_string: if (value) { output[key].push_back(payload); value = false; key = ""; } else { key = payload; } break; case tok_colon: value = (key != ""); break; } } return output; }

+1

BCS Dec 03 '10 at 19:22

source share

darioo · Accepted Answer · 2010-11-24T09:21:42+0000

Fewer lines of code you have; it's better. Do not add another 60 lines unless you really need it. If it is slow, profile. Then optimize. Do not optimize before you need it. If it works fine, leave it as it is. Adding more code will add more errors. You do not want this. Keep it short. In fact.

Read this wiki post .

“Premature optimization is the root of all evil” - Donald Knuth , a pretty smart guy.

You can write faster code by writing less, just more reasonably. One way to help speed: do less.

Quote: Raymond Chen:

"One of the questions I get is," My application starts up slowly. What are the super-secret tricks that you guys at Microsoft use to run your apps faster? “Answer:“ The super evil trick is to do fewer things. ”-“ Five things every Win32 programmer needs to know ”(September 16, 2005)

Also check out why GNU grep is fast .

Little is beautiful, but just as fast?

More articles: