Should we hire someone who writes C in Perl?

Question

Should we hire someone who writes C in Perl?

One of my colleagues recently gave an interview to some job candidates, and one said that they have very good experience in Perl.

Since my colleague did not know Perl, he asked me to criticize some code written (off-site) by this potential rental, so I looked and told him about my problems (the main thing is that he initially had no comments, and this is not like we gave them enough time).

However, the code works, so I can’t say no without extra input. Another problem is that this code basically looks exactly like I did its code in C. It was a while since I did Perl (and I didn’t do much, I’m more Python bod for quick scripts), but it seems to remind me that it was a much more expressive language than what this guy used.

I am looking for information from real Perl encoders and suggestions on how it can be improved (and why the Perl encoder needs to know this improvement method).

You can also resurrect the lyrics about whether people who write one language in a completely different language should (or should not be hired). I am interested in your arguments, but this question is primarily intended to criticize the code.

The specification was supposed to successfully process the CSV file as follows and display individual fields:

User ID,Name , Level,Numeric ID pax, Pax Morgan ,admin,0 gt," Turner, George" rubbish,user,1 ms,"Mark \"X-Men\" Spencer","guest user",2 ab,, "user","3"

The result should have been something like this (potential hiring code really outputs this):

 User ID,Name , Level,Numeric ID: [User ID] [Name] [Level] [Numeric ID] pax, Pax Morgan ,admin,0: [pax] [Pax Morgan] [admin] [0] gt," Turner, George " rubbish,user,1: [gt] [ Turner, George ] [user] [1] ms,"Mark \"X-Men\" Spencer","guest user",2: [ms] [Mark "X-Men" Spencer] [guest user] [2] ab,, "user","3": [ab] [] [user] [3]

Here is the code they sent:

 #!/usr/bin/perl # Open file. open (IN, "qq.in") || die "Cannot open qq.in"; # Process every line. while (<IN>) { chomp; $line = $_; print "$line:\n"; # Process every field in line. while ($line ne "") { # Skip spaces and start with empty field. if (substr ($line,0,1) eq " ") { $line = substr ($line,1); next; } $field = ""; $minlen = 0; # Detect quoted field or otherwise. if (substr ($line,0,1) eq "\"") { $line = substr ($line,1); $pastquote = 0; while ($line ne "") { # Special handling for quotes (\\ and \"). if (length ($line) >= 2) { if (substr ($line,0,2) eq "\\\"") { $field = $field . "\""; $line = substr ($line,2); next; } if (substr ($line,0,2) eq "\\\\") { $field = $field . "\\"; $line = substr ($line,2); next; } } # Detect closing quote. if (($pastquote == 0) && (substr ($line,0,1) eq "\"")) { $pastquote = 1; $line = substr ($line,1); $minlen = length ($field); next; } # Only worry about comma if past closing quote. if (($pastquote == 1) && (substr ($line,0,1) eq ",")) { $line = substr ($line,1); last; } $field = $field . substr ($line,0,1); $line = substr ($line,1); } } else { while ($line ne "") { if (substr ($line,0,1) eq ",") { $line = substr ($line,1); last; } if ($pastquote == 0) { $field = $field . substr ($line,0,1); } $line = substr ($line,1); } } # Strip trailing space. while ($field ne "") { if (length ($field) == $minlen) { last; } if (substr ($field,length ($field)-1,1) eq " ") { $field = substr ($field,0, length ($field)-1); next; } last; } print " [$field]\n"; } } close (IN);

+55

perl

paxdiablo Jun 09 '09 at 6:20

source share

23 answers

Its code is a bit detailed. Perl is all modules and avoiding them makes your life difficult. Here is the equivalent of what you posted that I wrote in about two minutes:

  #!/usr/bin/env perl use strict; use warnings; use Text::CSV; my $parser = Text::CSV->new({ allow_whitespace => 1, escape_char => '\\', allow_loose_quotes => 1, }); while(my $line = <>){ $parser->parse($line) or die "Parse error: ". $parser->error_diag; my @row = $parser->fields; print $line; print "\t[$_]\n" for @row; }

+83

jrockway Jun 09 '09 at 12:25

source share

I would say that writing C in Perl is a much better situation than writing Perl in C. As often brought up in the SO podcast, understanding C is a virtue that not all developers (even some good ones) currently have. Take them and buy Perl Best Practices for them and you will be installed. After excellence, a copy of Intermediate Perl and they could work.

+43

Copas Jun 09 '09 at 6:51

source share

It is not a terribly idiomatic Perl, but it is not a terrible Perl (although it can be much more compact).

Two warning bells - the shebang line does not include ' -w ', and neither use strict; 'nor' use warnings; '. This is a very old Perl style; good Perl code uses both warnings and strict.

Using old-style descriptors is no longer recommended, but it's not so bad (it could be code written more than 10 years ago, maybe).

Not using regular expressions is somewhat more surprising. For example:

 # Process every field in line. while ($line ne "") { # Skip spaces and start with empty field. if (substr ($line,0,1) eq " ") { $line = substr ($line,1); next; }

This could be written:

 while ($line ne "") { $line =~ s/^\s+//;

This strips away all leading spaces with a regular expression, without looping the code around the loop. A good part of the rest of the code will also benefit from carefully written regular expressions. This is a typical Perl idiom; It is amazing to see that they are not used.

If efficiency was the stated problem (the reason is not to use regular expressions), then the questions should be "did you measure it" and "what efficiency are you discussing - a machine or a programmer"?

The number of working codes. A more or less idiomatic code is better.

Also, of course, there are Text :: CSV and Text :: CSV_XS modules that can be used to process CSV parsing. It would be interesting to know if they know about Perl modules.

There are also several notations for processing quotes in quoted fields. The code assumes the backslash is appropriate; I believe Excel uses double quotes:

 "He said, ""Don't do it"", but they didn't listen"

This can be matched:

 $line =~ /^"([^"]|"")*"/;

With a little attention, you can only grab text between closing quotation marks. You still have to post processed text to remove the embedded double quotes.

A non-cableable field will correspond to:

 $line =~ /^([^,]*)(?:,|$)/;

This is significantly shorter than shown in the loop and substring.

Here's a version of the code that uses the backslash escape code mechanism used in the code in the question does the same job.

 #!/usr/bin/perl -w use strict; open (IN, "qq.in") || die "Cannot open qq.in"; while (my $line = <IN>) { chomp $line; print "$line\n"; while ($line ne "") { $line =~ s/^\s+//; my $field = ""; if ($line =~ m/^"((?:[^"]|\\.)*)"([^,]*)(?:,|$)/) { # Quoted field $field = "$1$2"; $line = substr($line, length($field)+2); $field =~ s/""/"/g; } elsif ($line =~ m/^([^,]*)(?:,|$)/) { # Unquoted field $field = "$1"; $line = substr($line, length($field)); } else { print "WTF?? ($line)\n"; } $line =~ s/^,//; print " [$field]\n"; } } close (IN);

This is less than 30 lines with no spaces, no comments, compared to 70 in the original. The original version is larger than it should be for some reason. And I have not gone astray to reduce this code to a minimum.

+42

Jonathan Leffler Jun 09 '09 at 6:45

source share

Do not use strict / warning warnings, the systematic use of substr instead of regexp, without the use of modules. This is definitely not someone who has a "very good Perl experience." At least not for real Perl projects. Like you, I suspect that this is probably a C programmer with basic Perl knowledge.

This does not mean that they cannot learn, especially since there are people from Perl around. This seems to mean that they overestimated their qualifications for the job. A few more questions about exactly how they gained this very good experience in Perl will be fine.

+30

mirod Jun 09 '09 at 9:16

source share

I don't care if he used regular expressions or not. I also don't care if its Perl C looks or not. The question that really matters is good Perl? And I would say that this is not so:

He did not use use strict
He did not include warnings.
It uses the old version with two open arguments.
The "open file" comment hurts and gives the impression that the code he usually writes does not contain comments.
Code is hard to maintain
Was he allowed to use CPAN modules? A good Perl programmer would first look at this option.

+27

innaM Jun 09 '09 at 9:25

source share

I should (kind of) disagree with most of the opinions expressed here.

Since the code in question can be expressed much more compact and supported in idiomatic Perl, you really need to ask how much time the candidate will spend developing this solution and how much time someone spent halfway using the idiomatic Perl.

I think you will find that this coding style can be a huge waste of time (and therefore company money).

I am not saying that every Perl programmer needs a grok language, which, unfortunately, would be far-fetched - but they should know enough not to spend a lot of time re-introducing the main language functions into their code again and again.

EDIT Looking at the code again, I have to be more decisive: although the code looks very clean, it is actually terrible. I'm sorry. This is not Perl. Do you know the saying "you can program Fortran in any language"? Yes you can. But you must not.

+22

Konrad Rudolph Jun 09 '09 at 7:10

source share

This is the case when you need to follow the programmer. Ask him why he wrote it like this.

Maybe there is a very good reason. Perhaps this is necessary in order to follow the same behavior as the existing code, and therefore he did the translation in turn for complete compatibility. If so, give him points for a worthy explanation.

Or maybe he doesn't know Perl, so he found out that day to answer a question. If so, give him points for fast and agile learning skills.

The only disqualifying comment might be "I always program Perl this way. I don't understand this regular expression."

+13

SPWorley Jun 09 '09 at 6:51

source share

It works? Did he write this in an acceptable period of time? Do you find it supported?

If you can answer these three questions, you can cross the bridge of death ( * ).

+9

thijs Jun 09 '09 at 6:41

source share

I would say that his code is an adequate solution. It works, right? And there is an advantage in maintainability by writing “longhand” rather than as few code characters as you can.

Perl's motto is " There's more than one way to do this ." Perl doesn't really understand the coding style, as some languages do (I also like Python, but you have to admit that people can get some kind of snobbery when evaluating whether the code is “pythonic” or not).

+9

Bill Karwin Jun 09 '09 at 6:45

source share

Recently, one of my colleagues interviewed some job candidates and one said that they have very good Perl experience.

If this person believes that he has a very good experience in Perl and writes Perl like this, he is probably a victim of the Dunning-Krueger Effect .

So this is not a rental.

+9

Unknown Jun 10 '09 at 0:40

source share

I think the biggest problem is that he or she did not show any knowledge of regular expression. And this is the key to Perl.

The question is, can they study? There is so much to look for in a candidate who has passed this piece of code.

+8

Artem Russakovskii Jun 09 '09 at 6:44

source share

I would not agree with the candidate. He or she does not like Perl idioms, which will lead to suboptimal code, less performance (all these unnecessary lines must be written!) And an inability to read code written by an experienced Perl encoder (which, of course, uses regular expressions, etc. generally).

But it works ...

+5

Erich Kitzmueller Jun 09 '09 at 6:55

source share

Only the start block indicates that it missed the basics of Perl.

  while ($line ne "") { # Skip spaces and start with empty field. if (substr ($line,0,1) eq " ") { $line = substr ($line,1); next; }

This should be at least written using a regular expression to remove the leading space. I like the answer from jrockway best , the rock module. Although I would use regular expressions for this, something like.

 #!/usr/bin/perl -w # # $Id$ # use strict; open(FD, "< qq.in") || die "Failed to open file."; while (my $line = <FD>) { # Don't like chomp. $line =~ s/(\r|\n)//g; # ".*?[^\\\\]" = Match everything between quotations that doesn't end with # an escaped quotation, match lazy so we will match the shortest possible. # [^",]*? = Match strings that doesn't have any quotations. # If we combine the two above we can match strings that contains quotations # anywhere in the string (or doesn't contain quotations at all). # Put them together and match lazy again so we can match white-spaces # and don't include them in the result. my $match_field = '\s*((".*?[^\\\\]"|[^",]*?)*)\s*'; if (not $line =~ /^$match_field,$match_field,$match_field,$match_field$/) { die "Invalid line: $line"; } # Put values in nice variables so we don't have to deal with cryptic $N # (and can use $1 in replace). my ($user_id, $name, $level, $numeric_id) = ($1, $3, $5, $7); print "$line\n"; for my $field ($user_id, $name, $level, $numeric_id) { # If the field starts with a quotation, # strip everything after the first unescaped quotation. $field =~ s/^"(.*?[^\\\\])".*/$1/g; # Now fix all escaped variables (not only quotations). $field =~ s/\\(.)/$1/g; print " [$field]\n"; } } close FD;

+5

Johan Soderberg Jun 09 '09 at 15:09

source share

Sorry about this guy. I would not dare to parse CSV with regex, although this can be done.

DFA in structured code is more obvious than the regular expression here, and translating DEF → regular expressions is nontrivial and prone to stupid mistakes.

+5

Joshua Jun 09 '09 at 15:19

source share

Perhaps ask him to write more versions of the same code? When in doubt about hiring, ask more questions to the candidate.

+3

sbidwai Jun 09 '09 at 6:51

source share

The fact that he did not use a single element of the regular expression in the code should make you ask him many questions about why he wrote this.

Perhaps he is Jamie Zawinski or a fan, and he had no more problems?

I'm not necessarily saying that all parsing should be a huge amount of unreadable CSV regular expressions, such as ("([^"]*|"{2})*"(,|$))|"[^"]*"(,|$)|[^,]+(,|$)|(,) or one of many similar regular expressions, but at least for passing lines or instead of using substring() .

+3

Vinko Vrsalovic Jun 09 '09 at 6:52

source share

The code not only shows that the candidate really does not know Perl, but all those lines that say $line = substr ($line,1) are terrible in any language. Try to parse a long string (say, several thousand fields) using this type of approach, and you will understand why. It just shows what problem Joel Spolsky discussed in this post .

+3

itub Mar 01 '10 at

source share

The obvious question may be, if you are not using Perl in your company in the first place, no matter how good its Perl code is?

I'm not sure that the elegance of his Perl code says a lot about his skills in any language that you actually use.

+1

jalf Jun 09 '09 at 7:00

source share

As not Perl (? A programmer?), I have to say that this is probably the most readable Perl I've ever read! :)

Hiring someone for something like a scripting language that can be learned a few days to a few weeks (if it's a decent scripting language!) Seems to be extremely wrong in the first place.

Personally, I probably would have hired this person for various reasons. The code is well structured and fairly commented out. Language features can be easily taught later.

+1

Serapth Jun 10 '09 at 0:26

source share

The crucial point here is, naturally, after it works as expected, regardless of whether the code is supported.

Do you get it?
Would you feel comfortable correcting a mistake in it?

Perl programs tend to look like a cat accidentally appears while walking on a keyboard. If this person knows how to write readable Perl code that is suitable for the team, this is actually good.

Then, you might want to teach him regular expressions, but only carefully :-)

+1

Thorbjørn Ravn Andersen Dec 23 '09 at 2:00

source share

The code looks clean and readable. (, .) , , - , .

/ , , . that ( , - ), .

, , .

+1

luis.espinal 26 . '10 14:36

source share

, , , . "", .

, CSV , , . , , , .

. - ( Perl).

 open (IN, "csv.csv"); while (<IN>) { #print $_; chomp; @array = split(/,/,$_); print "[User Id] = $array[0] [Name] = $array[1] [Level] = $array[2] [Numeric ID] = $array[3]\n"; }

0

chopppen5 04 . '09 5:49

source share

brian d foy · Accepted Answer · 2009-06-09 06:52

I advise people to never hire Perl programmers, C programmers or Java programmers, etc. Just hire good people. The programmers I hired to write Perl were also qualified in different languages. I hired them because they were good programmers, and good programmers can deal with several languages.

Now this code is very similar to C, but I think it is great for Perl too. If you hire a good programmer, with a little Pearl practice under his belt, he will catch up. People complain about the lack of regular expressions, which will simplify the situation in auxiliary areas, but I would never want to use a regular expression to parse these dirty CSV data. I would not want to read or support it.

I often find the opposite problem more troublesome: hire a good programmer who writes good Perl code, but the rest of the team knows only the basics of Perl and cannot keep up with the times. This has nothing to do with poor formatting or poor structure, just the skill level with advanced topics (like closing).

In this debate, the situation warms up a bit, so I think I should explain more about how I deal with this matter. I do not see this as a problem with regex / modeless. I would not write the code as the candidate did, but it does not really matter.

I write quite a bit of crap code. In the first pass, I usually think more about structure and process than syntax. I come back later to tighten it. This does not mean that the candidate code is useful, but for the first pass made in the interview, I do not judge it too harshly. I don’t know how long he had to write, etc. Therefore, I don’t think this is the basis for something that I could work on for a long time. Interviewing questions is always strange because you cannot do what you really do for real work. I probably also did not have a question about writing a CSV parser if I had to start from scratch and do it in 15 minutes. In fact, I spent more time than today on being an absolute idiot with some code.

I went to see the code Text :: CSV_PP , a cousin of Pure Perl on Text :: CSV_XS . It uses regular expressions, but many regular expressions that handle special cases, and in structure are no different from the code presented here. This is a lot of code, and it is complex code that I hope I will never have to watch again.

What I incline to trouble is the answers to the interviews, which concern only this input. This is almost always the wrong thing in the real world, where you have to deal with cases that you have not yet discovered, and you need the flexibility to solve future problems. I find that you have many answers on Stackoverflow. The mental decision process tells me more. People become proficient in the language easier than they change how they think about things. I can teach people to write better Perl, but I can’t change their wetware for the most part. This comes from scars and experience.

Since I was not there to see the candidate’s code in the decision or to ask him further questions, I won’t think about why he wrote it the way he did. For some of the other solutions that I saw here, I could be just as harsh in the interview.

Career is a journey. I do not expect everyone to become a guru or to have the same experience. If I wrote off people because they do not know any tricks or idioms, I do not give them the opportunity to continue their journey. The candidate code will not win any prizes, but, apparently, this was enough to attract him to the final three to consider the proposal. The guy got up and tried, did a lot better than a lot of the code I've seen in my life, and that's good enough for me.

Should we hire someone who writes C in Perl?

More articles: