Using readline (aka <> ) is completely wrong for two reasons: it is buffered and blocked.
Buffering is bad
More specifically, buffering using buffers that cannot be checked is bad.
The system can do all the necessary buffering as you can view its buffers using select .
The Perl IO system should not be allowed to do any buffering, because you cannot peer into its buffers.
Let's look at an example of what might happen using readline in a select loop.
"abc\ndef\n" arrives at the descriptor.select notifies you that there is data to read.readline will try to read the chunk from the handle."abc\ndef\n" will be placed in the Perl buffer for the descriptor.readline will return "abc\n" .
At this point, you call select again, and you want it to know that there is still something to read ( "def\n" ). However, select will indicate that there is nothing to read, since select is a system call, and the data has already been read from the system. That means you have to wait until more comes before you can read "def\n" .
The following program illustrates this:
use IO::Select qw( ); use IO::Handle qw( ); sub producer { my ($fh) = @_; for (;;) { print($fh time(), "\n") or die; print($fh time(), "\n") or die; sleep(3); } } sub consumer { my ($fh) = @_; my $sel = IO::Select->new($fh); while ($sel->can_read()) { my $got = <$fh>; last if !defined($got); chomp $got; print("It took ", (time()-$got), " seconds to get the msg\n"); } } pipe(my $rfh, my $wfh) or die; $wfh->autoflush(1); fork() ? producer($wfh) : consumer($rfh);
Output:
It took 0 seconds to get the msg It took 3 seconds to get the msg It took 0 seconds to get the msg It took 3 seconds to get the msg It took 0 seconds to get the msg ...
This can be fixed using unbuffered I / O:
sub consumer { my ($fh) = @_; my $sel = IO::Select->new($fh); my $buf = ''; while ($sel->can_read()) { sysread($fh, $buf, 64*1024, length($buf)) or last; while ( my ($got) = $buf =~ s/^(.*)\n// ) { print("It took ", (time()-$got), " seconds to get the msg\n"); } } }
Output:
It took 0 seconds to get the msg It took 0 seconds to get the msg It took 0 seconds to get the msg It took 0 seconds to get the msg It took 0 seconds to get the msg It took 0 seconds to get the msg ...
Lock is bad
Let's look at an example of what might happen using readline in a select loop.
"abc\ndef\n" arrives at the descriptor.select notifies you that there is data to read.readline will try to read the chunk from the socket."abc\ndef\n" will be placed in the Perl buffer for the descriptor.readline did not receive a newline, so it tries to read another fragment from the socket.- There are no more data available, so they are blocked.
This does not meet the purpose of using select .
[Preparing a demo code]
Decision
You must implement a version of readline that does not block, but uses only buffers that you can check. The second part is simple because you can check the buffers that you create.
- Create a buffer for each descriptor.
- When data comes from the descriptor, read them, but no more. When the data is waiting (as we know from
select ), sysread will return what is available, without waiting, sysread will appear again. This makes sysread ideal for this task. - Add the read data to the appropriate buffer.
- For each complete message in the buffer, extract it and process it.
Adding a pen:
$select->add($fh); $clients{fileno($fh)} = { buf => '', ... };
select loop:
while (my @ready = $select->can_read) { for my $fh (@ready) { my $client = $clients{fileno($fh)}; our $buf; local *buf = \($client->{buf});
By the way, this is much easier to do with threads, and it doesn't even work with authors!