Stdin behaves differently when transmitting over channels and when redirecting

I am trying to pass information to a program that does not accept input from stdin. To do this, I use / dev / stdin as an argument, and then try to connect to it. I noticed that if I do this with a pipe symbol:

[ pkerp@comp ernwin]$ cat fess/structures/168d.pdb | MC-Annotate /dev/stdin 

I have no conclusion. If, however, I do the same thing using the left carriage character, it works fine:

 [ pkerp@plastilin ernwin]$ MC-Annotate /dev/stdin < fess/structures/168d.pdb Residue conformations ------------------------------------------- A1 : G C3p_endo anti A2 : C C3p_endo anti A3 : G C3p_endo anti 

My question is: what is the difference between these two operations and why do they give a different result? As a bonus question, is there a suitable term to indicate input using the '<' character?

Update:

My best guess is that something internal for the program to run uses a file search. The answers below show that this has something to do with file pointers, but the following small test program is running:

 #include <stdio.h> int main(int argc, char *argv[]) { FILE *f = fopen(argv[1], "r"); char line[128]; printf("argv[1]: %sf: %d\n", argv[1], fileno(f)); while (fgets(line, sizeof(line), f)) { printf("line: %s\n", line); } printf("rewinding\n"); fseek(f, 0, SEEK_SET); while (fgets(line, sizeof(line), f)) { printf("line: %s\n", line); } fclose(f); } 

indicates that everything happens the same way until the fseek function is called:

 [ pete@kat tmp]$ cat temp | ./a.out /dev/stdin argv[1]: /dev/stdin f: 3 line: abcd rewinding =================== [ pete@kat tmp]$ ./a.out /dev/stdin < temp argv[1]: /dev/stdin f: 3 line: abcd rewinding line: abcd 

Using process substitution suggested by Christopher Neilan causes the program to hang above without even reading the input, which also seems a bit strange.

 [ pete@kat tmp]$ ./a.out /dev/stdin <( cat temp ) argv[1]: /dev/stdin f: 3 

Looking at the strace output, I confirm my suspicion that a search attempt is being performed, which does not work in the pipe version:

 _llseek(3, 0, 0xffffffffffd7c7c0, SEEK_CUR) = -1 ESPIPE (Illegal seek) 

And succeed in the redirect version.

 _llseek(3, 0, [0], SEEK_CUR) = 0 

The moral of the story: do not arbitrarily try to replace the /dev/stdin argument and try to connect to it. It may work, but it is not.

+7
source share
3 answers

The problem is how the files open for reading.

/dev/stdin not a real file; it is a symbolic link to a file that the current process uses as standard input. In a typical shell, it is associated with a terminal and is inherited by any process launched by the shell. Keep in mind that MC-Annotate will only read from the file provided as an argument.

In the pipe example, /dev/stdin is a symbolic link to a file that MC-Annotate inherits as standard input: terminal. This file is likely to open in a new descriptor (say 3, but there can be any value greater than 2). The pipe connects the output of standard cat input to MC-Annotate's (file descriptor 0), which MC-Annotate continues to ignore in favor of the file that it opened directly.

In the redirection example, the shell connects fess/structures/168d.pdb directly to file descriptor 0 before running MC-Annotate . When MC-Annotate starts up, it again tries to open /dev/stdin , which this time points to fess/structures/168d.pdb instead of the terminal.

So the answer is that the file /dev/stdin is a link in the process that executes MC-Annotate ; shell redirections are set before the process begins; pipelines after starting the process.

It works?

 cat fess/structures/168d.pdb | MC-Annotate <( cat /dev/stdin ) 

Similar team

 echo foo | cat <( cat /dev/stdin ) 

seems to work, but I will not argue that the situations are identical.


[UPDATE: does not work. /dev/stdin is still a link to a terminal, not a pipeline.]

This can provide a workaround. Now MC-Annotate inherits its standard input from a subshell, not the current shell, and the subshell has an exit from cat as standard input, not a terminal.

 cat fess/structures/168d.pdb | ( MC-Annotate /dev/stdin ) 

He thinks that a simple team group will work:

 cat fess/structures/168d.pdb | { MC-Annotate /dev/stdin; } 
+1
source

There should be no functional difference between the two teams. Indeed, I cannot recreate what you see:

 #! /usr/bin/perl # test.pl # this is a test Perl script that will read from a filename passed on the command line, and print what it reads. use strict; use warnings; print $ARGV[0], " -> ", readlink( $ARGV[0] ), " -> ", readlink( readlink($ARGV[0]) ), "\n"; open( my $fh, "<", $ARGV[0] ) or die "$!"; while( defined(my $line = <$fh>) ){ print "READ: $line"; } close( $fh ); 

Running this in three ways:

 ( caneylan@faye.sn : tmp)$ cat input a b c d ( caneylan@faye.sn : tmp)$ ./test.pl /dev/stdin /dev/stdin -> /proc/self/fd/0 -> /dev/pts/0 this is me typing into the terminal READ: this is me typing into the terminal ( caneylan@faye.sn : tmp)$ cat input | ./test.pl /dev/stdin /dev/stdin -> /proc/self/fd/0 -> pipe:[1708285] READ: a READ: b READ: c READ: d ( caneylan@faye.sn : tmp)$ ./test.pl /dev/stdin < input /dev/stdin -> /proc/self/fd/0 -> /tmp/input READ: a READ: b READ: c READ: d 

First of all, pay attention to the fact that /dev/stdin :

 ( caneylan@faye.sn : tmp)$ ls -l /dev/stdin lrwxrwxrwx 1 root root 15 Apr 21 15:39 /dev/stdin -> /proc/self/fd/0 ( caneylan@faye.sn : tmp)$ ls -l /proc/self lrwxrwxrwx 1 root root 0 May 10 09:44 /proc/self -> 27565 

This is always a symbolic link to /proc/self/fd/0 . /proc/self itself is a special link to the directory under /proc for the current process. Thus, /dev/stdin will always point to fd 0 of the current process. Therefore, when you run MC-Annotate (or, in my examples, test.pl ), the file /dev/stdin will be resolved to /proc/$pid/fd/0 , for any MC-Annotate process identifier. This is simply the result of a symlink for /dev/stdin .

So, as you can see above in my example, when you use the channel ( | ), /proc/self/fd/0 will point to the read end of the channel from cat installed by the shell. When you use redirection ( < ), /proc/self/fd/0 will point directly to the input file specified by the shell.

As for why you see this strange behavior - I would suggest that MC-Annotate does some type checking of the file before opening it and sees that / dev / stdin points to the named pipe instead of the regular file, and it gets saved. You can confirm this by either reading the source code for MC-Annotate , or using the strace command to see what happens inside.

Note that both of these methods cost a bit in Bash. The accepted way to get the output of a process into a program that opens only the file name is to use a process replacement :

 $ MC-Annotate <(cat fess/structures/168d.pdb) 

The <(...) construct returns the file descriptor to the end of the output of the channel that comes from ... :

 ( caneylan@faye.sn : tmp)$ echo <(true | grep example | cat) /dev/fd/63 
+1
source

From a look at this information about MC-Annotate http://bioinfo.cipf.es/ddufour/doku.php?id=mc-annotate The reason the pipe does not work is because MC-Annotate does not recognize cat output from a file as one of the .pbd types

The pipe chain commands along with the output of the first are used as input for the next.

"<(" less "," left arrow "," left angle bracket ") enters the file into the command.

http://tldp.org/LDP/abs/html/io-redirection.html#IOREDIRECTIONREF2

0
source

All Articles