The program gets stuck, the pipe file descriptor opens when it shouldn't?

I create a small shell that can read commands. When I run my program and type: "cat file.txt > file2.txt" creates a file and then gets stuck in the line: if(execvp(structVariables->argv[0], argv) < 0). (waiting for input / output ??). If I finish the ctrl + d program, I can see in my folder that the file was created, but nothing was written in it. (dupPipe is used to process more commands not yet used due to the problem described above)

 if((pid = fork()) < 0) { perror("fork error"); } else if(pid > 0) // Parent { if(waitpid(pid,NULL,0) < 0) { perror("waitpid error"); } } else // Child { int flags = 0; if(structVariables->outfile != NULL) { flags = 1; // Write redirect(structVariables->outfile, flags, STDOUT_FILENO); } if(structVariables->infile != NULL) { flags = 2; // Read redirect(structVariables->infile, flags, STDIN_FILENO); } if(execvp(structVariables->argv[0], argv) < 0) { perror("execvp error"); exit(EXIT_FAILURE); } } 

The two functions that I use in my program are as follows: dupPipe and redirection

 int dupPipe(int pip[2], int end, int destinfd) { if(end == READ_END) { dup2(pip[0], destinfd); close(pip[0]); } else if(end == WRITE_END) { dup2(pip[1], destinfd); close(pip[1]); } return destinfd; } int redirect(char *filename, int flags, int destinfd) { int newfd; if(flags == 1) { if(access(filename, F_OK) != -1) // If file already exists { errno = EEXIST; printf("Error: %s\n", strerror(errno)); return -1; } newfd = open(filename, O_WRONLY | O_CREAT | O_EXCL, S_IRUSR | S_IWUSR | S_IRGRP | S_IWGRP | S_IROTH | S_IWOTH); if(newfd == -1) { perror("Open for write failed"); return -1; } } else if(flags == 2) { newfd = open(filename, O_RDONLY); if(newfd == -1) { perror("Open for read failed"); return -1; } } else return -1; if(dup2(newfd, destinfd) == -1) { perror("dup2 failed"); close(newfd); return -1; } if(newfd != destinfd) { close(newfd); } return destinfd; } 
+5
source share
3 answers

It seems that you are trying to code a shell to run commands read from the input (if this is not the case, edit your question as that is not clear).

I'm not sure why you think pipes are used in a command like cat file.txt > file2.txt , but that is not the case anyway. See what happens under the hood when you enter cat file.txt > file2.txt into a shell like bash:

  • A child process is created where cat(1) will be launched.
  • The child process opens file2.txt for writing (more on this later).
  • If open(2) succeeds, the child process duplicates the newly opened file descriptor on stdout (so stdout will effectively point to the same entry in the file table as file2.txt ).
  • cat(1) is executed by calling one of seven exec() functions. The file.txt argument file.txt passed to cat(1) , so cat(1) will open file.txt and read everything by copying its contents to stdout (which is redirected to file2.txt ).
  • cat(1) shuts down and shuts down, which results in closing and clearing any open file descriptors. By the time cat(1) , file2.txt is a copy of file.txt .
  • At the same time, the parent shell process waits for the child process to complete before printing the next prompt and waiting for more commands.

As you can see, pipes are not used to redirect I / O. Pipes are an interprocess exchange mechanism used to feed process output to another process input. There is only one process ( cat ) here, so why do you need pipes?

This means that you must call redirect() with STDOUT_FILENO as destinfd (instead of the channel channel) to redirect the output. Likewise, input redirection should call redirect() with STDIN_FILENO . These constants are defined in unistd.h , so be sure to include this header.

You will also probably want to exit the child if exec() fails, otherwise you will run 2 copies of the shell process.

Last but not least, you should not introduce exclusive redirection of input or output. Perhaps the user wants both input and output redirection. So instead of else if when doing I / O redirection, I would just use 2 independent ifs.

With that in mind, the main text you posted should look something like this:

 if((pid = fork()) < 0) { perror("fork error"); } else if(pid > 0) // Parent { if(waitpid(pid,NULL,0) < 0) { perror("waitpid error"); } } else // Child { int flags = 0; if(structVariables->outfile != NULL) { flags = 1; // Write // We need STDOUT_FILENO here redirect(structVariables->outfile, flags, STDOUT_FILENO); } if(structVariables->infile != NULL) { flags = 2; // Read // Similarly, we need STDIN_FILENO here redirect(structVariables->infile, flags, STDIN_FILENO); } // This line changed; see updated answer below if(execvp(structVariables->argv[0], structVariables->argv) < 0) { perror("execvp error"); // Terminate exit(EXIT_FAILURE); } } 

As mentioned in another answer, your redirect() function is subject to race conditions because there is a time between checking for the existence of the file and the actual creation of the file, where another process can create the file (this is called TOCTTOU error: time to check for usage time). You must use O_CREAT | O_EXCL O_CREAT | O_EXCL for atomic existence testing and file creation.

Another problem is that you always close newfd . What if newfd and destinfd for some reason? Then you will mistakenly close the file because dup2(2) essentially does not work if you pass two equal file descriptors. Even if you think this will never happen, it is always recommended to check first if the duplicated fd is different from the original fd before closing the original.

Here is the code with these problems:

 int redirect(char *filename, int flags, int destinfd) { int newfd; if(flags == 1) { newfd = open(filename, O_WRONLY | O_CREAT | O_EXCL, 0666); if(newfd == -1) { perror("Open for write failed"); return -1; } } else if(flags == 2) { newfd = open(filename, O_RDONLY); if(newfd == -1) { perror("Open for read failed"); return -1; } } else return -1; if(dup2(newfd, destinfd) == -1) { perror("dup2 failed"); close(newfd); return -1; } if (newfd != destinfd) close(newfd); return destinfd; } 

Consider replacing 0666 in open(2) above with S_IRUSR | S_IWUSR | S_IRGRP | S_IWGRP | S_IROTH | S_IWOTH S_IRUSR | S_IWUSR | S_IRGRP | S_IWGRP | S_IROTH | S_IWOTH S_IRUSR | S_IWUSR | S_IRGRP | S_IWGRP | S_IROTH | S_IWOTH (be sure to include sys/stat.h and fcntl.h ). You might want to use #define to make it cleaner, but I still think it's better and more descriptive if you do it like this, and not hardcoding some kind of magic number (this is subjective).

I will not comment on dupPipe() as it is not needed / not used in this question. I / O redirection is all you need. If you want to continue the discussion in pipes, feel free to edit the question or create another one.

UPDATE

Ok, now that I have a look at the complete source code, I have a few more comments.

The reason cat(1) hangs because of this:

 if (execvp(structVariables->argv[0], argv) < 0) 

The second execvp(2) parameter should be structVariables->argv , not argv , because argv is an array of shell program arguments, which is (usually) empty. Passing an empty argument list to cat(1) makes it read from stdin , not from the file, so it seems to freeze - it is waiting for you to supply input. So go ahead and replace this line:

 if (execvp(structVariables->argv[0], structVariables->argv) < 0) 

This solves one of your problems: things like cat < file.txt > file2.txt will work now (I tested it).

About channel redirection

So now we need to work with channel redirection. Pipe redirection occurs every time we see | on the command line. Let's look at an example to understand what happens under the hood when you type ls | grep "file.txt" | sort ls | grep "file.txt" | sort ls | grep "file.txt" | sort . It is important to understand these steps so that you can build an accurate mental model of how the system works; without such a vision, you really won’t understand the implementation:

  • The shell (usually) first separates the commands by pipe symbol. It also does your code. This means that after parsing, the shell collected enough information, and the command line was divided into 3 objects ( ls , grep and sort command).
  • The shell starts and calls one of seven exec() functions to start the ls child. Now remember that the pipeline means that the output of the program is the input of the next, so before exec() ing the shell must create a channel. The child process that should start ls(1) calls dup2(2) before exec() to duplicate the channel write channel on stdout . Similarly, the parent process calls dup2(2) to duplicate the channel read channel on stdin . It is very important that you understand this step: since the parent duplicates the end of reading the channel on stdin , then what we do next (for example, fork to execute more commands) will always read the input from the channel. So, at the moment we are writing ls(1) in stdout , which is redirected to the pipe that is read by the parent shell process.

  • Now the shell will execute grep(1) . Again, it starts a new process to execute grep(1) . Remember that file descriptors are inherited throughout the plug and that the parent shell process has stdin attached to the read end of the channel connected to ls(1) , so the new child process that is about to execute grep(1) will read "automatically" "read from this pipe! But wait, there is more! The shell knows that there is another process in the pipeline (the sort command), so before executing grep (and before forking), the shell creates another channel for connecting the output of grep(1) to the input of sort(1) , then it repeats the same steps: in the child ka al recording channel is duplicated on stdout in the parent channel of the channel read channel is duplicated on. stdin Again, it is important to understand what is happening here:. a process that is about to perform grep(1) , already reads its input from the channel connected to the ls(1) and now it has an output connected to the pipe, which will be feed sort(1) . Thus, grep(1) essentially reads from the pipe and writes to the pipe. OTOH, the parent shell process duplicated the last read channel of the channel on stdin , effectively "giving up" by reading the output ls(1) (because grep(1) will obrabatyv be it in any way), but instead updates the input stream is read result grep(1) .

  • Finally, the shell sees that sort(1) is the last command, so it is just forks + execs sort(1) . The results are written to stdout because we never changed stdout during the shell process, but the input is read from the pipe that connects grep(1) to sort(1) because of our actions in step 3.

So how is this implemented?

Simple: as long as there are several commands left to process, we create a channel and a plug. On the child, we close the channel read channel, duplicate the channel write channel on stdout and call one of the seven exec() functions. On the parent element, we close the channel recording channel and duplicate the channel reading channel on stdin .

When there is only one command left to process, we simply fork + exec without creating a channel.

To clarify only one detail: before starting the pipe(2) redirection batch, we need to save the link to the source code of the shell, since we (possibly) change it many times throughout the path. If we don’t save it, we may lose the link to the stdin source file, and then we will no longer be able to read user data! In code, I usually do this with fcntl(2) with F_DUPFD_CLOEXEC (see man 2 fcntl ) to ensure that the handle is closed when the command is executed in a child process (usually an inefficient practice is to leave open file descriptors around when they are used).

In addition, the shell process must wait(2) in the last process in the pipeline. If you think about it, it makes sense: pipes essentially synchronize each command in the pipeline; the set of commands is considered complete only when the last command reads the EOF from the channel (that is, we know that we only do this when all the data flows along the entire pipeline). If the shell did not wait for the last process, but instead expected some other process in the middle (or at the beginning) of the pipeline, it will return to the command line too early and leave the other commands still running on background - not a smart move, as the user expects the shell to complete the current job before waiting any longer.

So ... this is a lot of information, but it is very important that you understand this. So, the revised core code is here:

 int saved_stdin = fcntl(STDIN_FILENO, F_DUPFD_CLOEXEC, 0); if (saved_stdin < 0) { perror("Couldn't store stdin reference"); break; } pid_t pid; int i; /* As long as there are at least two commands to process... */ for (i = 0; i < n-1; i++) { /* We create a pipe to connect this command to the next command */ int pipefds[2]; if (pipe(pipefds) < 0) { perror("pipe(2) error"); break; } /* Prepare execution on child process and make the parent read the * results from the pipe */ if ((pid = fork()) < 0) { perror("fork(2) error"); break; } if (pid > 0) { /* Parent needs to close the pipe write channel to make sure * we don't hang. Parent reads from the pipe read channel. */ if (close(pipefds[1]) < 0) { perror("close(2) error"); break; } if (dupPipe(pipefds, READ_END, STDIN_FILENO) < 0) { perror("dupPipe() error"); break; } } else { int flags = 0; if (structVariables[i].outfile != NULL) { flags = 1; // Write if (redirect(structVariables[i].outfile, flags, STDOUT_FILENO) < 0) { perror("redirect() error"); exit(EXIT_FAILURE); } } if (structVariables[i].infile != NULL) { flags = 2; // Read if (redirect(structVariables[i].infile, flags, STDIN_FILENO) < 0) { perror("redirect() error"); exit(EXIT_FAILURE); } } /* Child writes to the pipe (that is read by the parent); the read * channel doesn't have to be closed, but we close it for good practice */ if (close(pipefds[0]) < 0) { perror("close(2) error"); break; } if (dupPipe(pipefds, WRITE_END, STDOUT_FILENO) < 0) { perror("dupPipe() error"); break; } if (execvp(structVariables[i].argv[0], structVariables[i].argv) < 0) { perror("execvp(3) error"); exit(EXIT_FAILURE); } } } if (i != n-1) { /* Some error caused an early loop exit */ break; } /* We don't need a pipe for the last command */ if ((pid = fork()) < 0) { perror("fork(2) error on last command"); } if (pid > 0) { /* Parent waits for the last command to execute */ if (waitpid(pid, NULL, 0) < 0) { perror("waitpid(2) error"); } } else { int flags = 0; /* Execute last command. This will read from the last pipe we set up */ if (structVariables[i].outfile != NULL) { flags = 1; // Write if (redirect(structVariables[i].outfile, flags, STDOUT_FILENO) < 0) { perror("redirect() error"); exit(EXIT_FAILURE); } } if (structVariables[i].infile != NULL) { flags = 2; // Read if (redirect(structVariables[i].infile, flags, STDIN_FILENO) < 0) { perror("redirect() error"); exit(EXIT_FAILURE); } } if (execvp(structVariables[i].argv[0], structVariables[i].argv) < 0) { perror("execvp(3) error on last command"); exit(EXIT_FAILURE); } } /* Finally, we need to restore the original stdin descriptor */ if (dup2(saved_stdin, STDIN_FILENO) < 0) { perror("dup2(2) error when attempting to restore stdin"); exit(EXIT_FAILURE); } if (close(saved_stdin) < 0) { perror("close(2) failed on saved_stdin"); } 

Some final notes on dupPipe() :

  • Both dup2(2) and close(2) may return an error; you should probably check this out and act accordingly (for example, pass the error to the call stack, returning -1).
  • Again, you should not blindly close the descriptor after duplicating it, because it may be that the descriptors of the source and destination are the same.
  • You should verify that end either READ_END or WRITE_END and returns an error if this is incorrect (instead of returning destinfd no matter what may give a false sense of success for the caller code)

Here is how I would improve it:

 int dupPipe(int pip[2], int end, int destinfd) { if (end != READ_END && end != WRITE_END) return -1; if(end == READ_END) { if (dup2(pip[0], destinfd) < 0) return -1; if (pip[0] != destinfd && close(pip[0]) < 0) return -1; } else if(end == WRITE_END) { if (dup2(pip[1], destinfd) < 0) return -1; if (pip[1] != destinfd && close(pip[1]) < 0) return -1; } return destinfd; } 

Have fun with your shell!

+7
source

execvp is not returned unless an error occurs.

Consequently, the source program will (usually) not execute code outside of the execvp () call

normal code sequence:

 1) fork() 2) if child then call execvp(); 3) if parent .... 
+2
source

You are using open() in redirect() incorrectly if flags == 1 :

  if(flags == 1) { if(access(filename, F_OK) != -1) // If file already exists { errno = EEXIST; printf("Error: %s\n", strerror(errno)); return -1; } newfd = open(filename, O_CREAT, O_WRONLY); if(newfd == -1) { perror("Open for write failed"); return -1; } } 

In newfd = open(filename, O_CREAT, O_WRONLY); , O_WRONLY (incorrectly) is used instead of the mode argument in open() , and not in flags :

  if(flags == 1) { if(access(filename, F_OK) != -1) // If file already exists { errno = EEXIST; printf("Error: %s\n", strerror(errno)); return -1; } newfd = open(filename, O_CREAT | O_WRONLY, mode); //whatever mode you want, but remember umask. if(newfd == -1) { perror("Open for write failed"); return -1; } } 

In addition, checking the previous existence of the file is racy, another program may create the file after access() and before open() . Use open(filename, O_CREAT | O_EXCL, mode) to atomically create and open a file.

0
source

All Articles