Built grep is slower than grep that comes with Linux

I am trying to understand why the grep created by me is much slower than the one that comes with the system, and trying to find what compiler options are used by the grep that comes with the system.

OS version: CentOS version 5.3 (Final) grep on the system:

   Version: grep (GNU grep) 2.5.1
   Size: 88896 bytes
   ldd output: 
  libpcre.so.0 => /lib64/libpcre.so.0 (0x0000003991800000)
  libc.so.6 => /lib64/libc.so.6 (0x0000003985a00000)
  /lib64/ld-linux-x86-64.so.2 (0x0000003984a00000)

grep built by me:

   Version: 2.5.1
   Size: 256437 bytes
   ldd output:
  libpcre.so.0 => /lib64/libpcre.so.0 (0x0000003991800000)
  libc.so.6 => /lib64/libc.so.6 (0x0000003985a00000)
  /lib64/ld-linux-x86-64.so.2 (0x0000003984a00000)

The performance of system grep (330 ms) is much faster than the grep I created (22430 ms) when I ran the search for regular expressions in a large list text file.

Below is the command I used for time.

 % time src / grep ". * asa. *" large_list.txt> / dev / null
 real 0m22.430s
 user 0m22.291s
 sys 0m0.080s

OR

 % time bin / grep ". * asa. *" large_list.txt> / dev / null
 real 0m0.331s
 user 0m0.236s
 sys 0m0.081s

The grep system explicitly uses some optiomizing options, which give a huge difference in performance.

Can any body help me with what parameters with which you can build a grep system?

Here are the compilation options for one of the source files when you create it.
gcc -DLIBDIR=\"/usr/local/lib\" -DHAVE_CONFIG_H -I. -I.. -I.. -I. -I../intl -g -O2 -MT xstrtol.o -MD -MP -MF .deps/xstrtol.Tpo -c -o xstrtol.o xstrtol.c

Output. / Configure:

 checking for a BSD-compatible install ... / usr / bin / install -c
 checking whether build environment is sane ... yes
 checking for a thread-safe mkdir -p ... / bin / mkdir -p
 checking for gawk ... gawk
 checking whether make sets $ (MAKE) ... yes
 checking build system type ... x86_64-unknown-linux-gnu
 checking host system type ... x86_64-unknown-linux-gnu
 checking for gawk ... (cached) gawk
 checking for gcc ... gcc
 checking for C compiler default output file name ... a.out
 checking whether the C compiler works ... yes
 checking whether we are cross compiling ... no
 checking for suffix of executables ... 
 checking for suffix of object files ... o
 checking whether we are using the GNU C compiler ... yes
 checking whether gcc accepts -g ... yes
 checking for gcc option to accept ISO C89 ... none needed
 checking for style of include used by make ... GNU
 checking dependency style of gcc ... gcc3
 checking for a BSD-compatible install ... / usr / bin / install -c
 checking for ranlib ... ranlib
 checking for getconf ... getconf
 checking for CFLAGS value to request large file support ... 
 checking for LDFLAGS value to request large file support ... 
 checking for LIBS value to request large file support ... 
 checking for _FILE_OFFSET_BITS ... no
 checking for _LARGEFILE_SOURCE ... no
 checking for _LARGE_FILES ... no
 checking for function prototypes ... yes
 checking how to run the C preprocessor ... gcc -E
 checking for grep that handles long lines and -e ... / bin / grep
 checking for egrep ... / bin / grep -E
 checking for ANSI C header files ... yes
 checking for sys / types.h ... yes
 checking for sys / stat.h ... yes
 checking for stdlib.h ... yes
 checking for string.h ... yes
 checking for memory.h ... yes
 checking for strings.h ... yes
 checking for inttypes.h ... yes
 checking for stdint.h ... yes
 checking for unistd.h ... yes
 checking for string.h ... (cached) yes
 checking for size_t ... yes
 checking for ssize_t ... yes
 checking for an ANSI C-conforming const ... yes
 checking for inttypes.h ... yes
 checking for unsigned long long ... yes
 checking for ANSI C header files ... (cached) yes
 checking for string.h ... (cached) yes
 checking for stdlib.h ... (cached) yes
 checking sys / param.h usability ... yes
 checking sys / param.h presence ... yes
 checking for sys / param.h ... yes
 checking for memory.h ... (cached) yes
 checking for unistd.h ... (cached) yes
 checking libintl.h usability ... yes
 checking libintl.h presence ... yes
 checking for libintl.h ... yes
 checking wctype.h usability ... yes
 checking wctype.h presence ... yes
 checking for wctype.h ... yes
 checking wchar.h usability ... yes
 checking wchar.h presence ... yes
 checking for wchar.h ... yes
 checking for dirent.h that defines DIR ... yes
 checking for library containing opendir ... none required
 checking whether stat file-mode macros are broken ... no
 checking for working alloca.h ... yes
 checking for alloca ... yes
 checking whether closedir returns void ... no
 checking for stdlib.h ... (cached) yes
 checking for unistd.h ... (cached) yes
 checking for getpagesize ... yes
 checking for working mmap ... yes
 checking for btowc ... yes
 checking for isascii ... yes
 checking for iswctype ... yes
 checking for mbrlen ... yes
 checking for memmove ... yes
 checking for setmode ... no
 checking for strerror ... yes
 checking for wcrtomb ... yes
 checking for wcscoll ... yes
 checking for wctype ... yes
 checking whether mbrtowc and mbstate_t are properly declared ... yes
 checking for stdlib.h ... (cached) yes
 checking for mbstate_t ... yes
 checking for memchr ... yes
 checking for stpcpy ... yes
 checking for strtoul ... yes
 checking for atexit ... yes
 checking for fnmatch ... yes
 checking for stdlib.h ... (cached) yes
 checking whether defines strtoumax as a macro ... no
 checking for strtoumax ... yes
 checking whether strtoul is declared ... yes
 checking whether strtoull is declared ... yes
 checking for strerror in -lcposix ... no
 checking for inline ... inline
 checking for off_t ... yes
 checking whether we are using the GNU C Library 2.1 or newer ... yes
 checking argz.h usability ... yes
 checking argz.h presence ... yes
 checking for argz.h ... yes
 checking limits.h usability ... yes
 checking limits.h presence ... yes
 checking for limits.h ... yes
 checking locale.h usability ... yes
 checking locale.h presence ... yes
 checking for locale.h ... yes
 checking nl_types.h usability ... yes
 checking nl_types.h presence ... yes
 checking for nl_types.h ... yes
 checking malloc.h usability ... yes
 checking malloc.h presence ... yes
 checking for malloc.h ... yes
 checking stddef.h usability ... yes
 checking stddef.h presence ... yes
 checking for stddef.h ... yes
 checking for stdlib.h ... (cached) yes
 checking for string.h ... (cached) yes
 checking for unistd.h ... (cached) yes
 checking for sys / param.h ... (cached) yes
 checking for feof_unlocked ... yes
 checking for fgets_unlocked ... yes
 checking for getcwd ... yes
 checking for getegid ... yes
 checking for geteuid ... yes
 checking for getgid ... yes
 checking for getuid ... yes
 checking for mempcpy ... yes
 checking for munmap ... yes
 checking for putenv ... yes
 checking for setenv ... yes
 checking for setlocale ... yes
 checking for stpcpy ... (cached) yes
 checking for strchr ... yes
 checking for strcasecmp ... yes
 checking for strdup ... yes
 checking for strtoul ... (cached) yes
 checking for tsearch ... yes
 checking for __argz_count ... yes
 checking for __argz_stringify ... yes
 checking for __argz_next ... yes
 checking for iconv ... yes
 checking for iconv declaration ... 
          extern size_t iconv (iconv_t cd, char * * inbuf, size_t * inbytesleft, char * * outbuf, size_t * outbytesleft);
 checking for nl_langinfo and CODESET ... yes
 checking for LC_MESSAGES ... yes
 checking whether NLS is requested ... yes
 checking whether included gettext is requested ... no
 checking for libintl.h ... (cached) yes
 checking for GNU gettext in libc ... yes
 checking for dcgettext ... yes
 checking for msgfmt ... / usr / bin / msgfmt
 checking for gmsgfmt ... / usr / bin / msgfmt
 checking for xgettext ... / usr / bin / xgettext
 checking for bison ... bison
 checking version of bison ... 2.3, ok
 checking for catalogs to be installed ... af be bg ca cs da de el eo es et eu fi fr ga gl he hr hu id it ja ko ky lt nb nl pl pt pt_BR ro ru rw sk sl sr sv tr uk vi zh_TW
 checking for dos file convention ... no
 checking host system type ... (cached) x86_64-unknown-linux-gnu
 checking host system type ... (cached) x86_64-unknown-linux-gnu
 checking for DJGPP environment ... no
 checking for environ variable separator ...:
 checking for working re_compile_pattern ... yes
 checking for getopt_long ... yes
 configure: WARNING: Included lib / regex.c not used
 checking whether strerror_r is declared ... yes
 checking for strerror_r ... yes
 checking whether strerror_r returns char * ... no
 checking for strerror ... (cached) yes
 checking for strerror_r ... (cached) yes
 checking for vprintf ... yes
 checking for doprnt ... no
 checking for ANSI C header files ... (cached) yes
 checking for working malloc ... yes
 checking for working realloc ... yes
 checking for pcre_exec in -lpcre ... yes
 configure: creating ./config.status
 config.status: creating Makefile
 config.status: creating lib / Makefile
 config.status: creating lib / posix / Makefile
 config.status: creating src / Makefile
 config.status: creating tests / Makefile
 config.status: creating po / Makefile.in
 config.status: creating intl / Makefile
 config.status: WARNING: intl / Makefile.in seems to ignore the --datarootdir setting
 config.status: creating doc / Makefile
 config.status: creating m4 / Makefile
 config.status: creating vms / Makefile
 config.status: creating bootstrap / Makefile
 config.status: creating config.h
 config.status: config.h is unchanged
 config.status: executing depfiles commands
 config.status: executing default-1 commands
 config.status: creating po / POTFILES
 config.status: creating po / Makefile
 config.status: executing stamp-h commands

Thanks Kumar

+7
gcc grep compiler-flags
source share
5 answers

Why don't you just get CentOS SRPM for the grep binary and compare their compilation options with yours? I would suggest that this is much more effective than having a whole StackOverflow community shake blindly in the dark until they hit.

EDIT: Do you use a multibyte encoded language? (Note: if you donโ€™t know what this means, then the answer is probably โ€œYesโ€, since UTF-8 has been used by default for most Linux distributions for several years, and indeed RedHat (and therefore CentOS) were the first to switch )

In this case, GNU grep is a slow dog. And this applies not only to GNU grep, but to all GNU tools that do some text processing. The FSF refuses to accept any corrections to improve multibyte performance unless these corrections slow down fixed-width encoding. However, since any patch to improve performance for multibyte encodings must contain at least some if , it is actually impossible to write a patch that at least slows down fixed-width encoding, at least the overhead of this if . Thus, GNU UTF-8 tool performance will continue to suck until the end of time.

In any case, most Linux distributions prevent the rat from hearing what the FSF thinks and fixes GNU grep. Fedora Rawhide SRPM contains a patch called grep-2.5.3-egf-speedup.patch , which speeds up UTF-8 GNU grep by several orders of magnitude. (Since this patch is already from 2005, I assume that it is also used on CentOS.) This patch is also used on Mac OSX, Debian, Ubuntu, ... GNU grep distributed by GNU is almost never used. Multibyte-encoded text processing will never be as fast as fixed-width encoding, but it should be at least comparable, not 50x (or even 1500x, as some people say) slower.

There is also another patch called dfa-optional , which makes grep just use the GNU libc regex engine instead of its own, which is not only much faster when working with UTF-8, but also has much less errors.

So, you can re-run your tests with export LC_ALL=POSIX . If this fixes your problem, you need to apply one of the two above fixes.

Additional information is also available in the following two RedHat reports:

The moral of the story: Despite popular belief, Linux distributions know what they do, at least sometimes. Do not think about them.

+10
source share

You are compiled with the -O2 flag. Why didnโ€™t you use the -03 flag. See here for an explanation of the optimization options available with gcc.

Using the Intel ICC compiler can also help improve performance, although it really depends on the application. In addition, it is not free.

Edit, I just saw the -g flag on your compilation line. Remove this when it includes debug files, and this can lead to a pretty serious result.

+4
source share

Others think that besides the -O options, it looks like you are building with the debugging characters "-g".

Debugging usually increases the binary size and can reduce the performance of the binary, I would like grep to be pretty stable and you don't need debug symbols for it.

+1
source share

What version of GCC are you using? IIRC, GCC 4 has been significantly modified, which for some time did not work out some optimization code.

+1
source share

With this large performance gap, this is probably the difference between the algorithm and the code, and not just the difference in the level of compiler optimization. What makes you suspect the compiler?

0
source share

All Articles