This is not difficult to do in C (or Perl or Python) using any of the many md5 implementations - at the heart of its md5 is a hash function that moves from a character vector to a character vector.
So, just write an external program that reads your 3 million lines, and then feed them one by one to the md5 implementation of your choice. Thus, you have one launch of the program, and not 3 million, and only this will save you time.
FWIW in one project. I used the implementation of md5 (in C) by Christoph Devin, there is OpenSSL, and I'm sure CPAN will also have a lot of them for Perl.
Edit: Well, I could not resist. The md5 implementation that I mentioned, for example, inside this little tarball . Take the md5.c file and replace (# ifdef'ed out) main() at the bottom with this
int main( int argc, char *argv[] ) { FILE *f; int j; md5_context ctx; unsigned char buf[1000]; unsigned char md5sum[16]; if( ! ( f = fopen( argv[1], "rb" ) ) ) { perror( "fopen" ); return( 1 ); } while( fscanf(f, "%s", buf) == 1 ) { md5_starts( &ctx ); md5_update( &ctx, buf, (uint32) strlen((char*)buf) ); md5_finish( &ctx, md5sum ); for( j = 0; j < 16; j++ ) { printf( "%02x", md5sum[j] ); } printf( " <- %s\n", buf ); } return( 0 ); }
create a simple standalone program, for example. in
/tmp$ gcc -Wall -O3 -o simple_md5 simple_md5.c
and then you get the following:
# first, generate 300,000 numbers in a file (using 'little r', an R variant) /tmp$ r -e'for (i in 1:300000) cat(i,"\n")' > foo.txt
So, about a second for 300,000 (short) lines.