Edit: Sorry for the error. It is a pity that I observed that using the variable my inside countit(x, q{}) is a big mistake. This line is evaluated inside the Benchmark module, and @str is empty there. This solution is not as fast as I imagined. See Correction below. I'm sorry again.
Perl can be fast:
use strict; use warnings; package LCP; sub LCP { return '' unless @_; return $_[0] if @_ == 1; my $i = 0; my $first = shift; my $min_length = length($first); foreach (@_) { $min_length = length($_) if length($_) < $min_length; } INDEX: foreach my $ch ( split //, $first ) { last INDEX unless $i < $min_length; foreach my $string (@_) { last INDEX if substr($string, $i, 1) ne $ch; } } continue { $i++ } return substr $first, 0, $i; }
Test suite:
#!/usr/bin/env perl use strict; use warnings; Test::LCP->runtests; package Test::LCP; use base 'Test::Class'; use Test::More; use Benchmark qw(:all :hireswallclock); sub test_use : Test(startup => 1) { use_ok('LCP'); } sub test_lcp : Test(6) { is( LCP::LCP(), '', 'Without parameters' ); is( LCP::LCP('abc'), 'abc', 'One parameter' ); is( LCP::LCP( 'abc', 'xyz' ), '', 'None of common prefix' ); is( LCP::LCP( 'abcdefgh', ('abcdefgh') x 15, 'abcdxyz' ), 'abcd', 'Some common prefix' ); my @str = map { chomp; $_ } <DATA>; is( LCP::LCP(@str), 'file:///home/gms8994/Music/', 'Test data prefix' ); is( LCP::LCP2(@str), 'file:///home/gms8994/Music/', 'Test data prefix by LCP2' ); my $t = countit( 1, sub{LCP::LCP(@str)} ); diag("LCP: ${\($t->iters)} iterations took ${\(timestr($t))}"); $t = countit( 1, sub{LCP::LCP2(@str)} ); diag("LCP2: ${\($t->iters)} iterations took ${\(timestr($t))}"); } __DATA__ file:///home/gms8994/Music/tATu/ file:///home/gms8994/Music/nina%20sky/ file:///home/gms8994/Music/A%20Perfect%20Circle/
Test suite result:
1..7 ok 1 - use LCP; ok 2 - Without parameters ok 3 - One parameter ok 4 - None of common prefix ok 5 - Some common prefix ok 6 - Test data prefix ok 7 - Test data prefix by LCP2
This means that a pure Perl solution using substr approximately 20% faster than the Roy solution in your test case, and one prefix detection takes about 50us. There is no need to use XS if your data or performance expectations are greater.
Hynek -Pichi- Vychodil Feb 01 '09 at 9:56 2009-02-01 09:56
source share