How can this be done in more than perl way

I am new to Perl and for one of my homework I came up with this solution:

#wordcount.pl FILE 
    # 

    #if no filename is given, print help and exit 
    if (length($ARGV[0]) < 1) 
    { 
           print "Usage is : words.pl word filename\n"; 
           exit; 
    } 

   my $file = $ARGV[0];          #filename given in commandline 

   open(FILE, $file);            #open the mentioned filename 
   while(<FILE>)                 #continue reading until the file ends 
    { 
           chomp; 
           tr/A-Z/a-z/;          #convert all upper case words to lower case 
           tr/.,:;!?"(){}//d;            #remove some common punctuation symbols 
           #We are creating a hash with the word as the key.  
           #Each time a word is encountered, its hash is incremented by 1. 
           #If the count for a word is 1, it is a new distinct word. 
           #We keep track of the number of words parsed so far. 
           #We also keep track of the no. of words of a particular length.  

          foreach $wd (split) 
          { 
                $count{$wd}++; 
                if ($count{$wd} == 1) 
                 { 
                       $dcount++; 
                 } 
                $wcount++; 
                $lcount{length($wd)}++; 
          } 
   } 

   #To print the distinct words and their frequency,  
   #we iterate over the hash containing the words and their count. 
   print "\nThe words and their frequency in the text is:\n"; 
   foreach $w (sort keys%count) 
   { 
         print "$w : $count{$w}\n"; 
   } 

   #For the word length and frequency we use the word length hash 
   print "The word length and frequency in the given text is:\n"; 
   foreach $w (sort keys%lcount) 
   { 
         print "$w : $lcount{$w}\n"; 
   } 

   print "There are $wcount words in the file.\n"; 
   print "There are $dcount distinct words in the file.\n"; 

   $ttratio = ($dcount/$wcount)*100;       #Calculating the type-token ratio. 

   print "The type-token ratio of the file is $ttratio.\n"; 

I included a comment to mention what he is doing. Actually I need to find the number of words from a given text file. The output of the above program will look like this:

The words and their frequency in the text is: 
1949 : 1
a : 1
adopt : 1
all : 2
among : 1
and : 8
assembly : 1
assuring : 1
belief : 1
citizens : 1
constituent : 1
constitute : 1
.
.
.
The word length and frequency in the given text is:
1 : 1
10 : 5
11 : 2
12 : 2
2 : 15
3 : 18
There are 85 words in the file. 
There are 61 distinct words in the file. 
The type-token ratio of the file is 71.7647058823529. 

Even though with the help of Google I can find a solution for my homework. But I think there will be smaller and more concise code using the real power of Perl. Can someone give me a solution in Perl with much fewer lines of code?

+5
source share
2 answers

Here are some suggestions:

  • Turn on use strictand use warningsin your scripts in Perl.

  • , : (1) 1 @ARGV (2) .

  • , <> , $_. , Perl, $_ (, map, grep - for )

    while (my $line = <>){
        ...
    }
    
  • Perl (lc) .

  • . , . , Perl (for, while, if ..), .

    while (my $line = <>){
        ...
        $words{$_} ++ for split /\s+/, $line;
    }
    
  • "tallies" , . , - , - -.

  • :

    my %lengths;
    $lengths{length $_} += $words{$_} for keys %words;
    
+9

, , - . perl- - /g . \w+ - .

while( <FILE> )
{
    while( /(\w+)/g )
    {
        my $wd = lc( $1 );
        ...

     }
 }
+1

All Articles