How to count line words in ruby

I want to do something like this

def get_count(string) sentence.split(' ').count end 

I think there may be a better way, the string may have an inline method for this.

+4
source share
9 answers

I believe that count is a function, so you probably want to use length.

 def get_count(string) sentence.split(' ').length end 

Edit: if your string is really long, creating an array from it with any splitting will require more memory, so here is a faster way:

 def get_count(string) (0..(string.length-1)).inject(1){|m,e| m += string[e].chr == ' ' ? 1 : 0 } end 
+4
source

If the only word boundary is one space, just count them.

 puts "this sentence has five words".count(' ')+1 # => 5 

If there are spaces between words, line endings, tabs, comma, and then space, etc., then the ability to scan word boundaries is possible:

 puts "this, is./tfour words".scan(/\b/).size/2 
+3
source

I know this is an old question, but it can help someone stumble here. Word counting is a difficult problem. What is a word? Are numbers and special characters considered words? Etc ...

I wrote words_counted for this purpose. This is a very flexible, custom string analyzer. You can ask him to analyze any string to count words, occurrences of words and exclude words / characters using regular expressions, strings and arrays.

 counter = WordsCounted::Counter.new("Hello World!", exclude: "World") counter.word_count #=> 1 counted.words #=> ["Hello"] 

Etc ...

The documentation and full source are on Github .

+1
source

using regex will also cover several spaces:

 sentence.split(/\S+/).size 
0
source

The string has nothing ready to do what you wanted. You can define a method in your class or extend the String class itself for what you want to do:

 def word_count( string ) return 0 if string.empty? string.split.size end 
0
source

Regex is broken into any non-primary character:

 string.split(/\W+/).size 

... although it does apostrophe using count as two words, so depending on how small the margin of error is, you can create your own regex expression.

0
source

I recently discovered that String # count is faster than breaking a string an order of magnitude more .

Unfortunately, String # count only accepts a string, not a regular expression. In addition, it will consider two adjacent spaces as two things, not one thing, and you will have to handle the other space characters separately.

0
source
 p " some word\nother\tword.word|word".strip.split(/\s+/).size #=> 4 
0
source

I would rather look at the word boundaries:

 "Lorem Lorem Lorem".scan(/\w+/).size => 3 

If you need to match rock and roll as one word, you can do the following:

 "Lorem Lorem Lorem rock-and-roll".scan(/[\w-]+/).size => 4 
0
source

All Articles