How to use some words in a text file?

I have a text file that has the usual sentences. In fact, I was in a hurry when typing this file, so I just capitalized the first letter of the first word of the sentence (according to English grammar).

But now I want it to be better if every word is the first letter in capital letters. Sort of:

Every word of this sentence is capitalized.

The point to be noted in the sentence above is and not uppercase, in fact I want to avoid words that are equal to or less than 3 .

What should I do?

+4
source share
5 answers

You should separate words and use only those that are longer than three letters.

words.txt :

 each word of this sentence is capitalized some more words an other line 

-

 import string with open('words.txt') as file: # List to store the capitalised lines. lines = [] for line in file: # Split words by spaces. words = line.split(' ') for i, word in enumerate(words): if len(word.strip(string.punctuation + string.whitespace)) > 3: # Capitalise and replace words longer than 3 (without punctuation). words[i] = word.capitalize() # Join the capitalised words with spaces. lines.append(' '.join(words)) # Join the capitalised lines. capitalised = ''.join(lines) # Optionally, write the capitalised words back to the file. with open('words.txt', 'w') as file: file.write(capitalised) 
+3
source
 for line in text_file: print ' '.join(word.title() if len(word) > 3 else word for word in line.split()) 

Change To eliminate punctuation in the counter, replace len following function:

 def letterlen(s): return sum(c.isalpha() for c in s) 
+5
source

Take a look at NLTK .

Label each word and use it. Words such as β€œif,” β€œout,” are called β€œstop words.” If your criteria match only length, Stephen's answer is a good way to do this. If you want to see stop words, there is a similar question in SO: How to remove stop words using nltk or python .

+4
source

What you really want is called a stop words list. In the absence of this list, you can create it yourself and do the following:

 skipWords = set("of is".split()) punctuation = '.,<>{}][()\'"/\\ ?!@ #$%^&*' # and any other punctuation that you want to strip out answer = "" with open('filepath') as f: for line in f: for word in line.split(): for p in punctuation: # you end up losing the punctuation in the outpt. But this is easy to fix if you really care about it word = word.replace(p, '') if word not in skipwords: answer += word.title() + " " else: answer += word + " " return answer # or you can write it to file continuously 
+1
source

You can add all the elements from a text file to the list:

 list = [] f.open('textdocument'.txt) for elm in f (or text document, I\'m too tired): list.append(elm) 

And as soon as you have all the elements in the list, run a for loop, which checks each element length, and if it is more than three, it returns the first top-level element

 new_list = [] for items in list: if len(item) > 3: item.title() (might wanna check if this works in this case) new_list.append(item) else: new_list.append(item) #doesn't change words smaller than three words, just adds them to the new list 

And look, does it work?

0
source

All Articles