How to think in Python after working in C ++?

I am new to Python and trying to learn it by replicating the following C ++ function in python

// determines which words in a vector consist of the same letters // outputs the words with the same letters on the same line void equivalentWords(vector <string> words, ofstream & outFile) { outFile << "Equivalent words\n"; // checkedWord is parallel to the words vector. It is // used to make sure each word is only displayed once. vector <bool> checkedWord (words.size(), false); for(int i = 0; i < words.size(); i++) { if (!checkedWord[i]){ outFile << " "; for(int j = i; j < words.size(); j++){ if(equivalentWords(words[i], words[j], outFile)) { outFile << words[j] << " "; checkedWord[j] = true; } } outFile << "\n"; } } } 

In my python code (below), instead of the second vector, I have a list ("words") of line lists, a sorted list of characters in the previous line (since the lines are immutable), and bool (which says if another word is checked). However, I can’t figure out how to change the value when going through the list.

  for word, s_word, checked in words: if not checked: for word1, s_word1, checked1 in words: if s_word1 == s_word: checked1 = True # this doesn't work print word1, print "" 

Any help on this or thinking more about Python.

+7
c ++ python
source share
7 answers

Keeping things simple is O (N) complexity and should be sufficient if you don't have GB data words. Note that set() and dict() are basically a hashed index (free and inline!).

 index = {} for word, s_word in words: index[s_word] = index.get(s_word, []) + [word] for similar_words in index.values(): print ' '.join(similar_words) 

I don’t know what you are using it for, but you might be wondering that in python 2.7 a Counter was introduced into the collections module.

If you really want to save your algorithm and update the logical list (which you are not doing, because this algorithm will do inefficient double loops), you would do it like this:

 checked = [False] * len(words) for i, (word, word_s) in enumerate(words): if checked[i]: continue for j, (other, other_s) in enumerate(words[i:]): if word_s == other_s: print other, checked[i+j] = True print 
+5
source share

I usually like calling catchmeifyoutry, but I personally pulled it a bit further since

 for word in set(words): print word 

Change My answer is a shorter but functionally equivalent form of the original, pre-edited catchmeifyoutry answer.

+3
source share

I think the word you are looking for is Pythonic, here the pythonic code sample for what you associate defines words that are equivalent, where equivalence is determined by the same set of letters

 import collections def print_equivalent_words(words): eq_words = defaultdict(list) for word in words: eq_words["".join(sorted(set(word)))].append(word) for k,v in eq_words.items(): print(v) 
+3
source share

This is not the best algorithm to solve this problem (this is O (N ^ 2), not O (N)), but here is the pythonic version. The method I used is to replace your array of bits with a set containing the words that you have already seen.

 checked = set() for i, word in enumerate(words): if word in checked: continue to_output = [word] for word2 in words[i + 1:]: if equivalentWords(word, word2): to_output.append(word2) checked.add(word2) print ' '.join(to_output) 
+1
source share

Make words list of objects:

 class Word(object): def __init__(self, word, s_word, checked=False): self.word = word self.s_word = s_word self.checked = checked .... for word1 in words: if word1.s_word == word.s_word: word1.checked = True print word1.word print 
0
source share

Based on the comment:

 // determines which words in a vector consist of the same letters // outputs the words with the same letters on the same line 

I'm not quite sure if the source code works, and even if it is, I cannot say that I really like it. First of all, based on nesting cycles, it looks like complexity is O (N 2 ). Secondly, I cannot understand what he is doing well enough to be sure that he is really doing what is stated above (he uses the three-parameter equivalentWords overload, which seems to be absent, which complicates its statement).

Some of the Python solutions are much shorter and simpler - to the point that I'm sure they just don't work. The pair seems to be simply printing out unique words that (at least as I interpret it) are not intended at all.

Here is a C ++ version that interprets the requirements:

 #include <string> #include <set> #include <vector> #include <algorithm> #include <iostream> #include <map> std::string sort_word(std::string word) { std::sort(word.begin(), word.end()); return word; } namespace std { std::ostream & operator<<(std::ostream &os, std::pair<std::string, std::set<std::string> >const &words) { std::copy(words.second.begin(), words.second.end(), std::ostream_iterator<std::string>(os, "\t")); return os; } } void equivalentWords(std::vector<std::string> const &words, std::ostream &os) { typedef std::map<std::string, std::set<std::string> > word_list_t; word_list_t word_list; for (int i=0; i<words.size(); i++) word_list[sort_word(words[i])].insert(words[i]); std::copy(word_list.begin(), word_list.end(), std::ostream_iterator<word_list_t::value_type>(os, "\n")); } int main() { std::vector<std::string> input; std::copy(std::istream_iterator<std::string>(std::cin), std::istream_iterator<std::string>(), std::back_inserter(input)); equivalentWords(input, std::cout); return 0; } 

I think using this as a starting point for a version of Python is more likely to produce clean working results.

0
source share

I would not say that it is pythonic, but I am very proud of it.

 import itertools for _, to_output in itertools.groupby(sorted(words, key=sorted), sorted): print ' '.join(to_output) 
0
source share

All Articles