Sort a list of strings based on regular expression matching or something similar

I have a text file that looks something like this:

random text random text, can be anything blabla %A blabla random text random text, can be anything blabla %D blabla random text random text, can be anything blabla blabla %F random text random text, can be anything blabla blabla random text random text, %C can be anything blabla blabla 

When I readlines() it enters, it becomes a list of sentences. Now I want this list to be sorted by letter after % . So basically, when sorting is applied to the above, it should look like this:

 random text random text, can be anything blabla %A blabla random text random text, %C can be anything blabla blabla random text random text, can be anything blabla %D blabla random text random text, can be anything blabla blabla %F random text random text, can be anything blabla blabla 

Is there a good way to do this, or will I have to break each row into columns and then move the letters to a specific column and then sort with key=operator.itemgetter(col) ?

thanks

+4
source share
4 answers
 In [1]: def grp(pat, txt): ...: r = re.search(pat, txt) ...: return r.group(0) if r else '&' In [2]: y Out[2]: ['random text random text, can be anything blabla %A blabla', 'random text random text, can be anything blabla %D blabla', 'random text random text, can be anything blabla blabla %F', 'random text random text, can be anything blabla blabla', 'random text random text, %C can be anything blabla blabla'] In [3]: y.sort(key=lambda l: grp("%\w", l)) In [4]: y Out[4]: ['random text random text, can be anything blabla %A blabla', 'random text random text, %C can be anything blabla blabla', 'random text random text, can be anything blabla %D blabla', 'random text random text, can be anything blabla blabla %F', 'random text random text, can be anything blabla blabla'] 
+6
source

how about this? hope this helps.

 def k(line): v = line.partition("%")[2] v = v[0] if v else 'z' # here z stands for the max value return v print ''.join(sorted(open('data.txt', 'rb'), key = k)) 
+3
source

You can use the key custom function to compare strings. Using lambda syntax, you can write this inline, for example:

 strings.sort(key=lambda str: re.sub(".*%", "", str)); 

Calling re.sub(".*%", "", str) effectively removes anything before the first percent sign, so if there is a percent sign in the line, it will compare what comes after it, otherwise it will compare the whole line.

Pedantically speaking, this not only uses the letter following the percent sign, but also uses everything after. If you want to use a letter, and only the letter will try this slightly larger line:

 strings.sort(key=lambda str: re.sub(".*%(.).*", "\\1", str)); 
+1
source

Here is a quick and dirty approach. Without knowing more about the requirements of a kind, I do not know if this satisfies your needs.

Suppose your list is stored in ' listoflines ':

 listoflines.sort( key=lambda x: x[x.find('%'):] ) 

Note that this sorts all lines without the% character by their final character.

+1
source

All Articles