Search for a list of strings for multiple keywords

I have two python lists, one of them is a list of keywords, and the other is a list of file names. I need to analyze a list of file names based on the keywords that I have. I want python to match the file name with the keyword, and then perform an operation based on which keyword it matches.

What do I see:

keywords = ["_CMD_","_COMM_","_RETRANSMIT_"] file_list = ['2B_CMD_2015.txt','2C_CMD_2015.txt','RETRANSMIT_2015.txt'] for f_name in file_list: for keyword in keywords: if keyword in f_name: #perform operation based on what keyword is matched else: #print an error 

The problem I am facing is that since it goes through the keywords, it prints an error until it finds the keyword that is in the file name and then performs the operation, but I want it to print the error, if none of the keywords are found in the name of the file it is looking for.

I tried using any() , but it seems to stop checking files after it finds a match. For example, using

 for keyword in keywords: if any(keyword in f_name for f_name in file_list): print f_name print keyword 

Returns

 2B_CMD_2015.txt _CMD_ 2B_CMD_2015.txt _RETRANSMIT_ 

This is not true.

Edit Also tried using a regex, but not sure if I am doing this correctly:

 for keyword in keywords: for item in wordlist: if re.search(keyword,item) is not None: print keyword print item else: print "nope" 

Return:

 nope nope nope _CMD_ 2B_CMD_2015.txt _CMD_ 2C_CMD_2015.txt nope nope nope _RETRANSMIT_ _RETRANSMIT_2015.txt nope nope nope 

Can anyone help me with this? I feel it shouldn't be that hard.

+4
source share
5 answers

Using for-else instead of if-else :

 for f_name in file_list: for keyword in keywords: if keyword in f_name: print "Found keyword %s in name %s"%(keyword, f_name) break else: print "Found no keyword" 

Pay attention to the level of indentation. The else block matches for , not if . Also note that if must end with break if you want to avoid for-else execution.

+3
source

for-else can help you. The else clause will be executed if the inner for loop is not interrupted, which happens only if you find a match. Please note that this means that only the first match is considered, and it will not look for more matches.

 keywords = ["_CMD_","_COMM_","_RETRANSMIT_"] file_list = ['2B_CMD_2015.txt','2C_CMD_2015.txt','RETRANSMIT_2015.txt'] for f_name in file_list: for keyword in keywords: if keyword in f_name: #perform operation based on what keyword is matched break else: #print an error 
+1
source

The main way to do this is to set a flag:

 for f_name in file_list: flag = False for keyword in keywords: if keyword in f_name: flag = True #perform operation based on what keyword is matched if not flag: #print an error 
+1
source

Filter the list using any , and then use it:

 keywords = ["_CMD_","_COMM_","_RETRANSMIT_"] file_list = ['2B_CMD_2015.txt','2C_CMD_2015.txt','RETRANSMIT_2015.txt'] filtered = [file_name for file_name in file_list if any(keyword in file_name for keyword in keywords)] if filtered: # do stuff with 'filtered' print("processing files...") else: print("error") 

Example:

 >>> keywords = ["_CMD_","_COMM_","_RETRANSMIT_"] >>> file_list = ['2B_CMD_2015.txt','2C_CMD_2015.txt','RETRANSMIT_2015.txt'] >>> filtered = [file_name for file_name in file_list if any(keyword in file_name for keyword in keywords) ... ... ] >>> filtered ['2B_CMD_2015.txt', '2C_CMD_2015.txt'] 
0
source

I suggest making keywords list of tuples that associates each keyword with a handler. You can use the for..else construct to process files that do not match. Consider, for example:

 def handleCmd(fn): print "handleCmd: " + fn def handleComm(fn): print "handleComm: " + fn def handleRetransmit(fn): print "handleRetransmit: " + fn keywords = [ ( "_CMD_", handleCmd ), ( "_COMM_", handleComm ), ( "RETRANSMIT_", handleRetransmit ), ] file_list = ['2B_CMD_2015.txt','2C_CMD_2015.txt','RETRANSMIT_2015.txt','bogus.t> for fn in file_list: for kw, handle in keywords: if kw in fn: handle(fn) break else: print "OH NOE" 

Will print

 handleCmd: 2B_CMD_2015.txt handleCmd: 2C_CMD_2015.txt handleRetransmit: RETRANSMIT_2015.txt OH NOE 
0
source

All Articles