How to remove punctuation from an item in a list and save it as a separate item in a list?

I am trying to compress elements from one list to another list, and I need to save punctuation as separate elements in the list, because if I do not, "you" and "you"; saved as separate items in the list.

For example, the source list,

['Ask', 'not', 'what', 'your', 'country', 'can', 'do', 'for', 'you;', 'ask', 'what', 'you', 'can', 'do', 'for', 'your', 'country!', 'This', 'is', 'a', 'quote', 'from', 'JFK', 'who', 'is', 'a', 'former', 'American', 'President.']

and a concise list now

['Ask', 'not', 'what', 'your', 'country', 'can', 'do', 'for', 'you;', 'ask', 'you', 'country!', 'This', 'is', 'a', 'quote', 'from', 'JFK', 'who', 'former', 'American', 'President.']

but I want him to have punctuation as separate elements in the list.

My intended output:

['Ask', 'not', 'what', 'your', 'country', 'can', 'do', 'for', 'you', ';', 'ask', '!', 'This', 'is', 'a', 'quote', 'from', 'JFK', 'who', 'former', 'American', 'President', '.']

+4
source share
2 answers

You can implement with regex.

import re
a = ['Ask', 'not', 'what', 'your', 'country', 'can', 'do', 'for', 'you;', 'ask', 'what', 'you', 'can', 'do', 'for', 'your', 'country!', 'This', 'is', 'a', 'quote', 'from', 'JFK', 'who', 'is', 'a', 'former', 'American', 'President.']
result = re.findall(r"[\w']+|[.,!?;]",' '.join(a))

Output

['Ask', 'not', 'what', 'your', 'country', 'can', 'do', 'for', 'you', ';', 'ask', 'what', 'you', 'can', 'do', 'for', 'your', 'country', '!', 'This', 'is', 'a', 'quote', 'from', 'JFK', 'who', 'is', 'a', 'former', 'American', 'President', '.']

, regex.

+2

, , . , .

def separate(mylist):
    newlist = [] 
    test = ''
    a = ''
    for e in mylist:
        for c in e:   
            if not c.isalpha():
                a = c
            else:
                test = test + c
        if a != '':
            newlist = newlist + [test] + [a]
        else:
            newlist = newlist + [test]
        test = ''
        a = ''
    noduplicates = []
    for i in newlist:
        if i not in noduplicates:
            noduplicates = noduplicates + [i]
    return noduplicates

, - , , , , .

0

All Articles