Create only certain parts of a string in uppercase

For my project, I am parsing reddit comments and trying to capitalize them. It works great with plain text, but with things like links, it creates errors by capitalizing the characters in the link and therefore makes the link unusable. Links to reddit now have a very simple syntax, namely [text](url). I would like to deduce [text](url), knowing that my comment is likely to contain other content.

I tried using regular expressions, but this does not work while I still think I'm pretty close:

import re

def make_upper(comment):
    return re.sub(r"\[([^ ]*)\]\(([^ ]*)\)", r"[\1]".upper() + r"(\2)", comment)

This regex is a bit messy (like all regexes;), but still readable to me, and I don't mind explaining it if necessary.

Input:

"blablabla (btw, blargh!) [text](url) blabla"

Required Conclusion:

"blablabla (btw, blargh!) [text](url) blabla"
+4
3

, . , , -, .

1:

, , , , .

MARKDOWN_LINK_REGEX = re.compile(r'\[(.*?)\]\((.*?)\)')

, , , . , , .

2:

def uppercase_comment(comment):
    result = comment.upper()
    for match in re.finditer(MARKDOWN_LINK_REGEX, comment):
        upper_match = match.group(0).upper()
        corrected_link = upper_match.replace(match.group(2).upper(),
                                             match.group(2))
        result = result.replace(upper_match, corrected_link)
    return result

-, , . , regex (re.finditer), , , , . , - . (, , ), , , .

3: Out

if __name__ == '__main__':
    print(uppercase_comment('this comment has zero links'))
    print(uppercase_comment('this comment has [a link at the end](Google.com)'))
    print(uppercase_comment('[there is a link](Google.com) at the beginning'))
    print(uppercase_comment('there is [a link](Google.com) in the middle'))
    print(uppercase_comment('there are [a few](Google.com) links [in](StackOverflow.com) this one'))
    print(uppercase_comment("doesn't matter (it shouldn't!) if there are [extra parens](Google.com)"))

THIS COMMENT HAS ZERO LINKS
THIS COMMENT HAS [A LINK AT THE END](Google.com)
[THERE IS A LINK](Google.com) AT THE BEGINNING
THERE IS [A LINK](Google.com) IN THE MIDDLE
THERE ARE [A FEW](Google.com) LINKS [IN](StackOverflow.com) THIS ONE
DOESN'T MATTER (IT SHOULDN'T!) IF THERE ARE [EXTRA PARENS](Google.com)

. , Python 3, 3- , , - . ​​

+2

, , , :

def make_upper(comment):
    native_links = re.findall(r"\[.*?\]\((.*?)\)", comment)
    result = comment.upper()
    for url in native_links:
        result = result.replace(url.upper(), url)
return result

, , , , !

+2

The previous answers are based on a global search and replace to handle the solution. This is not the case when the link URL is also contained in the rest of the text. A guaranteed safe way to do this is to build a new line as you go.

For example, although this is not elegant, it does the trick:

MARKDOWN_LINK_REGEX = re.compile(r'(.*?)(\[.*?\])(\(.*?\))(.*)')

def uppercase_comment(text):
    result = ""
    while True:
        match = MARKDOWN_LINK_REGEX.match(text)
        if match:
            result += match.group(1).upper() + match.group(2).upper()
            result += match.group(3)
            text = match.group(4)
        else:
            result += text.upper()
            return result
+2
source

All Articles