Python: list of strings, changing character color if found (using xlsxwriter)

I have several lists that I write for different columns / rows of an Excel spreadsheet using xlsxwriter in python 2.7. For one list of strings (DNA sequences), I want to find specific characters in a string ('a', 't', 'c', 'g'), change their individual colors, and then write a complete list of strings (multi-color strings, for each character) into one column in a spreadsheet.

So far, the code I wrote is:

row = 1
col = 1
for i in (seqs):
    worksheet.write(row,1,i,green)
    for char in i:
        if i.__contains__("A") or i.__contains__("T") :
            worksheet.write(row,1,i[char],red)
row += 1

Where seqs is my list of sequences. I want A / T to be red and G / C to green and the complete sequence to be a spreadsheet. I don't get any errors, but I either write the whole sequence per line in excel in green, or one character in a line in red. Is there any way to do this / make this code work?

+4
source share
1 answer

You can do this using the XlsxWriter method write_rich_string().

Here is a small working example:

from xlsxwriter.workbook import Workbook

workbook = Workbook('sequences.xlsx')
worksheet = workbook.add_worksheet()

red = workbook.add_format({'color': 'red'})
green = workbook.add_format({'color': 'green'})

sequences = [
    'ACAAGATG',
    'CCATTGTC',
    'CCCCGGCC',
    'CCTGCTGC',
    'GCTGCTCT',
    'CGGGGCCA',
    'GGCCACCG',
]

worksheet.set_column('A:A', 40)

for row_num, sequence in enumerate(sequences):

    format_pairs = []

    # Get each DNA base character from the sequence.
    for base in sequence.upper():

        # Prefix each base with a format.
        if base == 'A' or base == 'T':
            format_pairs.extend((red, base))

        elif base == 'G' or base == 'C':
            format_pairs.extend((green, base))

        else:
            # Non base characters are unformatted.
            format_pairs.append(base)

    worksheet.write_rich_string(row_num, 0, *format_pairs)

workbook.close()

Conclusion:

enter image description here

+6
source

All Articles