Regex Matrix Variable Arguments

To increase performance in FORTRAN code, I would like to rearrange the indexes of arrays so that the 4th index is moved to second place, for example, I want to change the next line

ts(l,i,j,k) = ts(l,i,j,k1(i,j)) 

to

 ts(l,k,i,j) = ts(l,k1(i,j),i,j) 

Note that this is just an example of a string, indexes are not always called i, j, k, l ... I just know the name and rank of the array. Therefore, I cannot just separate 4 arguments from commas, since a single argument can also be a matrix having a comma (in the above case, k1 (i, j)). So my first idea

 sed -r 's/ts\(([^,]+),([^,]+),([^)]+),([^,]+)\)/ts\(\1,\4,\2,\3\)/g' *.F 

fails in this case (rhs in the above line of code) because it gives:

 ts(l,k,i,j) = ts(l,j),i,j,k1(i) 

I need this regular expression that splits my array indices only when a maximum of 1 bracket is open. Can someone give me a hint how to do this with sed / python / perl?

best wishes

+4
source share
2 answers

This should work if the brackets are not nested deeper than in your example:

 sed -r 's/ts\(((\([^()]*\)|[^(),])*),((\([^()]*\)|[^(),])*),((\([^()]*\)|[^(),])*),((\([^()]*\)|[^(),])*)\)/ts(\1,\7,\3,\5)/g' *.F 

Not that it was very beautiful ...

Explanation:

 ( # Match and capture... ( # either \( # an opening parenthesis [^()]* # any number of non-parenthesis characters \) # a closing parenthesis | # or [^(),] # a character besides parentheses or comma )* # any number of times ) # End of capturing group 
+2
source

Maybe direct regex is a bit complicated. If you have a script language, try the following. After you find the string containing access to the array. (in python)

 import re def getArguments(rhs): """ Separates string on commas that are in the first level parentheses """ lvl = 0 argSplits = [] for i, c in enumerate(rhs): if c == '(': lvl += 1 if lvl == 1: argSplits.append(i) elif c == ')': lvl -= 1 if lvl == 0: argSplits.append(i) break if lvl < 0: raise ValueError('Parentheses do not match') if lvl == 1: if c == ',': argSplits.append(i) args = [] for i in range(len(argSplits)-1): args.append(rhs[argSplits[i]+1:argSplits[i+1]]) return args line = r'ts(l,i,j,k) = ts(l,i,j,k1(i,j))' # get righthand side of equ rhs = re.split('=', line)[1] # get arguments args = getArguments(rhs) # args = ['l', 'i', 'j', 'k1(i,j)'] # try: line = r'ts(l,i,j,k) = ts(l,i,j,k1(i(am(crazy(!))i),j))' # you get: getArguments(rhs) --> ['l', 'i', 'j', 'k1(i(am(crazy(!))i),j)' 

Once you have a list of arguments, you can simply rearrange them when you return the string together

+2
source

All Articles