Custom Parsing String

When parsing an FTX string (free text), I need to break it using + as a delimiter, but only when it is not preceded by an escape character (say ? ). Thus, this line nika ?+ marry = love+sandra ?+ alex = love should be divided into two lines: nika + marry = love and sandra + alex = love . Using String.Split('+') clearly not enough. Can I somehow achieve this?

One way, it seems to me, is to replace the occurrences of ?+ With some unique character (or sequence of characters), say @#@ , splitting using "+" as a separator, and then replacing @#@ Return to + , but it is unreliable and wrong in any possible case that I can think of.

? it is used as an escape character only in combination with : or + , in any other case it is considered as a regular character.

+5
source share
2 answers

Awful regex for breaking it up:

 string str = "nika ?+ marry = love??+sandra ???+ alex = love"; string[] splitted = Regex.Split(str, @"(?<=(?:^|[^?])(?:\?\?)*)\+"); 

It is broken into + ( \+ ), which is preceded by the beginning of a line ( ^ ) or a character not ? ( [^?] ) plus an even number ? ( (?:\?\?)* ). There is a liberal use of (?:) (non-capturing groups) because Regex.Split does fun things if there are several capture groups.

Please note that I do not do unescape! So, in the end ?+ Remains ?+ .

+3
source
 using System; using System.Collections.Generic; using System.Text.RegularExpressions; public class Program { public static void Main() { string s = "nika ?+ marry = love+sandra ?+ alex = love"; string[] result = Regex.Split(s, "\\?{0}\\+", RegexOptions.Multiline); s = String.Join("\n", result); Regex rgx = new Regex("\\?\\n"); s = rgx.Replace(s, "+"); result = Regex.Split(s, "\\n", RegexOptions.Multiline); foreach (string match in result) { Console.WriteLine("'{0}'", match); } } } 

Outputs

 'nika + marry = love' 'sandra + alex = love' 

See https://dotnetfiddle.net/HkcQUw

+1
source

All Articles