Regular expression for splitting long lines across multiple lines

I am not an expert in regular expressions, and today in my project I am faced with the need to split a long line into several lines in order to check whether the text of the line matches the page height.

I need a C # regex to split long lines in multiple lines into "\n" , "\r\n" and save a maximum of 150 characters. If character 150 is in the middle of a word, the whole word should be moved to the next line.

Can anyone help me?

+7
source share
5 answers

This is actually a pretty simple problem. Look for any characters up to 150, followed by a space. Since Regex is greedy in nature, it will do exactly what you want. Replace it with a match and a new line:

 .{0,150}(\s+|$) 

Replace

 $0\r\n 

See also: http://regexhero.net/tester/?id=75645133-1de2-4d8d-a29d-90fff8b2bab5

+7
source
 var regex = new Regex(@".{0,150}", RegexOptions.Multiline); var strings = regex.Replace(sourceString, "$0\r\n"); 
+1
source

Here you go:

 ^.{1,150}\n 

This will match the longest starting line like this.

0
source

If you just want to split a long string into 150 character strings, then I'm not sure why you need a regular expression:

  private string stringSplitter(string inString) { int lineLength = 150; StringBuilder sb = new StringBuilder(); while (inString.Length > 0) { var curLength = inString.Length >= lineLength ? lineLength : inString.Length; var lastGap = inString.Substring(0, curLength).LastIndexOfAny(new char[] {' ', '\n'}); if (lastGap == -1) { sb.AppendLine(inString.Substring(0, curLength)); inString = inString.Substring(curLength); } else { sb.AppendLine(inString.Substring(0, lastGap)); inString = inString.Substring(lastGap + 1); } } return sb.ToString(); } 

edited to account for word breaks

0
source

This code should help you. It will check the length of the current line. If in this case it is larger than your maxLength (150), it will start with the 150th character and (in reverse order) find the first non-word character (as described by OP, this is a sequence of non-spatial characters), then it will save the line before that character and starts over with the remaining line, repeating until we finish with a substring that is less than maxLength characters. Finally, join them all together again in the final row.

 string line = "This is a really long run-on sentence that should go for longer than 150 characters and will need to be split into two lines, but only at a word boundary."; int maxLength = 150; string delimiter = "\r\n"; List<string> lines = new List<string>(); // As long as we still have more than 'maxLength' characters, keep splitting while (line.Length > maxLength) { // Starting at this character and going backwards, if the character // is not part of a word or number, insert a newline here. for (int charIndex = (maxLength); charIndex > 0; charIndex--) { if (char.IsWhiteSpace(line[charIndex])) { // Split the line after this character // and continue on with the remainder lines.Add(line.Substring(0, charIndex+1)); line = line.Substring(charIndex+1); break; } } } lines.Add(line); // Join the list back together with delimiter ("\r\n") between each line string final = string.Join(delimiter , lines); // Check the results Console.WriteLine(final); 

Note. . If you run this code in a console application, you can change "maxLength" to a lower number so that the console does not wrap you.

Note: This code does not enforce any tab characters. If tabs are also included, your situation becomes a little more complicated.

Update: I fixed a bug in which new lines began with a space.

0
source

All Articles