How to split a line and then join it again?

I have the following line in C #.

"aaa,bbbb.ccc|dddd:eee" 

Then I split it into new char[] {',','.','|',':'} . How would I join this line in the same order as before, with the same characters? Thus, the list will be the same as before.

Example

 string s = "aaa,bbbb.ccc|dddd:eee"; string[] s2 = s.Split(new char[] {',','.','|',':'}); // now s2 = {"aaa", "bbbb", "ccc", "dddd", "eee"} // lets assume I done some operation, and // now s2 = {"xxx", "yyy", "zzz", "1111", "222"} s = s2.MagicJoin(~~~~~~); // I need this // now s = "xxx,yyy.zzz|1111:222"; 

EDIT

the char[] in the above sample is just a sample, not in the same order or even not displayed at the same time in the real world.

EDIT

Just think about how to use Regex.split, and then first split on char[] to get a string[] , and then use not the char[] to split get another string[] , and then just return them back. It may work, but I do not know how to encode it.

+4
source share
4 answers

This might be easier to do with the Regex class:

 input = Regex.Replace(input, @"[^,.|:]+", DoSomething); 

Where DoSomething is a method or lambda that converts a given item, for example:

 string DoSomething(Match m) { return m.Value.ToUpper(); } 

In this example, the output line for "aaa, bbbb.ccc | dddd: eee" will be "AAA, BBBB.CCC | DDDD: EEE".

If you use lambda, you can easily save state around, for example:

 int i = 0; Console.WriteLine(Regex.Replace("aaa,bbbb.ccc|dddd:eee", @"[^,.|:]+", _ => (++i).ToString())); 

Outputs:

 1,2.3|4:5 

It depends on what kind of transformation you do with the elements.

+3
source

Here you go - it works any combination of separators in any order, which also allows in a situation where the separator is not actually found in the string. It took me a while to come up with this and, posting it, looks more complicated than any other answer!

Okay, I'll keep it here anyway.

 public static string SplitAndReJoin(string str, char[] delimiters, Func<string[], string[]> mutator) { //first thing to know is which of the delimiters are //actually in the string, and in what order //Using ToArray() here to get the total count of found delimiters var delimitersInOrder = (from ci in (from c in delimiters from i in FindIndexesOfAll(str, c) select new { c, i }) orderby ci.i select ci.c).ToArray(); if (delimitersInOrder.Length == 0) return str; //now split and mutate the string string[] strings = str.Split(delimiters); strings = mutator(strings); //now build a format string //note - this operation is much more complicated if you wish to use //StringSplitOptions.RemoveEmptyEntries string formatStr = string.Join("", delimitersInOrder.Select((c, i) => string.Format("{{{0}}}", i) + c)); //deals with the 'perfect' split - ie there always two values //either side of a delimiter if (strings.Length > delimitersInOrder.Length) formatStr += string.Format("{{{0}}}", strings.Length - 1); return string.Format(formatStr, strings); } public static IEnumerable<int> FindIndexesOfAll(string str, char c) { int startIndex = 0; int lastIndex = -1; while(true) { lastIndex = str.IndexOf(c, startIndex); if (lastIndex != -1) { yield return lastIndex; startIndex = lastIndex + 1; } else yield break; } } 

And here is a test you can use to test it:

 [TestMethod] public void TestSplitAndReJoin() { //note - mutator does nothing Assert.AreEqual("a,b", SplitAndReJoin("a,b", ",".ToCharArray(), s => s)); //insert a 'z' in front of every sub string. Assert.AreEqual("zaaa,zbbbb.zccc|zdddd:zeee", SplitAndReJoin("aaa,bbbb.ccc|dddd:eee", ",.|:".ToCharArray(), s => s.Select(ss => "z" + ss).ToArray())); //re-ordering of delimiters + mutate Assert.AreEqual("zaaa,zbbbb.zccc|zdddd:zeee", SplitAndReJoin("aaa,bbbb.ccc|dddd:eee", ":|.,".ToCharArray(), s => s.Select(ss => "z" + ss).ToArray())); //now how about leading or trailing results? Assert.AreEqual("a,", SplitAndReJoin("a,", ",".ToCharArray(), s => s)); Assert.AreEqual(",b", SplitAndReJoin(",b", ",".ToCharArray(), s => s)); } 

Please note that I suggested that you need to do something with the elements of the array in order to manipulate the individual lines before joining them together again - otherwise presumably you just save the original line!

The method builds a dynamic format string. There is no guarantee of effectiveness :)

+3
source

Here's MagicSplit :

 public IEnumerable<Tuple<string,char>> MagicSplit(string input, char[] split) { var buffer = new StringBuilder(); foreach (var c in input) { if (split.Contains(c)) { var result = buffer.ToString(); buffer.Clear(); yield return Tuple.Create(result,c); } else { buffer.Append(c); } } yield return Tuple.Create(buffer.ToString(),' '); } 

And two types of MagicJoin :

 public string MagicJoin(IEnumerable<Tuple<string,char>> split) { return split.Aggregate(new StringBuilder(), (sb, tup) => sb.Append(tup.Item1).Append(tup.Item2)).ToString(); } public string MagicJoin(IEnumerable<string> strings, IEnumerable<char> chars) { return strings.Zip(chars, (s,c) => s + c.ToString()).Aggregate(new StringBuilder(), (sb, s) => sb.Append(s)).ToString(); } 

Customs:

 var s = "aaa,bbbb.ccc|dddd:eee"; // simple var split = MagicSplit(s, new char[] {',','.','|',':'}).ToArray(); var joined = MagicJoin(split); // if you want to change the strings var strings = split.Select(tup => tup.Item1).ToArray(); var chars = split.Select(tup => tup.Item2).ToArray(); strings[0] = "test"; var joined = MagicJoin(strings,chars); 
+3
source

How about this?

 var x = "aaa,bbbb.ccc|dddd:eee"; var matches = Regex.Matches(x, "(?<Value>[^\\.,|\\:]+)(?<Separator>[\\.,|\\:]?)"); var result = new StringBuilder(); foreach (Match match in matches) { result.AppendFormat("{0}{1}", match.Groups["Value"], match.Groups["Separator"]); } Console.WriteLine(result.ToString()); Console.ReadLine(); 

Or if you like LINQ (what I do):

 var x = "aaa,bbbb.ccc|dddd:eee"; var matches = Regex.Matches(x, "(?<Value>[^\\.,|\\:]+)(?<Separator>[\\.,|\\:]?)"); var reassembly = matches.Cast<Match>().Aggregate(new StringBuilder(), (a, v) => a.AppendFormat("{0}{1}", v.Groups["Value"], v.Groups["Separator"])).ToString(); Console.WriteLine(reassembly); Console.ReadLine(); 

Needless to say, you could do something with the parts before reassembling, which I would suggest is the point of this exercise.

+1
source

Source: https://habr.com/ru/post/1411651/


All Articles