How to split a line and then join it again?

I have the following line in C #.


Then I split it into new char[] {',','.','|',':'} . How would I join this line in the same order as before, with the same characters? Thus, the list will be the same as before.


 string s = "aaa,bbbb.ccc|dddd:eee"; string[] s2 = s.Split(new char[] {',','.','|',':'}); // now s2 = {"aaa", "bbbb", "ccc", "dddd", "eee"} // lets assume I done some operation, and // now s2 = {"xxx", "yyy", "zzz", "1111", "222"} s = s2.MagicJoin(~~~~~~); // I need this // now s = "xxx,yyy.zzz|1111:222"; 


the char[] in the above sample is just a sample, not in the same order or even not displayed at the same time in the real world.


Just think about how to use Regex.split, and then first split on char[] to get a string[] , and then use not the char[] to split get another string[] , and then just return them back. It may work, but I do not know how to encode it.

This might be easier to do with the Regex class:

 input = Regex.Replace(input, @"[^,.|:]+", DoSomething); 

Where DoSomething is a method or lambda that converts a given item, for example:

 string DoSomething(Match m) { return m.Value.ToUpper(); } 

In this example, the output line for "aaa, bbbb.ccc | dddd: eee" will be "AAA, BBBB.CCC | DDDD: EEE".

If you use lambda, you can easily save state around, for example:

 int i = 0; Console.WriteLine(Regex.Replace("aaa,bbbb.ccc|dddd:eee", @"[^,.|:]+", _ => (++i).ToString())); 



It depends on what kind of transformation you do with the elements.


Here you go - it works any combination of separators in any order, which also allows in a situation where the separator is not actually found in the string. It took me a while to come up with this and, posting it, looks more complicated than any other answer!

Okay, I'll keep it here anyway.

 public static string SplitAndReJoin(string str, char[] delimiters, Func<string[], string[]> mutator) { //first thing to know is which of the delimiters are //actually in the string, and in what order //Using ToArray() here to get the total count of found delimiters var delimitersInOrder = (from ci in (from c in delimiters from i in FindIndexesOfAll(str, c) select new { c, i }) orderby ci.i select ci.c).ToArray(); if (delimitersInOrder.Length == 0) return str; //now split and mutate the string string[] strings = str.Split(delimiters); strings = mutator(strings); //now build a format string //note - this operation is much more complicated if you wish to use //StringSplitOptions.RemoveEmptyEntries string formatStr = string.Join("", delimitersInOrder.Select((c, i) => string.Format("{{{0}}}", i) + c)); //deals with the 'perfect' split - ie there always two values //either side of a delimiter if (strings.Length > delimitersInOrder.Length) formatStr += string.Format("{{{0}}}", strings.Length - 1); return string.Format(formatStr, strings); } public static IEnumerable<int> FindIndexesOfAll(string str, char c) { int startIndex = 0; int lastIndex = -1; while(true) { lastIndex = str.IndexOf(c, startIndex); if (lastIndex != -1) { yield return lastIndex; startIndex = lastIndex + 1; } else yield break; } } 

And here is a test you can use to test it:

 [TestMethod] public void TestSplitAndReJoin() { //note - mutator does nothing Assert.AreEqual("a,b", SplitAndReJoin("a,b", ",".ToCharArray(), s => s)); //insert a 'z' in front of every sub string. Assert.AreEqual("zaaa,zbbbb.zccc|zdddd:zeee", SplitAndReJoin("aaa,bbbb.ccc|dddd:eee", ",.|:".ToCharArray(), s => s.Select(ss => "z" + ss).ToArray())); //re-ordering of delimiters + mutate Assert.AreEqual("zaaa,zbbbb.zccc|zdddd:zeee", SplitAndReJoin("aaa,bbbb.ccc|dddd:eee", ":|.,".ToCharArray(), s => s.Select(ss => "z" + ss).ToArray())); //now how about leading or trailing results? Assert.AreEqual("a,", SplitAndReJoin("a,", ",".ToCharArray(), s => s)); Assert.AreEqual(",b", SplitAndReJoin(",b", ",".ToCharArray(), s => s)); } 

Please note that I suggested that you need to do something with the elements of the array in order to manipulate the individual lines before joining them together again - otherwise presumably you just save the original line!

The method builds a dynamic format string. There is no guarantee of effectiveness :)


Here's MagicSplit :

 public IEnumerable<Tuple<string,char>> MagicSplit(string input, char[] split) { var buffer = new StringBuilder(); foreach (var c in input) { if (split.Contains(c)) { var result = buffer.ToString(); buffer.Clear(); yield return Tuple.Create(result,c); } else { buffer.Append(c); } } yield return Tuple.Create(buffer.ToString(),' '); } 

And two types of MagicJoin :

 public string MagicJoin(IEnumerable<Tuple<string,char>> split) { return split.Aggregate(new StringBuilder(), (sb, tup) => sb.Append(tup.Item1).Append(tup.Item2)).ToString(); } public string MagicJoin(IEnumerable<string> strings, IEnumerable<char> chars) { return strings.Zip(chars, (s,c) => s + c.ToString()).Aggregate(new StringBuilder(), (sb, s) => sb.Append(s)).ToString(); } 


 var s = "aaa,bbbb.ccc|dddd:eee"; // simple var split = MagicSplit(s, new char[] {',','.','|',':'}).ToArray(); var joined = MagicJoin(split); // if you want to change the strings var strings = split.Select(tup => tup.Item1).ToArray(); var chars = split.Select(tup => tup.Item2).ToArray(); strings[0] = "test"; var joined = MagicJoin(strings,chars); 

How about this?

 var x = "aaa,bbbb.ccc|dddd:eee"; var matches = Regex.Matches(x, "(?<Value>[^\\.,|\\:]+)(?<Separator>[\\.,|\\:]?)"); var result = new StringBuilder(); foreach (Match match in matches) { result.AppendFormat("{0}{1}", match.Groups["Value"], match.Groups["Separator"]); } Console.WriteLine(result.ToString()); Console.ReadLine(); 

Or if you like LINQ (what I do):

 var x = "aaa,bbbb.ccc|dddd:eee"; var matches = Regex.Matches(x, "(?<Value>[^\\.,|\\:]+)(?<Separator>[\\.,|\\:]?)"); var reassembly = matches.Cast<Match>().Aggregate(new StringBuilder(), (a, v) => a.AppendFormat("{0}{1}", v.Groups["Value"], v.Groups["Separator"])).ToString(); Console.WriteLine(reassembly); Console.ReadLine(); 

Needless to say, you could do something with the parts before reassembling, which I would suggest is the point of this exercise.



