Eric Lippert defies the comma, the best answer?

I wanted to bring this call to the attention of the stackoverflow community. Original problem and answers here . By the way, if you have not followed this before, you should try to read Eric's blog, this is pure wisdom.

Summary:

Write a function that accepts a non-zero IEnumerable and returns a string with the following characteristics:

  • If the sequence is empty, the resulting string is "{}".
  • If the sequence is a single "ABC" element, then the resulting string will be "{ABC}".
  • If the sequence is a sequence of two elements "ABC", "DEF", then the resulting string is "{ABC and DEF}".
  • If the sequence has more than two elements, for example, "ABC", "DEF", "G", "H", then the resulting string is "{ABC, DEF, G and H}". (Note: no Oxford comma!)

As you can see even our own John Skeet (yes, it is well known that he can be in two places at the same time ) published a solution but his (IMHO) is not the most elegant, although you probably cannot defeat his performance.

What do you think? There are pretty good options. I really like one of the solutions, which includes selection and aggregate methods (from Fernando Nicolet). Linq is very effective and devotes some time to such complex tasks so that you learn a lot. I twisted it a bit to make it more realistic and understandable (using Count and avoiding Reverse):

public static string CommaQuibbling(IEnumerable<string> items) { int last = items.Count() - 1; Func<int, string> getSeparator = (i) => i == 0 ? string.Empty : (i == last ? " and " : ", "); string answer = string.Empty; return "{" + items.Select((s, i) => new { Index = i, Value = s }) .Aggregate(answer, (s, a) => s + getSeparator(a.Index) + a.Value) + "}"; } 
+22
c # programming-languages puzzle
Apr 25 '09 at 8:37
source share
24 answers

How about this approach? Purely cumulative - there is no reverse tracking and only repeats once. For raw performance, I'm not sure that you'll work better with LINQ, etc., No matter how “pretty” LINQ's answer can be.

 using System; using System.Collections.Generic; using System.Text; static class Program { public static string CommaQuibbling(IEnumerable<string> items) { StringBuilder sb = new StringBuilder('{'); using (var iter = items.GetEnumerator()) { if (iter.MoveNext()) { // first item can be appended directly sb.Append(iter.Current); if (iter.MoveNext()) { // more than one; only add each // term when we know there is another string lastItem = iter.Current; while (iter.MoveNext()) { // middle term; use ", " sb.Append(", ").Append(lastItem); lastItem = iter.Current; } // add the final term; since we are on at least the // second term, always use " and " sb.Append(" and ").Append(lastItem); } } } return sb.Append('}').ToString(); } static void Main() { Console.WriteLine(CommaQuibbling(new string[] { })); Console.WriteLine(CommaQuibbling(new string[] { "ABC" })); Console.WriteLine(CommaQuibbling(new string[] { "ABC", "DEF" })); Console.WriteLine(CommaQuibbling(new string[] { "ABC", "DEF", "G", "H" })); } } 
+28
Apr 25 '09 at 8:53
source share

Ineffective, but I think clearly.

 public static string CommaQuibbling(IEnumerable<string> items) { List<String> list = new List<String>(items); if (list.Count == 0) { return "{}"; } if (list.Count == 1) { return "{" + list[0] + "}"; } String[] initial = list.GetRange(0, list.Count - 1).ToArray(); return "{" + String.Join(", ", initial) + " and " + list[list.Count - 1] + "}"; } 

If I supported the code, I would prefer the smarter versions.

+33
Apr 25 '09 at 19:52
source share

If I did a lot with threads that needed first / last information, I would have an extension:

 [Flags] public enum StreamPosition { First = 1, Last = 2 } public static IEnumerable<R> MapWithPositions<T, R> (this IEnumerable<T> stream, Func<StreamPosition, T, R> map) { using (var enumerator = stream.GetEnumerator ()) { if (!enumerator.MoveNext ()) yield break ; var cur = enumerator.Current ; var flags = StreamPosition.First ; while (true) { if (!enumerator.MoveNext ()) flags |= StreamPosition.Last ; yield return map (flags, cur) ; if ((flags & StreamPosition.Last) != 0) yield break ; cur = enumerator.Current ; flags = 0 ; } } } 

Then the simplest (and not the fastest, which will require several more convenient extension methods) will be:

 public static string Quibble (IEnumerable<string> strings) { return "{" + String.Join ("", strings.MapWithPositions ((pos, item) => ( (pos & StreamPosition.First) != 0 ? "" : pos == StreamPosition.Last ? " and " : ", ") + item)) + "}" ; } 
+5
Apr 25 '09 at 10:04
source share

Here as Python one liner

 >>> f=lambda s:"{%s}"%", ".join(s)[::-1].replace(',','dna ',1)[::-1] >>> f([]) '{}' >>> f(["ABC"]) '{ABC}' >>> f(["ABC","DEF"]) '{ABC and DEF}' >>> f(["ABC","DEF","G","H"]) '{ABC, DEF, G and H}' 

This version may be easier to understand.

 >>> f=lambda s:"{%s}"%" and ".join(s).replace(' and',',',len(s)-2) >>> f([]) '{}' >>> f(["ABC"]) '{ABC}' >>> f(["ABC","DEF"]) '{ABC and DEF}' >>> f(["ABC","DEF","G","H"]) '{ABC, DEF, G and H}' 
+3
Oct 12 '09 at 0:44
source share

Here's a simple F # solution that only performs one iteration forward:

 let CommaQuibble items = let sb = System.Text.StringBuilder("{") // pp is 2 previous, p is previous let pp,p = items |> Seq.fold (fun (pp:string option,p) s -> if pp <> None then sb.Append(pp.Value).Append(", ") |> ignore (p, Some(s))) (None,None) if pp <> None then sb.Append(pp.Value).Append(" and ") |> ignore if p <> None then sb.Append(p.Value) |> ignore sb.Append("}").ToString() 

(EDIT: Turns out it's a lot like Skeet's.)

Security Code:

 let Test l = printfn "%s" (CommaQuibble l) Test [] Test ["ABC"] Test ["ABC";"DEF"] Test ["ABC";"DEF";"G"] Test ["ABC";"DEF";"G";"H"] Test ["ABC";null;"G";"H"] 
+2
Apr 25 '09 at 9:15
source share

Disclaimer I used this as an excuse to play with new technologies, so my solutions do not meet Eric's original requirements for clarity and ease of maintenance.

Enumerator's naive solution (I admit that the foreach option for this is excellent, as it does not require manual tinkering with the enumerator.)

 public static string NaiveConcatenate(IEnumerable<string> sequence) { StringBuilder sb = new StringBuilder(); sb.Append('{'); IEnumerator<string> enumerator = sequence.GetEnumerator(); if (enumerator.MoveNext()) { string a = enumerator.Current; if (!enumerator.MoveNext()) { sb.Append(a); } else { string b = enumerator.Current; while (enumerator.MoveNext()) { sb.Append(a); sb.Append(", "); a = b; b = enumerator.Current; } sb.AppendFormat("{0} and {1}", a, b); } } sb.Append('}'); return sb.ToString(); } 

LINQ Solution

 public static string ConcatenateWithLinq(IEnumerable<string> sequence) { return (from item in sequence select item) .Aggregate( new {sb = new StringBuilder("{"), a = (string) null, b = (string) null}, (s, x) => { if (sa != null) { s.sb.Append(sa); s.sb.Append(", "); } return new {s.sb, a = sb, b = x}; }, (s) => { if (sb != null) if (sa != null) s.sb.AppendFormat("{0} and {1}", sa, sb); else s.sb.Append(sb); s.sb.Append("}"); return s.sb.ToString(); }); } 

Solution with TPL

This solution uses a producer-consumer queue to supply an input sequence to the processor, while storing at least two elements buffered in the queue. As soon as the manufacturer reaches the end of the input sequence, the last two elements can be processed with special processing.

In retrospect, there is no reason for the consumer to work asynchronously, which would eliminate the need for a parallel queue, but, as I said earlier, I just used this as an excuse to play with new technologies :-)

 public static string ConcatenateWithTpl(IEnumerable<string> sequence) { var queue = new ConcurrentQueue<string>(); bool stop = false; var consumer = Future.Create( () => { var sb = new StringBuilder("{"); while (!stop || queue.Count > 2) { string s; if (queue.Count > 2 && queue.TryDequeue(out s)) sb.AppendFormat("{0}, ", s); } return sb; }); // Producer foreach (var item in sequence) queue.Enqueue(item); stop = true; StringBuilder result = consumer.Value; string a; string b; if (queue.TryDequeue(out a)) if (queue.TryDequeue(out b)) result.AppendFormat("{0} and {1}", a, b); else result.Append(a); result.Append("}"); return result.ToString(); } 

Refined tests for brevity.

+2
Apr 25 '09 at 10:10
source share

I am a fan of the serial comma: I eat, shoot and leave.

I constantly need to solve this problem and solved it in three languages ​​(although not in C #). I would adapt the following solution (in Lua , it does not wrap the answer in braces) by writing a concat method that works on any IEnumerable :

 function commafy(t, andword) andword = andword or 'and' local n = #t -- number of elements in the numeration if n == 1 then return t[1] elseif n == 2 then return concat { t[1], ' ', andword, ' ', t[2] } else local last = t[n] t[n] = andword .. ' ' .. t[n] local answer = concat(t, ', ') t[n] = last return answer end end 
+2
May 02 '09 at 17:30
source share

This is not brilliantly readable, but it scales to tens of millions of lines. I am developing on an old Pentium 4 workstation, and it is 1,000,000 lines of average length 8 in approximately 350 ms.

 public static string CreateLippertString(IEnumerable<string> strings) { char[] combinedString; char[] commaSeparator = new char[] { ',', ' ' }; char[] andSeparator = new char[] { ' ', 'A', 'N', 'D', ' ' }; int totalLength = 2; //'{' and '}' int numEntries = 0; int currentEntry = 0; int currentPosition = 0; int secondToLast; int last; int commaLength= commaSeparator.Length; int andLength = andSeparator.Length; int cbComma = commaLength * sizeof(char); int cbAnd = andLength * sizeof(char); //calculate the sum of the lengths of the strings foreach (string s in strings) { totalLength += s.Length; ++numEntries; } //add to the total length the length of the constant characters if (numEntries >= 2) totalLength += 5; // " AND " if (numEntries > 2) totalLength += (2 * (numEntries - 2)); // ", " between items //setup some meta-variables to help later secondToLast = numEntries - 2; last = numEntries - 1; //allocate the memory for the combined string combinedString = new char[totalLength]; //set the first character to { combinedString[0] = '{'; currentPosition = 1; if (numEntries > 0) { //now copy each string into its place foreach (string s in strings) { Buffer.BlockCopy(s.ToCharArray(), 0, combinedString, currentPosition * sizeof(char), s.Length * sizeof(char)); currentPosition += s.Length; if (currentEntry == secondToLast) { Buffer.BlockCopy(andSeparator, 0, combinedString, currentPosition * sizeof(char), cbAnd); currentPosition += andLength; } else if (currentEntry == last) { combinedString[currentPosition] = '}'; //set the last character to '}' break; //don't bother making that last call to the enumerator } else if (currentEntry < secondToLast) { Buffer.BlockCopy(commaSeparator, 0, combinedString, currentPosition * sizeof(char), cbComma); currentPosition += commaLength; } ++currentEntry; } } else { //set the last character to '}' combinedString[1] = '}'; } return new string(combinedString); } 
+2
May 22 '09 at 20:04
source share

Another option is to separate the logic of punctuation and iteration for the sake of clarity of code. And still think about performance.

It works on request with pure IEnumerable / string / and the lines in the list cannot be null.

 public static string Concat(IEnumerable<string> strings) { return "{" + strings.reduce("", (acc, prev, cur, next) => acc.Append(punctuation(prev, cur, next)).Append(cur)) + "}"; } private static string punctuation(string prev, string cur, string next) { if (null == prev || null == cur) return ""; if (null == next) return " and "; return ", "; } private static string reduce(this IEnumerable<string> strings, string acc, Func<StringBuilder, string, string, string, StringBuilder> func) { if (null == strings) return ""; var accumulatorBuilder = new StringBuilder(acc); string cur = null; string prev = null; foreach (var next in strings) { func(accumulatorBuilder, prev, cur, next); prev = cur; cur = next; } func(accumulatorBuilder, prev, cur, null); return accumulatorBuilder.ToString(); } 

F # certainly looks a lot better:

 let rec reduce list = match list with | [] -> "" | head::curr::[] -> head + " and " + curr | head::curr::tail -> head + ", " + curr :: tail |> reduce | head::[] -> head let concat list = "{" + (list |> reduce ) + "}" 
+2
09 Oct '09 at 9:31
source share

Late entry:

 public static string CommaQuibbling(IEnumerable<string> items) { string[] parts = items.ToArray(); StringBuilder result = new StringBuilder('{'); for (int i = 0; i < parts.Length; i++) { if (i > 0) result.Append(i == parts.Length - 1 ? " and " : ", "); result.Append(parts[i]); } return result.Append('}').ToString(); } 
+2
Mar 23 '10 at 5:53
source share
 public static string CommaQuibbling(IEnumerable<string> items) { int count = items.Count(); string answer = string.Empty; return "{" + (count==0) ? "" : ( items[0] + (count == 1 ? "" : items.Range(1,count-1). Aggregate(answer, (s,a)=> s += ", " + a) + items.Range(count-1,1). Aggregate(answer, (s,a)=> s += " AND " + a) ))+ "}"; } 

It is implemented as

 if count == 0 , then return empty, if count == 1 , then return only element, if count > 1 , then take two ranges, first 2nd element to 2nd last element last element 
+1
Apr 25 '09 at 8:56
source share

How to skip complex aggregation code and just clear the line after it is assembled?

 public static string CommaQuibbling(IEnumerable<string> items) { var aggregate = items.Aggregate<string, StringBuilder>( new StringBuilder(), (b,s) => b.AppendFormat(", {0}", s)); var trimmed = Regex.Replace(aggregate.ToString(), "^, ", string.Empty); return string.Format( "{{{0}}}", Regex.Replace(trimmed, ", (?<last>[^,]*)$", @" and ${last}")); } 

UPDATED: this will not work with comma strings as indicated in the comments. I tried some other options, but without specific rules about what the lines may contain, I will have real problems matching any possible last element with a regex, which makes this a good lesson for me regarding their limitations.

+1
Apr 25 '09 at 9:25
source share

Here's mine, but I understand it pretty much, like Mark, has some slight differences in the order of things, and I also added unit tests.

 using System; using NUnit.Framework; using NUnit.Framework.Extensions; using System.Collections.Generic; using System.Text; using NUnit.Framework.SyntaxHelpers; namespace StringChallengeProject { [TestFixture] public class StringChallenge { [RowTest] [Row(new String[] { }, "{}")] [Row(new[] { "ABC" }, "{ABC}")] [Row(new[] { "ABC", "DEF" }, "{ABC and DEF}")] [Row(new[] { "ABC", "DEF", "G", "H" }, "{ABC, DEF, G and H}")] public void Test(String[] input, String expectedOutput) { Assert.That(FormatString(input), Is.EqualTo(expectedOutput)); } //codesnippet:93458590-3182-11de-8c30-0800200c9a66 public static String FormatString(IEnumerable<String> input) { if (input == null) return "{}"; using (var iterator = input.GetEnumerator()) { // Guard-clause for empty source if (!iterator.MoveNext()) return "{}"; // Take care of first value var output = new StringBuilder(); output.Append('{').Append(iterator.Current); // Grab next if (iterator.MoveNext()) { // Grab the next value, but don't process it // we don't know whether to use comma or "and" // until we've grabbed the next after it as well String nextValue = iterator.Current; while (iterator.MoveNext()) { output.Append(", "); output.Append(nextValue); nextValue = iterator.Current; } output.Append(" and "); output.Append(nextValue); } output.Append('}'); return output.ToString(); } } } } 
+1
Apr 25 '09 at 10:19
source share

I really liked John, but this is because I really like how I approached this problem. Instead of special coding in two variables, I implemented them inside the FIFO queue.

This is strange because I only assumed that there would be 15 posts that all did exactly the same, but it looks like we were the only ones who did it that way. Oh, looking at these answers, Mark Gravel’s answer is pretty close to the method we used, but it uses two “loops” rather than sticking to the values.

But all these answers with LINQ and regex and array concatenation just seem crazy! :-)

+1
Apr 25 '09 at 11:04
source share

I do not think that using the old old array is a limitation. Here is my version using the array and extension method:

 public static string CommaQuibbling(IEnumerable<string> list) { string[] array = list.ToArray(); if (array.Length == 0) return string.Empty.PutCurlyBraces(); if (array.Length == 1) return array[0].PutCurlyBraces(); string allExceptLast = string.Join(", ", array, 0, array.Length - 1); string theLast = array[array.Length - 1]; return string.Format("{0} and {1}", allExceptLast, theLast) .PutCurlyBraces(); } public static string PutCurlyBraces(this string str) { return "{" + str + "}"; } 

I use an array because of the string.Join method and because if the ability to access the last element through the index. The extension method here is due to DRY.

I think that performance disruptions come from calls to list.ToArray() and string.Join , but all in one, I hope that part of the code is pleasant to read and support.

+1
Apr 25 '09 at 12:23
source share

I think Linq provides pretty readable code. This version processes a million "ABCs" in 0.89 seconds:

 using System.Collections.Generic; using System.Linq; namespace CommaQuibbling { internal class Translator { public string Translate(IEnumerable<string> items) { return "{" + Join(items) + "}"; } private static string Join(IEnumerable<string> items) { var leadingItems = LeadingItemsFrom(items); var lastItem = LastItemFrom(items); return JoinLeading(leadingItems) + lastItem; } private static IEnumerable<string> LeadingItemsFrom(IEnumerable<string> items) { return items.Reverse().Skip(1).Reverse(); } private static string LastItemFrom(IEnumerable<string> items) { return items.LastOrDefault(); } private static string JoinLeading(IEnumerable<string> items) { if (items.Any() == false) return ""; return string.Join(", ", items.ToArray()) + " and "; } } } 
+1
Apr 25 '09 at 19:26
source share

You can use foreach without LINQ, delegates, closures, lists or arrays and still have clear code. Use bool and string, for example:

 public static string CommaQuibbling(IEnumerable items) { StringBuilder sb = new StringBuilder("{"); bool empty = true; string prev = null; foreach (string s in items) { if (prev!=null) { if (!empty) sb.Append(", "); else empty = false; sb.Append(prev); } prev = s; } if (prev!=null) { if (!empty) sb.Append(" and "); sb.Append(prev); } return sb.Append('}').ToString(); } 
+1
Apr 27 '09 at 19:18
source share
 public static string CommaQuibbling(IEnumerable<string> items) { var itemArray = items.ToArray(); var commaSeparated = String.Join(", ", itemArray, 0, Math.Max(itemArray.Length - 1, 0)); if (commaSeparated.Length > 0) commaSeparated += " and "; return "{" + commaSeparated + itemArray.LastOrDefault() + "}"; } 
+1
Apr 27 '09 at 20:27
source share

Here is my view. The signature was slightly changed to make it more general. Using .NET 4 functions ( String.Join() with IEnumerable<T> ), otherwise it works with .NET 3.5. The goal was to use LINQ with significantly simplified logic.

 static string CommaQuibbling<T>(IEnumerable<T> items) { int count = items.Count(); var quibbled = items.Select((Item, index) => new { Item, Group = (count - index - 2) > 0}) .GroupBy(item => item.Group, item => item.Item) .Select(g => g.Key ? String.Join(", ", g) : String.Join(" and ", g)); return "{" + String.Join(", ", quibbled) + "}"; } 
+1
Oct 02 2018-10-10
source share

There are several answers other than C #, and the original message asked the answers in any language, so I thought that I would show another way to do this that none of the C # programmers seemed to touch: DSL!

 (defun quibble-comma (words) (format nil "~{~#[~;~a~;~a and ~a~:;~@{~a~#[~; and ~:;, ~]~}~]~}" words)) 

Insight will notice that Common Lisp does not have a built-in IEnumerable<T> , and therefore, FORMAT will only work on the correct list here. But if you made IEnumerable , you could of course extend FORMAT to work on this. (Does it have a clojure?)

Also, anyone who reads this one who has a taste (including Lisp programmers!) Is likely to be offended by the literal "~{~#[~;~a~;~a and ~a~:;~@{~a~#[~; and ~:;, ~]~}~]~}" there. I will not argue that FORMAT implements a good DSL, but I find it extremely useful to have several powerful DSLs to combine strings. Regex is a powerful DSL for breaking lines, and string.Format is a DSL (type) for concatenating strings, but it's stupidly weak.

I think everyone writes this all the time. Why the hell is there no built-in universal, tasteful DSL? I think the closest we have is Perl, maybe.

+1
02 Oct '10 at 20:43
source share

Just for fun, using the new Zip extension method from C # 4.0:

 private static string CommaQuibbling(IEnumerable<string> list) { IEnumerable<string> separators = GetSeparators(list.Count()); var finalList = list.Zip(separators, (w, s) => w + s); return string.Concat("{", string.Join(string.Empty, finalList), "}"); } private static IEnumerable<string> GetSeparators(int itemCount) { while (itemCount-- > 2) yield return ", "; if (itemCount == 1) yield return " and "; yield return string.Empty; } 
+1
Jan 25 2018-11-11T00:
source share
 return String.Concat( "{", input.Length > 2 ? String.Concat( String.Join(", ", input.Take(input.Length - 1)), " and ", input.Last()) : String.Join(" and ", input), "}"); 
+1
Jan 28 '11 at 14:50
source share

I tried using foreach. Please let me know your opinions.

 private static string CommaQuibble(IEnumerable<string> input) { var val = string.Concat(input.Process( p => p, p => string.Format(" and {0}", p), p => string.Format(", {0}", p))); return string.Format("{{{0}}}", val); } public static IEnumerable<T> Process<T>(this IEnumerable<T> input, Func<T, T> firstItemFunc, Func<T, T> lastItemFunc, Func<T, T> otherItemFunc) { //break on empty sequence if (!input.Any()) yield break; //return first elem var first = input.First(); yield return firstItemFunc(first); //break if there was only one elem var rest = input.Skip(1); if (!rest.Any()) yield break; //start looping the rest of the elements T prevItem = first; bool isFirstIteration = true; foreach (var item in rest) { if (isFirstIteration) isFirstIteration = false; else { yield return otherItemFunc(prevItem); } prevItem = item; } //last element yield return lastItemFunc(prevItem); } 
+1
Aug 15 '12 at 5:26
source share

Here are some solutions and test code written in Perl based on answers from http://blogs.perl.org/users/brian_d_foy/2013/10/comma-quibbling-in-perl.html .

 #!/usr/bin/perl use 5.14.0; use warnings; use strict; use Test::More qw{no_plan}; sub comma_quibbling1 { my (@words) = @_; return "" unless @words; return $words[0] if @words == 1; return join(", ", @words[0 .. $#words - 1]) . " and $words[-1]"; } sub comma_quibbling2 { return "" unless @_; my $last = pop @_; return $last unless @_; return join(", ", @_) . " and $last"; } is comma_quibbling1(qw{}), "", "1-0"; is comma_quibbling1(qw{one}), "one", "1-1"; is comma_quibbling1(qw{one two}), "one and two", "1-2"; is comma_quibbling1(qw{one two three}), "one, two and three", "1-3"; is comma_quibbling1(qw{one two three four}), "one, two, three and four", "1-4"; is comma_quibbling2(qw{}), "", "2-0"; is comma_quibbling2(qw{one}), "one", "2-1"; is comma_quibbling2(qw{one two}), "one and two", "2-2"; is comma_quibbling2(qw{one two three}), "one, two and three", "2-3"; is comma_quibbling2(qw{one two three four}), "one, two, three and four", "2-4"; 
+1
Oct 14 '13 at 8:45
source share



All Articles