Splitting a string into words based on length C # lists

I have a string of words separated by spaces. How to split a string into word lists based on word length?

Example

:

" aa aaa aaaa bb bbb bbbb cc ccc cccc cccc bbb bb aa " 

output:

 List 1 = { aa, bb, cc} List 2 = { aaa, bbb, ccc} List 3 = { aaaa, bbbb, cccc} 
+7
source share
5 answers

Edit: I am glad that my original answer helped the OP solve their problem. However, having thought a little over the problem, I adapted it (and strongly recommend it against my previous solution, which I left at the end of the message).

Simple approach

 string input = " aa aaa aaaa bb bbb bbbb cc ccc cccc cccc bbb bb aa "; var words = input.Trim().Split().Distinct(); var lookup = words.ToLookup(word => word.Length); 

Description

First, we crop the entrance to avoid empty elements from external spaces. Then we split the string into an array. If there are several spaces between words, you need to use StringSplitOptions , as in Mark the answer .

After calling Distinct only include each word once, we convert words from IEnumerable<string> to Lookup<int, string> , where the length of the words is represented by the key (int) , and the words themselves are stored in the value (string) .

Wait, how is this possible? Don't we have a few words for each key? Of course, but exactly what the Lookup class exists for:

Lookup<TKey, TElement> is a collection of keys, each of which maps to one or more values. A Lookup<TKey, TElement> resembles a Dictionary<TKey, TValue> . The difference is that the Dictionary matches keys to single values, while the search matches keys to collections of values.

You can instantiate Lookup by calling ToLookup on an object that implements IEnumerable<T> .


Note
There is no public constructor to create a new instance of Lookup. Furthermore, Lookup objects are immutable, that is, you cannot add or remove elements or keys from a search after it has been created.

word => word.Length is a word => word.Length key element: it determines that we want to index (or group, if you want) Lookup by the length of words.

Using

Enter all words in the console

(similar to the requested query)

 foreach (var grouping in lookup) { Console.WriteLine("{0}: {1}", grouping.Key, string.Join(", ", grouping)); } 

Exit

 2: aa, bb, cc 3: aaa, bbb, ccc 4: aaaa, bbbb, cccc 

Put all words of a certain length in a List

 List<String> list3 = lookup[3].ToList(); 

Key Order

(note that they will return an IOrderedEnumerable<T> , so access to the key is no longer possible)

 var orderedAscending = lookup.OrderBy(grouping => grouping.Key); var orderedDescending = lookup.OrderByDescending(grouping => grouping.Key); 

Original answer - please do not do this (poor performance, code mess):

 string input = " aa aaa aaaa bb bbb bbbb cc ccc cccc cccc bbb bb aa "; Dictionary<int, string[]> results = new Dictionary<int, string[]>(); var grouped = input.Trim().Split().Distinct().GroupBy(s => s.Length) .OrderBy(g => g.Key); // or: OrderByDescending(g => g.Key); foreach (var grouping in grouped) { results.Add(grouping.Key, grouping.ToArray()); } 
+6
source

You can use Where to search for elements matching the predicate (in this case having the correct length):

 string[] words = input.Split(); List<string> twos = words.Where(s => s.Length == 2).ToList(); List<string> threes = words.Where(s => s.Length == 3).ToList(); List<string> fours = words.Where(s => s.Length == 4).ToList(); 

Alternatively, you can use GroupBy to immediately find all groups:

 var groups = words.GroupBy(s => s.Length); 

You can also use ToLookup so you can easily index to find all words of a certain length:

 var lookup = words.ToLookup(s => s.Length); foreach (var word in lookup[3]) { Console.WriteLine(word); } 

Result:

  aaa
 bbb
 ccc

See how it works on the Internet: ideone


In your update, it looks like you want to remove blank lines and duplicate words. You can do the first using StringSplitOptions.RemoveEmptyEntries and the second using Distinct .

 var words = input.Split((char[])null, StringSplitOptions.RemoveEmptyEntries) .Distinct(); var lookup = words.ToLookup(s => s.Length); 

Output:

 aa, bb, cc aaa, bbb, ccc aaaa, bbbb, cccc 

See how it works on the Internet: ideone

+10
source

First, declare a class that may contain a length, as well as a list of words

 public class WordList { public int WordLength { get; set; } public List<string> Words { get; set; } } 

Now we can create a list of word lists with

 string input = " aa aaa aaaa bb bbb bbbb cc ccc cccc "; string[] words = input.Trim().Split(); List<WordList> list = words .GroupBy(w => w.Length) .OrderBy(group => group.Key) .Select(group => new WordList { WordLength = group.Key, Words = group.Distinct().OrderBy(s => s).ToList() }) .ToList(); 

Lists are sorted by length and abatpine, respectively.


Result

enter image description here

eg.

 list[2].WordLength ==> 4 list[2].Words[1] ==> "bbbb" 

UPDATE

If you want, you can process the result right away, instead of putting it in the data structure

 string input = " aa aaa aaaa bb bbb bbbb cc ccc cccc "; var query = input .Trim() .Split() .GroupBy(w => w.Length) .OrderBy(group => group.Key); // Process the result here foreach (var group in query) { // group.Key ==> length of words foreach (string word in group.Distinct().OrderBy(w => w)) { ... } } 
+3
source

You can use Linq GroupBy

edit Now, I used Linq to create the list of strings you wanted for output.

edit2 used multiple input, one output, as in the edited question. This is just a separate challenge in Linq.

 string input = " aa aaa aaaa bb bbb bbbb cc ccc cccc "; var list = input.Split(' '); var grouped = list.GroupBy(s => s.Length); foreach (var elem in grouped) { string header = "List " + elem.Key + ": "; // var line = elem.Aggregate((workingSentence, next) => next + ", " + workingSentence); // if you want single items, use this var line = elem.Distinct().Aggregate((workingSentence, next) => next + ", " + workingSentence); string full = header + " " + line; Console.WriteLine(full); } // output: please note the last blank in the input string! this generates the 0 list List 0: , List 2: cc, bb, aa List 3: ccc, bbb, aaa List 4: cccc, bbbb, aaaa 
+1
source

A bit long solution, but getting the result in the dictionary

 class Program { public static void Main() { Print(); Console.ReadKey(); } private static void Print() { GetListOfWordsByLength(); foreach (var list in WordSortedDictionary) { list.Value.ForEach(i => { Console.Write(i + ","); }); Console.WriteLine(); } } private static void GetListOfWordsByLength() { string input = " aa aaa aaaa bb bbb bbbb cc ccc cccc "; string[] inputSplitted = input.Split(' '); inputSplitted.ToList().ForEach(AddToList); } static readonly SortedDictionary<int, List<string>> WordSortedDictionary = new SortedDictionary<int, List<string>>(); private static void AddToList(string s) { if (s.Length > 0) { if (WordSortedDictionary.ContainsKey(s.Length)) { List<string> list = WordSortedDictionary[s.Length]; list.Add(s); } else { WordSortedDictionary.Add(s.Length, new List<string> {s}); } } } } 
0
source

All Articles