Why doesn't LINQ's Union function delete duplicate entries?

I use VB.NET, and I know that Union usually runs ByRef, but in VB, strings are usually processed as if they were primitive data types.

Therefore, here is the problem:

Sub Main() Dim firstFile, secondFile As String(), resultingFile As New StringBuilder firstFile = My.Computer.FileSystem.ReadAllText(My.Computer.FileSystem.SpecialDirectories.Desktop & "\1.txt").Split(vbNewLine) secondFile = My.Computer.FileSystem.ReadAllText(My.Computer.FileSystem.SpecialDirectories.Desktop & "\2.txt").Split(vbNewLine) For Each line As String In firstFile.Union(secondFile) resultingFile.AppendLine(line) Next My.Computer.FileSystem.WriteAllText(My.Computer.FileSystem.SpecialDirectories.Desktop & "\merged.txt", resultingFile.ToString, True) End Sub 

1.txt contains:

b
from
d
e

2.txt contains:
b
from
d
e
e
g
h
I
J

After running the code, I get:

b
from
d
e
b
e
g
h
I
J

Any suggestions for creating the Union function act as its mathematical counterpart?

+6
linq union
source share
2 answers

Linq Union does everything you need. Make sure your input files are correct (for example, one of the lines may contain a space before a new line) or Trim() lines after separation?

 var list1 = new[] { "a", "s", "d" }; var list2 = new[] { "d", "a", "f", "123" }; var union = list1.Union(list2); union.Dump(); // this is a LinqPad method 

In linqpad, the result is {"a", "s", "d", "f", "123" }

+16
source share

I think you want to use the Distinct function. At the end of your LINQ statement, run .Distinct();

 var distinctList = yourCombinedList.Distinct(); 

Like 'SELECT DISTINCT' in SQL :)

+2
source share

All Articles