Removing duplicate rows from TStringList without sorting in Delphi

I know how to remove duplicate rows from a TStringList using dupignore for a sorted Tstringlist.

CallData := TStringList.Create;
CallData.Sorted := True;
Call.Duplicates := dupIgnore;

But in my case, the lines should not be sorted .

Using FOR search loops for measurements is very slow (also using indexOF ()) when a TStringList has hundreds of thousands of rows.

 if OpenDialog1.Execute then
  begin
    Try
      y := TStringList.create;
      f := TStreamReader.create(OpenDialog1.FileName, TEncoding.UTF8, True);
      while not f.EndOfStream do
      begin
        l := f.ReadLine;
        X.Add(l);
      end;

      g := Tstreamwriter.create('d:\logX.txt', True, TEncoding.UTF8);
      for I := 0 to X.count - 1 do
      begin


          if y.IndexOf(X[I]) = -1 then

          y.Add(X[I]);

      end;

      for j := 0 to y.count - 1 do
        g.WriteLine(y[j]);

    Finally
      f.free;
      y.free;
      g.free;
    End;
  end;

is there a better way?

+6
source share
3 answers

Here is how I would apply this problem:

  • Create the dictionary associated with the string. It doesn't matter what type of value.
  • Iterate the list of strings in reverse order.
  • For each line, check if it is in the dictionary.
  • , . .

, , . , , , , . , , .

:

  • , . , .
  • Count .
  • .
  • , .
  • , . , Count , Count.
  • Count.

, O (1), O (n).

+6

, . :

  y := TStringList.create;
  s := TStringList.create;
  s.Sorted := TRUE;
  s.Duplicates := dupIgnore;

  f := TStreamReader.create(OpenDialog1.FileName, TEncoding.UTF8, True);
  while not f.EndOfStream do
  begin
    l := f.ReadLine;
    s.Add(l);
    if s.Count > y.Count then y.Add(l);
  end;

  // etc.
+2
function compareobjects
          (list     : Tstringlist;
           index1   : integer;
           index2   : integer
          )         : integer;
begin
  if index1 = index2 then
    result := 0
  else
    if integer(list.objects[index1]) < integer(list.objects[index2]) then
      result := -1
    else
      result := 1;
end;

begin
  Try
    y := TStringList.create;
    y.Sorted := true;
    y.Duplicates := dupignore;
    f := TStreamReader.create('c:\106x\q47780823.bat');
    i := 0;
    while not f.EndOfStream do
    begin
      inc(i);
      line := f.readline;
      y.Addobject(line,tobject(i));
    end;
    y.Sorted := false;
    y.CustomSort(compareobjects);

    for i := 0 to y.count - 1 do
      WriteLn(y[i]);

    Finally
      f.free;
      y.free;
  End;
  readln;
end.

I would track the line number ( i) and assign it to the line by casting as an object; sort the list and delete duplicates as before, but then do not sort it using custom sorting on objects.

+1
source

All Articles