PowerShell: import headerless CSV and remove partial duplicate rows

Question

PowerShell: import headerless CSV and remove partial duplicate rows

I have a log file that is formatted as CSV without headers. The first column is basically a unique identifier for the recorded problems. For the same problem identifier, there may be several lines with different details. I would like to delete rows where the first column is duplicated, because at this time I do not need other data.

At the moment, I have a fairly basic knowledge of PowerShell, so I'm sure that something simple is missing.

Sorry if this is a duplicate, but I could find answers to some parts of the question, but not to the question as a whole.

So far, I am assuming the following:

Import-Csv $outFile | % { Select-Object -Index 1 -Unique } | Out-File $outFile -Append

But this gives me an error:

Import-Csv: Member "LB" is already present. In C: \ Users \ jnurczyk \ Desktop \ Scratch \ POImport \ getPOImport.ps1: 6 char: 1 + Import-Csv $ outFile | % {Select-Object -InputObject $ _ -Index 1 -Unique} | Outside ... + ~~~~~~~~~~~~~~~~~~~~ + CategoryInfo: NotSpecified: (:) [Import-Csv], ExtendedTypeSystemException + FullyQualifiedErrorId: AlreadyPresentPSMemberInfoInternalCollectionAdd, Microsoft.PowerShell.Commands. ImportCsvCommand

+4

powershell csv

Joshua nurczyk Dec 11 '13 at 17:36

source share

3 answers

, , . , . , : A, B, C, A1 D, A2, C1 ..

Function Import-CSVCustom ($csvTemp) {
    $StreamReader = New-Object System.IO.StreamReader -Arg $csvTemp
    [array]$Headers = $StreamReader.ReadLine() -Split "," | % { "$_".Trim() } | ? { $_ }
    $StreamReader.Close()

    $a=@{}; $Headers = $headers|%{
        if($a.$_.count) {"$_$($a.$_.count)"} else {$_}
        $a.$_ += @($_)
    }

    Import-Csv $csvTemp -Header $Headers
}

0

user3818571 10 . '16 23:22

, Sql Script (, , !) script:

SELECT
        '-Header '
            + STUFF((SELECT
                    ',' + QUOTENAME(COLUMN_NAME, '"')
                    + CASE WHEN C.ORDINAL_POSITION % 5 = 0 THEN ' `' + CHAR(13) + CHAR(10) ELSE '' END
                FROM 
                    INFORMATION_SCHEMA.COLUMNS C
                WHERE
                    TABLE_NAME = '<Staging Table Name>'
            FOR XML PATH (''), type).value('.', 'nvarchar(max)'), 1, 1, '')

0

Mark Kram 11 . '17 16:24

Benjamin Hubbard · Accepted Answer · 2013-12-11T19:21:20+0000

Since your data has no headers, you need to specify headers in the cmdlet Import-Csv. And then, to select only unique entries using the first column, you need to specify this in the cmdlet Select-Object. See the code below:

Import-Csv $outFile -Header A,B,C | Select-Object -Unique A

, - A, B C. , , . , . , .

PowerShell: import headerless CSV and remove partial duplicate rows

More articles: