Find and replace in a large file

I want to find a piece of text in a large XML file and want to replace it with another text. File size is about 50 GB. I want to do this on the command line. I look at Powershell and want to know if it can handle large sizes. I would also like to know the syntax for escaping key statements in powershell. I'm new to PowerShell

I'm currently trying something like this, but he doesn't like it

    Get-Content C:\File1.xml | Foreach-Object {$_ -replace "xmlns:xsi=\"http:\/\/www\.w3\.org\/2001\/XMLSchema-instance\"", ""} | Set-Content C:\File1.xml

The text I want to replace is xmlns: xsi = "http://www.w3.org/2001/XMLSchema-instance" with an empty string "".

Questions

  • Can PowerShell handle large files
  • How do I call powershell script from the command line
  • The syntax for the escape key is statements in PowerShell and the list of key statements in PowerShell.
  • I do not want the replacement to be done in memory and prefer to use streaming on the assumption that the server will not lead to its knees.
  • Are there any other approaches I can take (different tools / strategy?)

thank

+5
source share
4 answers

I had a similar need (and a similar lack of powershell experience), but she collected the complete answer from the other answers on this page and a bit more research.

I also wanted to avoid handling regular expressions, since I don’t need it either - just a simple line replacement - but in a large file, so I didn’t want it to be loaded into memory.

Here is the command I used (adding lines to read):

Get-Content sourcefile.txt
    | Foreach-Object {$_.Replace('http://example.com', 'http://another.example.com')}
    | Set-Content result.txt

! (, , ), , .

+10

, Get-Content/Set-Content. , file1.xml file1.xml.bak temp file1.xml.

  • , . Line-by-line , . -ReadCount 1000 .
  • ? PowerShell? , script , .\myscript.ps1, , c:\users\joe\myscript.ps1 c:\temp\file1.xml.
  • , PowerShell. , PowerShell. , - escape char , . msgstr "$ p1 $ps1". ( : ):

    'Xmlns: XSI = "http://www.w3.org/2001/XMLSchema-instance"

  • , 50 . , . , , , ?

  • , , PowerShell .
+4

, :

Function ReplaceTextIn-File{
  Param(
    $infile,
    $outfile,
    $find,
    $replace
  )

  if( -Not $outfile)
  {
    $outfile = $infile
  }

  $temp_out_file = "$outfile.temp"

  Get-Content $infile | Foreach-Object {$_.Replace($find, $replace)} | Set-Content $temp_out_file

  if( Test-Path $outfile)
  {
    Remove-Item $outfile
  }

  Move-Item $temp_out_file $outfile
}

:

ReplaceTextIn-File -infile "c:\input.txt" -find 'http://example.com' -replace 'http://another.example.com' 
0
source

The escape character in powershell strings is the back side (`), not the backslash (\). I will give an example, but the flip side is also used by wiki markup.: (

The only thing you need to run away is quotes - periods and such should be in order.

-1
source

All Articles