We have a directory of 3000 + HTML files that are transferred to the sharepoint site, and we need to clear some data.
Specific situations:
- About 1/3 of the files contain an XML header
<?xml version="1.0" encoding="utf-8"?>, which is not like in sharepoint. We plan to simply remove this title bar. - Each file has javascript options for "HOME" that point to two alternative links to the home page
foo1.htmor foo.htm. We want to change both to absolute linkhttp:\\sharepoint.site\home.aspx - Each file also has a javascript link parameter "Show", which we just want to hide by changing it to
''.
Here is my function:
function scrubXMLHeader {
$srcfiles = Get-ChildItem $backupGuidePath -filter "*htm.*"
$srcfilecount = (Get-ChildItem $backupGuidePath).Count
$selfilecount = $srcfiles.Count
$sourcePath = $backupGuidePath
$destinationPath = $workScrubPath
"Input From: $($sourcePath)" | Log $messagLog -echo
" Output To: $($destinationPath)" | Log $messageLog -echo
$temp01 = Get-ChildItem $sourcePath -filter "*.htm"
foreach($file in $temp01)
{
$outfile = $destinationPath + $file
$content = Get-Content $file.Fullname | ? {$_ -notmatch "<\?xml[^>]+>" }
Set-Content -path $outfile -Force -Value $content
}
}
I want to add the following two changes to each document:
-replace '("foo.htm", "", ">", "Home", "foo1.htm")', '("http:\\sharepoint.site\home.aspx", "", ">", "Home", "http:\\sharepoint.site\home.aspx")
-replace 'addButton("show",BTN_TEXT,"Show","","","","",0,0,"","","");', ''
, , , , open-edit-save/close. , , , , , .
, " HTML, , , , HTML", PowerShell, -replace $content... ? ?
? - ?
$content = Get-Content $file.Fullname | ? {$_ -notmatch "<\?xml[^>]+>",
-replace '("foo.htm", "", ">", "Home", "foo1.htm")', '("http:\\sharepoint.site\home.aspx", "", ">", "Home", "http:\\sharepoint.site\home.aspx"),
-replace 'addButton("show",BTN_TEXT,"Show","","","","",0,0,"","","");', '' }