See the questions I asked in the comment above.
Assuming you are using GNU sed, and that you are trying to add a final / to your tags to make the XML <img /> and <input /> compatible, then replace sed in your command with this, and it should do the trick: '1h;1!H;${;g;s/\(img\|input\)\( [^>]*[^/]\)>/\1\2\/>/g;p;}'
Here it is in the simplest test file (SO colorizer does stupid things):
$ cat test.html This is an <img tag> without closing slash. Here is an <img tag /> with closing slash. This is an <input tag > without closing slash. And here one <input attrib="1" > that spans multiple lines. Finally one <input attrib="1" /> with closing slash. $ sed -n '1h;1!H;${;g;s/\(img\|input\)\( [^>]*[^/]\)>/\1\2\/>/g;p;}' test.html This is an <img tag/> without closing slash. Here is an <img tag /> with closing slash. This is an <input tag /> without closing slash. And here one <input attrib="1" /> that spans multiple lines. Finally one <input attrib="1" /> with closing slash.
Here is the syntax of the GNU sed syntax and how buffering works to perform multiline search / replace .
Alternatively, you can use something like Tidy , which is designed to disinfect bad HTML - what would I do if I did something more complex than a few simple searches / replacements. The ordered parameters quickly get complicated, so it is usually better to write a script in your chosen scripting language (Python, Perl), which calls libtidy and sets whatever parameters you need.
source share