Difference between revisions of "Grepping in XML and other structured files"

From WTFwiki
Jump to navigation Jump to search
(A new page, with stuff in it)
(No difference)

Revision as of 11:39, 17 November 2008

Sometimes you need to grep in a HTML file or an XML file or other kinds of files which are not line-based. This is usually hard or painful using your standard grepping tools. Luckily there's a wonderful tool called sgrep which does exactly this sort of thing. You'll find it in apt if you're using Ubuntu and probably ports if you are using a BSD and if not, you're being really difficult but its homepage might be [1].

Here's an example to show how you might yank all the virtualhost directives out of a httpd.conf:

grep -v ^# httpd.conf | sgrep -i '"<virtualhost".."</virtualhost>"'

Yay!

Also here's a small tip for dealing with ugly XML files: Expat comes with a tool called xmllint which is able to reformat files thus:

xmllint --format - <uglymess.xml

That'll be $5.