Difference between revisions of "Grepping in XML and other structured files"

From WTFwiki
Jump to navigation Jump to search
(A new page, with stuff in it)
 
(+xgrep)
Line 6: Line 6:
 
Yay!
 
Yay!
  
Also here's a small tip for dealing with ugly XML files: Expat comes with a tool called '''xmllint''' which is able to reformat files thus:
+
Also in apt I found a tool called '''xgrep''' (homepage probably [http://wohlberg.net/public/software/xml/xgrep/]) which is less neat than sgrep but might work better for some cases when you have well-formed XML files (which httpd.conf certainly isn't), because it allows you to specify tag ancestry and such.
 +
 
 +
And a small tip for dealing with ugly XML files: Expat comes with a tool called '''xmllint''' which is able to reformat files thus:
 
  xmllint --format - <uglymess.xml
 
  xmllint --format - <uglymess.xml
  
 
That'll be $5.
 
That'll be $5.

Revision as of 10:44, 17 November 2008

Sometimes you need to grep in a HTML file or an XML file or other kinds of files which are not line-based. This is usually hard or painful using your standard grepping tools. Luckily there's a wonderful tool called sgrep which does exactly this sort of thing. You'll find it in apt if you're using Ubuntu and probably ports if you are using a BSD and if not, you're being really difficult but its homepage might be [1].

Here's an example to show how you might yank all the virtualhost directives out of a httpd.conf:

grep -v ^# httpd.conf | sgrep -i '"<virtualhost".."</virtualhost>"'

Yay!

Also in apt I found a tool called xgrep (homepage probably [2]) which is less neat than sgrep but might work better for some cases when you have well-formed XML files (which httpd.conf certainly isn't), because it allows you to specify tag ancestry and such.

And a small tip for dealing with ugly XML files: Expat comes with a tool called xmllint which is able to reformat files thus:

xmllint --format - <uglymess.xml

That'll be $5.