Strip HTML tags from a string, Ruby edition
require 'hpricot'
page = Hpricot("<b>some marked up <i>text</i></b>")
puts page.to_plain_text
Interestingly the Hpricot FAQ says:
Q: How do I strip all HTML tags from a page?
A: Use regex replace!
A2: The regex is ok, but will break in some cases, even with valid html. Try the to_plain_text or inner_text methods instead.
Michael Barton
Stew
Rohit
Crystal P.
. This post has trackbacks.
