[thelist] Strip HTML Tags Regular Expression

Kevin lists at irubin.com
Wed Jul 3 14:06:01 CDT 2002


> Can someone please provide me with the Javascript regular
> expression that will strip out all HTML tags from a page.
>
> Thanks,
> Josh

Here is a quick perl script that does that

$html = qq{
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
<HTML>
<HEAD>
<TITLE></TITLE>
</HEAD>
<BODY BGCOLOR="#FFFFFF" >
HELLO
</BODY>
</HTML>
};
$html =~ s/<([^>]|\n)*>//g ;
$html =~ s/\n//g;
print $html;

> HELLO

--Kevin
lists at irubin.com




More information about the thelist mailing list