[thelist] Strip HTML Tags Regular Expression

Plunkett, Matt MPlunkett at MSA.com
Wed Jul 3 14:23:00 CDT 2002


Josh,

Be very careful trying to do this.  The Perl Cookbook recipe 20.6 cautions
you not to use the regular expression that several people have
offered...namely:

/<[^>]*>/

JavaScript expressions are slightly different than Perl, but the principle
remains the same.

It will break on comments that include <>'s, one line script segments, tags
with line breaks, and so on.

Basically, a regexp is not really appropriate for this problem...I'm not
sure how you get around it in JavaScript, in Perl there are a few modules
you can use instead.

HTH,
Matt

	-----Original Message-----
	From:	Feingold Josh S [SMTP:Josh.S.Feingold at irs.gov]
	Sent:	Wednesday, July 03, 2002 11:23 AM
	To:	'thelist at lists.evolt.org'
	Subject:	[thelist] Strip HTML Tags Regular Expression

	--
	[ Picked text/plain from multipart/alternative ]
	Can someone please provide me with the Javascript regular
	expression that will strip out all HTML tags from a page.

	Thanks,
	Josh
	--
	For unsubscribe and other options, including
	the Tip Harvester and archive of thelist go to:
	http://lists.evolt.org Workers of the Web, evolt !



More information about the thelist mailing list