[thelist] Counting words on a site

matt newell matt at sweetillusions.org
Thu Mar 28 11:53:01 CST 2002


load the document in homesite:
select all / right click / selection / strip tags

that should leave you with all the text on the page.

i'm presently doing the same thing with a website going into german and
french. so far so good ... as long as you don't have a million pages to
work with.

hth,

// matt



On Thu, 28 Mar 2002, Shaun Anderson wrote:

> I've been asked to see if I can come up with a way of counting the number of
> words on our German site.  They're going to send the English equivalent to
> be translated into Japanese, and are trying to budget for it.
>
> Has anybody ever tried to do this before?  It seems like it'll be pretty
> tricky because I'd need to count words after they had been parsed by the
> browser.  Then I was going to count every space that occurs by it's self.  I
> know it's not perfect, but how off do you think it will be?
>
> So the problem becomes "How do I get the browser parsed text only?"
>
> Could I use a regular expression of some kind?
>
> I'll be coding in ASP if it's possible.  If somebody knows of a program
> that'll do it I'd be happy to use that.
>
> Thanks,
> Shaun Anderson
>
> --
> For unsubscribe and other options, including
> the Tip Harvester and archive of thelist go to:
> http://lists.evolt.org Workers of the Web, evolt !
>




More information about the thelist mailing list