[thelist] Server Script Reading HTML

David Kaufman david at gigawatt.com
Tue Aug 20 17:15:01 CDT 2002


Chris <axe at suburbia.com.au> queried thusly:
>
> I have to code some server side scripts which have a requirement to
> read some web pages via HTTP and take action depending on what it
> finds. I am thinking perl would be the best way to go however I have
> not done anything like this with perl before, in fact I've only done
> this before with VB. The tricky part is that the script I code must
> be able to give a certain user agent aswell as a specific referer.
>
> Can anyone point me in the right direction for help / tutorials for
> this kind of thing.

*pointing_to-> perl's most excellent LWP module:
http://search.cpan.org/author/GAAS/libwww-perl-5.65/lib/LWP.pm

and the relevent perl.com FAQ:
http://www.perldoc.com/perl5.6/pod/perlfaq9.html#How-do-I-fetch-an-HTML-file
-

  # simplest solution

  use LWP::UserAgent;
  my $ua = LWP::UserAgent->new;
  my $ua->agent('Mozilla/5.0');
  my $request = HTTP::Request->new(
    'GET',
    'http://www.url.org/whutevuh'
  );
  my $response = $ua->request($request);

  if ($response->is_success) {
    print $response->content;
  } else {
    print "Bad news: " . $response->message;
  }

 # fin

now that wasn't so hard, was it ? :-)

LWP::Simple is even simpler, of course :-)  while less simple interfaces to
LWP allow making POST requests, submitting Basic-Auth usernames and
passwords, accepting and returning cookies, making SSL requests, doing HTTP
uploads, converting the retrieved HTML into plaintext, and following
30X-Moved style redirection response headers.  but perl does try to keep
simple things simple, and LWP is a good example of the philosophy in
practice

hth,

-dave

--
sig {
  # stripped by Morton AntiVirus SMTP filter ...it's
  # for your own protection -- you'll thank us later.
}





More information about the thelist mailing list