[thelist] php parse

Anthony Ettinger apwebdesign at yahoo.com
Wed Sep 14 18:43:16 CDT 2005


CCCC ---- The Journal Opinion 8/18/05

Title: blah blah blah
Category:  blah1, blah2, blah3
Day: 1

long boring news story

CCCC

I would write a regex to parse out the content between
"CCCC ----" and "CCCC", perhaps staring with split on
"CCCC ----" and then regex out the data into their own
variables, ie - $title, $category, $day, $body.





--- Dan McCullough <dan.mccullough at gmail.com> wrote:

> I'm trying to build an import script to take a file
> that is formatted
> a certain way.  At the beginning of each record is
> setup similar.
> 
> CCCC ---- The Journal Opinion 8/18/05
> 
> Title: blah blah blah
> Category:  blah1, blah2, blah3
> Day: 1
> 
> long boring news story
> 
> CCCC
> 
> CCCC ---- The Journal Opinion 8/18/05
> 
> Title: blah blah blah
> Category:  blah1, blah2, blah3
> Day: 0
> 
> long boring news story
> 
> CCCC
> 
> so on and so forth.
> 
> I seem to be having some issue with getting content
> out on a
> consistent basis, I might get one record and thats
> it, otherwise I
> might get the whole page with only the top missing
> "CCCC ---- The
> Journal Opinion 8/18/05".  The goal is to insert
> this into a table for
> use later, I have permission to use the content
> theres no issue with
> that.
> 
> So here is what I have.
> 
> CODE
> ++++++++++++++++++++++++++++++++++++++++++++++++
> <?php 
> // import.php
> // 8/23/05
> include("config.php");
> include("functions.php");
> 
> $dataFile =
> @fopen("http://www.somedomain.com/newscontent.php",
> "r" ) ;
> 
> if ( $dataFile ) {
> while (!feof($dataFile)) {
> $buffer.=fgets($dataFile,255);
> $start = "CCCC ---- The Journal Opinion 8/18/05";
> $end = "<br> <br>CCCC<br>";
> }
> fclose($dataFile);
> $buffer=str_replace("\n","<br>",$buffer);
> $start_position=strpos($buffer, $start); 
> $end_position=strpos($buffer, $end)+strlen($end); 
> $length=$end_position-$start_position; 
> $buffer=substr($buffer, $start_position, $length);
> echo $buffer;
> }
> 
> ?>
> CODE
> ++++++++++++++++++++++++++++++++++++++++++++++++
> 
> help?
> --
> 
> * * Please support the community that supports you. 
> * *
> http://evolt.org/help_support_evolt/
> 
> For unsubscribe and other options, including the Tip
> Harvester
> and archives of thelist go to:
> http://lists.evolt.org
> Workers of the Web, evolt !
> 


Anthony Ettinger
ph: (408) 656-2473
blog: http://www.chovy.com

__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 


More information about the thelist mailing list