[thelist] php parse

Dan McCullough dan.mccullough at gmail.com
Wed Sep 14 15:50:29 CDT 2005


I'm trying to build an import script to take a file that is formatted
a certain way.  At the beginning of each record is setup similar.

CCCC ---- The Journal Opinion 8/18/05

Title: blah blah blah
Category:  blah1, blah2, blah3
Day: 1

long boring news story

CCCC

CCCC ---- The Journal Opinion 8/18/05

Title: blah blah blah
Category:  blah1, blah2, blah3
Day: 0

long boring news story

CCCC

so on and so forth.

I seem to be having some issue with getting content out on a
consistent basis, I might get one record and thats it, otherwise I
might get the whole page with only the top missing "CCCC ---- The
Journal Opinion 8/18/05".  The goal is to insert this into a table for
use later, I have permission to use the content theres no issue with
that.

So here is what I have.

CODE ++++++++++++++++++++++++++++++++++++++++++++++++
<?php 
// import.php
// 8/23/05
include("config.php");
include("functions.php");

$dataFile = @fopen("http://www.somedomain.com/newscontent.php", "r" ) ;

if ( $dataFile ) {
while (!feof($dataFile)) {
$buffer.=fgets($dataFile,255);
$start = "CCCC ---- The Journal Opinion 8/18/05";
$end = "<br> <br>CCCC<br>";
}
fclose($dataFile);
$buffer=str_replace("\n","<br>",$buffer);
$start_position=strpos($buffer, $start); 
$end_position=strpos($buffer, $end)+strlen($end); 
$length=$end_position-$start_position; 
$buffer=substr($buffer, $start_position, $length);
echo $buffer;
}

?>
CODE ++++++++++++++++++++++++++++++++++++++++++++++++

help?


More information about the thelist mailing list