[thelist] question about parsing large files

Jackson Yee jyee at vt.edu
Sun Aug 25 15:03:00 CDT 2002


"Cameron McCormick" <lawngnome007 at hotmail.com> wrote in message
news:F21k7IdIpjPmoExxBYh00017ba9 at hotmail.com...
> in one of my projects, I end up with large status reports (these are
~40,000
> lines) I need to parse each days report against the previous days, to find
> out which items are no longer in the list.

If these status reports contain identical lines and only differ by a couple
of selective lines, then I would suggest the diff utility.  It's used by
programmers to figure out what changes to a given source file have been made
since the last revision and in conjunction with patch to create source
updates, and is especially designed for scanning large text files differing
by only a couple of lines.

See http://www.opennc.org/onlinepubs/7908799/xcu/diff.html for details.

--
Regards,
Jackson Yee
jyee at vt.edu
http://www.jacksonyee.com/





More information about the thelist mailing list