[thelist] SQL: finding "new" records in a table

Judah McAuley judah at wiredotter.com
Tue Sep 4 11:34:54 CDT 2007


I'm trying to do the heavy lifting at the database level. When I have 
the csv file, I transfer it over the network to another machine. The 
files aren't huge, so I'm not terribly worried about that. But then I 
have to read the csv file in and loop over it, checking to see if each 
is a new appointment (do an insert) or an existing appointment (do an 
update). I'd like to minimize the size of the file I'm looping over for 
efficiency sake and I'd also like to minimize the number of updates I'm 
doing.

The issue with the updates on the destination system is pretty much the 
same issue as on the originating system. I'm going to have some rows 
that are identical for all, say, 12 columns, to what is in the database. 
Those rows don't need an update. But I will have some rows that have one 
or more columns that differ. Those need an update.

What's the most efficient way to take care of that? Right now I'm just 
doing an update if the appointment exists in the destination database. 
That's not bringing the system to its knees or anything. But as it 
grows, I'm worried that too many updates to the same table will cause 
contention locks and slow the system down, so I'd like to minimize the 
number starting now before it gets bad.

Judah

Fred Jones wrote:
>> There isn't anything like an Is_changed flag on the appointments table 
>> and I don't have the option of modifying the table structure or the 
>> process that adds/updates appointments.
> 
> If you can't access the add/update procedure and you can't access the 
> table structure, then indeed the solution which I would imagine would be 
> to compare the last CVS dump with the current data.
> 
> Question is if that will actually save you anything--what are you trying 
> to save? Disk space? CPU cycles? Or just to eliminate redundant CVS 
> data, in which case it makes sense.
> 
> Fred





More information about the thelist mailing list