[thelist] CF: List Manipulations
Frank
framar at interlog.com
Fri Oct 19 17:00:24 CDT 2001
>: I've created a little utility to import a tab/return/comma delimited
>: text file into a database. I've got two questions.
>:
>: 1) I want to make it a little more robust, so that if the list is
>: somehow inconsistent that it returns an error. Can someone suggest
>: how I may go about verifying the consistency of a list?
Clarification on the above. The utility serves to allow the user to
upload a text file that has been exported by something like an
emailer or spreadsheet. The reason I'm doing this is that the target
user is rather unsophisticated.
> 1. Does each delimited item in the import file represent
> a single record.
> 2. Does each row of data in the import file represent a
> record where each otherwise delimted record represent a
> field in that record.
Yes. One list item is one email address, to be be imported into a
table where each email address is one row.
> 3. Are any values quotes?
That would be determined by what may export it. I would think that
comma delimited would, but it's not a given.
> 4. If so, and the field and/or record delimiters appear
> within the quotes, are they included within the field?
No. An email cannot contain commas, returns or tabs, so that would
preclude them.
> 5. If row=record, is there a set number of fields that
> you expect in each row?
One field, one row.
> 6. Are you able to determine ahead of time the number of
> expected records?
No, that's variable.
> : 2) I've been trying to figure out how to get my app to
> identify what : delimits a list. Imagine I have a list
> of emails, delimited by : commas. From a machine point
> of view '@', '.' or ',' might be the : delimiters. Now
> if one of the emails happens to be missing with '@' : or
> the '.', once can see how quickly a problem could arise.
What I mean by this more specifically: A human can by sight recognize
an email, even if it's malformed. But how on earth does one go about
instructing a machine to determine that pattern when there is a
possibility of it's being malformed? If it was a given that all
emails would be perfect I could simply use a simple regex like this:
[A-Za-z0-9]+@[A-Za-z0-9]+.[A-Za-z]+
It might not be, and someone might have abc at domain with no .* That
might cause the next email to appear as abc at domaindef@domain.com, and
thus I would have a corrupt.
This is what I mean by ensuring that the list data is consistent. So
I would like to be able to figure out how the list is delimited
without the necessity for the user to know, and then to ensure that
what I import is consistent.
--
Our best destiny, as planetary cohabitants, is the development
of what has been called "species consciousness" - something over
and above nationalisms, blocs, religions, ethnicities.
Frank Marion Framar Studios
frank at framarstudios.com http://www.framarstudios.com
More information about the thelist
mailing list