[thelist] character encoding nightmare
anthony at baratta.com
Fri Mar 17 00:31:26 UTC 2017
If you are not concerned about being able to search the data, maybe encoding everything use Base64 before you put it into the database would be helpful.
But it sound like you need to look at the PHP version of each service, and the DB setup / versions too. Something is out of whack there. Maybe comparing the base PHP.config files would help - see if there is something in there that is causing the insert into the second DB to be "extra" converted.
If you can provide an example of a raw string, and the different results of your encoding and conversions in addition to what the final text looks like when pulled back out of the DB, we might be able to recognize the hiccup via the changes being made.
> From: "Garth Hagerman" <hagerman at mcn.org>
> To: thelist at lists.evolt.org
> Date: 03/16/17 16:59
> Subject: Re: [thelist] character encoding nightmare
> >Are you sending the second page the "sanitized" strings, or the raw strings from the form post? Are you running the htmlentities() function on the second page?
> I did not know about the sanitize_string filter. The base setup for this operation just sends the same entity-ized string to DB2 that was inserted into DB1. I don’t see how it’d help. There are some simple html tags in these strings, so my scripts de-entityize the < and >. If I understand the sanitize filter correctly, I’d lose them entirely.
> I’ve tried a number of variations to no avail. I’ve tried sending the raw strings to site2 and entity-izing them there. Same results.
> I’ve used html_entity_decode() on the string on site2, and then re-enity-izing it. Same results.
> * * Please support the community that supports you. * *
> For unsubscribe and other options, including the Tip Harvester
> and archives of thelist go to: http://lists.evolt.org
> Workers of the Web, evolt !
More information about the thelist