[thelist] XLS, UTF and MySQL

Fred Jones fredthejonester at gmail.com
Sun Mar 2 12:34:16 CST 2008

I am using third-party code (this Drupal module: 
http://drupal.org/project/import_html ) to process HTML and insert the 
relevant parts into a MySQL DB. The code uses an XLS transformation to 
process the HTML files and extract the STYLE and BODY tags. This all 
works well.

The issue is that I need to surround the STYLE and BODY code with STYLE 
and DIV tags to make the HTML usable. I tried to insert <style> in the 
XLS file but none of these work:


The first generates an error b/c it's read as a tag. The second two are 
taken literally, and not parsed. I put the above texts inside of an 
<xsl:text> tag.

I couldn't figure out how to get the output to be <style> so instead I 
wrote a PHP script to look through the DB, extract the code, do a 
preg_replace and __STYLE__ for example, and replace that with <style> 
and then update the DB. This actually works fine.

The only PROBLEM is that there are UTF characters (Hebrew to be 
specific) in the data and when I run my PHP script, it converts them to 
garbage question mark signs. I have PHP 5 and MySQL 5 on my Win 2K system.

Can anyone suggest a way either to get <style> to come out of the XLS or 
to update the DB via PHP without losing my UTF chars?


More information about the thelist mailing list