Wednesday, June 9, 2010

migrating from latin1 to utf8 in mysql : part 2

As promised the second part of this article.

to locate the double encoded characters, we made use of an user defined library LIB_MYSQLUDF_PREG

this function allows you to return the position where a certain pattern occurs.
let me explain with this example :

SELECT * FROM Address WHERE
MOD(preg_position('(C{1}[2-9]{1}[8-9A-B]{1}[0-9A-F]{1})',hex(line6)),2) = 1;

this will return us the records where the starting position of a hex string which starts with C2 till C9 and as a second part from 80 till BF is uneven.

so the string ß => hex c39f would be filtered out, because indeed the condition is met

So far how to locate the different cases the next part will tell how we did the whole migration to minimize the downtime.

Monday, June 7, 2010

migrating from latin1 to utf8 in mysql : part 1

Some time ago we had to migrate a latin1 mysql 5.0.37 database to it's utf8 counterpart, due to the addition of new languages to the application (russian, polish, ...)

In the latin1 database we had 3 cases :
1. legal latin1 characters
2. html entities generated by the browser
3. double encoded characters which infact are utf8 characters :
eg ß => if you put this in hex you'll see that this is c3 9f => this
converts to the german ß in utf8

the problem was on how to locate those occurences :
* the first case was not a problem since mysqldump would convert them properly
* the second case was not difficult we used the the function of regexp to locate
these eg. name REGEXP '&#[0-9]+;'
* the third case was much more difficult to find i'll dedicate a 2nd post to this
later

how to enable tracing in OWB 11g R2

It took me a while to figure out what went wrong in my previous post
Sometimes the gui doesn't tell you everything and you need to enable tracing. in OWB 11gR2 this is done by creating a file in the admin directory of the owb installation.


in the $ORACLE_HOME/owb/bin/admin directory you must create following file

owbclient.logging.properties
the contents of this file should look like this :

console.messageFormat={1,time}: {2}
file.messageFormat=[{3} ({4})] {1,time}: <{0}> {2}
handlers=java.util.logging.FileHandler,java.util.logging.ConsoleHandler
#for windows
java.util.logging.FileHandler.pattern=c:/temp/owbclient.log
# name the file whatever you want

#for unix/linux
java.util.logging.FileHandler.pattern=/temp/owbclient.log

# Excessive debug, do not use day to day
.level = DEBUG_ALL
java.util.logging.FileHandler.level=ALL
java.util.logging.ConsoleHandler.level=DEBUG_ALL

Friday, June 4, 2010

connecting to mysql from warehouse builder




I found an excellent article of David Allan explaining how to connect to a mysql database from Oracle Warehouse builder 11G r2.
His solution worked like charm however when I tried to display data out of the imported tables,Owb returned me an error.
When I started to digg deeper and enabled tracing for the owb client I noticed that OWB put double quotes around the query :
Sql Stmt =SELECT * FROM "Address"
mysql didn't appreciate this. I fixed this by adding following to the connection URL of the location :


jdbc:mysql://10.1.1.100/mysql_db?capitalizeTypeNames=FALSE&sessionVariables=sql_mode=ANSI_QUOTES

after which the error dissappeared and I could see the data in the table

there are more parameters that could be given to the open source jdbc driver i used

my first post

Hello all,

welcome to my blog.
I already a long time wanted to start with a blog now with the advent of migrating from mysql to oracle and the implementation of a DWH i finally found some interesting subjects to talk about.

The goal is to talk about Warehouse Builder, mysql, oracle and Oracle BI.