Wednesday, April 30, 2003

Working with PostgreSQL

I'm working on migrating a SQL Server database to PostgreSQL. So far it has gone smooth, I did run into two issues today however.

I wrote some scripts that generate the DDL for PostgreSQL from my SQL Server metadata, the scripts also grab all the data, and create INSERT statements (I should probably be using COPY statements). So I have a 5MB file that contains the insert statements, and there was an error somewhere in there. I managed to figure out (after a lot of grep's) which line the error was on, it wasn't obvious to psql because it was a termination issue with a single quote. Here's what the query looked like:

INSERT INTO someTable (someColumn)
VALUES ('This is the content \')
Single quotes are escaped in PostgreSQL with either another single quote ('') or with a backslash. Turns out one of the columns in one of the tables ended with a \ so PostgreSQL just escaped the ending '

The next problem I ran into was with the bit datatype. I use a bit in SQL Server to represent booleans, so I assumed that in PostgreSQL a bit would have the same semantics. The bit datatype in PostgreSQL can actually hold a string of bits, not just 0,1, or NULL like in SQL Server. So a query like this:

SELECT * FROM news WHERE archived = 0
Won't work on PostgreSQL server if archived is of the bit datatype. You would have to do something like this:
SELECT * FROM news WHERE archived = B'0'
Which cast's the string '0' to a binary bit value.

I'd rather not change any SQL code in my applications, so I will be mapping my bit types to PostgreSQL's boolean type. It's actually a handy type that allows for more than just 0/1. You can use the following values for true:

And the following values for false:
This is nice because it corresponds to ColdFusion's boolean datatype.

Another way to resolve this problem would be to use a numeric type, such as integer. Many developers prefer to use integer datatypes over a bit or boolean, because performance is better. Using an integer does require more storage space however.

UPDATE the boolean type requires single quotes around 1 or 0, so in order to keep SQL code consistant I will be using the integer datatype.