It use Python to transfer SQL Server to PostgreSQL database
There is some difference between SQL Server and PostgreSQL. The following is the decision I made when transfering. In order to use windows authorition for SQL server, this code is run in windows computer.
-
Schema name, In SQL Server, the default schema is dbo, it is public in PostgreSQL. So, I created table in public in PostgreSQL from dbo in SQL Server. It will have side effect, when you create the foreign key, you might have issue.
-
table name and column name style. In PostgreSQL, the table name is case sensitive. If the table name is not quoted in ", it will convert to lower case. It means when with double quote, "Table_Name" is not same as Table_Name without double quote, but Table_Name is same as table_name.
So, it is safe to convert SQL server name to lower case. It is the same for column name, it is converted to lower case when without double quote. In PostGreSQL, the colum name is suggested to lower case with underscore. -
In SQL Server, when use bcp, for varchar or varbinary as empty, it will be eported as NUL(0x00). It cannot be processed by POSTGreSQL. So, you need do something to replace NUL. I replace it with 0 for binary, and ' ' for varchar
-
Because SQL server BCP do not encode as csv, you should have some logic to encode the string by use Select column and check the column if there is special charactors.
-
When using psql from PostgreSQL, you have to set PGPASSWORD varable name in your user's environment for PostGreSQL username's password. psql do not take password
Prequest:
- Install python package, sqlalchemy
- Install PostgreSQL client in windows server.
- Make bcp.exe(SQL Server export) and psql.exe (PostgreSQL import) in windows path
If the databse is not large, you can just use directly data copy purely using sqlalchemy. If it is a liiter large, you might use BCP and psql in same windows computer. But I have to use bcp because I transfer 4.8TB. I used bcp to export the large tables to csv in USB drive and copied to Macintosh. After that, I start psql and using \copy to batch import data.
Overall, I did in following multiple steps: 1. Create all tables in target 2. Copy all tables using SQL directly (SQLAlchemy) 3. Copy data using BCP and then PSQL 4. BCP output data only 5. After data copied, add all indexes 6. After data copied, similiar to 5, but generate one SQL File for add all indexes and foreign key 7. Check both tables have same row count 8. Drop all tables (if you like to redo the process) 0. Exit