Using ora2pg to do a kind of backup

Disclaimer: ora2pg is an amazing tool, but is not supposed to be used as a backup tool!
In this article I’m going to show you how I decided to implement a kind of Oracle-to-PostgreSQL backup by means of ora2pg.

It all started as a simple need: migrate an Oracle database to PostgreSQL to do some experiments.
Therefore I fired up an ora2pg project, and started from there in order to do the migration.
End of the story.
But then I was asked to migrate again the same database, because in the meantime something changed.
And then again, and again.
I’m not saying I was asked to keep the database synchronized, but to sometime load an updated amount of data (and structures) from Oracle to PostgreSQL.
As lazy as I am, after a couple of request I was producing a simple shell script to automate the job, at least about running ora2pg. Yes, this could be less trivial than you think, since ora2pg relies on the Oracle instaclient to be installed (with all the environment set), and Perl to be ready with all the DBD::Oracle, DBI and other stuff in the right place. And this is a little complicated on my machines because I tend to experiment, and so I have a lot of different stuff installed, so I have to fire up the right Perl, with the right modules, and the right environment (I do use perlbrew, in the case you are wondering). In other words, there was some setup work necessary before I could run ora2pg, and that was a perfect candidate for a real shell script.
Then, the number of the databases to do this work on became two, and this was a call for a parametric script…you get the point!
Last but not least, I was not sure about when the migration would happen and when I was asked to load a new bunch of stuff into PostgreSQL, and since my memory is lazier than me, I not always do remember all the required steps to load the extracted part of ora2pg into our beloved database.
And therefore I decided to write a simple shell script to allow me to:

extract data from a customizable Oracle database, assuming ora2pg project is configured;
place the data and structures in a well defined space on my storage;
create a compact and clear psql script to load the extraction into PostgreSQL (yes, I know ora2pg can do this automagically with a PostgreSQL connection, but I have to do it offline).

Let’s start first from how I do add new databases to my script:

export ORACLE_HOME=/opt/oracle/instantclient_18_3
source ~/perl5/perlbrew/etc/bashrc
DATE_DIR=`date '+%Y-%m-%d'`
BACKUP_ROOT=/backup
ORACLE_PG_TEMPALTE="my_oracle_template"

do_ora2pg ora-srv ORADB1 /backup/ora2pg/ORADB1
do_ora2pg ora-srv ORADB2  /backup/ora2pg/ORADB2

The initial part is used to set up Perl and Oracle Instant Client.
The do_ora2pg are the lines that define a single extraction; the arguments to the do_ora2pg shell function are:

Oracle host name;
Oracle schema to which I need to connect;
path to the ora2pg project.

What does the do_ora2pg shell function do?
Here it is:

do_ora2pg()
{
    local SERVER_NAME=$1
    local ORACLE_SCHEMA=$2
    local ORACLE_PROJECT_FOLDER=$3

    local BACKUP_DIR="${BACKUP_ROOT}/${SERVER_NAME}/${DATE_DIR}/ora2pg/${ORACLE_SCHEMA}"
    local PG_DATABASE=$( echo $ORACLE_SCHEMA | tr '[:upper:]' '[:lower:]' )

    if [ ! -d "$BACKUP_DIR" ]; then
        mkdir -p "$BACKUP_DIR"
    else
        rm $BACKUP_DIR/*.sql > /dev/null 2>&1
    fi

    echo -e "\n{ $ORACLE_SCHEMA }\n\t=> PostgreSQL Dump in [$BACKUP_DIR]\n"

    cd $BACKUP_DIR


    TYPES="TABLE VIEW MVIEW INSERT SEQUENCE FUNCTION PROCEDURE TRIGGER"
    counter=1

    cat <<EOF > all.sql
-- Automatic PostgreSQL reload from Oracle
-- $ORACLE_SCHEMA
-- $BACKUP_DIR

\set ON_ERROR_STOP 1
\set QUIET         1

\echo Reload of Oracle schema $PG_DATABASE

DROP DATABASE IF EXISTS $PG_DATABASE;
CREATE DATABASE $PG_DATABASE WITH TEMPLATE $ORACLE_PG_TEMPLATE
\c $PG_DATABASE


CREATE EXTENSION IF NOT EXISTS orafce;


\echo Connected to $PG_DATABASE
\echo Starting the loading batch
\echo

EOF

    for t in $TYPES
    do
        output=$(printf "%02d" $counter)-$t.sql
        echo "$t => $output "
        echo -e "\n\\\echo Batch to load : $output ..." >> all.sql
        ora2pg --c ${ORACLE_PROJECT_FOLDER}/config/ora2pg.conf  -t $t -o $output >> ora2pg.log 2>&1
        if [ $? -eq 0 ]; then
            echo "OK"
            echo -e "\\\i $output" >> all.sql
        else
            echo "KO"
            echo  -e "\\\echo NOT LOADING!" >> all.sql
        fi

        counter=$(( counter + 1 ))
    done

    
}

The function initially creates a BACKUP_DIR that is named after a well defined root, and after the date the backup is took on (I assume to do no more than one per day). The idea is that the backup directory will result in something like /backup/ora-srv/2021-06-11/db1. After a quick check about the existance or not of the backup directory, the script creates a file named all.sql in such directory, placing some psql directives into such file.
Then the script executes ora2pg for the objects I care about, producing a different file name suffix for every kind of invocation, for example 01-TABLES for table structures (schema). If the dump of the objects type is fine, the \i inclusion of that file is placed into all.sql, otherwise an alert is inserted.
The result all.sql is a file like the following:

-- Automatic PostgreSQL reload from Oracle
-- ORADB1
-- /backup/DATI/ora-srv/2021-06-10/ora2pg/ORADB1

\set ON_ERROR_STOP 1
\set QUIET         1

\echo Reload of Oracle schema db1

DROP DATABASE IF EXISTS db1;
CREATE DATABASE db1 WITH TEMPLATE my_oracle_template;
\c db1


CREATE EXTENSION IF NOT EXISTS orafce;


\echo Connected to db1
\echo Starting the loading batch
\echo


\echo Batch to load : 01-TABLE.sql ...
\i 01-TABLE.sql

\echo Batch to load : 02-VIEW.sql ...
\i 02-VIEW.sql

\echo Batch to load : 03-MVIEW.sql ...
\i 03-MVIEW.sql

\echo Batch to load : 04-INSERT.sql ...
\i 04-INSERT.sql

\echo Batch to load : 05-SEQUENCE.sql ...
\i 05-SEQUENCE.sql

\echo Batch to load : 06-FUNCTION.sql ...
\i 06-FUNCTION.sql

\echo Batch to load : 07-PROCEDURE.sql ...
\i 07-PROCEDURE.sql

\echo Batch to load : 08-TRIGGER.sql ...
\i 08-TRIGGER.sql

Therefore, the only thing I have to do when I want to migrate the Oracle content into PostgreSQL, is to launch a command like:

% psql -U luca template1 < all.sql

and wait. This is something easy enough for me to remember even if I have not sleep well!
I’ve experimented with this for a few weeks now, and it is something that is really useful to my use case.
Please note that I create the extension orafce in the reloaded database, because we do use some functions that are dumped and reloaded well by this extension. For that reason, the database on the PostgreSQL side is created by means of a specific template that have the extension already installed.

Conclusions

ora2pg is an amazing tool, that can be used and abused in different ways including doing backups!
I’m sure there are smarter ways to achieve my same aim, and I will report back if I learn about them, so please let me know if you have suggestions!

The article Using ora2pg to do a kind of backup has been posted by Luca Ferrari on June 11, 2021

Tags: postgresql , planet-postgresql-org