Skip to main content

Heterogeneous Database Sync

Hi

As a part of ORACLE to PostgreSQL Migration, I come across to implement a trigger on Oracle, which sync it's data to PostgreSQL. I have tried with
a simple table as below, which is hopefully helpful to others.

Find this link to configure the heterogeneous dblink to postgres.

I believe, the below approach works effectively with the Primary Key tables of Oracle Database.
If we don't have primary key in a table, then the UPDATE,DELETE statements going to fire multiple times in Postgres, which leads performance issues.

ORACLE
CREATE TABLE test(t INT PRIMARY KEY);

CREATE OR REPLACE TRIGGER testref AFTER INSERT OR UPDATE OR DELETE ON test
FOR EACH ROW
DECLARE
 PRAGMA AUTONOMOUS_TRANSACTION;
 C number;
 N number;
BEGIN
 c:=DBMS_HS_PASSTHROUGH.OPEN_CURSOR@pglink;
 IF INSERTING THEN
  DBMS_HS_PASSTHROUGH.PARSE@pglink(c, 'INSERT INTO test VALUES('||:NEW.t||');');
  n:=DBMS_HS_PASSTHROUGH.EXECUTE_NON_QUERY@pglink(c);
 ELSIF DELETING THEN
  DBMS_HS_PASSTHROUGH.PARSE@pglink(c, 'DELETE FROM test WHERE t='||:OLD.t);
  n:=DBMS_HS_PASSTHROUGH.EXECUTE_NON_QUERY@pglink(c);
 ELSIF UPDATING THEN
  DBMS_HS_PASSTHROUGH.PARSE@pglink(c, 'UPDATE test SET t='||:NEW.T||' WHERE t='||:OLD.T);
  n:=DBMS_HS_PASSTHROUGH.EXECUTE_NON_QUERY@pglink(c);
 END IF;
 DBMS_HS_PASSTHROUGH.CLOSE_CURSOR@pglink(c);
COMMIT;
END;
/
SQL> INSERT INTO test VALUES(1);
SQL> INSERT INTO test VALUES(2);
SQL> INSERT INTO test VALUES(3);


SQL> DELETE FROM test WHERE t=2;

SQL> UPDATE test SET t=10 WHERE t=1;
SQL> COMMIT;
Commit complete.

PostgreSQL

CREATE TABLE test(t INT PRIMARY KEY);
postgres=# SELECT * FROM test;
t
----
3
10
(2 rows)

NOTE
Oracle's DG4ODBC don't support the 2 phase commit with PostgreSQL. In the above case, if we issue ROLLBACK in oracle, the data in postgresql will remain, which will lead to inconsistency. To fix this, i believe we have to use Asynchronous data copy by scheduling a job in Oracle as below.

CREATE SEQUENCE test_seq ORDER;
CREATE TABLE test_backup(SEQ INT, EVT VARCHAR(2), t INT, PRIMARY KEY(SEQ, EVT, t));
-- Trigger, which records the test table's information into test_backup.
CREATE OR REPLACE TRIGGER testref AFTER INSERT OR UPDATE OR DELETE ON test
FOR EACH ROW
DECLARE
BEGIN
IF INSERTING THEN
  INSERT INTO test_backup VALUES(test_seq.nextval, 'I', :NEW.t);
ELSIF DELETING THEN
  INSERT INTO test_backup VALUES(test_seq.nextval, 'D', :OLD.t);
ELSIF UPDATING THEN
    INSERT INTO test_backup VALUES(test_seq.nextval, 'UD', :OLD.t);
  INSERT INTO test_backup VALUES(test_seq.nextval, 'UI', :NEW.t);
END IF;
END;
/

-- Procedure, which sync the test_backup table to postgres test table.
create or replace PROCEDURE SYNCORATOPG
IS
REC TEST_BACKUP%ROWTYPE;
C NUMBER;
N NUMBER;
PRAGMA AUTONOMOUS_TRANSACTION;
TYPE SYNCID IS TABLE OF INTEGER;
SYNCNUM SYNCID;
CNT INT:=0;
BEGIN
  
    BEGIN
        SYNCNUM:=SYNCID();
        FOR REC IN (SELECT * FROM test_backup ORDER BY SEQ ASC)
        LOOP
        SYNCNUM.EXTEND;
        CNT:=CNT+1;
        SYNCNUM(CNT):=REC.SEQ;
        c:=DBMS_HS_PASSTHROUGH.OPEN_CURSOR@TOPG;
        IF REC.EVT = 'I' OR REC.EVT='UI' THEN
           DBMS_HS_PASSTHROUGH.PARSE@TOPG(c, 'INSERT INTO test VALUES('||REC.t||');');
        ELSIF REC.EVT = 'D' OR REC.EVT='UD' THEN
           DBMS_HS_PASSTHROUGH.PARSE@TOPG(c, 'DELETE FROM test WHERE t='||REC.T);
        END IF;
      
         n:=DBMS_HS_PASSTHROUGH.EXECUTE_NON_QUERY@TOPG(c);
         DBMS_HS_PASSTHROUGH.CLOSE_CURSOR@TOPG(c);
        END LOOP;
        
        COMMIT;
    END;
    
    BEGIN
      FOR i IN 1..SYNCNUM.COUNT
      LOOP
        DELETE FROM test_backup WHERE SEQ=SYNCNUM(i);
        SYNCNUM.DELETE(i);
      END LOOP;
       COMMIT;
    END;
  EXCEPTION
    WHEN OTHERS THEN
      RAISE_APPLICATION_ERROR(-20000, 'Fatal error.'||SQLERRM);
END;

--Scheduling a JOB in Oracle, which call the above SYNC procedure for every 1 minute.
BEGIN
DBMS_JOB.isubmit (
job => 113,
what => 'BEGIN SYNCORATOPG; END;',
next_date => SYSDATE,
interval => 'SYSDATE + 1/(24*60) /* 1 Minute */');

COMMIT;
END;
/
-- Inserting Sample Data In Oracle
INSERT INTO test VALUES(-1);
INSERT INTO test VALUES(-2);
INSERT INTO test VALUES(-3);

UPDATE test SET t=-20 WHERE t=-2;
DELETE FROM test WHERE t=-1;
INSERT INTO test VALUES(-2);
COMMIT;

-- PostgreSQL
postgres=# SELECT * FROM test;
 t
----
  -3
 -20
  -2
(3 rows)

As usual, welcome your inputs.

 --Dinesh Kumar

Comments

  1. Hmm...can't test for myself, but I wonder what'll happen if instead of COMMIT, you issue a ROLLBACK? Since you're using pragma autonomous_transaction, which usually means a "logical bug" in the code, you'll end with rows at PG side and no rows at Oracle. Is this intentionally in your case?

    ReplyDelete
  2. This comment has been removed by the author.

    ReplyDelete
  3. Yes, you are true. I used PRAGMA, since the dg4odbc don't support the two phase commit with postgres database. I am not sure, whether this is the behavior with other database as well. In my case, have tried to see 2 phase commit works with other database or not. Any how, i used the AFTER DML event, which guarantees that, the EVENT is successful from Application side. I forgot to add this point as note in my blog, and will do it now.

    Thanks for pointing this.

    ReplyDelete
  4. Afaik, two phase commit is supported (and reserved) for non-free Oracle transparent gateways that you need to license (TG for MSSQL, Sybase, Informix),....so no luck for ODBC and PostgreSQL.

    ReplyDelete
  5. Autonomous transactions from a trigger is a big red flag screaming "my design is broken". Ever heard of "restart" or read Tom Kyte ranting about that in 2008 : http://www.oracle.com/technetwork/issue-archive/2008/08-sep/o58asktom-101055.html ?

    Regarding the update from Oracle to PG, tThe other way, we could also let PostgreSQL query the "database change log" you called "test_backup" through a foreign data wrapper (e.g. http://laurenz.github.io/oracle_fdw/ ). Either way, one should handle the occasional situation when one of the 2 databases is offline.

    That said, you have a big bug : you haven't declared the sequence "ordered". By default ( http://docs.oracle.com/cd/E11882_01/server.112/e41084/statements_6015.htm#SQLRF01314 ) they are unordered, which could lead to subtle bugs if the modifications are applied out-of-order in the postgres database...

    ReplyDelete
  6. I completely agree with you.

    Regarding AUTONOMOUS: It's the one which violates the ACID properties. If we want the SYNC replication, then i believe, we have to go with this. I have also included ASYNC methodology without using AUTONOMOUS in triggers.

    Regarding Oracle_fdw: Sure, we can do that. But, i have just tried the ways what we have in Oracle.

    Regarding ORDER: I think, Numeric sequence always guarantees it's ORDER. I am not sure how that works in RAC, but let me change the DDL statement.

    Thanks for your inputs.

    ReplyDelete

Post a Comment

Popular posts from this blog

How To Send E-Mail From PostgreSQL

Hi , If you want to send E-Mails from PostgreSQL, then use the below Python 3.2 Script as below. I have used ActivePython 3.2 with PostgreSQL 9.1 for sending E-Mails from PostgreSQL. If you want to configure the Python 3.2 with PostgreSQL 9.1 then, please refer the below steps. http://manojadinesh.blogspot.in/2012/06/fatal-python-error-pyinitialize-unable.html Once, your Python 3.2 successful then follow the below steps to send an e-mail. Step 1 ===== postgres=# CREATE OR REPLACE FUNCTION public.send_email(_from Text,_password Text,smtp Text,port INT,receiver text, subject text, send_message text) RETURNS TEXT  LANGUAGE plpython3u AS $function$ import smtplib sender = _from receivers = receiver message = ("From: %s\nTo: %s\nSubject: %s\n\n %s"  % (_from,receiver,subject,send_message)) try:   smtpObj = smtplib.SMTP(smtp,port)   smtpObj.starttls()   smtpObj.login(_from, _password)   smtpObj.sendmail(sender, receivers,message) ...

Parallel Operations With pl/pgSQL

Hi, I am pretty sure that, there will be a right heading for this post. For now, i am going with this. If you could suggest me proper heading, i will update it :-) OK. let me explain the situation. Then will let you know what i am trying to do here, and how i did it. Situation here is, We have a table, which we need to run update on “R” no.of records. The update query is using some joins to get the desired result, and do update the table.  To process these “R” no.of records, it is taking “H” no.of hours. That too, it’s giving load on the production server. So, we planned to run this UPDATE as batch process.  Per a batch process, we took “N” no.or records. To process this batch UPDATE, it is taking “S” no.of seconds. With the above batch process, production server is pretty stable, and doing great. So, we planned to run these Batch updates parallel.  I mean, “K” sessions, running different record UPDATEs. Of-course, we can also increase the Batch size ...

Pgpool Configuration & Failback

I would like to share the pgpool configuration, and it's failback mechanism in this post. Hope it will be helpful to you in creating pgpool and it's failback setup. Pgpool Installation & Configuration 1. Download the pgpool from below link(Latest version is 3.2.1).     http://www.pgpool.net/mediawiki/index.php/Downloads 
2. Untart the pgpool-II-3.2.1.tar.gz and goto pgpool-II-3.2.1 directory. 3. Install the pgpool by executing the below commands:   ./configure ­­prefix=/opt/PostgreSQL92/ ­­--with­-pgsql­-includedir=/opt/PostgreSQL92/include/ --with­-pgsql­-libdir=/opt/PostgreSQL92/lib/ make make install 4. You can see the pgpool files in /opt/PostgreSQL92/bin location. /opt/PostgreSQL92/bin $ ls clusterdb   droplang  pcp_attach_node  pcp_proc_count pcp_systemdb_info  pg_controldata  pgpool pg_test_fsync pltcl_loadmod  reindexdb createdb    dropuser  pcp_detach_node  pcp_proc_info createla...