Skip to main content

N-Node Mutlimaster Replication With Bucardo...!

Our team recently got  a problem, which is to solve the N-Node multi master replication in PostgreSQL.

We all know that, there are some other db engines like Postgres-XC which works in this way. But, we don't have any tool available in PostgreSQL, except Bucardo.

Bucardo is the nice solution for 2-Nodes. Is there a way we can exceed this limitation from 2 to N..?

As an initial step on this, I have done with 3 Nodes, which I believe, we can extend this upto N. { I might be wrong here.}

Please follow the below steps to set up the 1 - 1 multi master replication.

1. Follow the below steps to get all the pre-requisites for the Bucardo.

yum install perl-DBIx-Safe
or
apt-get install libdbix-safe-perl

Install the below components from CPAN.

DBI
DBD::Pg
Test::Simple
boolean (Bucardo 5.0 and higher)

Download the latest tarball from here.

tar xvfz Bucardo-4.4.8.tar.gz
cd Bucardo-4.4.8
perl Makefile.PL
make
sudo make install
2. We need to create plperl extension in db. For this, download the required active perl to set up or simply do as below.
apt-get install postgresql-plperl-9.3
or
yum install postgresql-plperl-9.3

Make a copy of Old $PGLIBPATH/plperl.so
Move the new plperl.so to $PGLIBPATH/plperl.so
3. plperl extension.
Create extension plperl;
4. Create 3 databases like "node1", "node2", "node3".
CREATE DATABASE node1;
CREATE DATABASE node2;
CREATE DATABASE node3;
5. Execute below statements on 2 databases (node1, node2).
CREATE TABLE test1(t INT PRIMARY KEY);
INSERT INTO test1 VALUES(generate_series(1, 10));
6. Install Bucardo catalog database using the below command.
postgres@dinesh-VirtualBox:~/Bucardo-4.5.0$ ./bucardo_ctl install
Postgres version is: 9.3
Attempting to create and populate the bucardo database and schema
Database creation is complete
Connecting to database 'bucardo' as user 'bucardo'
Updated configuration setting "piddir"
Installation is now complete.
7. Adding databases, those will be part of mutli-master replication.
postgres@dinesh-VirtualBox:~/Bucardo-4.5.0$ ./bucardo_ctl add db db1 dbname="node1"
Added database "db1"
postgres@dinesh-VirtualBox:~/Bucardo-4.5.0$ ./bucardo_ctl add db db2 dbname="node2"
Added database "db2"
8. Creating a herd of db1 tables.
postgres@dinesh-VirtualBox:~/Bucardo-4.5.0$ ./bucardo_ctl add all tables db=db1 herd=db1_herd
Creating herd: db1_herd
New tables added: 1
Already added: 0
9. Creating a sync of this herd from db1->db2.
postgres@dinesh-VirtualBox:~/Bucardo-4.5.0$ ./bucardo_ctl add sync db1_to_db2 type=pushdelta source=db1_herd targetdb=db2
Added sync "db1_to_db2"
10. Creating a herd of db2 tables.
postgres@dinesh-VirtualBox:~/Bucardo-4.5.0$ ./bucardo_ctl add all tables db=db2 herd=db2_herd
Creating herd: db2_herd
New tables added: 1
Already added: 0
11. Creating a sync of this herd from db2->db1.
postgres@dinesh-VirtualBox:~/Bucardo-4.5.0$ ./bucardo_ctl add sync db2_to_db1 type=pushdelta source=db2_herd targetdb=db1
Added sync "db2_to_db2"
12. Start the Bucardo process. Default log location for the Bucardo instance is SYSLOG.
postgres@dinesh-VirtualBox:~/Bucardo-4.5.0$ ./bucardo_ctl start
Checking for existing processes
Removing /tmp/fullstopbucardo
Starting Bucardo

postgres@dinesh-VirtualBox:~/Bucardo-4.5.0$ ps -ef|grep bucardo
postgres 2740 1128 0 21:26 ? 00:00:00 postgres: bucardo bucardo [local] idle
postgres 2743 1128 0 21:26 ? 00:00:00 postgres: bucardo node1 [local] idle
13. Check the DML activities between these two nodes.
node1=# INSERT INTO test1 VALUES(-1);
INSERT 0 1
node1=# \c node2
node2=# SELECT * FROM test1 WHERE t=-1;
 t
---
-1
(1 row)

node2=# INSERT INTO test1 VALUES(-2)
INSERT 0 1
node2=# \c node1
node1=# SELECT * FROM test1 WHERE t=-2;
 t
---
-2
(1 row)
14. Check the status from the below command.
postgres@dinesh-VirtualBox:~/Bucardo-4.5.0$ ./bucardo_ctl status
Days back: 3  User: bucardo  Database: bucardo  PID of Bucardo MCP: 2739
Name       Type  State PID  Last_good Time  I/U/D Last_bad Time
==========+=====+=====+====+=========+=====+=====+========+====
db1_to_db2| P   |idle |2746|7m49s    |0s   |1/0/0|unknown |    
db2_to_db1| P   |idle |2745|6m11s    |0s   |1/0/0|unknown |    
 
It seems everything is perfect, and data is replicating from node1 to node2 and vice-versa. Now, let's try to add a new node, "node3" as part of this mutli-master replication.

15. Add new replication database to Bucardo.
postgres@dinesh-VirtualBox:~/Bucardo-4.5.0$ ./bucardo_ctl add db db3 dbname="node3"
Added database "db3"
16. Dump and restore of test1 table from either node1 or node2.

17. Create a herd of db3 tables
postgres@dinesh-VirtualBox:~/Bucardo-4.5.0$ ./bucardo_ctl add all tables db=db3 herd=db3_herd
Creating herd: db3_herd
New tables added: 1
Already added: 0
18. Create a new sync from db2->db3, db3->db2.
postgres@dinesh-VirtualBox:~/Bucardo-4.5.0$ ./bucardo_ctl add sync db2_to_db3 type=pushdelta source=db2_herd targetdb=db3
Added sync "db2_to_db3"
postgres@dinesh-VirtualBox:~/Bucardo-4.5.0$ ./bucardo_ctl add sync db3_to_db2 type=pushdelta source=db3_herd targetdb=db2
Added sync "db3_to_db2"

19. Restart the Bucardo to check for the latest node status.
postgres@dinesh-VirtualBox:~/Bucardo-4.5.0$ ./bucardo_ctl stop
Creating /tmp/fullstopbucardo ... Done

postgres@dinesh-VirtualBox:~/Bucardo-4.5.0$ ./bucardo_ctl start
Checking for existing processes
Removing /tmp/fullstopbucardo
Starting Bucardo
 

postgres@dinesh-VirtualBox:~/Bucardo-4.5.0$ ./bucardo_ctl status
Days back: 3  User: bucardo  Database: bucardo  PID of Bucardo MCP: 2553
Name       Type  State PID  Last_good Time  I/U/D Last_bad Time
==========+=====+=====+====+=========+=====+=====+========+====
db1_to_db2| P   |idle |2563|1s       |0s   |0/0/0|unknown |    
db2_to_db1| P   |idle |2560|1s       |0s   |0/0/0|unknown |    
db2_to_db3| P   |idle |2561|0s       |0s   |1/0/1|unknown |    
db3_to_db2| P   |idle |2562|0s       |0s   |0/0/0|unknown |    
20. Here is the small hack, which makes you to embed this db3 into the multi master replication. As Burcardo replication system works based on triggers, we need to make all the triggers what it created on nodes to "ENABLE ALWAYS" for each table.
Node1
-=-=-
node1=# \d test1
     Table "public.test1"
 Column |  Type   | Modifiers 
--------+---------+-----------
 t      | integer | not null
Indexes:
    "test1_pkey" PRIMARY KEY, btree (t)
Triggers:
    bucardo_add_delta AFTER INSERT OR DELETE OR UPDATE ON test1 FOR EACH ROW EXECUTE PROCEDURE bucardo.bucardo_add_delta_t()
    bucardo_triggerkick_db1_to_db2 AFTER INSERT OR DELETE OR UPDATE OR TRUNCATE ON test1 FOR EACH STATEMENT EXECUTE PROCEDURE bucardo.bucardo_triggerkick_db1_to_db2()

node1=# ALTER TABLE test1 ENABLE ALWAYS TRIGGER bucardo_add_delta;
ALTER TABLE
node1=# ALTER TABLE test1 ENABLE ALWAYS TRIGGER bucardo_triggerkick_db1_to_db2;
ALTER TABLE

Do "ENABLE ALWAYS" these two triggers on this table. Similarly, do the same activity on all nodes.

Testing Multi-Master replication.
node3=# INSERT INTO test1 VALUES(-1010);
INSERT 0 1
node3=# \c node2
node2=# SELECT * FROM test1 WHERE t=-1010;
   t   
-------
 -1010
(1 row)

node2=# \c node1
node1=# SELECT * FROM test1 WHERE t=-1010;
   t   
-------
 -1010
(1 row)


node2=# INSERT INTO test1 VALUES(-2020);  
INSERT 0 1
node2=# \c node3
node3=# SELECT * FROM test1 WHERE t=-2020;
   t   
-------
 -2020
(1 row)

node3=# \c node1
node1=# SELECT * FROM test1 WHERE t=-2020;
   t   
-------
 -2020
(1 row)
Seems Bucardo is doing 3 node mulit master replication. Thanks to Bucardo Team. :)

--Dinesh Kumar

Comments

  1. Hi thanks for nice tutorial. i have tried to setup this and it worked. but i have some problem while testing this with real scenario.

    1) it takes to long time to write records in another node and it cause duplication error sometimes.
    ERROR: duplicate key value violates unique constraint "ip_map_id_pkey"

    2) i have almost 100 tables so i need to alter triggers for each table manually or is there any way to make this process faster?

    Please suggest.

    ReplyDelete
  2. Hi,

    As Bucardo is a Asynchronous multi-master replication, where i believe we can get this kind of row conflicts.

    Sorry, i didn't get your question, about "alter triggers". Could you elaborate it please.

    ReplyDelete

Post a Comment

Popular posts from this blog

Parallel Operations With pl/pgSQL

Hi, I am pretty sure that, there will be a right heading for this post. For now, i am going with this. If you could suggest me proper heading, i will update it :-) OK. let me explain the situation. Then will let you know what i am trying to do here, and how i did it. Situation here is, We have a table, which we need to run update on “R” no.of records. The update query is using some joins to get the desired result, and do update the table.  To process these “R” no.of records, it is taking “H” no.of hours. That too, it’s giving load on the production server. So, we planned to run this UPDATE as batch process.  Per a batch process, we took “N” no.or records. To process this batch UPDATE, it is taking “S” no.of seconds. With the above batch process, production server is pretty stable, and doing great. So, we planned to run these Batch updates parallel.  I mean, “K” sessions, running different record UPDATEs. Of-course, we can also increase the Batch size here.  But

How To Send E-Mail From PostgreSQL

Hi , If you want to send E-Mails from PostgreSQL, then use the below Python 3.2 Script as below. I have used ActivePython 3.2 with PostgreSQL 9.1 for sending E-Mails from PostgreSQL. If you want to configure the Python 3.2 with PostgreSQL 9.1 then, please refer the below steps. http://manojadinesh.blogspot.in/2012/06/fatal-python-error-pyinitialize-unable.html Once, your Python 3.2 successful then follow the below steps to send an e-mail. Step 1 ===== postgres=# CREATE OR REPLACE FUNCTION public.send_email(_from Text,_password Text,smtp Text,port INT,receiver text, subject text, send_message text) RETURNS TEXT  LANGUAGE plpython3u AS $function$ import smtplib sender = _from receivers = receiver message = ("From: %s\nTo: %s\nSubject: %s\n\n %s"  % (_from,receiver,subject,send_message)) try:   smtpObj = smtplib.SMTP(smtp,port)   smtpObj.starttls()   smtpObj.login(_from, _password)   smtpObj.sendmail(sender, receivers,message)   print ('Successf

::Pipelined in Oracle as well in PostgreSQL::

Pipelined Table Functions:- [ORACLE] =========================== If you want to return multiple rows to the calling environment, then piplined table functions is prefred. It will increase the dbperformance as well. Ex:- Step 1: ----------- CREATE TABLE EMP(EMPNO INT,ENAME VARCHAR2(10),SAL INT); Step 2: ----------- Insert sample data. Step 3: ----------- Create an object for the row type casting. CREATE OR REPLACE TYPE emp_row AS OBJECT ( empno INT, ename VARCHAR2(20), SAL INT ); Step 4: ----------- Create a Return Type for the pipelined function. CREATE OR REPLACE TYPE emp_table_type AS TABLE OF emp_row; Step 5: ----------- CREATE OR REPLACE FUNCTION emp_pipe_function RETURN emp_table_type PIPELINED IS BEGIN FOR rec in (select * from emp) LOOP PIPE ROW (emp_row(rec.empno,rec.ename,rec.sal)); END LOOP; RETURN; END; Step 6: ---------- SQL> select * from table(emp_pipe_function); EMPNO ENAME SAL ---------- ----