Skip to main content

N-Node Mutlimaster Replication With Bucardo...!

Our team recently got  a problem, which is to solve the N-Node multi master replication in PostgreSQL.

We all know that, there are some other db engines like Postgres-XC which works in this way. But, we don't have any tool available in PostgreSQL, except Bucardo.

Bucardo is the nice solution for 2-Nodes. Is there a way we can exceed this limitation from 2 to N..?

As an initial step on this, I have done with 3 Nodes, which I believe, we can extend this upto N. { I might be wrong here.}

Please follow the below steps to set up the 1 - 1 multi master replication.

1. Follow the below steps to get all the pre-requisites for the Bucardo.

yum install perl-DBIx-Safe
or
apt-get install libdbix-safe-perl

Install the below components from CPAN.

DBI
DBD::Pg
Test::Simple
boolean (Bucardo 5.0 and higher)

Download the latest tarball from here.

tar xvfz Bucardo-4.4.8.tar.gz
cd Bucardo-4.4.8
perl Makefile.PL
make
sudo make install
2. We need to create plperl extension in db. For this, download the required active perl to set up or simply do as below.
apt-get install postgresql-plperl-9.3
or
yum install postgresql-plperl-9.3

Make a copy of Old $PGLIBPATH/plperl.so
Move the new plperl.so to $PGLIBPATH/plperl.so
3. plperl extension.
Create extension plperl;
4. Create 3 databases like "node1", "node2", "node3".
CREATE DATABASE node1;
CREATE DATABASE node2;
CREATE DATABASE node3;
5. Execute below statements on 2 databases (node1, node2).
CREATE TABLE test1(t INT PRIMARY KEY);
INSERT INTO test1 VALUES(generate_series(1, 10));
6. Install Bucardo catalog database using the below command.
postgres@dinesh-VirtualBox:~/Bucardo-4.5.0$ ./bucardo_ctl install
Postgres version is: 9.3
Attempting to create and populate the bucardo database and schema
Database creation is complete
Connecting to database 'bucardo' as user 'bucardo'
Updated configuration setting "piddir"
Installation is now complete.
7. Adding databases, those will be part of mutli-master replication.
postgres@dinesh-VirtualBox:~/Bucardo-4.5.0$ ./bucardo_ctl add db db1 dbname="node1"
Added database "db1"
postgres@dinesh-VirtualBox:~/Bucardo-4.5.0$ ./bucardo_ctl add db db2 dbname="node2"
Added database "db2"
8. Creating a herd of db1 tables.
postgres@dinesh-VirtualBox:~/Bucardo-4.5.0$ ./bucardo_ctl add all tables db=db1 herd=db1_herd
Creating herd: db1_herd
New tables added: 1
Already added: 0
9. Creating a sync of this herd from db1->db2.
postgres@dinesh-VirtualBox:~/Bucardo-4.5.0$ ./bucardo_ctl add sync db1_to_db2 type=pushdelta source=db1_herd targetdb=db2
Added sync "db1_to_db2"
10. Creating a herd of db2 tables.
postgres@dinesh-VirtualBox:~/Bucardo-4.5.0$ ./bucardo_ctl add all tables db=db2 herd=db2_herd
Creating herd: db2_herd
New tables added: 1
Already added: 0
11. Creating a sync of this herd from db2->db1.
postgres@dinesh-VirtualBox:~/Bucardo-4.5.0$ ./bucardo_ctl add sync db2_to_db1 type=pushdelta source=db2_herd targetdb=db1
Added sync "db2_to_db2"
12. Start the Bucardo process. Default log location for the Bucardo instance is SYSLOG.
postgres@dinesh-VirtualBox:~/Bucardo-4.5.0$ ./bucardo_ctl start
Checking for existing processes
Removing /tmp/fullstopbucardo
Starting Bucardo

postgres@dinesh-VirtualBox:~/Bucardo-4.5.0$ ps -ef|grep bucardo
postgres 2740 1128 0 21:26 ? 00:00:00 postgres: bucardo bucardo [local] idle
postgres 2743 1128 0 21:26 ? 00:00:00 postgres: bucardo node1 [local] idle
13. Check the DML activities between these two nodes.
node1=# INSERT INTO test1 VALUES(-1);
INSERT 0 1
node1=# \c node2
node2=# SELECT * FROM test1 WHERE t=-1;
 t
---
-1
(1 row)

node2=# INSERT INTO test1 VALUES(-2)
INSERT 0 1
node2=# \c node1
node1=# SELECT * FROM test1 WHERE t=-2;
 t
---
-2
(1 row)
14. Check the status from the below command.
postgres@dinesh-VirtualBox:~/Bucardo-4.5.0$ ./bucardo_ctl status
Days back: 3  User: bucardo  Database: bucardo  PID of Bucardo MCP: 2739
Name       Type  State PID  Last_good Time  I/U/D Last_bad Time
==========+=====+=====+====+=========+=====+=====+========+====
db1_to_db2| P   |idle |2746|7m49s    |0s   |1/0/0|unknown |    
db2_to_db1| P   |idle |2745|6m11s    |0s   |1/0/0|unknown |    
 
It seems everything is perfect, and data is replicating from node1 to node2 and vice-versa. Now, let's try to add a new node, "node3" as part of this mutli-master replication.

15. Add new replication database to Bucardo.
postgres@dinesh-VirtualBox:~/Bucardo-4.5.0$ ./bucardo_ctl add db db3 dbname="node3"
Added database "db3"
16. Dump and restore of test1 table from either node1 or node2.

17. Create a herd of db3 tables
postgres@dinesh-VirtualBox:~/Bucardo-4.5.0$ ./bucardo_ctl add all tables db=db3 herd=db3_herd
Creating herd: db3_herd
New tables added: 1
Already added: 0
18. Create a new sync from db2->db3, db3->db2.
postgres@dinesh-VirtualBox:~/Bucardo-4.5.0$ ./bucardo_ctl add sync db2_to_db3 type=pushdelta source=db2_herd targetdb=db3
Added sync "db2_to_db3"
postgres@dinesh-VirtualBox:~/Bucardo-4.5.0$ ./bucardo_ctl add sync db3_to_db2 type=pushdelta source=db3_herd targetdb=db2
Added sync "db3_to_db2"

19. Restart the Bucardo to check for the latest node status.
postgres@dinesh-VirtualBox:~/Bucardo-4.5.0$ ./bucardo_ctl stop
Creating /tmp/fullstopbucardo ... Done

postgres@dinesh-VirtualBox:~/Bucardo-4.5.0$ ./bucardo_ctl start
Checking for existing processes
Removing /tmp/fullstopbucardo
Starting Bucardo
 

postgres@dinesh-VirtualBox:~/Bucardo-4.5.0$ ./bucardo_ctl status
Days back: 3  User: bucardo  Database: bucardo  PID of Bucardo MCP: 2553
Name       Type  State PID  Last_good Time  I/U/D Last_bad Time
==========+=====+=====+====+=========+=====+=====+========+====
db1_to_db2| P   |idle |2563|1s       |0s   |0/0/0|unknown |    
db2_to_db1| P   |idle |2560|1s       |0s   |0/0/0|unknown |    
db2_to_db3| P   |idle |2561|0s       |0s   |1/0/1|unknown |    
db3_to_db2| P   |idle |2562|0s       |0s   |0/0/0|unknown |    
20. Here is the small hack, which makes you to embed this db3 into the multi master replication. As Burcardo replication system works based on triggers, we need to make all the triggers what it created on nodes to "ENABLE ALWAYS" for each table.
Node1
-=-=-
node1=# \d test1
     Table "public.test1"
 Column |  Type   | Modifiers 
--------+---------+-----------
 t      | integer | not null
Indexes:
    "test1_pkey" PRIMARY KEY, btree (t)
Triggers:
    bucardo_add_delta AFTER INSERT OR DELETE OR UPDATE ON test1 FOR EACH ROW EXECUTE PROCEDURE bucardo.bucardo_add_delta_t()
    bucardo_triggerkick_db1_to_db2 AFTER INSERT OR DELETE OR UPDATE OR TRUNCATE ON test1 FOR EACH STATEMENT EXECUTE PROCEDURE bucardo.bucardo_triggerkick_db1_to_db2()

node1=# ALTER TABLE test1 ENABLE ALWAYS TRIGGER bucardo_add_delta;
ALTER TABLE
node1=# ALTER TABLE test1 ENABLE ALWAYS TRIGGER bucardo_triggerkick_db1_to_db2;
ALTER TABLE

Do "ENABLE ALWAYS" these two triggers on this table. Similarly, do the same activity on all nodes.

Testing Multi-Master replication.
node3=# INSERT INTO test1 VALUES(-1010);
INSERT 0 1
node3=# \c node2
node2=# SELECT * FROM test1 WHERE t=-1010;
   t   
-------
 -1010
(1 row)

node2=# \c node1
node1=# SELECT * FROM test1 WHERE t=-1010;
   t   
-------
 -1010
(1 row)


node2=# INSERT INTO test1 VALUES(-2020);  
INSERT 0 1
node2=# \c node3
node3=# SELECT * FROM test1 WHERE t=-2020;
   t   
-------
 -2020
(1 row)

node3=# \c node1
node1=# SELECT * FROM test1 WHERE t=-2020;
   t   
-------
 -2020
(1 row)
Seems Bucardo is doing 3 node mulit master replication. Thanks to Bucardo Team. :)

--Dinesh Kumar

Comments

  1. Hi thanks for nice tutorial. i have tried to setup this and it worked. but i have some problem while testing this with real scenario.

    1) it takes to long time to write records in another node and it cause duplication error sometimes.
    ERROR: duplicate key value violates unique constraint "ip_map_id_pkey"

    2) i have almost 100 tables so i need to alter triggers for each table manually or is there any way to make this process faster?

    Please suggest.

    ReplyDelete
  2. Hi,

    As Bucardo is a Asynchronous multi-master replication, where i believe we can get this kind of row conflicts.

    Sorry, i didn't get your question, about "alter triggers". Could you elaborate it please.

    ReplyDelete

Post a Comment

Popular posts from this blog

Pgpool Configuration & Failback

I would like to share the pgpool configuration, and it's failback mechanism in this post.

Hope it will be helpful to you in creating pgpool and it's failback setup.

Pgpool Installation & Configuration

1. Download the pgpool from below link(Latest version is 3.2.1).
    http://www.pgpool.net/mediawiki/index.php/Downloads


2. Untart the pgpool-II-3.2.1.tar.gz and goto pgpool-II-3.2.1 directory.

3. Install the pgpool by executing the below commands:

./configure ­­prefix=/opt/PostgreSQL92/ ­­--with­-pgsql­-includedir=/opt/PostgreSQL92/include/ --with­-pgsql­-libdir=/opt/PostgreSQL92/lib/ make make install 4. You can see the pgpool files in /opt/PostgreSQL92/bin location.
/opt/PostgreSQL92/bin $ ls clusterdb   droplang  pcp_attach_node  pcp_proc_count pcp_systemdb_info  pg_controldata  pgpool pg_test_fsync pltcl_loadmod  reindexdb createdb    dropuser  pcp_detach_node  pcp_proc_info createlang  ecpg      pcp_node_count   pcp_promote_node oid2name  pcp_pool_status  pcp_stop_pgpool  …

pgBucket v1.0 is ready

pgBucket v1.0 pgBucket v1.0 (concurrent job scheduler for PostgreSQL) is released. This version is more stable and fixed the issues which was observed in the previous beta releases.
Highlights of this tool are Schedule OS/DB level jobsCron style syntax {Schedule up to seconds}On fly job modificationsInstant daemon status by retrieving live job queue, job hashEnough cli options to deal with all the configured/scheduled job Here is the URL for the pgBucket build/usage instructions. https://bitbucket.org/dineshopenscg/pgbucket
I hope this tool will be helpful for the PostgreSQL users to get things done in the scheduled time. Note: This tool requires c++11{gcc version >= 4.9.3} to compile.
--Dinesh

pgBucket - A new concurrent job scheduler

Hi All,

I'm so excited to announce about my first contribution tool for postgresql. I have been working with PostgreSQL from 2011 and I'm really impressed with such a nice database.

I started few projects in last 2 years like pgHawk[A beautiful report generator for Openwatch] , pgOwlt [CUI monitoring. It is still under development, incase you are interested to see what it is, attaching the image here for you ],


pgBucket [Which I'm gonna talk about] and learned a lot and lot about PostgreSQL/Linux internals.

Using pgBucket we can schedule jobs easily and we can also maintain them using it's CLI options. We can update/insert/delete jobs at online. And here is its architecture which gives you a basic idea about how it works.


Yeah, I know there are other good job schedulers available for PostgreSQL. I haven't tested them and not comparing them with this, as I implemented it in my way.
Features are: OS/DB jobsCron style sytaxOnline job modificationsRequired cli options