high availability database ( postgresql ) with docker swarm

high availability database ( postgresql ) with docker swarm - database

i am new to docker, and just learning docker for quite sometimes,for i am working with LAPP ( Linux, Apache, PHP and PostreSQL ) stack and always working with monolithic architecture where all the things are put together into one single server / vps.
after i learn about docker my mind changes quite a bit, and now i am going to try how to containerized the stack. When i came into the db, i thought if i implemented docker in swarm mode, the db can replicated or sync automatically in my worker node when i scale it up.. but in fact it does not. And here is my .yml file i am using to re-produce the scenario.
version: "3"
services:
db:
image: postgres:9.5
container_name: db
restart: always
tty: true
ports:
- "5432:5432"
environment:
POSTGRES_PASSWORD: mydocker_pass
POSTGRES_USER: mydocker_user
POSTGRES_DB: mydocker
volumes:
- db-data:/var/lib/postgresql/data
volumes:
db-data:
at the first time i just create a single container inside my swarm-manager but then i scale up this service with this command docker service scale db_stack_db=2 it works perfectly, now new container are running in my worker node. In my worker node i also have the same volume which i already have in my manager node but sadly when i write something into the db in the manager node, it doesn't show up in my worker node and vice versa though they have the same volume name, and pre-created database : mydocker and also have the same db user : mydocker_user and db_password mydocker_password which means instance in worker node are not syncing with manager node.
then if anyone here is experienced this kind of scenario, please help me and please kindly share your thoughts with me, what is the best practice with swarm mode? do i only have to put one node for db, because if i scale the db up and if it running in the worker node is it useful and helping the load of the db service? since it doesn't have same data between manager and worker node.
regards.

Related

There are 8 hours difference in two client of TDengine

We have an TDengine application. There are more than one clients, including Docker, Linux, and Windows.
I'm using interval(1d), but the time on Windows & Docker is different by 8 hours. I check the timezone setting, they are 'Asia/Shanghai' and Beijing Time. I can't tell what's may be the problem.

I believe when importing data time is converted based on timezone settings on client and on server just store the value of UTC timestamp. So once you query data on another client, the queried timestamp will be converted based on the timezone setting on that client too. Be careful when you query data based on timestamp make sure considering the time offset between two different. So its better to make consistent client timezone settings to mitigate unexpected result.

Be careful to set a TZ environment when you want to use TDengine database in docker or a kubernetes cluster.
Like this:
docker run --name tdengine -d -e TZ=Asia/Shanghai tdengine/tdengine
Recommend to use docker-compose to manage the runtime configurations for the TDengine container:
version: "3.7"
networks:
td:
external: true
services:
tdengine:
image: tdengine/tdengine:2.4.0.16
networks:
- td
environment:
TZ: Asia/Shanghai
TAOS_FQDN: localhsot
Check the docker here https://taosdata.github.io/TDengine-Operator/en/2.2-tdengine-with-helm.html if you use TDengine in k8s.

Does the master database need to be in the same host where symmetricds runs?

That is the configuration of the master node
engine.name=master
db.driver=com.mysql.jdbc.Driver
db.url=jdbc:mysql://192.168.1.55:3306/master-db?useSSL=false
db.user=root
db.password=password
registration.url=
sync.url=http://192.168.1.55:31415/sync/master-db
group.id=master
external.id=0
# Don't muddy the waters with purge logging
job.purge.period.time.ms=7200000
# This is how often the routing job will be run in milliseconds
job.routing.period.time.ms=5000
# This is how often the push job will be run.
job.push.period.time.ms=5000
# This is how often the pull job will be run.
job.pull.period.time.ms=5000
# Kick off initial load
initial.load.create.first=true
That is the configuration of the child node
engine.name=italian-restaurant
db.driver=com.mysql.jdbc.Driver
db.url=jdbc:mysql://192.168.1.5:3306/italian_restaurant_db?useSSL=false
db.user=root
db.password=password
registration.url=
sync.url=http://192.168.1.55:31415/sync/child-db
group.id=restaurants
external.id=1
# Don't muddy the waters with purge logging
job.purge.period.time.ms=7200000
# This is how often the routing job will be run in milliseconds
job.routing.period.time.ms=5000
# This is how often the push job will be run.
job.push.period.time.ms=5000
# This is how often the pull job will be run.
job.pull.period.time.ms=5000
# Kick off initial load
initial.load.create.first=true
And all this works fine, but if in the master properties change the host IP of the master DB to another IP (Because I have the database in the cloud) the connection to master DB in the cloud works fine because all symmetricds tables are created and the default configuration is loaded but the registration of nodes, not works.
Throw warn alert Registration was no open
This only happens if the master database is not in the same host where symmetricds runs
Thanks, I hope for your answers

There is no requirement for SymmetricDS to be on the same host as the database. I would have expected your scenario to work exactly the same as with the local database.
In the master.properties did you only change the ip address in the db.url?
On a side note, it is usually a good idea to have your SymmetricDS instance on the same network with good bandwidth to your database for optimal performance (as JDBC can be chatty).

Connecting to multiple databases on Kubernetes

I have a cluster with multiple databases. Applications on my cluster can access the database using the clusterIP service. For security reasons, I do not want to expose these databases publicly using a nodeport or a loadbalancer.
What I would like to do is upload a web based database client to Kubernetes and expose this client as a service, so that the database can be accessed.
Is something like this possible?

Personal opinion aside on 'web based database client' and security concern
What you are trying to achieve seems to be proxying your databases through a web app.
This would go like this:
NodePort/LB --> [WebApp] --> (DB1 ClusterIP:Port)--[DB1]
\--> (DB2 ClusterIP:Port)--[DB2]
\--> (DB3 ClusterIP:Port)--[DB3]
You just have to define a NodePort/LB Service to expose your WebApp publicly, and ClusterIP Services for each Database you want to be able to reach. As long as the WebApp is running in the same cluster, it will be able to connect to your internal databases, while they wouldn't be directly reachable from outside the Kubernetes cluster.

You would need to check, in any registry, if there is this web based client Docker image you want. If there is, you would deploy it as pod, and will expose this pod to access from your browser.

MongoDB replication set without restarting database

I have a mongoDB database running in one server. This is its configuration file:
# mongod.conf
# for documentation of all options, see:
# http://docs.mongodb.org/manual/reference/configuration-options/
# Where and how to store data.
storage:
dbPath: /var/lib/mongodb
journal:
enabled: true
# engine:
# mmapv1:
# wiredTiger:
# where to write logging data.
systemLog:
destination: file
logAppend: true
path: /var/log/mongodb/mongod.log
# network interfaces
net:
port: 27017
bindIp: 0.0.0.0
ssl:
mode: requireSSL
PEMKeyFile: /etc/ssl/mongo.pem
#processManagement:
#security:
security:
authorization: enabled
#operationProfiling:
#replication:
#sharding:
## Enterprise-Only Options:
#auditLog:
#snmp:
setParameter:
failIndexKeyTooLong: false
I have created a service to launch the mongoDB each time the server starts or each time the database is down.
This configuration is working so far.
Now I have cloned this server into another one. The configuration is identical except for the server IP and the server domain.
This new server is working too but I would like to connect both databases so the new database is synchronized with the first one as with a master-slave configuration.
I think this is the typical case of a Mongo DB Replication Set with 2 databases. But I’m not very expert with databases and after reading lots of documents I don’t understand very much how to do it.
For example, it seems that all options require to turn off master database before making the synchronization, but in my case the master database is in a production environment so I would like to avoid this. Is there any option to configure the replication set without having to restart the master mongoDB instance?
I’ve checked the reference of the replication options in the connfiguration file too but I don’t know how to use them.
In conclusion, is there any tutorial about how to create a replication set with 2 mongodb databases and if it’s possible without having to restart the master (in production environment) database?

Remote Postgres Database Heroku Connection is slow from Digital Ocean Instance

I am using Apache2 and php 5.6.,12. I decided to host my database remotely at Heroku(Using postgresql 9.4) and keep my server at Digital ocean.
In my yii 1 framework, the connection string that I have added is the following:
'db'=>array(
'connectionString' =>
'pgsql:host=ec2-XX-XX-XX-XX.compute-1.amazonaws.com;port=6372;dbname=dddqXXXXX;sslmode=require',
'emulatePrepare' => true,
'username' => 'XXXX4dcXXXX',
'password' => 'XXXXXXXXXc34XXXXXXX123',
'charset' => 'utf8',
),
The connection is successful but remote access is making it slow for even simple query in my server at digital ocean. I read from Heroku that for remote access, ssl mode has to be enabled. So I did and and I am still unable to figured out why the database connection is slow. It can be slow up to even 5 seconds. I tried with a locally installed postgresql database server and everything is running as expected. I am not sure how can I solve this else I will have to move away from Herokku and do it in the traditional way which is going to be very depressing. I hope that someone can help me.
Here is my php info og pgsql:
Is there some settings that need to be done to speed up remote heroku database access in apache2 or php?

I was unable to ping Postgres Heroku Server as advised by Richard (Heroku has prevented pinging) . It was very obvious that connection between digital ocean server and Heroku Postgres server is slow. Thus I emailed Heroku directly to ask for their advice.
Heroku's Solution:
They claimed that applications which are connecting from long distance outside the Heroku platform will have initial connection latency and this latency is a big problem.
Thus, Application has to establish a TCP connection which Postgres protocol will upgrade that to an SSL connection. This takes quite a few packets and introduces a lot of latency, particularly if the app is creating a new connection for each query or page load.
Heroku recommended me to configure the app to use something like heroku-pgbouncer connection pool. That uses pgbouncer and stunnel to provide a configurable connection pool for the app endpoints.
The recommendation sound too expensive and highly challenging for me to deal with.
My Solution :: Use Database Labs
I found out another postgres as a service provider called Database Labs . They allow users to select data center region for better performance.Database Labs has easy backend managing platform and friendly support team. The backend had minimum backend functionality and I do understand as they started in year 2014.
However, after migrating to their service, the performance of my web page improved remarkably. The connection was like any standard connection without the need for SSL. I am inputing my solution for the benefit of others who could face similar problem like me.
Heroku is definitely a good provider if we host our application in Heroku and use their database service. However If you are a Digital Ocean user, I recommend that you use Use Database Labs . This saves a lot of time

There isn't really a question here exactly, so this answer is more a guide to how to test the situation.
If you don't know enough to run a packet trace, you probably want to make sure your servers are all on the same network. However, try logging in to your Digital Ocean server and just ping the Heroku one. Repeat for www.google.com and compare the times. That's assuming the Heroku server responds to pings.
You should be able to connect with "psql -h ...". Then you can run a "SELECT count(*) FROM " then "SELECT * FROM LIMIT 10000", then "LIMIT 20000". That will let you figure out how much time is spent just transferring data vs running the query.
It might just be that the connection between your servers is very slow. Can't say without testing.