Questions about CreateCluster tool in H2 database - database

I have a couple of questions about H2's create cluster tool's behavior.
If a create a cluster specifying source A and target B, is H going to keep B in synch with A? In other words, is there a master-slave relationship maintained between both?
Let's imagine that database A, B and C belong to the same cluster. What happens if two different transactions are executed on A and B simultaneoulsy. Does H2 elect a leader in the cluster to make sure there is a unique execution order for all databases in the cluster?
If H2 elects a leader, what if this leader disappears? Is there an automatic failover mecanism? Is a new leader automatically elected? Can I still
If I create a cluster with source A -> target B, then source B -> target C, then source C -> target D, will D get statements to execute from C, C get executions statements from B and B get statements to execute for A? Or will B, C and D get execution statements from A (or the elected leader)? In other words, do we have a chain or star organization?

See the cluster documentation on the H2 web site.
There is no master / slave, no leader, and no connection between the cluster nodes. Instead, each client connects to both cluster nodes and executes the statements on both.
Each client executes all statements on all cluster nodes, in the same order. Each client has a list of cluster nodes, and each cluster node keeps the list as well. The clients verify the list is the same.
There is no leader. The failover mechanism is: if a client loses the connection to one of the cluster node, it removes that cluster node from it's list, and tells each cluster node to remove the cluster node from the list.
This will just expand the list so you get A, B, C, D. Each client will then execute all update statements on each cluster node.

Related

Is TDengine able to change replica number of a super(or chile table) table online?

Is TDengine able to change replica number of a super(or chile table) table online? For example, change replica number from 3 to 5. Will the data be copied into new replica automatically?
In tdengine you can change the number of replica using the follow command:
ALTER DATABASE db_name REPLICA X;
The "x" represent the number of replica. However the X is an integer among[1,3], beside you also need to ensure the replica must less or equal to the number of dnodes.
Using the command below to check the num of dnode in tdengine.
show dnodes;

Citus "... is a metadata node, but is out of sync HINT: If the node is up, wait until metadata gets synced to it and try again."

I've got a Citus (v10.1) sharded PostgreSQL (v13) cluster with 4 nodes. Master node address is 10.0.0.2 and the rest are up to .5 When trying to manage my sharded table, I've got this error:
ERROR: 10.0.0.5:5432 is a metadata node, but is out of sync
HINT: If the node is up, wait until metadata gets synced to it and try again.
I've been waiting. After 30 minutes or more, I've literally did drop schema ... cascade, drop extension Citus cascade; and after re-importing the data, creating a shard I've got the same error message once more and can't get past it.
Some additional info:
Another thing that might be an actual hint is I cannot distribute my function through create_distributed_function(), because it's says it in a DeadLock state, and transaction cannot be committed.
I've checked idle processes, nothing out of ordinary.
Created shard like that:
SELECT create_distributed_table('test_table', 'id');
SELECT alter_distributed_table('test_table', shard_count:=128, cascade_to_colocated:=true);
There is no topic in google search result regarding this subject.
EDIT 1:
I did bomb (20k-200k hits per second) my shard with a huge amount of requests for a function that does insert/update or delete if specific argument is set.
This is a rather strange error. It might be the case that you hit the issue in https://github.com/citusdata/citus/issues/4721
Do you have column defaults that are generated by sequences? If so, consider using bigserial types for these columns.
If it does not work, you can disable metadata syncing by SELECT stop_metadata_sync)to_node('10.0.0.5','5432') optionally followed by SELECT start_metadata_sync_to_node('10.0.0.5','5432') to stop waiting for metadata syncing and (optionally) retry metadata creation from scratch.

Avoid deadlock among data-read in SQL Server

We have an issue with "deadlocks" in SQL Server, where there are no explicit locks involved and would like to know, how to get around them.
Some relevant background info: Our application is quite old and large. We recently began to remove some concurrency-hindering issues and doing so we stepped onto the deadlock-thing on SQL Server. We do not have the resources to tackle each select statement within the application but are looking for a more general approach on configuration level.
We can reduce one exemplary problem as follows: Basically, we have two entities, EntityA and EntityB. Both are mapped to individual tables in the SQL Server schema. Between both entities, there is a m:n relation, mapped within the database via a AToB table, which contains some additional context (i.e. there can be more than one entry within AToB regarding the same A and B.
During one business operation a new instance of A and B are inserted into the database and also multiple entries within the AToB table. Within the same transaction at a later point all this data is read again (without for update). When executing this operation in parallel deadlocks occur. These deadlocks are linked to the AToB table.
Say, we have A1 and B1, which are linked via A1B1_1 and A1B1_2 and A2 and B2, which are linked via A2B2_1 and A2B2_2.
My guess is the following happens:
t1 -> INSERT A1
t1 -> INSERT B1
t1 -> INSERT A1B1_1 (PAGE1)
t2 -> INSERT A2
t2 -> INSERT B2
t2 -> INSERT A2B2_1 (PAGE1)
t1 -> INSERT A1B1_2 (PAGE2)
t2 -> INSERT A2B2_2 (PAGE2)
t1 -> SELECT * FROM AToB WHERE AToB.A=A1
t1 -> SELECT * FROM AToB WHERE AToB.A=A2
Now, during concurrent reads on the AToB table t1 obtains a lock on PAGE1 and t2 obtains a lock on PAGE2 resulting in a deadlock.
First question: Is this a plausible explanation for the deadlock to occur?
Second question: During research I was given the impression, an index on AToB.A might force SQL Server to lock less entries on the table (possibly even reducing it to a row- instead of a page lock). Is this right?
Third question: I further got the impression, this problem might be solved by snapshot-locking. Is that right?
We tried this approach, however it lead us into the next circle of hell:
During the business-transaction at one point a business-identifier is assigned to A. This comes from a separate table and it must be unique among the As. There is no possibility to assign this via a database-sequence. Our solution is to assign this identifier via an select/update on a fourth table Identifier. This is done via a for update statement. When employing snapshot-locking this for update lock is ignored during acquisition and only leads to an optimistic locking exception during commit. This leads us to
Fourth question: When using snapshot-locking, is it possible to still have a special transactions, which still run on pessimistic locking or is it possible to tell SQL Server, some tables are excluded from optimistic locking?

SQL Server CDC: Track additional column after the fact

If CDC has been setup on a table, only tracking Columns A,D,E instead of the entire table, is it possible to add a column Z to the source table, then add column Z to the list of tracked columns for CDC? Is it possible to do this without losing CDC data?
I've looked around and the only examples I find are for tracking the entire table and not for cherry picking columns. I'm hoping for a way to update a table schema and not lose CDC history, without doing the whole copy CDC to temp table then back to CDC process.
Do you always have to create a new instance of CDC for schema changes?
SQL Server 2012
The reason that CDC allows for two capture instances on a given table is exactly this reason. The idea is this:
You have a running instance tracking columns A, B, and C
You now want to start tracking column D
So you set up a second capture instance to track A, B, C, and D.
You then process everything from the original capture instance and note the last LSN that you process
Using the fn_cdc_increment_lsn() function, you set the start point to start processing the new capture instance
Once you're up and running on the new instance, you can drop the old one
Of course, if you're using CDC as history of all changes ever on the tableā€¦ that's not what it was designed for. CDC was meant to be an ETL facilitator. That is, capturing changes to data so that they could be consumed and then ultimately discarded from the CDC system. If you're using it for a historical system of record, I'd suggest setting up a "dumb" ETL meaning a straight copy out of the CDC tables into a user table. Once you do that, you can implement the above.

Access distributed mnesia database from different nodes

I have a mnesia database containning different tables.
I want to be able to access the tables from different Linux terminals.
I have a function called add_record, which takes a few parameters, say name and id. I want to be able to call add_record on node1 and add record on node2 but I want to be updating the same table from different locations.
I read a few sources and the only thing i found out was that i should use net_adm:ping (node2). but somehow I cant access the data from the table.
i assume that you probably meant replicated table. Suppose you have your mnesia table on node: nodea#127.0.0.1 with -setcookie mycookie, whether its replicated on another node or not, if i want to access the records from another terminal, then i have to use erlang in this other terminal as well by creating a node, connecting this node to our node with the table (you ensure that they all have the same cookie), then you call a method on the remote node. Lets say you want to use a method add_record in module mydatabase.erl on the node nodea#127.0.0.1 which is having the mnesia table, the i open a linux terminal and i enter the following:
$ erl -name remote#127.0.0.1 -setcookie mycookie
Eshell V5.8.4 (abort with ^G)
1> N = 'nodea#127.0.0.1'.
'nodea#127.0.0.1'
2> net_adm:ping(N).
pong
3> rpc:call(N,mydatabase,add_record,[RECORD]).
{atomic,ok}
4>
with this module (rpc), you can call any method on a remote node, if the two nodes are connected using the same cookie. start by calling this method on the remote node:
rpc:call('nodea#127.0.0.1',mnesia,info,[]).
It should display everything in your remote terminal. I suggest that probably, you first go through this lecture: Distributed Erlang Programming and then you will be able to see how replicated mnesia tables are managed. Go through that entire tutorial on that domain.

Resources