How should I design the multiple same applications update one database - database

I'm managing an online book store website. For sake of high availability I've setup two Tomcat instances for running the website application, they are the exactly same program, and they are sharing the same database which located in another server.
My question is that how can I avoid conflicts or dirty data when the two applications do the same updates/inserts at the same time to the database.
For example: update t_sale set total='${num}' where category='cs', if there are two processes execute the above sql simultaneously would cause data lost.

If by "database" you are talking about a well designed schema that is running on an RDBMS such as Oracle, DB2, or SQL Server, then the database itself will prevent what you call "conflicts" by locking parts of the database during each update transaction.
You can prevent "dirty data" from getting into the database by adding features such as check clauses and primary-foreign key structures in the database itself.

Related

db replication vs mirroring

Can anyone explain the differences from a replication db vs a mirroring db server?
I have huge reports to run. I want to use a secondary database server to run my report so I can off load resources from the primary server.
Should I setup a replication server or a mirrored server and why?
For your requirements the replication is the way to go. (asumming you're talking about transactional replication) As stated before mirroring will "mirror" the whole database but you won't be able to query unless you create snapshots from it.
The good point of the replication is that you can select which objects will you use and you can also filter it, and since the DB will be open you can delete info if it's not required( just be careful as this can lead to problems maintaining the replication itself), or create specific indexes for the report which are not needed in "production". I used to maintain this kind of solutions for a long time with no issues.
(Assuming you are referring to Transactional Replication)
The biggest differences are: 1) Replication operates on an object-by-object basis whereas mirroring operates on an entire database. 2) You can't query a mirrored database directly - you have to create snapshots based on the mirrored copy.
In my opinion, mirroring is easier to maintain, but the constant creation of snapshots may prove to be a hassle.
As mentioned here
Database mirroring and database replication are two high data
availability techniques for database servers. In replication, data and
database objects are copied and distributed from one database to
another. It reduces the load from the original database server, and
all the servers on which the database was copied are as active as the
master server. On the other hand, database mirroring creates copies of
a database in two different server instances (principal and mirror).
These mirror copies work as standby copies and are not always active
like in the case of data replication.
This question can also be helpful or have a look at MS Documentation

What is the DBA function/tool called for keeping many remote DBs synchronized with a main DB

I am a front-end developer being asked to fulfil some DBA tasks. Uncharted waters.
My client has 10 remote (off network) data collection terminals hosting a PostgreSQL application. My task is to take the .backup or .sql files those terminals generate and add them to the main DB. The schema for all of these DBs will match. But the merge operation will lead to many duplicates. I am looking for a tool that can add a backup file to an existing DB, filter out duplicates, and provide a report on the merge.
Is there a term for this kind of operation in the DBA domain?
Is this function normally built into basic DB admin suites (e.g. pgAdmin III), are enterprise-level tools required, or is this something that can be done on the command-line easily enough?
Update
PostgreSQL articles on DB replication here and glossary.
You can't "merge a bunch of tables" but you could use Slony to replicate child tables (i.e. one partition per location) back to a master db.
This is not an out of the box solution but with something like Bucardo or Slony it can be done, albeit with a fair bit of work and added maintenance.

Client-side Replication for SQL Server?

I'd like to have some degree of fault tolerance / redundancy with my SQL Server Express database. I know that if I upgrade to a pricier version of SQL Server, I can get "Replication" built in. But I'm wondering if anyone has experience in managing replication on the client side. As in, from my application:
Every time I need to create, update or delete records from the database -- issue the statement to all n servers directly from the client side
Every time I need to read, I can do so from one representative server (other schemes seem possible here, too).
It seems like this logic could potentially be added directly to my Linq-To-SQL Data Context.
Any thoughts?
Every time I need to create, update or
delete records from the database --
issue the statement to all n servers
directly from the client side
Recipe for disaster.
Are you going to have a distributed transaction or just let some of the servers fail? If you have a distributed transaction, what do you do if a server goes offline for a while.
This type of thing can only work if you do it at a server-side data-portal layer where application servers take in your requests and are aware of your database farm. At that point, you're better off just using a higher grade of SQL Server.
I have managed replication from an in-house client. My database model worked on an insert-only mode for all transactions, and insert-update for lookup data. Deletes were not allowed.
I had a central table that everything was related to. I added a field to this table for a date-time stamp which defaulted to NULL. I took data from this table and all related tables into a staging area, did BCP out, cleaned up staging tables on the receiver side, did a BCP IN to staging tables, performed data validation and then inserted the data.
For some basic Fault Tolerance, you can scheduling a regular backup.

Is SQL replication the answer?

For our application(desktop in .net), we want to have 2 databases in 2 different remote places(different countries).Is it possible to use replication to keep the data in sync in both the databases while application changes data?. What other strategies can be used? Should the sync happen instantaneously or, at a scheduled time? What if we decide to keep one database 'readonly'?
thanks
You need to go back to your requirements I think.
Does data need to be shared between two sites?
Can both sites update the same data?
What's the minimum acceptable time for an update in one location to be visible in another?
Do you need failover/disaster recovery capability?
Do you actually need two databases? (e.g is it for capacity, for failover or simply because the network link between the two sites is slow? etc)
Any other requirements around data access/visibility?
Real-time replication is one solution, an overnight extract-transform-load process could be another. It really depends on your requirements.
I think the readonly question is key. If one database is readonly then you can use mirroring to sync them, assuming you have a steady connection.
What is the bandwidth and reliability of connection between the sites?
If updates are happening at both locations (on the same data) then Merge Replication is a possibility. It's really designed for mobile apps where users in the field have some subset of the data and conflicts may need to be resolved at replication time.
High level explanation of the various replication types in SQL Server including the new Sync Framework in SQL Server 2008 can be found here: http://msdn.microsoft.com/en-us/library/ms151198.aspx
-Krip

SQL Server 2005 One-way Replication

In the business I work for we are discussion methods to reduce the read load on our primary database.
One option that has been suggested is to have live one-way replication from our primary database to a slave database. Applications would then read from the slave database and write directly to the primary database. So...
Application Reads From Slave
Application Writes to Primary
Primary Updates Slave Automatically
What are the major pros and cons for this method?
A few cons:
2 points of failure
Application logic will have to take into account the delay between writing something and then reading it, since it won't be available immediately from the secondary database
A strategy I have used is to send key reporting data to a secondary database nightly, de-normalizing it on the way, so that beefy queries can run on that database instead of locking up tables and stealing resources from the OLTP server. I'm not using any formal data warehousing or replication tools, rather I identify problem queries that are Ok without up-to-the-minute data and create data structures on the secondary server specifically for those queries.
There are definitely pros to the "replicate everything" approach:
You can run any ad-hoc query on the secondary, since it has all of your data
If your primary server dies, you can re-purpose the secondary quickly to take over
We are using one-way replications, but not from the same application. Our applications are reading-writing to the master database, the data gets synchronized to the replca database, and the reporting tools are using this replica.
We don't want our application to read from a different database, so in this scenario I would suggest using file groups and partitioning on the master database. Using file groups (especially on different drives) and partitioning of files and indexes can help on performance a lot.

Resources