inserting data into a table concurrently - hibernate - database

I have an application that uses hibernate to inert data into a table.
Database is SQL server. The application itself is deployed in Tomcat 6.
To insert data into DB table - I am using BasicDataSource with minimum configurations for tomcat connection pool (like MaxActive=150, maxIdle =10....)
The problem now is that - I want to add concurrency to the application. In the process - I am making concurrent calls to the business layer method that calls the dao level methods that perform DB inserts. This is resulting in the below error:
Exception occurred java.util.concurrent.ExecutionException: org.hibernate.HibernateException:
Illegal attempt to associate a collection with two open sessions
When I monitor the database, I see that multiple threads are being created but are not being closed.
I am not sure how to proceed further to debug/fix this. Any pointers would be helpful.

If Hibernate is telling you:
Illegal attempt to associate a collection with two open sessions
Basically you are opening two sessions, and have a transaction each and you are trying to save a session in one transaction into another. Ya concurrency is you major problem here. Well That can be tackled if you design your application so as to handle sessions carefully.
Stack-Trace will give you which functions are causing the exceptions. See how long you unit of work lasts with sessions and try reducing those and make sure your sessions are always closed after use.
An application implemented in hibernate can have various patterns.
You need to have session-per-request pattern. In this model, a
request from the client is sent to the server, where the Hibernate
persistence layer runs. A new Hibernate Session is opened, and all
database operations are executed in this unit of work. On completion
of the work, and once the response for the client has been prepared,
the session is flushed and closed. Use a single database transaction
to serve the clients request, starting and committing it when you open
and close the Session. The relationship between the two is one-to-one
and this model is a perfect fit for many applications
Do not use the anti-patterns session-per-user-session or session-per-application
The Transaction And Concurrency in Hibernate Documentation gives in depth analysis and examples

Related

How should I design the multiple same applications update one database

I'm managing an online book store website. For sake of high availability I've setup two Tomcat instances for running the website application, they are the exactly same program, and they are sharing the same database which located in another server.
My question is that how can I avoid conflicts or dirty data when the two applications do the same updates/inserts at the same time to the database.
For example: update t_sale set total='${num}' where category='cs', if there are two processes execute the above sql simultaneously would cause data lost.
If by "database" you are talking about a well designed schema that is running on an RDBMS such as Oracle, DB2, or SQL Server, then the database itself will prevent what you call "conflicts" by locking parts of the database during each update transaction.
You can prevent "dirty data" from getting into the database by adding features such as check clauses and primary-foreign key structures in the database itself.

Load balancer and multiple instance of database design

The current single application server can handle about 5000 concurrent requests. However, the user base will be over millions and I may need to have two application servers to handle requests.
So the design is to have a load balancer to hope it will handle over 10000 concurrent requests. However, the data of each users are being stored in one single database. So the design is to have two or more servers, shall I do the followings?
Having two instances of databases
Real-time sync between two database
Is this correct?
However, if so, will the sync process lower down the performance of the servers
as Database replication seems costly.
Thank you.
You probably want to think of your service in "tiers". In this instance, you've got two tiers; the application tier and the database tier.
Typically, your application tier is going to be considerably easier to scale horizontally (i.e. by adding more application servers behind a load balancer) than your database tier.
With that in mind, the best approach is probably to overprovision your database (i.e. put it on its own, meaty server) and have your application servers all connect to that same database. Depending on the database software you're using, you could also look at using read replicas (AWS docs) to reduce the strain on your database.
You can also look at caching via Memcached / Redis to reduce the amount of load you're placing on the database.
So – tl;dr – put your DB on its own, big, server, and spread your application code across many small servers, all connecting to that same DB server.
Best option could be the synchronizing the standby node with data from active node as cost effective solution since it can be achievable using open source relational database(e.g. Maria DB).
Do not store computable results and statistics that can be easily doable at run time which may help reduce to data size.
If history data is not needed urgent for inquiries , it can be written to text file in easily importable format to database(e.g. .csv).
Data objects that are very oftenly updated can be kept in in-memory database as key value pair, use scheduled task to perform batch update/insert to relation database to achieve persistence
Implement retry logic for database batch update tasks to handle db downtimes or network errors
Consider writing data to relational database as serialized objects
Cache configuration data to memory from database either periodically or via API to refresh the changing part.

Keep a transaction open on SQL Server with connection closed

On SQL Server, is it possible to begin a transaction but intentionally orphan it from an open connection yet keep it from rolling back?
The use-case it for a REST service.
I'd like to be able to link a series of HTTP requests to work under a transaction, which can be done if the service is stateful; i.e. there's a single REST API server holding a connection open (map HTTP header value to a named connection), but a flawed idea in a non-sticky farm of web servers.
If the DB supported the notion of something like named/leased transactions, kinda like a named mutex, this could be done.
I appreciate there are other RESTful designs for atomic data mutations.
Thanks.
No. A transaction lives and dies with the session it's created in, and a session lives and dies with its connection. You can keep a transaction open for as long as you like -- but only by also keeping the session, and thereby the connection open. If the session is closed before the transaction commits, it automatically rolls back. Which is a good thing, in general, because transactions tend to use pessimistic locking. You don't want to keep those locks around for longer than necessary.
While there is such a thing as a distributed transaction that you can enlist in even if the current connection did not begin the transaction, this will still not do what you want for the scenario of multiple distributed nodes performing actions in succession to complete a transaction on one database. Specifically, you'd still need to have one "master" node to keep the transaction alive and decide it should finally commit now, and you need a way to make nodes aware of the transaction so they can enlist. I don't recommend you actually go this way, as it's much more complicated than tailoring a solution to your specific scenario (typically, accumulating modifications in their own table and committing them as a batch when they're complete, which can be done in one transaction).
You could use a queue-oriented design, where the application simply adds to the queue, while SQL server agent 'pop's the queue and executes.

Best practice to recover an CRUD statement to a linked server if connection is lost

I am looking for the best practice for the following scenario.
We have a CRM in our company. When an employee updates the record of a company, there is trigger that fires a stored procedure which has a CRUD statement to the linked server hosting the SQL DB of our website.
Question:
What happens when the connection is lost in the middle of the CRUD and the SQL DB of the website did not get updated? What would be the best way to have the SQL statement processed again when the connection is back?
I read about Service Broker or Transactional Replication. Is one of these more appropriate for that situation?
The configuration:
Local: SQL server 2008 R2
Website: SQL server 2008
Here's one way of handling it, assuming that the CRUD statement isn't modal, in the sense that you have to give a response from the linked server to the user before anything else can happen:
The trigger stores, in a local table, all the meta-information you need to run the CRUD statement on the linked server.
A job runs every n minutes that reads the table, attempts to do the CRUD statements stored in the table, and flags them as done if the linked server returns any kind of success message. The ones that aren't successful stay as they are in the table until the next time the job runs.
If the transaction failed in the middle of the trigger it would still be in the transaction and the data would not be written to either the CRM database or the web database. There is also the potential problem of performance, the SQL server data modification query wouldn't return control to the client until both the local and remote change had completed. The latter wouldn't be a problem if the query was executed async, but fire and forget isn't a good pattern for writing data.
Service Broker would allow you to write the modification into some binary data and take care of ensuring that it was delivered in order and properly processed at the remote end. Performance would not be so bad as the insert into the queue is designed to be completed quickly returning control to the trigger and allowing the original CRM query to complete.
However, it is quite a bit to set up. Even using service broker for simple tasks on a local server takes quite a bit of setup, partly because it is designed to handle secure, multicast, reliable ordered conversations so it needs a few layers to work. Once it is all there is is very reliable and does a lot of the work you would otherwise have to do to set up this sort of distributed conversation.
I have used it in the past to create a callback system from a website, a customer enters their number on the site and requests a callback this is sent via service broker over a VPN from the web to the back office server and a client application waits for a call on the service broker queue. Worked very efficiently once it was set up.

Can someone tell me if SQL Server Service Broker is needed for this scenario?

My first ever question on stack overflow so please go easy. I have a long running windows application that continually processes sql server commands. I also have a web front end that users use occasionally use to update the same db. I've noticed that sometimes (depending on what the windows application is processing at the time) that if a user submits something to the db I receive out of memory exceptions on the server. I realise I need to dig around a bit more and optimise the code. However I cannot afford the server to go down and expect that in the future i'll be allowing more and more users on the frontend. What i really need is a system that will queue the users requests (they are not time critical) and process them when the db is ready.
I'm using SQL 2012 express.
Is SQL Service Broker the best solution, i've also looked into MSMQ.
If so can someone point me in the right direction for it would be appreciate. In my search i'm just finding a lot of things it does that i don't think i need.
Cheers
It depends on where you're doing the persistence work, and / or calculations. If you're doing the hard work in your Windows Application, then using a Service Broker queue won't be worthwhile, as all you will be doing is receiving a message from the Service Broker queue in your Windows Application, doing your calculations and / or queries from the Windows Application, and then persisting the results to the database: as your database is already under memory pressure, this seems like an unnecessary extra load as you could just as easily queue and retrieve the message from MSMQ (or any other queueing technology).
If however you are doing all the work in the database and your Windows Application just acts as a marshalling service - eg taking the request and palming it off to a stored procedure for actioning - then Service Broker Queues may be worth using: because they are already operating within the context of the database, they can be very efficient at persisting amd querying data.
You would also want to take into failure modes, depending on whether or not you can afford to lose any messages. To ensure message persistence in MSMQ you have to use Transactional Messaging: Service Broker is more efficient at transactional queue processing than MSMQ (because it has transaction support built in, unlike MSMQ which has to use DTC, which adds an overhead) - but if your volume of messages is low, this may not be an issue.

Resources