We have MongoDB hosted as kubernetes Pod.
There are no replicas.
We have collection(c1) that is not indexed
We have a scenario, where,
backend app 1 has a thread that initiate update query on multiple documents. This single thread performs update every 2 sec
Backend app 2 has a thread that initiate select query on multiple documents. There can be such multiple threads. Update query is performed as a transaction.
Two backend apps have their respective mongodb driver with database connection pool
Problem
select query performance impacts backend app 2 with delay in responding to UI(frontend)
Does update query create performance delay for select query?
Related
Querying from a view into a temp table can insert 800K records in < 30 seconds. However, querying from the view to my app across the network takes 6 minutes. Does the server build the dataset and then send it, releasing any locks acquired after the dataset is built? Or are the locks held for that entire 6 minutes?
Does the server build the dataset and then send it, releasing any locks acquired after the dataset is built?
If you're using READ COMMITTED SNAPSHOT or are in SNAPSHOT isolation then there are no row and page locks in the first place.
Past that depends a on whether it's a streaming query plan or not. With a streaming plan SQL Server may be reading slowly from the tables as the results are sent across the network.
I've setup two SQL DBs on Azure with geo-replication. The primary is in Brazil and a secondary in West Europe.
Similarly I have two web apps running the same web api. A Brazilian web app that reads and writes on the Brazilian DB and a European web app that reads on the European DB and writes in the Brazilian DB.
When I test response times on read-only queries with Postman from Europe, I first notice that on a first "cold" call the European Web app is twice as fast as the Brazilian one. However, immediate next calls response times on the Bazilian web app are 10% of the initial "cold" call whereas response times on the European web app remain the same. I also notice that after a few minutes of inactivity, results are back to the "cold" case.
So:
why do query response times drop in Brazil?
whatever the answer is to 1, why doesn't it happen in Europe?
why does the response times optimization occurring in 1 doesn't last after a few minutes of inactivity?
Note that both web apps and DB are created as copy/paste (except geo-replication) from each other in an Azure ARM json file.
Both web apps are alwaysOn.
Thank you.
UPDATE
Actually there are several parts in action in what I see as a end user. The webapps and the dbs. I wrote this question thinking the issue was around the dbs and geo-replication however, after trying #Alberto's script (see below) I couldn,' see any differences in wait_times when querying Brazil or Europe so the problem may be on the webapps. I don't know how to further analyse/test that.
UPDATE 2
This may be (or not) related to query store. I asked on a new more specific question on that subject.
UPDATE 3
Queries on secondary database are not slower. My question was raised on false conclusions. I won't delete it as others took time to answer it and I thank them.
I was comparing query response times through rest calls to a web api running EF queries on a SQL Server DB. As rest calls to the web api located in the region querying the db replica are slower than rest calls to the same web api deployed in another region targeting the primary db, I concluded the problem was on the db side. However, when I run the queries in SSMS directly, bypassing the web api, I observe almost no differences in response times between primary and replica db.
I still have a problem but it's not the one raised in that question.
On Azure SQL Database your database' memory utilization may be dynamically reduced after some minutes of inactivity, and on this behavior Azure SQL differs from SQL Server on-premises. If you run a query two or three times it then start to execute faster again.
If you examine the query execution plan and its wait stats, you may find a wait named MEMORY_ALLOCATION_EXT for those queries executing after the memory allocation has been shrinked by Azure SQL Database service. Databases with a lot activity and query execution may not see its memory allocation reduced. For a detailed information of my part please read this StackOverflow thread.
Take in consideration also both databases should have the same service tier assigned.
Use below script to determine query waits and see what is the difference in terms of waits between both regions.
DROP TABLE IF EXISTS #before;
SELECT [wait_type], [waiting_tasks_count], [wait_time_ms], [max_wait_time_ms],
[signal_wait_time_ms]
INTO #before
FROM sys.[dm_db_wait_stats];
-- Execute test query here
SELECT *
FROM [dbo].[YourTestQuery]
-- Finish test query
DROP TABLE IF EXISTS #after;
SELECT [wait_type], [waiting_tasks_count], [wait_time_ms], [max_wait_time_ms],
[signal_wait_time_ms]
INTO #after
FROM sys.[dm_db_wait_stats];
-- Show accumulated wait time
SELECT [a].[wait_type], ([a].[wait_time_ms] - [b].[wait_time_ms]) AS [wait_time]
FROM [#after] AS [a]
INNER JOIN [#before] AS [b] ON
[a].[wait_type] = [b].[wait_type]
ORDER BY ([a].[wait_time_ms] - [b].[wait_time_ms]) DESC;
Is it normal to have 1 database, on a DB server, that is used by a frontend (web) server, but then also have a third server doing an UPDATE in that database?
I want the frontend server to send queries to a DB table to check if an action is "done".
SELECT status FROM table WHERE id = '...';
But that action will only be "done" if this third server sends an UPDATE to that table, and updates the status.
UPDATE table SET status = 'done' WHERE id is = '...';
So 2 different servers (frontend and backend) will need to communicate with the DB.. is that potentially problematic? Is there a 'cleaner' solution?
It is common to do it. What you need to consider is the isolation level of transactions.
As long as it is ‘Read Committed’ or above, those simple queries are safe.
For more complex and multiple queries need to be excuted then Repeatable Read or Serializable level should be considered.
In our legacy architecture, we have a MS SQL server database, this database stores all the sensor information in near real time basis, on average per second it receives 100 records.In order to get complete information about the sensor events, we need to join 2 to 3 tables in the database.
Sample Query:
SELECT SOMETHING
FROM TABLE1 AS tab1
INNER JOIN TABLE2 AS tab2 ON tab1.UpdateID=tab2.ID
INNER JOIN TABLE3 as tab3 ON tab1.TagID=tab3.ID
WHERE tab2.UpdateTime > ${lastExtractUnixTime}
Our requirement is to get the capture data change of above query every 1 minute and post records to Kafka.
Temporarily I am doing CDC using Spark Core JDBC, processing records, sending to Kafka and maintaining CDC information along with ${lastExtractUnixTime} into HBase as Phoenix table. Job is scheduled for every 1 minute batch interval.
As a long term solution, we are planning to use Apache Nifi for doing the CDC thing and post information to Kafka, Spark streaming will read messages from Kafka, will apply some business logic on it and will send the enriched data to the other Kafka topic; I don't find suitable processor, which will help me to dynamically pass the ${lastExtractUnixTime} in SQL and get the delta records every 1 or 2 minutes.
Please suggest how this can be accomplished using Apache Nifi.
I need to sync(upload first to remote DB-download to mobile device next) DB tables with remote DB from mobile device (which may insert/update/delete rows from multiple tables).
The remote DB performs other operation based on uploaded sync data.When sync continues to download data to mobile device the remote DB still performing the previous tasks and leads to sync fail. something like 'critical condition' where both 'sync and DB-operations' want access remote Databse. How to solve this issue? is it possible to do sync DB and operate on same DB at a time?
Am using Sql server 2008 DB and mobilink sync.
Edit:
Operations i do in sequence:
1.A iPhone loaded with application which uses mobilink for SYNC data.
2.SYNC means UPLOAD(from device to Remote DB)followed by DOWNLOAD(from Remote DB to device).
3.Remote DB means Consolidated DB ; device Db is Ultralite DB.
4.Remote DB has some triggers to fire when certain tables are updated.
5.An UPLOAD from device to Remote will fire triggers when sync upload finished.
6.Very next moment the UPLOAD finished DOWNLOAD to device starts.
7.Exactly same moment those DB triggers will fire.
8.Now a deadlock between DB SYNC(-DOWNLOAD) and trigger(Update queries included within) operations occur.
9.Sync fails with error saying cannot access some tables.
I did a lots of work around and Google! Came out with a simple(?!) solution for the problem.
(though the exact problem cannot be solved at this point ..i tried my best).
Keep track of all clients who does a sync(kind of user details).
Create a sql job scheduler which contains all the operations to be performed when user syncs.
Announce a "maintenance period" everyday to execute the tasks of sql job with respect to saved user/client sync details.
Here keeping track of client details every time is costlier but much needed!
Remote consolidated DB "completely-updated" only after maintenance period.
Any approaches better than this would be appreciated! all Suggestions are welcome!
My understanding of your system is following:
Mobile application sends UPDATE statement to SQL Server DB.
There is ON UPDATE trigger, that updates around 30 tables (= at least 30 UPDATE statements in the trigger + 1 main update statement)
UPDATEis executed in single transaction. This transaction ends when Trigger completes all updates.
Mobile application does not wait for UPDATE to finish and sends multiple SELECT statements to get data from database.
These SELECTstatements query same tables as the Trigger above is updating.
Blocking and deadlocks occur at some query for some user as Trigger is not completing updates before selects and keeps lock on tables.
When optimizing we are trying make it our processes less easy for computer, achieve same result in less iterations and use less resources or those resources that are more available/less overloaded.
My suggestions for your design:
Use parametrized SPs. Every time SQL Server receives any statement it creates Execution plan. For 1 UPDATE statement with a trigger DB needs at least 31 execution plan. It happens on busy Production environment for every connection every time app updates DB. It is a big waste.
How SPs would help reduce blocking?
Now you have 1 transaction for 31 queries, where locks are issued against all tables involved and held until transaction commits. With SP you'll have 31 small transaction and only 1-2 tables will be locked at a time.
Another question I would like to address: how to do asynchronous updates to your database?
There is a feature in SQL Server called Service Broker. It allows to process message queue (rows from the queue table) automatically: it monitors queue, takes messages from it and does processing you specify and deletes processes messages from the queue.
For example, you save parameters for your SPs - messages - and Service Broker executes SP with parameters.