Spectre/Meltdown slowing down delphi service - sql-server

I have a problem with the spectre/meltdown patch from windows (it got released somewhere around Q1 last year). When activated, my delphi REST service is being slowed down about 15 times (so if a request takes 1 second, with the activated patches its about 15 seconds). I have traced the slowdown down to the database connection. Somehow the translation from parameters, after they have all been set, to the sql text, takes really long and then the execution on the database itself takes a lot longer than usual. First I helped myself by cutting down the sql statement to couple of rows, and it got faster (so more rows mean a lot more time. Approximately its like, if you add one more row to an update/insert statement it takes 0.2-0.3 seconds more to process the transaction. As far as I saw it, select statements work fine).
After I got the same issue on other requests, and the application is still in development, I turned of the patches, and everything got a lot faster. Now the administrator insists that the patches are being turned on, and the problem is there again .
Did anybody experience something like this, or is there a possiblity to exclude an application from being targeted by the patches? The strange thing is, I also have an client/server application that is using the same business logic. The client/server application is also being slowed down, but approximately just around the factor of 2. So thats the thing that I dont quite understand. With the same functions, it takes a lot longer from within the service, than from the client/server application.
Ah yes, I am using devart for the database connection, and its an mssql server (2016). The service and the client/server application are written in delphi XE7 (now trying to update do Xe10.2 hoping that this will help)
Thanks

Related

Interbase 2020 crashes/loops

We use Interbase 2020 as production DB using UTF8 (approx 250 simultaneous user). With this database we have two main problems that we are not able to solve.
In history we had a problem with an older udf-function that crashed our database because it was not ready for unicode string operation. As a result we changed to unicode compatible versions.
The last few years sometimes we get hiccup (as we call it). In this case every client looses connection and the guardian restarts. The clients can connect again without us doing anything.
The second problem is that sometimes the interbase does not crash but everyone looses the connection and it is not possible to reconnect (by client, or ibexpert for example). In this case we have to restart the whole server.
These problems are occuring irregular. Most times it first starts with a hiccup. After a time (maybe two to ten hours later), the second problem arrives and we need to restart our database. If we are lucky we need to restart the server 2-3 times, on a bad day we need to restart the server more often as the second problem returns again and again (for example every 30 minutes).
We are not yet able to locate this problem. It doesn't matter if a user is connected to the database or just idling on weekends. It also often happens when nobody is connected.
Even the server logs don't give hints that helped us yet.
-We minimized udf function use as low as possible, changed to newer udfs that support unicode etc.
-functions that crash the server (afaik) are guarded that they dont get for example invalid datetimes
-We update database server regularely to newest version
-also updated client dlls
-also updated connection components (IBDAC) + Delphi 11.1
-wrote exception tracker in our client software (unfortunately there is only the connection lost error)
-regularely check active transactions if something hangs/loops/snapshot creation
Do you have any information that we could use to solve our problems? Is there any possibility to get more info out of the log files (other log levels possible?)? We don't want to log every procedure call if not necessary, but if there are no other options we need to..
Thanks for your help!
Matze,
I suggest you log a Case with our Support team at Embarcadero (https://www.embarcadero.com/support). They will work with you to understand the specifics of the crash, get relevant details (and Performance Monitoring information) from you, and help us work on a resolution (if not addressed already in our latest update).
We have addressed a few corner cases (and other crash reports) in many updates over the past couple years in InterBase 2020, and are eager to get to the bottom of this issue as well. You can see some of the resolved crash reports at https://docwiki.embarcadero.com/InterBase/2020/en/Resolved_Defects
Supporting 250 simultaneous users is not the problem, but understanding how the use cases are running into any potential system resource limits is important.
You do mention that you have the latest updates to InterBase 2020, but I do not see a build number in your message. You can get the most recent update build (14.4.0.804) of the server (if on Windows) from https://my.embarcadero.com/#downloadDetail/1383

How can I check if the system time of the DB server is correct?

I have got a bug case from the service desk, which was a result of different system times on the application server (JBoss) and DB(Oracle) server. As a result, timeouts lied.
It doesn't happen often, but for the future, it will be better if the app server could raise alarm about the bad time on the DB server before it results in some deeper problems.
Of course, I can simply read
Select CURRENT_TIMESTAMP
, and compare it against the local time. But it is probable that the time of sending the query and getting its result will get some noticeable time and I will recognize good time as bad one or vice versa.
I can also check the time from sending the query to the return of the result. But this way will work correctly in the case of the good net without lags. And if the time on the DB server fails, it is highly probable that the net around the DB server is not OK. The queues on the DB server can make the times of sending and receiving noticeably unequal.
What is the best way you know to check the time on the DB server?
Limitations: preciseness of 5 sec
false alarms <10%
To be optimized(minimized): lost alarms.
Maybe I am inventing the bicycle and JBoss and/or Oracle have some tool for that? (I could not find it)
Have a program running on the app server get the current time there, then query the database time (CURRENT_TIMESTAMP) and the app server gets the current time there after the query returns.
Confirm that the DB time is between the two times on the App Server (with any tolerance you need). You can include a separate check on how long it took to get the response from the DB but it should be trivial.
If the environment is some form of VM, issues are most likely to arise when the VM is started or resumed from a pause. There might be situations where a clock is running fast or slow so recording the times would allow you to look for trends in either direction and allow you to take preemptive action.

How to avoid Mono ADO.NET serving wrong results to requests after one times out?

On the .NET Framework version 4
I'm seeing a possible concurrency issue in the SQL Server ADO.NET implementation on mono 4.2.2 that manifests when queries are cancelled or time out on the client, using the SqlCommand.ExecuteReader api.
To reproduce the issue seen in the field:
I run start 3 new timed tasks concurrently every second that run 3 - 5 relatively small queries that complete and return (all using SqlCommand.ExecuteReader), this runs as expected.
Then I add a long running query to the test run, set to execute every 65 seconds but cancel after 60 seconds. It would take longer than 60 seconds for the query to complete so gets cancelled every time (using SqlCommand.Cancel()).
After running for several minutes, suddenly most of the attempts to iterate the SqlDataReader returned error because the expected fields are not present on the returned rows, so when the data layer tries to access them by name, there is an exception.
Adding logging code to print the fields on the row indicate that they are from another query that is being run as part of the test, so one that is either running concurrently or very recently.
Once this problem occurs for one query, it happens very frequently indeed, in fact most queries fail. In the field, even services that were only trying to service 5 or so queries a minute were returning the wrong recordsets back for most queries.
Restarting the process fixes the problem.
FYI
A new connection, command and reader object are instantiated per query, and are used withing their own 'using' blocks.
Default connection pooling for ADO.NET is being used
Most connections are to the same DB, there is a seperate connection made to another DB on another server once per task run but this is always completed successfully.
The code is mature and in use in production on Windows .NET framework systems without issue, and running the same tests on Windows .NET framework cannot reproduce the problem, so it is unlikely to be an issue across both platforms.
Has anyone else seen this and can tell me what I might be doing wrong? Would simply disabling connection pooling be a (temporary) workaround for this problem?
Following further testing, explicitly disabling connection pooling does in fact work around this issue, but of course this comes with additional overhead, especially for applications with a high frequency of queries

How to decrease the response time when dealing with SQL Server remotely?

I have created a vb.net application that uses a SQL Server database at a remote location over the internet.
There are 10 vb.net clients that are working on the same time.
The problem is in the delay time that happens when inserting a new row or retrieving rows from the database, the form appears to be freezing for a while when it deals with the database, I don't want to use a background worker to overcome the freeze problem.
I want to eliminate that delay time and decrease it as much as possible
Any tips, advises or information are welcomed, thanks in advance
Well, 2 problems:
The form appears to be freezing for a while when it deals with the database, I don't want to use a background worker
to overcome the freeze problem.
Vanity, arroaance and reality rarely mix. ANY operation that takes more than a SHORT time (0.1-0.5 seconds) SHOULD run async, only way to kep the UI responsive. Regardless what the issue is, if that CAN take longer of is on an internet app, decouple them.
But:
The problem is in the delay time that happens when inserting a new records or retrieving records from the database,
So, what IS The problem? Seriously. Is this a latency problem (too many round trips, work on more efficient sql, batch, so not send 20 q1uestions waiting for a result after each) or is the server overlaoded - it is not clear from the question whether this really is a latency issue.
At the end:
I want to eliminate that delay time
Pray to whatever god you believe in to change the rules of physics (mostly the speed of light) or to your local physician tof finally get quantum teleportation workable for a low cost. Packets take time at the moment to travel, no way to change that.
Check whether you use too many ound trips. NEVER (!) use sql server remotely with SQL - put in a web service and make it fitting the application, possibly even down to a 1:1 match to your screens, so you can ask for data and send updates in ONE round trip, not a dozen. WHen we did something similar 12 years ago with our custom ORM in .NET we used a data access layer for that that acepted multiple queries in one run and retuend multiple result sets for them - so a form with 10 drop downs could ask for all 10 data sets in ONE round trip. If a request takes 0.1 seconds internet time - then this saves 0.9 seconds. We had a form with about 100 (!) round trips (creating a tree) and got that down to less than 5 - talk of "takes time" to "whow, there". Plus it WAS async, sorry.
Then realize moving a lot of data is SLOW unless you have instant high bandwidth connections.
THis is exaclty what async is done for - if you have transfer time or latency time issues that can not be optimized, and do not want to use async, go on delivering a crappy experience.
You can execute the SQL call asynchronously and let Microsoft deal with the background process.
http://msdn.microsoft.com/en-us/library/7szdt0kc.aspx
Please note, this does not decrease the response time from the SQL server, for that you'll have to try to improve your network speed or increase the performance of your SQL statements.
There are a few things you could potentially do to speed things up, however it is difficult to say without seeing the code.
If you are using generic inserts - start using stored procedures
If you are closing the connection after every command then... well dont. Establishing a connection is typically one of the more 'expensive' operations
Increase the pipe between the two.
Add an index
Investigate your SQL Server perhaps it not setup in a preferred manner.

Classic ASP Bottlenecks

I have 2 websites connecting to the same instance of MSSQL via classic ASP. Both websites are similar in nature and run similar queries.
One website chokes up every once in a while, while the other website is fine. This leads me to believe MSSQL is not the problem, otherwise I would think the bottleneck would occur in both websites simultaneously.
I've been trying to use Performance Monitor in Windows Server 2008 to locate the problem, but since everything is in aggregate form, it's hard to find the offending asp page.
So I am looking for some troubleshooting tips...
Is there a simple way to check all recent ASP pages and the see amount of time they ran for?
Is there a simple way to see live page requests as they happen?
I basically need to track down this offending code, but I am having a hard time seeing what happening in real-time through IIS.
If you use "W3C Extended Logging" as the log mode for your IIS logfiles, then you can switch on a column "time-taken" which will give you the execution time of each ASP in milliseconds (by default, this column is disabled). See here for more details.
You may find that something in one application is taking a lock in the database (e.g. through a transaction) and then not releasing it, which causes the other app to timeout.
Check your code for transactions and them being closed, and possibly consdier setting up tracing on the SQL server to log deadlocks.
Your best bet is to run SQL Server profiler to see what procedure or sql may be taking a long time to execute. You can also use Process Monitor to see any pages that may be taking a long time to finish execution and finally dont forget to check your IIS logs.
Hope that helps

Resources