sql waits for sp/rpc/stmt completed and other events - sql-server

In SQL XE for sp/rpc/stmt completed events it will be great if we can include wait types like IO/Network waits etc. Just like we can see reads/cpu/duration, If we can also gets other resource waits, we can get a good idea why sql is slow during scenarios where the duration/CPU is high and reads are low.

You can actually track the wait statistics of a query but the other way around - by tracking the wait statistics themselves. Take a look at the following snippet and the result image:
CREATE EVENT SESSION [Wait Statistics] ON SERVER
ADD EVENT sqlos.wait_info
(
ACTION(sqlserver.database_name,sqlserver.session_id,sqlserver.sql_text)
WHERE (
opcode = N'End'
AND duration > 0
)
),
ADD EVENT sqlos.wait_info_external
(
ACTION(sqlserver.database_name,sqlserver.session_id,sqlserver.sql_text)
WHERE (
opcode = N'End'
AND duration > 0
)
)
Here we capture the end of every wait (this is done because at this point SQL Server knows the duration of the statistic, so we can output it int he result) that has has duration greater than 0. In the ACTION part we retrieve the database name, the text of the query that caused the statistic and the session id of the query.
Beware though. Tracking wait statistics through Extended Events(and not through sys.dm_os_wait_stats that collects aggregate data) can generate a ton of data overwhelmingly fast. Should you choose this method, you should define very carefully which wait statistics you want to keep track of and what from what duration on the statistic causes you problem.

Related

Azure SQL DB, Extended Events using Ring Buffer stop on their own after a while - why?

I run a fairly basic Xevent on Azure SQL DB, using a Ring Buffer target, that looks for Severity errors over 10, based on a post Brent Ozar made. However, the session stops on its own, and I'm not sure why. I suspect it's filling up, even though all the documentation says it's FIFO and will drop the oldest events. Am I missing something? Do I need to set something differently? Thanks!
update: weirdly, I made a much smaller test, with max_memory = 200 and a much bigger amount of data in the failing code, to try and force it to stop/die, but I show it's looping as I expected. So I'm still confused why it's stopping, but it doesn't seem to be because it's filling up.
CREATE EVENT SESSION
severity_10plus_errors_XE
ON database
ADD EVENT sqlserver.error_reported
(
ACTION(sqlserver.client_app_name,sqlserver.client_hostname,sqlserver.database_id,sqlserver.sql_text,sqlserver.tsql_stack,sqlserver.username)
--ACTION (sqlserver.sql_text, sqlserver.tsql_stack, sqlserver.database_id, sqlserver.username)
WHERE ([severity]> 10)
)
ADD TARGET
package0.ring_buffer
(SET
max_memory = 10000 ) -- Units of KB.
WITH (MAX_DISPATCH_LATENCY = 60SECONDS)
GO
ALTER EVENT SESSION severity_10plus_errors_XE
ON DATABASE
STATE = START;
GO

Flink CEP cannot get correct results on a unioned table

I use Flink SQL and CEP to recognize some really simple patterns. However, I found a weird thing (likely a bug). I have two example tables password_change and transfer as below.
transfer
transid,accountnumber,sortcode,value,channel,eventtime,eventtype
1,123,1,100,ONL,2020-01-01T01:00:01Z,transfer
3,123,1,100,ONL,2020-01-01T01:00:02Z,transfer
4,123,1,200,ONL,2020-01-01T01:00:03Z,transfer
5,456,1,200,ONL,2020-01-01T01:00:04Z,transfer
password_change
accountnumber,channel,eventtime,eventtype
123,ONL,2020-01-01T01:00:05Z,password_change
456,ONL,2020-01-01T01:00:06Z,password_change
123,ONL,2020-01-01T01:00:08Z,password_change
123,ONL,2020-01-01T01:00:09Z,password_change
Here are my SQL queries.
First create a temporary view event as
(SELECT accountnumber,rowtime,eventtype FROM password_change WHERE channel='ONL')
UNION ALL
(SELECT accountnumber,rowtime, eventtype FROM transfer WHERE channel = 'ONL' )
rowtime column is the event time extracted directly from original eventtime col with watermark periodic bound 1 second.
Then output the query result of
SELECT * FROM `event`
MATCH_RECOGNIZE (
PARTITION BY accountnumber
ORDER BY rowtime
MEASURES
transfer.eventtype AS event_type,
transfer.rowtime AS transfer_time
ONE ROW PER MATCH
AFTER MATCH SKIP PAST LAST ROW
PATTERN (transfer password_change ) WITHIN INTERVAL '5' SECOND
DEFINE
password_change AS eventtype='password_change',
transfer AS eventtype='transfer'
)
It should output
123,transfer,2020-01-01T01:00:03Z
456,transfer,2020-01-01T01:00:04Z
But I got nothing when running Flink 1.11.1 (also no output for 1.10.1).
What's more, I change the pattern to only password_change, it still output nothing, but if I change the pattern to transfer then it outputs several rows but not all transfer rows. If I exchange the eventtime of two tables which means let password_changes happen first, then the pattern password_change will output several rows while transfer not.
On the other hand, if I extract those columns from two tables and merge them in one table manually, then emit them into Flink, the running result is correct.
I searched and tried a lot to get it right including changing the SQL statement, watermark, buffer timeout and so on, but nothing helped. Hope anyone here can help. Thanks.
10/10/2020 update:
I use Kafka as the table source. tEnv is the StreamTableEnvironment.
Kafka kafka=new Kafka()
.version("universal")
.property("bootstrap.servers", "localhost:9092");
tEnv.connect(
kafka.topic("transfer")
).withFormat(
new Json()
.failOnMissingField(true)
).withSchema(
new Schema()
.field("rowtime",DataTypes.TIMESTAMP(3))
.rowtime(new Rowtime()
.timestampsFromField("eventtime")
.watermarksPeriodicBounded(1000)
)
.field("channel",DataTypes.STRING())
.field("eventtype",DataTypes.STRING())
.field("transid",DataTypes.STRING())
.field("accountnumber",DataTypes.STRING())
.field("value",DataTypes.DECIMAL(38,18))
).createTemporaryTable("transfer");
tEnv.connect(
kafka.topic("pchange")
).withFormat(
new Json()
.failOnMissingField(true)
).withSchema(
new Schema()
.field("rowtime",DataTypes.TIMESTAMP(3))
.rowtime(new Rowtime()
.timestampsFromField("eventtime")
.watermarksPeriodicBounded(1000)
)
.field("channel",DataTypes.STRING())
.field("accountnumber",DataTypes.STRING())
.field("eventtype",DataTypes.STRING())
).createTemporaryTable("password_change");
Thank #Dawid Wysakowicz's answer. To confirm that, I added 4,123,1,200,ONL,2020-01-01T01:00:10Z,transfer to the end of transfer table, then the output becomes right, which means it is really some problem about watermarks.
So now the question is how to fix it. Since a user will not change his/her password frequently, the time gap between these two table is unavoidable. I just need the UNION ALL table has the same behavior as that I merged manually.
Update Nov. 4th 2020:
WatermarkStrategy with idle sources may help.
Most likely the problem is somewhere around watermark generation in conjunction with the UNION ALL operator. Could you share how you create the two tables including how you define the time attributes and what are the connectors? It could let me confirm my suspicions.
I think the problem is that one of the sources stops emitting Watermarks. If the transfer table (or the table with lower timestamps) does not finish and produces no records it emits no Watermarks. After emitting the fourth row it will emit Watermark = 3 (4-1 second). The Watermark of a union of inputs is the smallest of values of the two. Therefore the first table will pause/hold the Watermark with value Watermark = 3 and thus you see no progress for the original query and you see some records emitted for the table with smaller timestamps.
If you manually join the two tables, you have just a single input with a single source of Watermarks and thus it progresses further and you see some results.

How to fix System.LimitException: Apex CPU time limit exceeded caused by workflows with email alert?

I'm trying to execute batch test method on 100 records and get CPU Runtime Limit error.
I placed the Limits.getCpuTime() method in my code and noticed that my code without the workflow segment takes 3148 ms to complete. However, when I activate two workflows that sends emails to one user each, I get the CPU runtime limit error. In total my process without those two workflows takes around 10 seconds to complete while with them activated it takes around 20 seconds.
#IsTest
static void returnIncClientAddress(){
//Select Required Records
User incidentClient = [SELECT Id FROM User WHERE Username = 'bbaggins#shire.qa.com' LIMIT 1];
BMCServiceDesk__Category__c category = [SELECT Id FROM BMCServiceDesk__Category__c WHERE Name = 'TestCategory'];
BMCServiceDesk__BMC_BaseElement__c service = [SELECT ID FROM BMCServiceDesk__BMC_BaseElement__c WHERE Name = 'TestService'];
BMCServiceDesk__BMC_BaseElement__c serviceOffering = [SELECT ID FROM BMCServiceDesk__BMC_BaseElement__c WHERE Name = 'TestServiceOffering'];
//Create Incidents
List<BMCServiceDesk__Incident__c> incidents = new List<BMCServiceDesk__Incident__c>();
for(integer i = 0; i < 100; i++){
BMCServiceDesk__Incident__c incident = new BMCServiceDesk__Incident__c(
BMCServiceDesk__FKClient__c = incidentClient.ID,
BMCServiceDesk__FKCategory__c = category.ID,
BMCServiceDesk__FKServiceOffering__c = serviceOffering.ID,
BMCServiceDesk__FKBusinessService__c = service.ID,
BMCServiceDesk__FKStatus__c = awaiting_for_handling
);
incidents.add(incident);
}
test.startTest();
insert incidents;
test.stopTest();
}
I expected the email workflows and alerts to be processed in batch and sent without being so expensive in CPU time, but it seems that Salesforce takes a lot of time both checking the workflows rules and executing on them when needed. The majority of the process' time seems to be spent on sending the workflows' emails (which it doesn't actually do because it's a test method).
There's not much you can do to control the execution time of Workflow Rules. You could try converting them into Apex and benchmarking to see whether that results in improvement in time consumed, but I suspect the real solution is that you're going to have to dial down your bulk test.
The CPU limit for a transaction is 10 seconds. If your unit test code is already taking approximately 10 seconds to complete without Workflows (I'm not sure exactly what bounds your 3148 ms and 10 s refer to), you've really got only two choices:
Make the sum total of automation running on insert of this object faster;
Reduce the quantity of data you're processing in this unit test.
It's not clear what you're actually testing here, but if it's an Apex trigger, you should make sure that it's properly bulkified and does not consume unnecessary CPU time, including through trigger recursion. Reviewing the call stack in your logs (or simply adding System.debug() statements) may help with that.
Lastly - make sure you write assertions in your test method. Test methods without assertions are close to worthless.
Are there triggers on the BMCServiceDesk__Incident__c or on objects modified by the Workflow? Triggers on updates could possible cause the code to execute multiple times in the same execution context causing you to hit the cpu limit. Consider prventing reentry into triggers or performing check to only run triggers if specific criteria is met.
Otherwise consider refactoring the code if possible to have work executed within the same loop if possible as loops especially nested loops drive up your cpu usage. Usually workflow on their own dont drive up CPU limit unless triggers are executes due to workflow updates.

increase performance of a linq query using contains

I have a winforms app where I have a Telerik dropdownchecklist that lets the user select a group of state names.
Using EF and the database is stored in Azure SQL.
The code then hits a database of about 17,000 records and filters the results to only include states that are checked.
Works fine. I am wanting to update a count on the screen whenever they change the list box.
This is the code, in the itemCheckChanged event:
var states = stateDropDownList.CheckedItems.Select(i => i.Value.ToString()).ToList();
var filteredStops = (from stop in aDb.Stop_address_details where states.Contains(stop.Stop_state) select stop).ToArray();
ExportInfo_tb.Text = "Current Stop Count: " + filteredStops.Count();
It works, but it is slow.
I tried to load everything into a memory variable then querying that vs the database but can't seem to figure out how to do that.
Any suggestions?
Improvement:
I picked up a noticeable improvement by limiting the amount of data coming down by:
var filteredStops = (from stop in aDb.Stop_address_details where states.Contains(stop.Stop_state) select stop.Stop_state).ToList();
And better yet --
int count = (from stop in aDb.Stop_address_details where
states.Contains(stop.Stop_state)
select stop).Count();
ExportInfo_tb.Text = "Current Stop Count: " + count.ToString();
The performance of you query, actually, has nothing to do with Contiains, in this case. Contains is pretty performant. The problem, as you picked up on in your third solution, is that you are pulling far more data over the network than required.
In your first solution you are pulling back all of the rows from the server with the matching stop state and performing the count locally. This is the worst possible approach. You are pulling back data just to count it and you are pulling back far more data than you need.
In your second solution you limited the data coming back to a single field which is why the performance improved. This could have resulted in a significant improvement if your table is really wide. The problem with this is that you are still pulling back all the data just to count it locally.
In your third solution EF will translate the .Count() method into a query that performs the count for you. So the count will happen on the server and the only data returned is a single value; the result of count. Since network latency CAN often be (but is not always) the longest step when performing a query, returning less data can often result in significant gains in query speed.
The query translation of your final solution should look something like this:
SELECT COUNT(*) AS [value]
FROM [Stop_address_details] AS [t0]
WHERE [t0].[Stop_state] IN (#p0)

Analysis Services stored procedure performance

I'm writing a stored procedure in .NET to do some complex calculations that can't be written easily in pure MDX. The first problem I'm having is how to retrieve a set of data in a tabular form to pass to my calculation.
My code so far is written below. I would have thought that after we retrieve our value at position **1, we would have all the data in memory to interact with. However, it seems that at position **2, a Query Subcube is issued to the storage engine for each and every day in our range. This is devastating to performance.
Is there something I'm doing wrong? Is there another method I can call to evaluate the set all at once?
// First get the date range that we'd like to calculate over.
// (These values are constant here for example only)
DateTime date = new DateTime(2012, 4, 1);
int dateFrom = KeyFromDate(date.AddDays(-360));
int dateTo = KeyFromDate(date);
string dateRange = string.Format(
"[Date].[Date].&[{0}]:[Date].[Date].&[{1}]",
dateFrom,
dateTo
);
Expression expression = new Expression(dateRange + "*[Measures].[My Measure]");
MDXValue value = expression.CalculateMdxObject(null); // ***1
foreach (var tuple in value.ToSet().Tuples)
{
MDXValue tupleValue = MDXValue.FromTuple(tuple).ToInt32(); // ***2
}
Run SQL Profiler, connect to analysis services, on tab "event selection" check "show all events" and select "get data from aggredations", "get data from cache", "query subsube" and "query subcube verbose".
First read this document http://www.microsoft.com/en-us/download/details.aspx?id=17303 - see page 18 - in order to understand how "query subcube verbose" is working.
Then in Visual Studio (where you're debugging your procedure) in debug mode pass through line **1
and see in SQL Profiler what is queried in verbose - what measure group and what attributes.
Then pass through ***2 and see again in SQL Profiler in verbose events what is queried.
I believe that the set of attributes is different, so it may so happen that in **1 it uses some aggregate, and in place **2 when "value" is present in the tuple - there are no aggregate for this set of attributes, so instead of making "read from aggregations" once it makes "read from measure group cache" several times.
I can't tell more exactly cause I don't have your cube. Try to find this out by "query subcube verbose" events, and try to use BIDS Helper to create necessary aggregations manually (with specific set of attributes) - it may help.

Resources