Moving some complicated reporting sprocs to a centralized server and time went from 5 seconds to 30+ seconds.
validating what take so long via:
print '04 NWA Raw data Numeric'
print datediff(ss, #now, getdate())
set #now = GETDATE()
I am attempting to only pull local what this report needs with these queries:
1355 rows in 10 seconds----
select *
into #nwaDump
from [Phoenix].[NWA].dbo.QISDataNumeric
where rowguid in (
select rowguid from [Phoenix].[NWA].[dbo].[QISDataText] nd
where nd.DataValue in ( '41310291 ' )
)
249 rows in 28 seconds
select *
into #nwaText
from [Phoenix].[NWA].[dbo].[QISDataText] td
where td.DataValue in ( '41310291 ' )
Same two queries run on other server < 1 second return time.
Any ideas?
You can try to use OPENQUERY for this, since it should make the filters on the linked server and then pull them to your other server:
SELECT *
INTO #nwaText
FROM OPENQUERY(Phoenix,'SELECT * FROM [NWA].[dbo].[QISDataText]
WHERE DataValue in ( ''41310291 '' )')
Related
I have created a materialized view using this query
CREATE MATERIALIZED VIEW db.top_ids_mv (
`date_time` DateTime,
`id` String,
`total` UInt64
) ENGINE = SummingMergeTree
ORDER BY
(date_time, id) SETTINGS index_granularity = 8192 POPULATE AS
SELECT
toDateTime((intDiv(toUInt32(date_time), 60 * 60) * 60) * 60) AS date_time,
id AS id,
count(*) AS count
FROM
db.table
WHERE
type = 'user'
GROUP BY
date_time,id
My table contains almost 18 billion records. I have inserted my old data using POPULATE. But newly inserted data is not getting inserted into this materialized view. I have created many other views and they are working fine but this is creating issue.
This is what I am receiving in logs
2021.09.23 19:54:54.424457 [ 137949 ] {5b0f3c32-2900-4ce4-996d-b9696bd38568} <Trace> PushingToViewsBlockOutputStream: Pushing (sequentially) from db.table (15229c91-c202-4809-9522-9c91c2028809) to db.top_ids_mv (0cedb783-bf17-42eb-8ced-b783bf1742eb) took 0 ms.
One thing I noticed is that it is taking 0ms. I think that is wrong because query must take some time.
Thanks. Any help would be appreciated
SummingMergeTree does not store rows with metrics == 0.
total UInt64 <----> count(*) AS count -- names does not match. Your Mat.View inserts 0 into total, count goes nowhere.
Both are expected and specifically implemented for the reasons.
https://den-crane.github.io/Everything_you_should_know_about_materialized_views_commented.pdf
...
SELECT
toDateTime((intDiv(toUInt32(date_time), 60 * 60) * 60) * 60) AS date_time,
id AS id,
count(*) AS total --<<<<<------
FROM
db.table
...
For query performance and better data compression I would do
ENGINE = SummingMergeTree
ORDER BY ( id, date_time ) --- order id , time
Also try codecs
`date_time` DateTime CODEC(Delta, LZ4),
`id` LowCardinality(String),
`total` UInt64 CODEC(T64, LZ4)
SQL Server 2008 R2 Enterprise
I have a database with 3 tables that I am keeping a retention time of 15 days. This is a logging database that is very active and about 500 GB in size and eats about 30GB a day unless purged. I can't seem to get caught up on one of the tables and I am falling behind. This table has 220 million rows and it needs to purge around 10-12 million rows nightly. I am currently at 30 million rows needed to purge. I can only run this purge at night due to the volume of incoming inserts competing for table locks. I have confirmed that everything is indexed correctly and have run Brent Ozars sp_Blitz_Index just to confirm that. Is there any way to optimize what I am doing below? I am running the same purge steps for each table.
Drop and Create 3 purge tables: Purge_Log, Purge_SLogHeader and Purge_SLogMessage.
2.Insert rows into the purge tables (Takes 5 minutes each table):
Insert Into Purge_Log
Select ID from ServiceLog
where startTime < dateadd (day, -15, getdate() )
--****************************************************
Insert into Purge_SLogMessage
select serviceLogId from ServiceLogMessage
where serviceLogId in ( select id from
ServiceLog
where startTime < dateadd (day, -15, getdate() ))
--****************************************************
Insert into Purge_SLogHeader
Select serviceLogId from ServiceLogHeader
where serviceLogId in ( select id from
ServiceLog
where startTime < dateadd (day, -15, getdate() ))
After that is inserted, then I run the following with differences for each table:
SET ROWCOUNT 1000
delete_more:
delete from ServiceLog
where Id in ( select Id from Purge_Log)
IF ##ROWCOUNT > 0 GOTO delete_more
SET ROWCOUNT 0
Basically does anyone see a way that I can make this procedure run faster or have a different way to go about it. I've made the queries as simple as possible and with only one subquery. I've used a join and the execution query plan says the time is the same to complete it that way. Any guidance would be appreciated.
You can use this technique for all the tables, collect IDs first in temporary table to avoid scanning original table again and again in huge data. I hope it will work perfectly for you all the tables:
DECLARE #del_query VARCHAR(MAX)
/*
Taking IDs from ServiceLog table instead of Purge_Log because Purge_Log may have more data than expected because of frequent purging
*/
IF OBJECT_ID('tempdb..#tmp_log_ids') IS NOT NULL DROP TABLE #tmp_log_ids
SELECT ID INTO #tmp_log_ids FROM ServiceLog WHERE startTime < DATEADD(DAY, -15, GETDATE())
SET #del_query ='
DELETE TOP(100000) sl
FROM ServiceLog sl
INNER JOIN #tmp_log_ids t ON t.id = s1.id'
WHILE 1 = 1
BEGIN
EXEC(#del_query + ' option(maxdop 5) ')
IF ##rowcount < 100000 BREAK;
END
SET #del_query ='
DELETE TOP(100000) sl
FROM ServiceLogMessage sl
INNER JOIN #tmp_log_ids t ON t.id = s1.serviceLogId'
WHILE 1 = 1
BEGIN
EXEC(#del_query + ' option(maxdop 5) ')
IF ##rowcount < 100000 BREAK;
END
SET #del_query ='
DELETE TOP(100000) sl
FROM ServiceLogHeader sl
INNER JOIN #tmp_log_ids t ON t.id = s1.serviceLogId'
WHILE 1 = 1
BEGIN
EXEC(#del_query + ' option(maxdop 5) ')
IF ##rowcount < 100000 BREAK;
END
I have inherited this table and trying to optimize the queries. I am stuck with one query. Here is the table information
RaterName - varchar(24) - name of the rater
TimeTaken - varchar(12) - is stored as 00:10:14:8
Year - char(4) - is stored as 2014
I need
distinct list of raters, total count for the rater, sum(TimeTaken) for rater, avg(timetaken) for rater (for a given year)
I also need sum(timetaken) and avg(TimeTaken) for all the raters (for a given year)
Here is the query that I have come up with for #1... I would like the sum and avg to be like hh:mm:ss. How can I do this?
SELECT
[RaterName]
, count(*) as TotalRatings
, SUM((DATEPART(hh,convert(datetime, timetaken, 101))*60)+DATEPART(mi,convert(datetime, timetaken, 101))+(DATEPART(ss,convert(datetime, timetaken, 101))/(60.0)))/60.0 as TotalTimeTaken
, AVG((DATEPART(hh,convert(datetime, timetaken, 101))*60)+DATEPART(mi,convert(datetime, timetaken, 101))+(DATEPART(ss,convert(datetime, timetaken, 101))/(60.0)))/60.0 as AverageTimeTaken
FROM
[dbo].[rating]
WHERE
year = '2014'
GROUP BY
RaterName
ORDER BY
RaterName
Output:
RaterName TotalRatings TotalTimeTaken AverageTimeTaken
================================================================
Rater1 257 21.113609 0.082154
Rater2 747 41.546106 0.055617
Rater3 767 59.257218 0.077258
Rater4 581 37.154163 0.063948
Can I incorporate #2 in this query or write a second query and drop group by from it?
On the front end, I am using C#.
WITH data ( raterName, timeTaken )
AS (
SELECT raterName,
DATEDIFF(MILLISECOND, CAST('00:00' AS TIME),
CAST(timeTaken AS TIME))
FROM rating
WHERE CAST([year] AS INT) = 2014
)
SELECT raterName, COUNT(*) AS totalRatings,
SUM(timeTaken) AS totalTimeTaken, avg(timeTaken) AS averageTimeTaken
FROM data
GROUP BY raterName
ORDER BY raterName;
PS: If you don't want milliseconds, can make that Second or Minute.
EDIT: On your C# frontend you can make the Milliseconds or Seconds to a TimeSpan which would give you the format when you use ToString. ie:
var ttt = TimeSpan.FromSeconds(totalTimeTaken).ToString();
From a table do I want to select the first 4 rows after the first one. I had this in MySQL working as the following:
SELECT * FROM `yp_playlist` LIMIT 1, 4;
I have done some research to see the SQL Server version of this query and came out on the following but this keeps resulting me into an error which keeps me clueless for now.
SELECT id, entry
FROM (
SELECT id, entry, ROW_NUMBER() OVER (ORDER BY id) AS RowNum
FROM playlist
) AS MyDerivedTable
WHERE MyDerivedTable.RowNum BETWEEN 0 AND 10
This is the error:
There was an error parsing the query. [ Token line number = 3, Token line offset = 36, Token in error = OVER ]
With SQL Server Compact 4.0 you can use;
SELECT * FROM [Orders] ORDER BY [Order Date] OFFSET 1 ROWS
FETCH NEXT 4 ROWS ONLY;
SELECT TOP 10 *
FROM ( SELECT id, entry
FROM playlist
ORDER BY id )
one way is to set rowcount
e.g
set rowcount 4
then order your data so you get the ones you want at the top
I am trying to execute the following query:
SELECT *
FROM OPENQUERY
(
CLP,
'
SELECT *
FROM ORACLE_TABLE
WHERE [UPDATEDATE] > ''1900-01-01 12:00 AM''
'
)
This query works fine when I remove the date criteria. But as soon as I try to pass this criteria it no longer works. I can't figure out what I am missing.
Try to remove [and] and add convert date:
SELECT *
FROM OPENQUERY
(CLP,
'
SELECT *
FROM ORACLE_TABLE
WHERE
UPDATEDATE > to_date(''1900-01-01 12:00'',''yyyy-mm-dd hh:mi'')
'
)
or with am
SELECT *
FROM OPENQUERY
(CLP,
'
SELECT *
FROM ORACLE_TABLE
WHERE
UPDATEDATE > to_date(''1900-01-01 12:00 AM'',''yyyy-mm-dd hh:miam '')
'
)
Use
SELECT *
FROM OPENQUERY
(CLP,SELECT * FROM ORACLE_TABLE WHERE trunc(UPDATEDATE) > ''01-JAN-1900'')
All dates with no time component on them defaults to 12:00 AM (or 00:00 Hrs) in Oracle.
You can also use to_timestamp(UPDATEDATE) but for this to work the column should be of timestamp type (i.e. contain timestamp on it otherwise it would always give 12 AM). You can also use to_char(UPDATEDATE,'YYYY-MM-DD HH:MI AM').