I currently have a dataset I am getting from a Microsoft SQL query. It returns back Date, Server name, and call duration in milliseconds. My question is a multi parted one.
Is there a simpler way to breakout these aggregated metrics by machine without doing the following X number of times?
$__timeGroup(date, '5m') as time,
server,
max(value) as max,
min(value) as min,
FROM
rawData
WHERE
$__timeFilter(date)
and server = 'Server1'
GROUP BY
$__timeGroup(date, '5m', 0), server
ORDER BY
1
Also is there a way to filter all the metrics by the server column to show all the metrics for one particular server?
Related
is it possible to make MSSQL Management Studio produce a query that will reproduce a resultset, that you found prior, but use the best way possible to recreate it?
Maybe there is a way to tell the database which rows it shall return instead of it looking for the correct rows by the WHERE conditions? So once you found the rows, you dont have to search again?
So what I thought is: When you place a condition like
Where col1 = 10
The DB will check row 1 col1 for value 10, then row 2 col1 and so on..
Like it is searching, which takes time. Whereas if you could just make a statement that just directly asks for the specific row, you are faster?
I mean you dont need to search for the columns either: You just say give me col1 or col2 or whatever
The short answer is: NO
is it possible to make MSSQL Management Studio produce a query that will reproduce a result set, that you found prior
SQL Server Management Studio does not store your queries or the result SET which the queries return. It is simply a client application which pass the queries to the database server and present the result which the server returns.
On the other hand, SQL Server do store all the queries which you execute (for some time, depend on multiple parameters). Using the following query you can get the last queries which were executed by the server:
SELECT execquery.last_execution_time AS [Date Time], execsql.text AS [Script] FROM sys.dm_exec_query_stats AS execquery
CROSS APPLY sys.dm_exec_sql_text(execquery.sql_handle) AS execsql
ORDER BY execquery.last_execution_time DESC
GO
... use the best way possible to recreate it?
When you execute a query, then the server and the SSMS might provide some alerts and recommendation about the query, which can help us build a better query, but not the SQL Server and not the SQL Server Management Studio will build for you a better query based on a result SET of previous query
This is why we have DBA
I have a streaming input, say stock price data (including multiple stocks), and I want to do a ranking by their price every 1 minutes. The ranking is based on all stocks' latest price and needs to sort all of them no matter if it was updated in the previous 1 minute or not. I tried to use ORDER BY in flink stream SQL.
I failed to implement my logic and I am confused about two parts:
Why can ORDER BY only use a time attribute as primary and only support ASC? How can I implement an order by an other type like price?
What does the below SQL (from the Flink document) mean? There is no window and there is no window so I assume the SQL will be executed immediately for each order come in, in that case, it looks meaningless to sort one element.
[Update]: When I read the code of ProcimeSortProcessFunction.scala, it seems that Flink sorts the elements received during the next one millisecond.
SELECT *
FROM Orders
ORDER BY orderTime
Finally, is there a way to implement my logic in SQL?
ORDER BY in streaming queries are difficult to compute because we don't want to update the whole result when we have to emit a result that would need to go to the beginning of the result table. Therefore, we only support ORDER BY time-attribute if we can guarantee that the results have (roughly) increasing timestamps.
In the future (Flink 1.6 or later), we will also support some queries like ORDER BY x ASC LIMIT 10, which will result in an updating table that contains the records with the 10 smallest x values.
Anyway, you cannot (easily) compute a top-k ranking per minute using a GROUP BY tumbling window. GROUP BY queries aggregate the records of group (also window in case of GROUP BY TUMBLE(rtime, INTERVAL '1' MINUTE)) into a single record. So there won't be multiple records per minute but just one.
If you'd like a query to compute top-10 on field a per minute you would need a query similar to this one:
SELECT a, b, c
FROM (
SELECT
a, b, c,
RANK() OVER (ORDER BY a PARTITION BY CEIL(t TO MINUTE) BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) as rank
FROM yourTable)
WHERE rank <= 10
However, such queries are not yet supported by Flink (version 1.4) because the time attribute is used in the PARTITION BY clause and not the ORDER BY clause of the OVER window.
I have a very large query which returns about 800,000 rows on a monthly basis
I have been asked to run it for the year but obviously this will not all fit into a single excel file. Unfortunately the details are necessary for the analysis being undertaken by the business analyst so i cant roll it up to aggregates
SELECT * FROM foo where foo.start_date >= 20150101 AND < 20150201
I was wondering is it possible in Teradata to set up a loop to run the first query, increment the time period for a month, export it as a text file then run the query again
I would be using Teradata SQL Assistant for this with the export functionality
I could run it for the year and split it in MS Access afterwards but i would just like to see if there is a way to implement it directly in Teradata
Thank you for your time
I have SSAS and SSRS 2008R2. The end goal is to get the report with Daily MarketValue for each Portfolio and Security Combination. MarketValue has SCOPE calculation to Select Last Existing for Date Dimension. If the SCOPE is removed the query still takes 6 min t complete. and with SCOPE statement it timeout after 1 hour. here is my query
SELECT
NON EMPTY
{[Measures].[MarketValue]} ON COLUMNS
,NON EMPTY
{
[Portfolio].[PortfolioName].[PortfolioName].ALLMEMBERS*
[Effective Date].[Effective Date].[Effective Date].ALLMEMBERS*
[Security].[Symbol].[Symbol].ALLMEMBERS
}
DIMENSION PROPERTIES
MEMBER_CAPTION
,MEMBER_UNIQUE_NAME
ON ROWS
FROM EzeDM
WHERE
(
[AsOn Date].[AsOn Date].&[2014-06-17T06:32:41.97]
,[GoldenCopy].[Frequency].&[Daily]
,[GoldenCopy].[GoldenCopyType].&[CitcoPLExposure]
,[GoldenCopy].[PointInTime].&[EOP]
,[GoldenCopy].[PositionType].&[Trade-Date]
);
The SCOPE statement I have for MarketValue Measure is
SCOPE
[Effective Date].[Effective Date].MEMBERS;
THIS =
Tail
(
(EXISTING
[Effective Date].[Effective Date].MEMBERS)
,1
).Item(0);
END SCOPE;
Security DIM has around 4K values. Portfolio DIM has around 100 Values and EffectiveDate DIM has around 400 values.
If I remove the EffectiveDate from the cross join the query is taking less than 2 seconds.
So far I have tried different combinations and found that the slowness is due to the cross join between DIM with large values in them. but then I am thinking is 4000 values in DIM is actually large? people must have done the same reporting efficiently right?
Is this a SCOPE calculation? If so why does it get slower only when EffectiveDate is in the cross join?
Appreciate any help.
EDIT:1
Adding some more details about the current environment if that helps :
We do not have Enterprise version and currently we do not have any plans to ask our clients to upgrade to Enterprise version.
Security Dimension has around 40 attribute but 2 of them will always have data and at most "up to 6" may have any data. not sure if Attribute being not used in MDX query still affects the query performance "regardless it has data or not"
After reading the "Chris Webb" blog on MDX query improvements I notice the property is true for ALL Attributes in ALL Dimension.
"AttributeHierarchyEnabled = True"
For testing I have marked FALSE to all except currently I am using.
I do not have any aggregations defined on cube and I have started with building Aggregations using "Design Aggregations" wizard. after that I profile the same reporting query and didn't see any tick for "get data from Aggregations" event.
So currently I am working on preparing/testing "Usage Based Aggregation"
EDIT:2
So I created the log table with 50% logging sampling and ran 15-20 different reporting queries Client is expecting to run and saw some data in log table. I used the Wizard for Usage Based Aggregation and let SSAS finds out Estimated Row Count.
it was strange that it did not generate any aggregations.
I also tried the approach of changing the Aggregation property to LastChild As Frank suggested and it worked great but then I realize I can not pick LastChild Value for MarketValue for all Dimension. it is Additive across Security Dimension but not across Time.
I would assume that getting rid of the whole SCOPE statement and instead setting the AggregateFunction property of the measures to LastChild or LastNonEmpty would speed up the calculation. This would require [Effective Date] to be the first dimension tagged as time, and you need SQL Server Enterprise edition for these AggregateFunctions to be available.
I am currently in the process of revamping my company's management system to run a little more lean in terms of network traffic. Right now I'm trying to figure out an effective way to query only the records that have been modified (by any user) since the last time I asked.
When the application starts it loads the job information and caches it locally like the following: SELECT * FROM jobs.
I am writing out the date/time a record was modified ala UPDATE jobs SET Widgets=#Widgets, LastModified=GetDate() WHERE JobID=#JobID.
When any user requests the list of jobs I query all records that have been modified since the last time I requested the list like the following: SELECT * FROM jobs WHERE LastModified>=#LastRequested and store the date/time of the request to pass in as #LastRequest when the user asks again. In theory this will return only the records that have been modified since the last request.
The issue I'm running into is when the user's date/time is not quite in sync with the server's date/time and also of server load when querying an un-indexed date/time column. Is there a more effective system then querying date/time information?
I don't know that I would rely on Date-Time since it is external to SQL Server.
If you have an Identity column, I would use that column in a table UserId, LastQueryDateTime, LastIdRetrieved
Every time you query the base table, insert new row for user (or update if exists) the max id into this table. Also, the query should read the row from this table to get the LastIdRetrieved and use that in the where clause.
All this could be eliminated if all of your code chooses to insert GetDate() from SQL Server instead of from the client machines, but that task is pretty labor intensive.
The easiest solution seems to settle on one time as leading.
One way would be to settle on the server time. After updating the row, store the value returned by select LastModified where JobID = #JobID on the client side. That way, the client can effectively query using only the server time as reference.
Use an update sequence number (USN) much like Active Directory and DNS use to keep track of the objects that have changed since their last replication. Pick a number to start with, and each time a record in the Jobs table is inserted or modified, write the most recent USN. Keep track of the USN when the last Select query was executed, and you then always know what records were altered since the last query. For example...
Set LastQryUSN = 100
Update Jobs Set USN=101, ...
Update Jobs Set USN=102, ...
Insert Jobs (USN, ...) Values (103, ...)
Select * From Jobs Where USN > LastQryUSN
Set LastQryUSN = 103
Update Jobs Set USN=104
Insert Jobs (USN, ...) Values (105, ...)
Select * From Jobs Where USN > LastQryUSN
Set LastQryUSN = 105
... and so on
When you get the Jobs, get the server time too:
DECLARE #now DATETIME = GETUTCDATE();
SELECT #now AS [ServerTime], * FROM Jobs WHERE Modified >= #LastModified;
First time you pass in a minimum date as #LastModified. On each subsequent call, you pass in the ServerTime returned last call. This way the client time is taken out of the equation.
The answer to the server load is, I hope, obvious: add an index on Modified column.
And one more adice: never use local time, not even on server. Always use UTC times, and store UTC time in Modified. As it is right now, your program is completely screwed two times a year, when the daylight savings time changes are set in or when they are removed.
Current SQL Server has change tracking you can use for exactly that. Just enable change tracking on the tables you want to track.