Question about Crystal Reports + SQL Server stored procedure with GROUP BY - sql-server

I'm writing a report in Crystal Reports XI Developer that runs a stored procedure in a SQL Server 2005 database. The record set returns a summary from log table grouped by Day and Hour.
Currently my query looks something like this:
SELECT
sum(colA) as "Total 1",
day(convert(smalldatetime, convert(float, Timestamp) / 1440 - 1)) as "Date",
datepart(hh, convert(smalldatetime, convert(float, Timestamp) / 1440 - 1)) as "Hour"
`etc...`
GROUP BY
Day, Hour
Ignore the date insanity, I think the system designers were drinking heavily when they worked out how to store their dates.
My problem is this: since there are not always records from each hour of the day, then I end up with gaps, which is understandable, but I'd like Crystal to be able to report on the entire 24 hours regardless of whether there is data or not.
I know I can change this by putting the entire query in a WHILE loop (within the stored procedure) and doing queries on the individual hours, but something inside me says that one query is better than 24.
I'd like to know if there's a way to have Crystal iterate through the hours of the day as opposed to iterating through the rows in the table as it normally does.
Alternatively, is there a way to format the query so that it includes the empty hour rows without killing my database server?

Here's how I solved this problem:
Create a local table in your SQL-Server. Call it "LU_Hours".
This table will have 1 integer field (called "Hours") with 24 rows. Of course, the values would be 1 through 24.
Right join this onto your existing query.
You might need to tweak this to make sure the nulls of empty hours are handled to your satisfaction.

You could use a WITH clause to create the 24 hours, then OUTER JOIN it.
WITH hours AS (
SELECT 1 AS hour
UNION
SELECT 2 AS hour
...
SELECT 24 AS hour
)
SELECT *
FROM hours h
LEFT OUTER JOIN [your table] x ON h.hour=x.datepart(hh, convert(smalldatetime, convert(float, Timestamp) / 1440 - 1))
This SQL would need to added to a Command.
Group as necessary in the SQL or the report.

Related

SSRS: Why am I getting this aggregation?

I've recently uncovered that SSRS is doing a bizarre aggregation and I really don't understand why. In this report I'm building, as with other SQL queries I've built, I have a tendency to take preliminary results from an initial query, throw them into a temp table, and then perform another query and joining on that temp table to get my 'final' results I need to display. Here's an example:
--1. This query fetches all available rows based on the day (must be last day of month)
SELECT DISTINCT Salesperson ,c.Cust_Alias ,cost ,eomonth(CreateDate) createdate ,FaxNumber
INTO #equip
FROM PDICompany_2049_01.dbo.Customers c
JOIN PDICompany_2049_01.dbo.Customer_Locations cl ON c.Cust_Key = cl.CustLoc_Cust_Key
JOIN ricocustom..Equipment_OLD e ON e.FaxNumber = c.Cust_ID + '/' + cl.CustLoc_ID
JOIN PDICompany_2049_01.dbo.Charges ch ON ch.Chg_CustLoc_Key = cl.CustLoc_Key
WHERE Salesperson = #Salesperson
AND ch.Chg_Balance = 0
--2. This query fetches first result set, but filters further for matching date variable
SELECT DISTINCT (cost) EquipCost ,Salesperson ,DATEPART(YEAR, CreateDate) YEAR
,DATEPART(MONTH, CreateDate) MONTH ,Cust_Alias ,FaxNumber
INTO #equipcost
FROM #equip
WHERE Salesperson = #Salesperson
AND DATEPART(MONTH, CreateDate) = DATEPART(MONTH, #Start)
AND DATEPART(year, CreateDate) = DATEPART(year, #Start)
ORDER BY Cust_Alias
--3. Finally, getting sum of the EquipCost, with other KPI's, to put into my final result set
SELECT sum(EquipCost) EquipCost ,Salesperson ,YEAR ,MONTH ,Cust_Alias
INTO #temp_equipcost
FROM #equipcost
GROUP BY Salesperson ,year ,month ,Cust_Alias
Now I am aware that I could have easily reduced this to 2 queries instead of 3 in hindsight (and I have since gotten my results into a single query). But that's where I'm looking for the answer. In my GUI report, I had a row that was showing to have 180 for equipcost, but my query was showing 60. It wasn't until I altered my query to a single iteration (as opposed to the 3), and while I'm still getting the same result of 60, it now displays 60 in my GUI report.
I actually had this happen in another query as well, where I had 2 temp table result sets, but when I condensed it into one, my GUI report worked as expected.
Any ideas on why using multiple temp tables would affect my results via the GUI report in SQL Report Builder (NOT USING VB HERE!) but my SQL query within SSMS works as expected? And to be clear, only making the change described to the query and condensing it got my results, the GUI report in Report Builder is extremely basic, so nothing crazy regarding grouping, expressions, etc.
My best guess is that you accidentally had a situation where you did not properly clear the temp tables (or you populated the temp tables multiple times). As an alternative to temp tables, you could instead use table variables. Equally you could use a single query from the production tables -- using CTE if you want it to "feel" like 3 separate queries.

SQL Server Stored Procedure get nearest available date to parameter

I have a table of database size information. The data is collected daily. However, some days are missed due to various reasons. Additionally we have databases which come and go over or the size does not get recorded for several databases for a day or two. This all leads to very inconsistent data collection regarding dates. I want to construct a SQL procedure which will generate a percentage of change between any two dates (1 week, monthly, quarterly, etc.) for ALL databases The problem is what to do if a chosen date is missing (no rows for that date or no row for one or more databases for that date). What I want to be able to do is get the nearest available date for each database for the two dates (begin and end).
For instance, if database Mydb has these recording dates:
2015-05-03
2015-05-04
2015-05-05
2015-05-08
2015-05-09
2015-05-10
2015-05-11
2015-05-12
2015-05-14
and I want to compare 2015-05-06 with 2015-05-14
The 2015-05-07 date is missing so I would want to use the next available date which is 2015-05-08. Keep in mind, MyOtherDB may only be missing the 2015-05-06 date but have available the 2015-05-07 date. So, for MyOtherDb I would be using 2015-05-07 for my comparison.
Is there a way to proceduralize this with SQL WITHOUT using a CURSOR?
You're thinking too much into this, simple do a "BETWEEN" function in your where clause that takes the two parameters.
In your example, if you perform the query:
SELECT * FROM DATABASE_AUDIT WHERE DATE BETWEEN param1 /*2015-05-06*/ and param2 /*2015-05-14*/
It will give you the desired results.
select (b.dbsize - a.dbsize ) / a.dbsize *100 dbSizecChangePercent from
( select top 1 * from dbAudit where auditDate = (select min(auditDate) from dbAudit where auditDate between '01/01/2015' and '01/07/2015')) a
cross join
(select top 1 * from dbAudit where auditDate = (select max(auditDate) from dbAudit where auditDate between '01/01/2015' and '01/07/2015')) b
The top 1 can be replaced by a group by. This was assuming only 1 db aduit per day

Why SQL query with short date range runs slow, while a long range runs quickly?

Platform: SQL Server 2008 R2
I have a reasonably complex SQL query that queries (2) distinct databases and collects a report based on the date range of the records. Searching through over 3,000,000 rows for a date range of, say, 2 months is nearly instantaneous. However, searching a short date range of, say, 7 days takes nearly two minutes. For the life of me, I cannot understand why this would be. Here's the query below:
;with cte_assets as (
select a.f_locationid, a.f_locationparent, 0 as [lev], convert(varchar(30), '0_' + convert(varchar(10), f_locationid)) lineage
from [db_assets].[dbo].[tb_locations] a where f_locationID = '366' UNION ALL
select a.f_locationid
,a.f_locationparent
,c.[lev] + 1
,convert(varchar(30), lineage + '_' + convert(varchar(10), a.f_locationid))
from cte_assets c
join [db_assets].[dbo].[tb_locations] a
on a.f_locationparent = c.f_locationID
),
cte_a AS
(
select f_assetnetbiosname as 'Computer Name'
from cte_assets c
JOIN [db_assets].[dbo].[tb_assets] ass on ass.f_assetlocation = c.f_locationID
)
select apps.f_applicationname, apps.f_applicationID, sum(f_runtime/60) [RunTime]
from cte_a c
JOIN [db_reports].[dbo].[tb_applicationusage] ss on ss.f_computername = c.[Computer Name]
JOIN [db_reports].[dbo].[tb_applications] apps
ON ss.f_application = apps.f_applicationID
WHERE ss.f_runtime IS NOT NULL AND f_starttime BETWEEN '1/26/2015 10:55:03 AM' AND '2/12/2015 10:55:03 AM'
group by apps.f_applicationname, ss.f_application, apps.f_applicationID
ORDER BY RunTime DESC
The final WHERE clause (3rd-to-last line) is where the date range is specified. The date range shown in the query of BETWEEN '1/26/2015 10:55:03 AM' AND '2/12/2015 10:55:03 AM' works quickly without issues. If we change the query to, for example, BETWEEN '1/27/2015 10:55:03 AM' AND '2/12/2015 10:55:03 AM' (just a day later) it takes over two minutes to run. I have absolutely no idea why a short range would cause the query to run more slowly. Any assistance is appreciated.
Thanks,
Beems
I apologize, I should have Googled this problem more thoroughly first. I found the answer on another StackOverflow post regarding the need to update statistics:
SQL query takes longer time when date range is smaller?
Using the MSDN article not linked in that article, but this one here:
Updates query optimization statistics on a table or indexed view. By
default, the query optimizer already updates statistics as necessary
to improve the query plan; in some cases you can improve query
performance by using UPDATE STATISTICS or the stored procedure
sp_updatestats to update statistics more frequently than the default
updates.
Updating statistics ensures that queries compile with up-to-date
statistics. However, updating statistics causes queries to recompile.
We recommend not updating statistics too frequently because there is a
performance tradeoff between improving query plans and the time it
takes to recompile queries. The specific tradeoffs depend on your
application. UPDATE STATISTICS can use tempdb to sort the sample of
rows for building statistics.
I ran the following:
USE db_reports;
GO
UPDATE STATISTICS tb_applicationusage;
GO
2 seconds later the update was complete, and my short-term recent queries now run quickly.
Thanks,
Beems

Sql Server Selecting 1 single datapoint every hour or every day from a table

Hi I have several tables in sql server that record data every 30 seconds. obviously after a while that gets a bit bulky, what I'd like to do is select 1 datapoint an hour over the past 24 hours and put that into a seperate tables for fast quesries, so I was thinking once every hour over a 24 hour period, 2 times a day over a week long period, and once a day over a month long period. I have datetime recorded on every data point we have.
I'd like something like this for the once an hour over 24 hours
Select * from MiscTBL Where Date >= (( Currentdatetime - 24 hh )) group by hh
thank you for any advice
Also I'm using sql server management studio it would be great if this were an automatically updating process so I had seperate tables I could use for faster queries of data over shorter pre quantified time periods
Something like this would return 1 sample per hour:
select *
from ActivityLog
where id in
(select max(id) maxID
from ActivityLog
where activityDateTime between #startDateTime and #endDateTime
group by DATEPART(hour, activityDateTime))
I would use this concept to build a stored proc that moved the data around and then I would schedule it to run as often as needed using a SQL Agent job.

MS Access : Average and Total Calculation in Single Query

INTRODUCTION TO DATABASE TABLE BEING USED -
I am working on a “Stock Market Prices” based Database Table. My table has got the data for the following FIELDS –
ID
SYMBOL
OPEN
HIGH
LOW
CLOSE
VOLUME
VOLUME CHANGE
VOLUME CHANGE %
OPEN_INT
SECTOR
TIMESTAMP
New data gets added to the table daily “Monday to Friday”, based on the stock market price changes for that day. The current requirement is based on the VOLUME field, which shows the volume traded for a particular stock on daily basis.
REQUIREMENT –
To get the Average and Total Volume for last 10,15 and 30 Days respectively.
METHOD USED CURRENTLY -
I created these 9 SEPARATE QUERIES in order to get my desired results –
First I have created these 3 queries to take out the most recent last 10,15 and 30 dates from the current table:
qryLast10DaysStored
qryLast15DaysStored
qryLast30DaysStored
Then I have created these 3 queries for getting the respective AVERAGES:
qrySymbolAvgVolume10Days
qrySymbolAvgVolume15Days
qrySymbolAvgVolume30Days
And then I have created these 3 queries for getting the respective TOTALS:
qrySymbolTotalVolume10Days
qrySymbolTotalVolume15Days
qrySymbolTotalVolume30Days
PROBLEM BEING FACED WITH CURRENT METHOD -
Now, my problem is that I have ended up having these so many different queries, whereas I wanted to get the output into One Single Query, as shown in the Snapshot of the Excel Sheet:
http://i49.tinypic.com/256tgcp.png
SOLUTION NEEDED -
Is there some way by which I can get these required fields into ONE SINGLE QUERY, so that I do not have to look into multiple places for the required fields? Can someone please tell me how to get all these separate queries into one -
A) Either by taking out or moving the results from these separate individual queries to one.
B) Or by making a new query which calculates all these fields within itself, so that these separate individual queries are no longer needed. This would be a better solution I think.
One Clarification about Dates –
Some friend might think why I used the method of using Top 10,15 and 30 for getting the last 10,15 and 30 Date Values. Why not I just used the PC Date for getting these values? Or used something like -
("VOLUME","tbl-B", "TimeStamp BETWEEN Date() - 10 AND Date()")
The answer is that I require my query to "Read" the date from the "TIMESTAMP" Field, and then perform its calculations accordingly for LAST / MOST RECENT "10 days, 15 days, 30 days” FOR WHICH THE DATA IS AVAILABLE IN THE TABLE, WITHOUT BOTHERING WHAT THE CURRENT DATE IS. It should not depend upon the current date in any way.
If there is any better method or more efficient way to create these queries, then please enlighten.
You have separate queries to compute 10DayTotalVolume and 10DayAvgVolume. I suspect you can compute both in one query, qry10DayVolumes.
SELECT
b.SYMBOL,
Sum(b.VOLUME) AS 10DayTotalVolume,
Avg(b.VOLUME) AS 10DayAvgVolume
FROM
[tbl-B] AS b INNER JOIN
qryLast10DaysStored AS q
ON b.TIMESTAMP = q.TIMESTAMP
GROUP BY b.SYMBOL;
However, that makes me wonder whether 10DayAvgVolume can ever be anything other than 10DayTotalVolume / 10
Similar considerations apply to the 15 and 30 day values.
Ultimately, I think you want something based on a starting point like this:
SELECT
q10.SYMBOL,
q10.[10DayTotalVolume],
q10.[10DayAvgVolume],
q15.[15DayTotalVolume],
q15.[15DayAvgVolume],
q30.[30DayTotalVolume],
q30.[30DayAvgVolume]
FROM
(qry10DayVolumes AS q10
INNER JOIN qry15DayVolumes AS q15
ON q10.SYMBOL = q15.SYMBOL)
INNER JOIN qry30DayVolumes AS q30
ON q10.SYMBOL = q30.SYMBOL;
That assumes you have created qry15DayVolumes and qry30DayVolumes following the approach I suggested for qry10DayVolumes.
If you want to cut down the number of queries, you could use subqueries for each of the qry??DayVolumes saved queries, but try it this way first to make sure the logic is correct.
In that second query above, there can be a problem due to field names which start with digits. Enclose those names in square brackets or re-alias them in qry10DayVolumes, qry15DayVolumes, and qry30DayVolumes using alias names which begin with letters instead of digits.
I tested the query as written above with the "2nd Upload.mdb" you uploaded, and it ran without error from Access 2007. Here is the first row of the result set from that query:
SYMBOL 10DayTotalVolume 10DayAvgVolume 15DayTotalVolume 15DayAvgVolume 30DayTotalVolume 30DayAvgVolume
ACC-1 42909 4290.9 54892 3659.46666666667 89669 2988.96666666667
Access doesn't support most advanced SQL syntax and clauses, so this is a bit of a hack, but it works, and is fast on your small sample. You're basically running 3 queries but the Union clauses allow you to combine into one:
select
Symbol,
sum([10DayTotalVol]) as 10DayTotalV,
sum([10DayAvgVol]) as 10DayAvgV,
sum([15DayTotalVol]) as 15DayTotalV,
sum([15DayAvgVol]) as 15DayAvgV,
sum([30DayTotalVol]) as 30DayTotalV,
sum([30DayAvgVol]) as 30DayAvgV
from (
select
Symbol,
sum(volume) as 10DayTotalVol, avg(volume) as 10DayAvgVol,
0 as 15DayTotalVol, 0 as 15DayAvgVol,
0 as 30DayTotalVol, 0 as 30DayAvgVol
from
[tbl-b]
where
timestamp >= (select min(ts) from (select distinct top 10 timestamp as ts from [tbl-b] order by timestamp desc ))
group by
Symbol
UNION
select
Symbol,
0, 0,
sum(volume), avg(volume),
0, 0
from
[tbl-b]
where
timestamp >= (select min(ts) from (select distinct top 15 timestamp as ts from [tbl-b] order by timestamp desc ))
group by
Symbol
UNION
select
Symbol,
0, 0,
0, 0,
sum(volume), avg(volume)
from
[tbl-b]
where
timestamp >= (select min(ts) from (select distinct top 30 timestamp as ts from [tbl-b] order by timestamp desc ))
group by
Symbol
) s
group by
Symbol

Resources