Windowing function: Finding max value from LAG() - sql-server

I'm currently working my way through the exam study book, Querying Microsoft SQL Server 2012. I've been learning SQL over the last few months and I am currently looking over windowing functions. I came to this application question and it got me thinking about another question, which I'll list below:
So in the columns diffprev and diffnext it only lists the difference between the previous and the next value. How could I list the maximum difference between subsequent values across all of the rows (partitioned by custid)? So just scanning the table, I see that in custid 1's history, the greatest difference between subsequent rows is $548. Then for custid 2, the greatest difference is $390.95. I could see these values appearing in a maxdiff column across all the rows pertaining to the partition.
Thank you for aiding my studying!

If you're just looking for the value, this should work:
with cte as (
select custid, val - lag(val)
over (partition by custid order by orderdate, orderid) as prevVal
from Sales.OrderValues
)
select custid, max(abs(val))
from cte
group by custid
If you want the details of the rows that attain that maximum, it'll be a bit more work.
Bonus tip - pictures of text are the worst. You're more likely to get help if the people helping don't need to type your code out. Even better though would be a fully functioning example (complete with table definitions and sample data) so we can verify against your data!

Related

Is there a way to sum an entire quantity in SQL with unique values

I am trying to get a total summation of both the ItemDetail.Quantity column and ItemDetail.NetPrice column. For sake of example, let's say the quantity that is listed is for each individual item is 5, 2, and 4 respectively. I am wondering if there is a way to display quantity as 11 for one single ItemGroup.ItemGroupName
The query I am using is listed below
select Location.LocationName, ItemDetail.DOB, SUM (ItemDetail.Quantity) as "Quantity",
ItemGroup.ItemGroupName, SUM (ItemDetail.NetPrice)
from ItemDetail
Join ItemGroupMember
on ItemDetail.ItemID = ItemGroupMember.ItemID
Join ItemGroup
on ItemGroupMember.ItemGroupID = ItemGroup.ItemGroupID
Join Location
on ItemDetail.LocationID = Location.LocationID
Inner Join Item
on ItemDetail.ItemID = Item.ItemID
where ItemGroup.ItemGroupID = '78' and DOB = '11/20/2019'
GROUP BY Location.LocationName, ItemDetail.DOB, Item.ItemName,
ItemDetail.NetPrice, ItemGroup.ItemGroupName
If you are using SQL Server 2012 , you can use the summation on partition to display the
details and aggregates in the same query.
SUM(SalesYTD) OVER (ORDER BY DATEPART(yy,ModifiedDate)),1)
Link :
https://learn.microsoft.com/en-us/sql/t-sql/functions/sum-transact-sql?view=sql-server-ver15
We can't be certain without seeing sample data. But I suspect you need to remove some fields from you GROUP BY clause -- probably Item.ItemName and ItemDetail.NetPrice.
Generally, you won't GROUP BY a column that you are applying an aggregate function to in the SELECT -- as in SUM(ItemDetail.NetPrice). And it is not very common, in my experience, to GROUP BY columns that aren't included in the SELECT list - as you are doing with Item.ItemName.
I think you need to go back to basics and read about what GROUP BY does.
First of all welcome to the overflow...
Second: The answer is going to be "It depends"
Any time you aggregate data you will need to Group by the other fields in the query, and you have that in the query. The gotcha is what happens when data is spread across multiple locations.
My suggestion is to rethink your problem and see if you really need these other fields in the query. This will depend on what the person using the data really wants to know.
Do they need to know how many of item X there are, or do they really need to know that item X is spread out over three sites?
You might find you are better off with two smaller queries.

SQL column update to solve data quality issue

I am unable to figure out how to write 'smart code' for this.
In this case I would like the end result for the first two case to be:
product_cat_name
A_SEE
A_BEE
Business is rule is such that one product_cat_name can belong to only one group but due to data quality issues we sometimes have a product_cat_name belonging to 2 different groups. As a special case in such a situation we would like to append group to the product_cat_name so that product_cat_name becomes unique.
It sounds so simple yet I am cracking my head over this.
Any help much appreciated.
Something like this:
with names as (
select prod_cat_nm , prod_cat_nm+group as new_nm from (query that joins 3 tables together) as qry
join
(Select prod_cat_nm, count(distinct group)
from (query that joins 3 tables together) as x
group by
prod_cat_nm
having count(distinct group) > 1) dups
on dups.prod_cat_nm = qry.prod_cat_nm
)
SELECT prod_cat_nm, STRING_AGG(New_nm, '') WITHIN GROUP (ORDER BY New_Nm ASC) AS new_prod_cat_nm
FROM names
GROUP BY prod_cat_nm;
I've used the 2017 STRING_AGG() here as its shortest to write - But you could easily change this to use Recursions or XML path
It is simple if you break it down into small pieces.
You need to UPDATE the table obviously, and change the value of product_cat_name. That's easy.
The new value should be group + product_cat_name. That's easy.
You only want to do this when a product_cat_name is associated with more than one group. That's probably the tricky part, but it can also be broken down into small pieces that are easy.
You need to identify which product_cat_names have more than one group. That's easy. GROUP BY product_cat_name HAVING COUNT(DISTINCT Group) > 1.
Now you need to use that to limit your UPDATE to only those product_cat_names. That's easy. WHERE product_cat_name IN (Subquery using above logic to get PCNs that have more than one Group).
All easy steps. Put them together and you've got your solution.

RunningDifference(x) for multiple x values

My Table tbl_data(event_time, monitor_id,type,event_date,status)
select status,sum(runningDifference(event_time)) as delta from (SELECT status,event_date,event_time FROM tbl_data WHERE event_date >= '2018-05-01' AND monitor_id =3 ORDER BY event_time ASC) group by status
Result will be
status delta
1 4665465
2 965
This query result give me right answer for single monitor_id, Now I required it for multiple monitor_id,
How can I achieve it in single/same query??
Usually this is achieved with conditional expressions. Like SELECT ...,if(monitor_id = 1, status, NULL) AS status1,... and then you do your aggregate function that, as you might know, skips NULL values. But I did some testing and it turns out that because of clickhouse internals runningDifference() can't distinguish columns originated from the same source. At the same time it distinguishes columns that came from different sources just fine. It is a bug.
I opened an issue on Github: https://github.com/yandex/ClickHouse/issues/2590
UPDATE: Devs reacted incredibly fast and with the latest source from master you can get what you want with the strategy I described. See the issue for code example.

MDX Query: Trying to bring back calculated years

I have a piece of a report which currently works fine with hard-coded years, I am trying to make them a bit more dynamic so as to show the current year and previous year returns to avoid having to update this every year. This is a simple thing to do in SQL, but I'm having a much harder time figuring this out in MDX, which I am still learning. The row in question is the [Date].[Year].&[2013]:[Date].[Year].&[2014]. Here is my current query:
SELECT {
[Measures].[Users],
[Measures].[Sessions]
} ON COLUMNS,
(
{[User Type].[Description].&[Customer], [User Type].[Description].&[Vendor]},
[Date].[Year Month].[Year Month],
[Date].[Month Name].[Month Name],
[Date].[Year].&[2013]:[Date].[Year].&[2014]
) ON ROWS
FROM [My Cube]
Thanks for any help.
You can't use a range inside a tuple. You first need to create a member.
Put this in before the SELECT clause:
WITH Member [Date].[Periods] as Aggregate( [Date].[Year].&[2013]:[Date].[Year].&[2014])
and then replace the range in your tuple by [Date].[Periods]

maximum rows per value vs typical rows per value [teradata]

I've been reading about data demographics of teradata and came across with this two terms. It is mentioned that this two goes hand in hand to make good index choice, but I can't seem to understand exactly what is the difference between the two values.
Can anyone explain to me the exact difference between the two. Examples on how the values are derived would be really helpful.
I'm thinking both values will come from this query:
sel <columnname>, count(*)
from <tablename>
Here are the definition of the two terms, btw.
Maximum Rows/Value –No. of rows for the most-often-occurring value in the column.
Typical Rows/Value –No. of rows for a typical value in the column.
Any inputs will be much appreciated.
Thank you.
Here is my understanding of Maximum Rows/Value vs Typical Rows/Value.
Suppose (SQL Fiddle Link: http://sqlfiddle.com/#!4/27641/13/0)
SELECT MAX (COUNT ("sometext")) max_row_per_value
FROM table1
GROUP BY id
And here is the result
MAX_ROW_PER_VALUE
7
In this case, when you look at id=1, there are 7 records for that value, being the maximum rows/value.
The typical rows/value is what I consider the AVG(), like this:
SELECT AVG (COUNT ("sometext")) typical_row_per_value
FROM table1
GROUP BY id
Result
TYPICAL_ROW_PER_VALUE
4.5

Resources