I am fairly new to SQL and I can't figure out what to do here.
I have a database of financial data with one column being [Days]. I need to add a new column into it which will add a category in which the number of days fall into (0, 1-30, 30-60 etc).
In excel this would look like this =IF(A1>90,"90-120",IF(A1>60,"60-90".......)
The final database should look like this:
Days | Category
29 | 0-30
91 | 90-120
0 | 0
.
.
.
Thx in advance
You can use case:
select days,
(case when days > 90 then '90-120' -- should this be >= ?
when days > 60 then '60-90' -- should this be >= ?
. . .
end) as Category
from t;
Complete SQL:
select Days,
(case when days > '90' then '91-120'
when days > '60' then '61-90'
when days > '30' then '31-60'
when days > '0' then '1-30' else '0' end
end) as Category
from t;
Here is another way using IIF if you are using SQL Server 2012+:
CREATE TABLE Numbers (Number INT );
INSERT INTO Numbers VALUES
(1),
(0),
(15),
(29),
(32),
(54),
(59),
(60),
(63),
(89),
(90),
(140);
SELECT IIF(Number BETWEEN 90 AND 120, '90-120',
IIF(Number BETWEEN 60 AND 89, '60-90',
IIF(Number BETWEEN 30 AND 59 , '30-60' ,
IIF(Number BETWEEN 1 AND 29, '1-30' ,
IIF(Number = 0, '0', 'OutRange'))))) AS Category
FROM Numbers;
try this
create table #tmp ([Days] int)
insert into #tmp values (29)
insert into #tmp values (91)
insert into #tmp values (0)
insert into #tmp values (65)
SELECT
CASE WHEN [Days]=0 then CONVERT(VARCHAR(15),0)
ELSE CONVERT(VARCHAR(15),[Days]/30*30)+'-'+ CONVERT(VARCHAR(15),([Days]/30*30)+30) END AS Category
from #tmp
drop table #tmp
select *,number/30, ltrim(number/30*30)+'-' +ltrim((number/30+1)*30) from #Numbers
+--------+---+-------+
| Number | | |
+--------+---+-------+
| 1 | 0 | 0-30 |
| 0 | 0 | 0-30 |
| 15 | 0 | 0-30 |
| 29 | 0 | 0-30 |
| 32 | 1 | 30-60 |
| 54 | 1 | 30-60 |
| 59 | 1 | 30-60 |
| 60 | 2 | 60-90 |
| 63 | 2 | 60-90 |
+--------+---+-------+
One solution to your dilemma may be to insert a new database column that uses a SQL Server feature known as a "Computed Column Specification" into your table.
A Computed Column Specification is a method whereby a database column's value can be calculated when the row is updated. That value can be optionally also be persisted in the database so that when it is queried no calculation has to be performed at that time (just on the INSERT).
I like this solution because you don't have to do any special calculations upon querying the data. You'll pull the new column data with a simple SELECT.
You didn't list specifics, so let's suppose that your database table is named [FinancialData], and that it has defined in it a column named [Days] that is of some numeric type (int, smallint, tinyint, decimal, float, money, numeric, or real).
You can add the computed column as follows:
ALTER TABLE [FinancialData] ADD
Category AS (CASE WHEN [Days] >= 90 THEN '90-120'
WHEN [Days] >= 60 THEN '60-90'
WHEN [Days] >= 30 THEN '30-60'
WHEN [Days] >= 1 THEN '1-30'
WHEN [Days] = 0 THEN '0'
END) PERSISTED;
Note the word "PERSISTED" in the SQL statement above. This is what causes the database table to actually store the calculated value in the database table when the [Days] column is inserted or changed. If you don't want to store the value, simply leave out the word "PERSISTED".
When the computed column is added to the table by executing the SQL statement above, values will be computed and stored for all existing rows in the table. When inserting a new row into the table, do not supply a value for the new [Category] column. This is because a) it won't work, and b) that column's value will be computed from the [Days] column value.
To retrieve data from the new column, you simply list that column in the SELECT statement (or use *):
SELECT [Days], [Category]
FROM [FinancialData];
A couple of caveats to note: 1) This is SQL Server specific. Most other database engines have no support for this feature. 2) You didn't state whether the [Days] column is nullable - if so, this solution will have to be modified to support that.
Related
Please, I had recently changed my DB from MS Access To SQL server Express, Access is a wonderful small scale DB for a SINGLE user that have a very simple VBA Functionality which I missed in SQL server!
In My Old Access DB I have [Account] table with a Sub Procedure that Update a field in All Rows in a table with the result of this Expression:
[SortOrder] = [AccountNumber] * (10 ^ (Len(MaximumAccountNumber) - Len([AccountNumber])))
where MaximumAccountNumber is a Variable represent the Max AccountNumber in the table.
I was searching for a solution for many days but no one example can give me an idea for how to use a Value from a column in the SAME Row to Calculate the result for another Column in that Row and so on for All the Rows in the table as if in the Following VBA code:
Do while Not rst.EOF
rst.Edit
rst![Field1] = rst![Field2] * ( 10 ^ ( (Len(MaximumAccountNumber) - Len(rst![Field2]) ) )
rst.Update
rst.MoveNext
Loop
Please How to implement such an Update efficiently In SQL server T-SQL without using a Cursor because the Rows Count in the table could reaches to > 100,000?
Please, I want to do This by Creating a SP which I Can Fire it (Trigger) after every Insert of a New Account to Re-Calculate the SortOrder of All the Rows in the table as in the Following:
CREATE PROCEDURE [dbo].[SortingOrder]
#MaxOrder Numeric(38,0) = 0,
#Digits int = 0,
As
BEGIN
set #MaxOrder = (select MAX([AccNumber]) from Account)
set #Digits = (select LEN(#MaxOrder))
Update dbo.Account
Set [SortOrder] = (Select ([AccNumber] * (POWER(10 ,(#Digits -
LEN([AccNumber]))) from [Account] )
END
GO
As in This Sample Table [Account]:
AccID AccNumber SortOrder
----- --------- ---------
023 23 2300
054 243 2430
153 5434 5434
But when Insert a new Record, I want the SortOrder to be Updated for All the rows to a Number with the same Numbers Count based on 10 Power(Length of the Max AccNumber) as in the Following:
AccID AccNumber SortOrder
----- --------- ---------
023 23 230000000
054 243 243553000
153 5434 543400000
233 432345625 432345625
Try this:
Table Schema:
CREATE TABLE Account(AccID INT,AccNumber BIGINT,SortOrder BIGINT)
INSERT INTO Account VALUES(23,23,23)
INSERT INTO Account VALUES(54,254,254)
INSERT INTO Account VALUES(125,25487,25487)
T-SQL Query:
DECLARE #MaxValLen INT
SELECT #MaxValLen = LEN(MAX(AccNumber)) FROM Account
UPDATE Account
SET SortOrder = AccNumber * POWER(10,#MaxValLen - LEN(AccNumber))
Output:
| AccID | AccNumber | SortOrder |
|-------|-----------|-----------|
| 23 | 23 | 23000 |
| 54 | 254 | 25400 |
| 125 | 25487 | 25487 |
This is my first post here. I'm still a novice SQL user at this point though I've been using it for several years now. I am trying to find a solution to the following problem and am looking for some advice, as simple as possible, please.
I have this 'recordTable' with the following columns related to transactions; 'personID', 'recordID', 'item', 'txDate' and 'daySupply'. The recordID is the primary key. Almost every personID should have many distinct recordID's with distinct txDate's.
My focus is on one particular 'item' for all of 2017. It's expected that once the item daySupply has elapsed for a recordID that we would see a newer recordID for that person with a more recent txDate somewhere between five days before and five days after the end of the daySupply.
What I'm trying to uncover are the number of distinct recordID's where there wasn't an expected new recordID during this ten day window. I think this is probably very simple to solve but I am having a lot of difficulty trying to create a query for it, let alone explain it to someone.
My thought thus far is to create two temp tables. The first temp table stores all of the records associated with the desired items and I'm just storing the personID, recordID and txDate columns. The second temp table has the personID, recordID and the two derived columns from the txDate and daySupply; these would represent the five days before and five days after.
I am trying to find some way to determine the number of recordID's from the first table that don't have expected refills for that personID in the second. I thought a simple EXCEPT would do this but I don't think there's anyway of getting around a recursive type statement to answer this and I have never gotten comfortable with recursive queries.
I searched Stackoverflow and elsewhere but couldn't come up with an answer to this one. I would really appreciate some help from some more clever data folks. Here is the code so far. Thanks everyone!
CREATE TABLE #temp1 (personID VARCHAR(20), recordID VARCHAR(10), txDate
DATE)
CREATE TABLE #temp2 (personID VARCHAR(20), recordID VARCHAR(10), startDate
DATE, endDate DATE)
INSERT INTO #temp1
SELECT [personID], [recordID], txDate
FROM recordTable
WHERE item = 'desiredItem'
AND txDate > '12/31/16'
AND txDate < '1/1/18';
INSERT INTO #temp2
SELECT [personID], [recordID], (txDate + (daySupply - 5)), (txDate +
(daySupply + 5))
FROM recordTable
WHERE item = 'desiredItem'
AND txDate > '12/31/16'
AND txDate < '1/1/18';
I agree with mypetlion that you could have been more concise with your question, but I think I can figure out what you are asking.
SQL Window Functions to the rescue!
Here's the basic idea...
CREATE TABLE #fills(
personid INT,
recordid INT,
item NVARCHAR(MAX),
filldate DATE,
dayssupply INT
);
INSERT #fills
VALUES (1, 1, 'item', '1/1/2018', 30),
(1, 2, 'item', '2/1/2018', 30),
(1, 3, 'item', '3/1/2018', 30),
(1, 4, 'item', '5/1/2018', 30),
(1, 5, 'item', '6/1/2018', 30)
;
SELECT *,
ABS(
DATEDIFF(
DAY,
LAG(DATEADD(DAY, dayssupply, filldate)) OVER (PARTITION BY personid, item ORDER BY filldate),
filldate
)
) AS gap
FROM #fills
ORDER BY filldate;
... outputs ...
+----------+----------+------+------------+------------+------+
| personid | recordid | item | filldate | dayssupply | gap |
+----------+----------+------+------------+------------+------+
| 1 | 1 | item | 2018-01-01 | 30 | NULL |
| 1 | 2 | item | 2018-02-01 | 30 | 1 |
| 1 | 3 | item | 2018-03-01 | 30 | 2 |
| 1 | 4 | item | 2018-05-01 | 30 | 31 |
| 1 | 5 | item | 2018-06-01 | 30 | 1 |
+----------+----------+------+------------+------------+------+
You can insert the results into a temp table and pull out only the ones you want (gap > 5), or use the query above as a CTE and pull out the results without the temp table.
This could be stated as follows: "Given a set of orders, return a subset for which there is no order within +/- 5 days of the expected resupply date (defined as txDate + DaysSupply)."
This can be solved simply with NOT EXISTS. Define the range of orders you wish to examine, and this query will find the subset of those orders for which there is no resupply order (NOT EXISTS) within 5 days of either side of the expected resupply date (txDate + daysSupply).
SELECT
gappedOrder.personID
, gappedOrder.recordID
, gappedOrder.item
, gappedOrder.txDate
, gappedOrder.daysSupply
FROM
recordTable as gappedOrder
WHERE
gappedOrder.item = 'desiredItem'
AND gappedOrder.txDate > '12/31/16'
AND gappedOrder.txDate < '1/1/18'
--order not refilled within date range tolerance
AND NOT EXISTS
(
SELECT
1
FROM
recordTable AS refilledOrder
WHERE
refilledOrder.personID = gappedOrder.personID
AND refilledOrder.item = gappedOrder.item
--5 days prior to (txDate + daysSupply)
AND refilledOrder.txtDate >= DATEADD(day, -5, DATEADD(day, gappedOrder.daysSupply, gappedOrder.txDate))
--5 days after (txtDate + daysSupply)
AND refilledOrder.txtDate <= DATEADD(day, 5, DATEADD(day, gappedOrder.daysSupply, gappedOrder.txtDate))
);
Azure SQL Server - we have a table like this:
MyTable:
ID Source ArticleText
-- ------ -----------
1 100 <nvarchar(max) field with unstructured text from media articles>
2 145 "
3 866 "
4 232 "
ID column is the primary key and auto-increments on INSERTS.
I run this query to find the records with the largest data size in the ArticleText column:
SELECT TOP 500
ID, Source, DATALENGTH(ArticleText)/1048576 AS Size_in_MB
FROM
MyTable
ORDER BY
DATALENGTH(ArticleText) DESC
We are finding that for many reasons both technical and practical, the data in the ArticleText column is just too big in certain records. The above query allows me to look at a range of sizes for our largest records, which I'll need to know for what I'm trying to formulate here.
The feat I need to accomplish is, for all existing records in this table, any record whose ArticleText DATALENGTH is greater than X, break that record into X amount of records where each record will then contain the same value in the Source column, but have the data in the ArticleText column split up across those records in smaller chunks.
How would one achieve this if the exact requirement was say, take all records whose ArticleText DATALENGTH is greater than 10MB, and break each into 3 records where the resulting records' Source column value is the same across the 3 records, but the ArticleText data is separated into three chunks.
In essence, we would need to divide the DATALENGTH by 3 and apply the first 1/3 of the text data to the first record, 2nd 1/3 to the 2nd record, and the 3rd 1/3 to the third record.
Is this even possible in SQL Server?
You can use the following code to create a side table with the needed data:
CREATE TABLE #mockup (ID INT IDENTITY, [Source] INT, ArticleText NVARCHAR(MAX));
INSERT INTO #mockup([Source],ArticleText) VALUES
(100,'This is a very long text with many many words and it is still longer and longer and longer, and even longer and longer and longer')
,(200,'A short text')
,(300,'A medium text, just long enough to need a second part');
DECLARE #partSize INT=50;
WITH recCTE AS
(
SELECT ID,[Source]
,1 AS FragmentIndex
,A.Pos
,CASE WHEN A.Pos>0 THEN LEFT(ArticleText,A.Pos) ELSE ArticleText END AS Fragment
,CASE WHEN A.Pos>0 THEN SUBSTRING(ArticleText,A.Pos+2,DATALENGTH(ArticleText)/2) END AS RestString
FROM #mockup
CROSS APPLY(SELECT CASE WHEN DATALENGTH(ArticleText)/2 > #partSize
THEN #partSize - CHARINDEX(' ',REVERSE(LEFT(ArticleText,#partSize)))
ELSE -1 END AS Pos) A
UNION ALL
SELECT r.ID,r.[Source]
,r.FragmentIndex+1
,A.Pos
,CASE WHEN A.Pos>0 THEN LEFT(r.RestString,A.Pos) ELSE r.RestString END
,CASE WHEN A.Pos>0 THEN SUBSTRING(r.RestString,A.Pos+2,DATALENGTH(r.RestString)/2) END AS RestString
FROM recCTE r
CROSS APPLY(SELECT CASE WHEN DATALENGTH(r.RestString)/2 > #partSize
THEN #partSize - CHARINDEX(' ',REVERSE(LEFT(r.RestString,#partSize)))
ELSE -1 END AS Pos) A
WHERE DATALENGTH(r.RestString)>0
)
SELECT ID,[Source],FragmentIndex,Fragment
FROM recCTE
ORDER BY [Source],FragmentIndex;
GO
DROP TABLE #mockup
The result
+----+--------+---------------+---------------------------------------------------+
| ID | Source | FragmentIndex | Fragment |
+----+--------+---------------+---------------------------------------------------+
| 1 | 100 | 1 | This is a very long text with many many words and |
+----+--------+---------------+---------------------------------------------------+
| 1 | 100 | 2 | it is still longer and longer and longer, and |
+----+--------+---------------+---------------------------------------------------+
| 1 | 100 | 3 | even longer and longer and longer |
+----+--------+---------------+---------------------------------------------------+
| 2 | 200 | 1 | A short text |
+----+--------+---------------+---------------------------------------------------+
| 3 | 300 | 1 | A medium text, just long enough to need a second |
+----+--------+---------------+---------------------------------------------------+
| 3 | 300 | 2 | part |
+----+--------+---------------+---------------------------------------------------+
Now you have to update the existing line with the value at FragmentIndex=1, while you have to insert the values of FragmentIndex>1. Do this sorted by FragmentIndex and your IDENTITY ID-column will reflect the correct order.
Let's say I have a table with 3 columns (a, b, c) with following values:
+---+------+---+
| a | b | c |
+---+------+---+
| 1 | 5 | 1 |
| 1 | NULL | 1 |
| 2 | NULL | 0 |
| 2 | NULL | 0 |
| 3 | NULL | 5 |
| 3 | NULL | 5 |
+---+------+---+
My desired output: 3
I want to select only those distinct values from column a for which every single occurrence of this value has NULL in column b given that value in c is not 0. Therefore from my desired output, "1" won't come in because there is a "5" in column b even though there is a NULL for the 2nd occurrence of "1". And "2" won't come in because the value of c is 0
The query that I'm using currently which is not working:
SELECT a FROM tab WHERE c!=0 GROUP BY a HAVING COUNT(b) = 0
You can do this using HAVING clause:
SQL Fiddle
SELECT a
FROM tbl
GROUP BY a
HAVING
SUM(CASE
WHEN b IS NOT NULL OR c = 0 THEN 1
ELSE 0 END
) = 0
I think this is the having clause that you want:
select a
from table t
group by a
having count(case when c <> 0 then b end) = 0 and
max(c) > 0
This assumes that c is non-negative.
However, it is not entirely clear why "2" doesn't meet your condition. There are no rows where "c" is not zero. Hence, all such rows have NULL values.
DECLARE #Table TABLE (
A INT
,B INT
,C INT
)
INSERT INTO #Table SELECT 1,5,1
INSERT INTO #Table SELECT 1,NULL,1
INSERT INTO #Table SELECT 2,NULL,0
INSERT INTO #Table SELECT 2,NULL,0
INSERT INTO #Table SELECT 3,NULL,5
INSERT INTO #Table SELECT 3,NULL,5
SELECT
a,max(b) [MaxB],max(C) [MaxC]
FROM #Table
GROUP BY A
HAVING max(b) IS NULL AND ISNULL(max(C),1)<>0
Although you've got 3 answers already, I decided to contribute my 2c...
The query from Ghost comes out most efficient when I check in SQL Server Query analyzer, however, I suspect if your data-set changes that Ghost's query may not be exactly as you require based on what you've written.
I think the query below is what you're looking for at the lowest execution cost in SQL, just basing this on your written requirements as opposed to the data example you've provided (Note: This queries performance is similar to Felix and Gordon's answers, however, I haven't included a conditional "case" statement in my having clause.).
SELECT DISTINCT(a) FROM intTable
GROUP BY a
HAVING SUM(ISNULL(b,0))=0 AND SUM(c)<>0
Hope this helps!
I have this query through an odbc connection in excel for a refreshable report with data for every 4 weeks. I need to show the dates in each of the 4 weeks even if there is no data for that day because this data is then linked to a Graph. Is there a way to do this?
thanks.
Select b.INV_DT, sum( a.ORD_QTY) as Ordered, sum( a.SHIPPED_QTY) as Shipped
from fct_dly_invoice_detail a, fct_dly_invoice_header b, dim_invoice_customer c
where a.INV_HDR_SK = b.INV_HDR_SK
and b.DIM_INV_CUST_SK = c.DIM_INV_CUST_SK
and a.SRC_SYS_CD = 'ABC'
and a.NDC_NBR is not null
**and b.inv_dt between CURRENT_DATE - 16 and CURRENT_DATE**
and b.store_nbr in (2851, 2963, 3249, 3385, 3447, 3591, 3727, 4065, 4102, 4289, 4376, 4793, 5209, 5266, 5312, 5453, 5569, 5575, 5892, 6534, 6571, 7110, 9057, 9262, 9652, 9742, 10373, 12392, 12739, 13870
)
group by 1
The general purpose solution to this is to create a date dimension table, and then perform an outer join to that date dimension table on the INV_DT column.
There are tons of good resources you can search for on creating a good date dimension table, so I'll just create a quick and dirty (and trivial) example here. I highly recommend some research in that area if you'll be doing a lot of BI/reporting.
If our table we want to report from looks like this:
Table "TABLEZ"
Attribute | Type | Modifier | Default Value
-----------+--------+----------+---------------
AMOUNT | BIGINT | |
INV_DT | DATE | |
Distributed on random: (round-robin)
select * from tablez order by inv_dt
AMOUNT | INV_DT
--------+------------
1 | 2015-04-04
1 | 2015-04-04
1 | 2015-04-06
1 | 2015-04-06
(4 rows)
and our report looks like this:
SELECT inv_dt,
SUM(amount)
FROM tablez
WHERE inv_dt BETWEEN CURRENT_DATE - 5 AND CURRENT_DATE
GROUP BY inv_dt;
INV_DT | SUM
------------+-----
2015-04-04 | 2
2015-04-06 | 2
(2 rows)
We can create a date dimension table that contains a row for every date (or ate last 1024 days in the past and 1024 days in the future using the _v_vector_idx view in this example).
create table date_dim (date_dt date);
insert into date_dim select current_date - idx from _v_vector_idx;
insert into date_dim select current_date + idx +1 from _v_vector_idx;
Then our query would look like this:
SELECT d.date_dt,
SUM(amount)
FROM tablez a
RIGHT OUTER JOIN date_dim d
ON a.inv_dt = d.date_dt
WHERE d.date_dt BETWEEN CURRENT_DATE -5 AND CURRENT_DATE
GROUP BY d.date_dt;
DATE_DT | SUM
------------+-----
2015-04-01 |
2015-04-02 |
2015-04-03 |
2015-04-04 | 2
2015-04-05 |
2015-04-06 | 2
(6 rows)
If you actually needed a zero value instead of a NULL for the days where you had no data, you could use a COALESCE or NVL like this:
SELECT d.date_dt,
COALESCE(SUM(amount),0)
FROM tablez a
RIGHT OUTER JOIN date_dim d
ON a.inv_dt = d.date_dt
WHERE d.date_dt BETWEEN CURRENT_DATE -5 AND CURRENT_DATE
GROUP BY d.date_dt;
DATE_DT | COALESCE
------------+----------
2015-04-01 | 0
2015-04-02 | 0
2015-04-03 | 0
2015-04-04 | 2
2015-04-05 | 0
2015-04-06 | 2
(6 rows)
I agree with #ScottMcG that you need to get the list of dates. However if you are in a situation where you aren't allowed to create a table. You can simplify things. All you need is a table that has at least 28 rows. Using your example, this should work.
select date_list.dt_nm, nvl(results.Ordered,0) as Ordered, nvl(results.Shipped,0) as Shipped
from
(select row_number() over(order by sub.arb_nbr)+ (current_date -28) as dt_nm
from (select rowid as arb_nbr
from fct_dly_invoice_detail b
limit 28) sub ) date_list left outer join
( Select b.INV_DT, sum( a.ORD_QTY) as Ordered, sum( a.SHIPPED_QTY) as Shipped
from fct_dly_invoice_detail a inner join
fct_dly_invoice_header b
on a.INV_HDR_SK = b.INV_HDR_SK
and a.SRC_SYS_CD = 'ABC'
and a.NDC_NBR is not null
**and b.inv_dt between CURRENT_DATE - 16 and CURRENT_DATE**
and b.store_nbr in (2851, 2963, 3249, 3385, 3447, 3591, 3727, 4065, 4102, 4289, 4376, 4793, 5209, 5266, 5312, 5453, 5569, 5575, 5892, 6534, 6571, 7110, 9057, 9262, 9652, 9742, 10373, 12392, 12739, 13870)
inner join
dim_invoice_customer c
on b.DIM_INV_CUST_SK = c.DIM_INV_CUST_SK
group by 1 ) results
on date_list.dt_nm = results.inv_dt