I am using SQL Server 2010.
I have a table in the database with records as shown below :
Id | EmpName | JoinDate | ResignedDate
---+----------+-------------------------+--------------
1 | Govind | 2014-04-02 00:00:00.000 | 2014-04-02
2 | Aravind | 2014-04-05 00:00:00.000 | 2014-04-05
3 | Aravind | 2014-04-07 00:00:00.000 | 2014-04-10
4 | Aravind | 2014-04-10 00:00:00.000 | 2014-04-11
5 | Aravind | 2014-04-14 00:00:00.000 | 2014-04-16
Now, I want display the difference of two dates (joinDate , ResignDate) and of that date different how many count available
Sample output:
DateDifferent Count
------------- -----
0 2
1 1
2 1
3 1
Here am showing my sample query,
entityManager.createNativeQuery(SELECT
DATEDIFF(day, e.joinedDate , e.resignedDate),
COUNT(DATEDIFF(day, e.joinedDate , e.resignedDate)))
FROM
Employee e
GROUP BY
DATEDIFF(e.joinedDate , e.resignedDate) ORDER BY (DATEDIFF(e.joinedDate , e.resignedDate)));
This queries is work well for mssql query browser but when I using the query in JPA Native Query (Java code) this query is not working
Any one help me ...
SQL Fiddle
MS SQL Server 2008 Schema Setup:
CREATE TABLE Employee
([Id] int, [EmpName] varchar(7), [JoinDate] datetime, [ResignedDate] datetime)
;
INSERT INTO Employee
([Id], [EmpName], [JoinDate], [ResignedDate])
VALUES
(1, 'Govind', '2014-04-02 00:00:00', '2014-04-02 00:00:00'),
(2, 'Aravind', '2014-04-05 00:00:00', '2014-04-05 00:00:00'),
(3, 'Aravind', '2014-04-07 00:00:00', '2014-04-10 00:00:00'),
(4, 'Aravind', '2014-04-10 00:00:00', '2014-04-11 00:00:00'),
(5, 'Aravind', '2014-04-14 00:00:00', '2014-04-16 00:00:00')
;
Query 1:
SELECT
DATEDIFF(DAY, JoinDate, ResignedDate) AS DateDifferent
, COUNT(DATEDIFF(DAY, JoinDate, ResignedDate)) as FrequencyOf
FROM Employee
GROUP BY DATEDIFF(DAY, JoinDate, ResignedDate)
ORDER BY DateDifferent
Note! You may use column aliases (e.g. DateDifferent) in the ORDER BY clause
Results:
| DateDifferent | FrequencyOf |
|---------------|-------------|
| 0 | 2 |
| 1 | 1 |
| 2 | 1 |
| 3 | 1 |
Related
I have one doubt in snowflake
I have implemente one query in sql server using padindex and need to get
same result in snowflake server
in sql server :
CREATE TABLE [dbo].[proddetails](
[Filename] [varchar](50) NULL,
[pid] [int] NULL
)
INSERT [dbo].[proddetails] ([Filename], [pid]) VALUES (N'cinthol_20200108.csv', 1)
INSERT [dbo].[proddetails] ([Filename], [pid]) VALUES (N'pencame_20220309_1.csv', 2)
INSERT [dbo].[proddetails] ([Filename], [pid]) VALUES (N'prodct_20220403.csv', 3)
INSERT [dbo].[proddetails] ([Filename], [pid]) VALUES (N'jain_rav_pan_20220109_1.csv', 4)
based on above data I want out put like below
in sql server :
select pid,filename,substring(filename,0,patindex('%[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]%',filename)) filename_U
, cast(cast (substring(filename,patindex('%[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]%',filename),8) as varchar(8)) as date) filedate_U
FROM [test].[dbo].[proddetails]
pid |filename |filename_U |filedate_U
1 |cinthol_20200108.csv |cinthol_ |2020-01-08
2 |pencame_20220309_1.csv |pencame_ |2022-03-09
3 |prodct_20220403.csv |prodct_ |2022-04-03
4 |jain_rav_pan_20220109_1.csv |jain_rav_pan_|2022-01-09
in snowflake I tried like below
select pid,filename,substring(filename,0,regexp_instr('%[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]%',filename)) filename_U
, cast(cast (substring(filename,regexp_instr('%[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]%',filename),8) as varchar(8)) as date) filedate_U
FROM proddetails
but above query not giving exact result .
could you please tell me how to write query to achive this task in snowflake
Can you try this one?
select pid,filename,
regexp_substr(filename, '(.*)[0-9]{8}', 1,1,'e') filename_U,
to_date( regexp_substr(filename, '[0-9]{8}'), 'YYYYMMDD') filedate_U
FROM proddetails;
+-----+-----------------------------+---------------+--------------+
| PID | FILENAME | FILENAME_U | FILEDATE_U |
+-----+-----------------------------+---------------+--------------+
| 1 | cinthol_20200108.csv | cinthol_ | 2020-01-08 |
| 2 | pencame_20220309_1.csv | pencame_ | 2022-03-09 |
| 3 | prodct_20220403.csv | prodct_ | 2022-04-03 |
| 4 | jain_rav_pan_20220109_1.csv | jain_rav_pan_ | 2022-01-09 |
+-----+-----------------------------+---------------+--------------+
I am tracking data in my SCD table as shown below image using the SSIS package.
I need to add a new column, the "Column Updated" (as depicted above) which represents what columns were updated between N and N-1 transaction. This can be achieved by Cursor however I am looking for suggestions to do this in an efficient way. Would it be possible to perform within SCD or any other inbuilt SQL server function?
adding script:
Create table SCDtest
(
id int ,
empid int ,
Deptid varchar(10),
Ename varchar(50),
DeptName varchar(50),
city varchar(50),
startdate datetime,
Enddate datetime ,
ColumnUpdated varchar(500)
)
Insert into SCDtest values (1, 1, 'D1', 'Mike', 'Account', 'Atlanta', '7/31/2020', '8/3/2020','' )
Insert into SCDtest values (2, 2, 'D2', 'Roy', 'IT', 'New York', '7/31/2020', '8/5/2020','' )
Insert into SCDtest values (3, 1, 'D1', 'Ross', 'Account', 'Atlanta', '8/4/2020', '8/7/2020','' )
Insert into SCDtest values (4, 2, 'D2', 'Roy', 'IT', 'Los angeles', '8/5/2020',NULL ,'' )
Insert into SCDtest values (5, 1, 'D1', 'John', 'Marketing', 'Boston', '8/8/2020', NULL,'')
Thank you
Honestly I don't really know why you need this functionality as you can very easily just look at the two rows to see any changes, on the off chance that you do actually need to see them. I've never needed a ColumnUpdated type value and I don't think the processing required to generate one and the storage to hold the data is worth having it.
That said, here is one way you can calculate the desired output from your given test data. Ideally you would do this in a more efficient way as part of your ETL process that is updating the rows as they come in rather than all at once. Though this obviously required info about your ETL that you haven't included in your question:
Query
declare #SCDtest table(id int,empid int,Deptid varchar(10),Ename varchar(50),DeptName varchar(50),city varchar(50),startdate datetime,Enddate datetime);
Insert into #SCDtest values(1, 1, 'D1', 'Mike', 'Account', 'Atlanta', '7/31/2020', '8/3/2020'),(2, 2, 'D2', 'Roy', 'IT', 'New York', '7/31/2020', '8/5/2020'),(3, 1, 'D1', 'Ross', 'Account', 'Atlanta', '8/4/2020', '8/7/2020'),(4, 2, 'D2', 'Roy', 'IT', 'Los angeles', '8/5/2020',NULL),(5, 1, 'D1', 'John', 'Marketing', 'Boston', '8/8/2020', NULL);
with l as
(
select *
,lag(id,1) over (partition by empid order by id) as l
from #SCDtest
)
select l.id
,l.empid
,l.Deptid
,l.Ename
,l.DeptName
,l.city
,l.startdate
,l.Enddate
,stuff(concat(case when l.Deptid <> t.Deptid then ', Deptid' end
,case when l.Ename <> t.Ename then ', Ename' end
,case when l.DeptName <> t.DeptName then ', DeptName' end
,case when l.city <> t.city then ', city' end
)
,1,2,''
) as ColumnUpdated
from l
left join #SCDtest as t
on l.l = t.id
order by l.empid
,l.startdate;
Output
+----+-------+--------+-------+-----------+-------------+-------------------------+-------------------------+-----------------------+
| id | empid | Deptid | Ename | DeptName | city | startdate | Enddate | ColumnUpdated |
+----+-------+--------+-------+-----------+-------------+-------------------------+-------------------------+-----------------------+
| 1 | 1 | D1 | Mike | Account | Atlanta | 2020-07-31 00:00:00.000 | 2020-08-03 00:00:00.000 | NULL |
| 3 | 1 | D1 | Ross | Account | Atlanta | 2020-08-04 00:00:00.000 | 2020-08-07 00:00:00.000 | Ename |
| 5 | 1 | D1 | John | Marketing | Boston | 2020-08-08 00:00:00.000 | NULL | Ename, DeptName, city |
| 2 | 2 | D2 | Roy | IT | New York | 2020-07-31 00:00:00.000 | 2020-08-05 00:00:00.000 | NULL |
| 4 | 2 | D2 | Roy | IT | Los angeles | 2020-08-05 00:00:00.000 | NULL | city |
+----+-------+--------+-------+-----------+-------------+-------------------------+-------------------------+-----------------------+
I currently have a stored procedure that returns the total earnings from someone and checks if the current amount is outside the expected input.
The total earning is inputed manually and sometimes people insert the wrong amount, like putting in an extra 0 or forgetting one number.
The stored procedure calculates the earnings since the last input. Somethings are expected and the stored procedure tells if the row is outside the expected value (like if the value is negative for example)
What I need to do "ignore" those inputs and calculate based on the next row without error. For example in the query bellow:
DECLARE #Table AS TABLE (
[Date] DATETIME,
[Name] VARCHAR(100),
[TotalEarnings] DECIMAL(10,2),
[PartialEarnings] DECIMAL(10,2),
[ConsecultiveErrors] INT
);
INSERT INTO #Table VALUES ('20180510 00:00:00', 'John', 1000.00, NULL, 0);
INSERT INTO #Table VALUES ('20180509 00:00:00', 'John', 9000.00, -8000.00, 3);
INSERT INTO #Table VALUES ('20180508 00:00:00', 'John', 80.00, 8920.00, 2);
INSERT INTO #Table VALUES ('20180507 00:00:00', 'John', 700.00, -720.00, 1);
INSERT INTO #Table VALUES ('20180506 00:00:00', 'John', 600.00, 100.00, 0);
INSERT INTO #Table VALUES ('20180505 00:00:00', 'John', 5000.00, -4400.00, 2);
INSERT INTO #Table VALUES ('20180504 00:00:00', 'John', 400.00, 4600.00, 1);
INSERT INTO #Table VALUES ('20180503 00:00:00', 'John', 300.00, 100.00, 0);
INSERT INTO #Table VALUES ('20180502 00:00:00', 'John', 20.00, 180.00, 2);
INSERT INTO #Table VALUES ('20180501 00:00:00', 'John', 100.00, -80.00, 1);
SELECT
[t].[Date],
[t].[Name],
[t].[TotalEarnings],
[t].[PartialEarnings],
[t].[ConsecultiveErrors]
FROM
#Table AS [t]
I would need to calculate the avarage from the difference between the row from day 1 until the first day without error (the 3rd) giving $100 to the 1st and 2nd day each.
How can I achieve that?
[EDIT1] So the last column shows the amount of rows with errors in a sequence. They use that, because the error is calculated using the Partial Value, and the partial value is calculated using a LAG/LEAD function. So one wrong input actually can cause two or more rows to be considered "wrong"
Your descriptions are slightly confusing, though I think this does what you need it to?
The CTE uses a conditional running sum to group the rows by ConsecutiveErrors = 1 to ConsecutiveErrors = 0 when sorted in DateValue (Don't use reserved words such as Date for object names) order. It also provides the TotalEarnings either as a negative for the first row or a positive for the last row and 0 otherwise, the sum of which gives you the difference between the first and last row in the ErrorGroup.
From this, you can then either return the correct PartialEarnings value from the ConsecutiveErrors = 0 row or the average of the EarningsDiffence across the number of rows within the ErrorGroup minus 1 (ie: the number of erroneous rows within the group):
declare #t as table (
DateValue datetime,
Name varchar(100),
TotalEarnings decimal(10,2),
PartialEarnings decimal(10,2),
ConsecutiveErrors int
);
insert into #t values ('20180510 00:00:00', 'John', 1000.00, NULL, 0);
insert into #t values ('20180509 00:00:00', 'John', 9000.00, -8000.00, 3);
insert into #t values ('20180508 00:00:00', 'John', 80.00, 8920.00, 2);
insert into #t values ('20180507 00:00:00', 'John', 700.00, -720.00, 1);
insert into #t values ('20180506 00:00:00', 'John', 600.00, 100.00, 0);
insert into #t values ('20180505 00:00:00', 'John', 5000.00, -4400.00, 2);
insert into #t values ('20180504 00:00:00', 'John', 400.00, 4600.00, 1);
insert into #t values ('20180503 00:00:00', 'John', 300.00, 100.00, 0);
insert into #t values ('20180502 00:00:00', 'John', 20.00, 180.00, 2);
insert into #t values ('20180501 00:00:00', 'John', 100.00, -80.00, 1);
with g as
(
select DateValue
,Name
,TotalEarnings
,PartialEarnings
,ConsecutiveErrors
,sum(case when ConsecutiveErrors = 1 then 1 else 0 end) over (order by DateValue) as ErrorGroup
,case when ConsecutiveErrors = 1 then -TotalEarnings
when ConsecutiveErrors = 0 then TotalEarnings
else 0
end as EarningsDifference
from #t
)
select DateValue
,Name
,TotalEarnings
,PartialEarnings
,ConsecutiveErrors
,ErrorGroup
,case when ConsecutiveErrors = 0
then PartialEarnings
else sum(EarningsDifference) over (partition by ErrorGroup)
/ (count(EarningsDifference) over (partition by ErrorGroup)-1)
end as AverageEarnings
from g
order by DateValue
Output:
+-------------------------+------+---------------+-----------------+-------------------+------------+-----------------+
| DateValue | Name | TotalEarnings | PartialEarnings | ConsecutiveErrors | ErrorGroup | AverageEarnings |
+-------------------------+------+---------------+-----------------+-------------------+------------+-----------------+
| 2018-05-01 00:00:00.000 | John | 100.00 | -80.00 | 1 | 1 | 100.000000 |
| 2018-05-02 00:00:00.000 | John | 20.00 | 180.00 | 2 | 1 | 100.000000 |
| 2018-05-03 00:00:00.000 | John | 300.00 | 100.00 | 0 | 1 | 100.000000 |
| 2018-05-04 00:00:00.000 | John | 400.00 | 4600.00 | 1 | 2 | 100.000000 |
| 2018-05-05 00:00:00.000 | John | 5000.00 | -4400.00 | 2 | 2 | 100.000000 |
| 2018-05-06 00:00:00.000 | John | 600.00 | 100.00 | 0 | 2 | 100.000000 |
| 2018-05-07 00:00:00.000 | John | 700.00 | -720.00 | 1 | 3 | 100.000000 |
| 2018-05-08 00:00:00.000 | John | 80.00 | 8920.00 | 2 | 3 | 100.000000 |
| 2018-05-09 00:00:00.000 | John | 9000.00 | -8000.00 | 3 | 3 | 100.000000 |
| 2018-05-10 00:00:00.000 | John | 1000.00 | NULL | 0 | 3 | NULL |
+-------------------------+------+---------------+-----------------+-------------------+------------+-----------------+
I have a question in sql server
table name : Emp
Id |Pid |Firstname| LastName | Level
1 |101 | Ram |Kumar | 3
1 |100 | Ravi |Kumar | 2
2 |101 | Jaid |Balu | 10
1 |100 | Hari | Babu | 5
1 |103 | nani | Jai |44
1 |103 | Nani | Balu |10
3 |103 |bani |lalu |20
Here need to retrieve unique records based on id and Pid columns and records which have duplicate records need to skip.
Finally I want output like below
Id |Pid |Firstname| LastName | Level
1 |101 | Ram |Kumar | 3
2 |101 | Jaid |Balu | 10
3 |103 |bani |lalu |20
I found duplicate records based on below query
select id,pid,count(*) from emp group by id,pid having count(*) >=2
this query get duplicated records 2 that records need to skip to retrieve output
please tell me how to write query to achieve this task in sql server.
Since your output is based on unique ID and PID which do not have any duplicate value, You can use COUNT with partition to achieve your desired result.
SQL Fiddle
Sample Data
CREATE TABLE Emp
([Id] int, [Pid] int, [Firstname] varchar(4), [LastName] varchar(5), [Level] int);
INSERT INTO Emp
([Id], [Pid], [Firstname], [LastName], [Level])
VALUES
(1, 101, 'Ram', 'Kumar', 3),
(1, 100, 'Ravi', 'Kumar', 2),
(2, 101, 'Jaid', 'Balu', 10),
(1, 100, 'Hari', 'Babu', 5),
(1, 103, 'nani', 'Jai', 44),
(1, 103, 'Nani', 'Balu', 10),
(3, 103, 'bani', 'lalu', 20);
Query
SELECT *
FROM
(
SELECT *,rn = COUNT(*) OVER(PARTITION BY ID,PID)
FROM Emp
) Emp
WHERE rn = 1
Output
| Id | Pid | Firstname | LastName | Level |
|----|-----|-----------|----------|-------|
| 1 | 101 | Ram | Kumar | 3 |
| 2 | 101 | Jaid | Balu | 10 |
| 3 | 103 | bani | lalu | 20 |
I have this table and data
CREATE TABLE #transactions (
[transactionId] [int] NOT NULL,
[accountId] [int] NOT NULL,
[dt] [datetime] NOT NULL,
[balance] [smallmoney] NOT NULL,
CONSTRAINT [PK_transactions_1] PRIMARY KEY CLUSTERED
( [transactionId] ASC)
)
INSERT #transactions ([transactionId], [accountId], [dt], [balance]) VALUES
(1, 1, CAST(0x0000A13900107AC0 AS DateTime), 123.0000),
(2, 1, CAST(0x0000A13900107AC0 AS DateTime), 192.0000),
(3, 1, CAST(0x0000A13A00107AC0 AS DateTime), 178.0000),
(4, 2, CAST(0x0000A13B00107AC0 AS DateTime), 78.0000),
(5, 2, CAST(0x0000A13D011D1860 AS DateTime), 99.0000),
(6, 2, CAST(0x0000A13F00000000 AS DateTime), 97.0000),
(7, 1, CAST(0x0000A13D0141E640 AS DateTime), 201.0000),
(8, 3, CAST(0x0000A1420094DD60 AS DateTime), 4000.0000),
(9, 3, CAST(0x0000A14300956A00 AS DateTime), 4100.0000),
(10, 3, CAST(0x0000A14700000000 AS DateTime), 4200.0000),
(11, 2, CAST(0x0000A14B00B84BB0 AS DateTime), 110.0000)
I need two queries.
For each transaction, I want to return in a query the most recent balance for each account, and an extra column with a SUM of each account balance at that point in time.
Same as 1 but grouped by date without the time portion. So the latest account balance at the end of each day (where there is a transaction in any account) for each account, but SUMed together as in 1.
Data above is sample data that I just made up, but my real table has hundreds of rows and ten accounts (which may increase soon). Each account has a unique accountId. Seems quite a tricky piece of SQL.
EXAMPLE
For 1. I need a result like this:
+---------------+-----------+-------------------------+---------+-------------+
| transactionId | accountId | dt | balance | sumBalances |
+---------------+-----------+-------------------------+---------+-------------+
| 1 | 1 | 2013-01-01 01:00:00.000 | 123 | 123 |
| 2 | 1 | 2013-01-01 01:00:00.000 | 192 | 192 |
| 3 | 1 | 2013-01-02 01:00:00.000 | 178 | 178 |
| 4 | 2 | 2013-01-03 01:00:00.000 | 78 | 256 |
| 5 | 2 | 2013-01-05 17:18:00.000 | 99 | 277 |
| 7 | 1 | 2013-01-05 19:32:00.000 | 201 | 300 |
| 6 | 2 | 2013-01-07 00:00:00.000 | 97 | 298 |
| 8 | 3 | 2013-01-10 09:02:00.000 | 4000 | 4298 |
| 9 | 3 | 2013-01-11 09:04:00.000 | 4100 | 4398 |
| 10 | 3 | 2013-01-15 00:00:00.000 | 4200 | 4498 |
| 11 | 2 | 2013-01-19 11:11:00.000 | 110 | 4511 |
+---------------+-----------+-------------------------+---------+-------------+
So, for transactionId 8, I take the latest balance for each account in turn and then sum them. AccountID 1: is 201, AccountId 2 is 97 and AccountId 3 is 4000. Therefore the result for transactionId 8 will be 201+97+4000 = 4298. When calculating the set must be ordered by dt
For 2. I need this
+------------+-------------+
| date | sumBalances |
+------------+-------------+
| 01/01/2013 | 192 |
| 02/01/2013 | 178 |
| 03/01/2013 | 256 |
| 05/01/2013 | 300 |
| 07/01/2013 | 298 |
| 10/01/2013 | 4298 |
| 11/01/2013 | 4398 |
| 15/01/2013 | 4498 |
| 19/01/2013 | 4511 |
+------------+-------------+
So on date 15/01/2013 the latest account balance for each account in turn (1,2,3) is 201,97,4200. So the result for that date would be 201+97+4200 = 4498
This gives your first desired resultset (SQL Fiddle)
WITH T
AS (SELECT *,
balance -
isnull(lag(balance) OVER (PARTITION BY accountId
ORDER BY dt, transactionId), 0) AS B
FROM #transactions)
SELECT transactionId,
accountId,
dt,
balance,
SUM(B) OVER (ORDER BY dt, transactionId ROWS UNBOUNDED PRECEDING) AS sumBalances
FROM T
ORDER BY dt;
It subtracts the current balance of the account from the previous balance to get the net difference then calculates a running total of those differences.
And that can be used as a base for your second result
WITH T1
AS (SELECT *,
balance -
isnull(lag(balance) OVER (PARTITION BY accountId
ORDER BY dt, transactionId), 0) AS B
FROM #transactions),
T2 AS (
SELECT transactionId,
accountId,
dt,
balance,
ROW_NUMBER() OVER (PARTITION BY CAST(dt AS DATE) ORDER BY dt DESC, transactionId DESC) AS RN,
SUM(B) OVER (ORDER BY dt, transactionId ROWS UNBOUNDED PRECEDING) AS sumBalances
FROM T1)
SELECT CAST(dt AS DATE) AS [date], sumBalances
FROM T2
WHERE RN=1
ORDER BY [date];
Part 1
; WITH a AS (
SELECT *, r = ROW_NUMBER()OVER(PARTITION BY accountId ORDER BY dt)
FROM #transactions t
)
, b AS (
SELECT t.*
, transamount = t.balance - ISNULL(t0.balance,0)
FROM a t
LEFT JOIN a t0 ON t0.accountId = t.accountId AND t0.r + 1 = t.r
)
SELECT transactionId, accountId, dt, balance
, sumBalance = SUM(transamount)OVER(ORDER BY dt, transactionId)
FROM b
ORDER BY dt
Part 2
; WITH a AS (
SELECT *, r = ROW_NUMBER()OVER(PARTITION BY accountId ORDER BY dt)
FROM #transactions t
)
, b AS (
SELECT t.*
, transamount = t.balance - ISNULL(t0.balance,0)
FROM a t
LEFT JOIN a t0 ON t0.accountId = t.accountId AND t0.r + 1 = t.r
)
, c AS (
SELECT transactionId, accountId, dt, balance
, sumBalance = SUM(transamount)OVER(ORDER BY CAST(dt AS DATE))
, r1 = ROW_NUMBER()OVER(PARTITION BY accountId, CAST(dt AS DATE) ORDER BY dt DESC)
FROM b
)
SELECT dt = CAST(dt AS DATE)
, sumBalance
FROM c
WHERE r1 = 1
ORDER BY CAST(dt AS DATE)