snowfalke split into two columns - snowflake-cloud-data-platform

snowfalke split into two columns - snowflake-cloud-data-platform

I have one doubt in snowflake
I have implemente one query in sql server using padindex and need to get
same result in snowflake server
in sql server :
CREATE TABLE [dbo].[proddetails](
[Filename] [varchar](50) NULL,
[pid] [int] NULL
)
INSERT [dbo].[proddetails] ([Filename], [pid]) VALUES (N'cinthol_20200108.csv', 1)
INSERT [dbo].[proddetails] ([Filename], [pid]) VALUES (N'pencame_20220309_1.csv', 2)
INSERT [dbo].[proddetails] ([Filename], [pid]) VALUES (N'prodct_20220403.csv', 3)
INSERT [dbo].[proddetails] ([Filename], [pid]) VALUES (N'jain_rav_pan_20220109_1.csv', 4)
based on above data I want out put like below
in sql server :
select pid,filename,substring(filename,0,patindex('%[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]%',filename)) filename_U
, cast(cast (substring(filename,patindex('%[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]%',filename),8) as varchar(8)) as date) filedate_U
FROM [test].[dbo].[proddetails]
pid |filename |filename_U |filedate_U
1 |cinthol_20200108.csv |cinthol_ |2020-01-08
2 |pencame_20220309_1.csv |pencame_ |2022-03-09
3 |prodct_20220403.csv |prodct_ |2022-04-03
4 |jain_rav_pan_20220109_1.csv |jain_rav_pan_|2022-01-09
in snowflake I tried like below
select pid,filename,substring(filename,0,regexp_instr('%[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]%',filename)) filename_U
, cast(cast (substring(filename,regexp_instr('%[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]%',filename),8) as varchar(8)) as date) filedate_U
FROM proddetails
but above query not giving exact result .
could you please tell me how to write query to achive this task in snowflake

Can you try this one?
select pid,filename,
regexp_substr(filename, '(.*)[0-9]{8}', 1,1,'e') filename_U,
to_date( regexp_substr(filename, '[0-9]{8}'), 'YYYYMMDD') filedate_U
FROM proddetails;
+-----+-----------------------------+---------------+--------------+
| PID | FILENAME | FILENAME_U | FILEDATE_U |
+-----+-----------------------------+---------------+--------------+
| 1 | cinthol_20200108.csv | cinthol_ | 2020-01-08 |
| 2 | pencame_20220309_1.csv | pencame_ | 2022-03-09 |
| 3 | prodct_20220403.csv | prodct_ | 2022-04-03 |
| 4 | jain_rav_pan_20220109_1.csv | jain_rav_pan_ | 2022-01-09 |
+-----+-----------------------------+---------------+--------------+

Related

Separate comma values into individual values

I need to separate columns in SQL Server
Table: columnsseparates
CREATE TABLE [dbo].[columnsseparates](
[id] [varchar](50) NULL,
[name] [varchar](500) NULL
)
INSERT [dbo].[columnsseparates] ([id], [name]) VALUES (N'1,2,3,4', N'abc,xyz,mn')
GO
INSERT [dbo].[columnsseparates] ([id], [name]) VALUES (N'4,5,6', N'xy,yz')
GO
INSERT [dbo].[columnsseparates] ([id], [name]) VALUES (N'7,100', N'yy')
INSERT [dbo].[columnsseparates] ([id], [name]) VALUES (N'101', N'oo,yy')
GO
based on above data I want output like below:
id | Name
1 |abc
2 |xyz
3 |mn
4 |null
4 |xy
5 |yz
6 |null
7 |yy
100 |null
101 |oo
null |yy
How to achieve this task in SQL Server?

Storing non-atomic values in column is a sign that schema should be normalised.
Naive approach using PARSENAME(up to 4 comma separated values):
SELECT DISTINCT s.id, s.name
FROM [dbo].[columnsseparates]
CROSS APPLY(SELECT REVERSE(REPLACE(id,',','.')) id,REVERSE(REPLACE(name, ',','.')) name) sub
CROSS APPLY(VALUES (REVERSE(PARSENAME(sub.id,1)), REVERSE(PARSENAME(sub.name,1))),
(REVERSE(PARSENAME(sub.id,2)), REVERSE(PARSENAME(sub.name,2))),
(REVERSE(PARSENAME(sub.id,3)), REVERSE(PARSENAME(sub.name,3))),
(REVERSE(PARSENAME(sub.id,4)), REVERSE(PARSENAME(sub.name,4)))
) AS s(id, name)
ORDER BY s.id;
db<>fiddle demo
Output:
+------+------+
| id | name |
+------+------+
| | |
| | yy |
| 1 | abc |
| 100 | |
| 101 | oo |
| 2 | xyz |
| 3 | mn |
| 4 | |
| 4 | xy |
| 5 | yz |
| 6 | |
| 7 | yy |
+------+------+

If you have more than 4 values, then you'll to use a string splitter that can return the ordinal value. I use delimitedsplit8k_LEAD here:
WITH Ids AS(
SELECT cs.id,
cs.name,
DS.ItemNumber,
DS.Item
FROM dbo.columnsseparates cs
CROSS APPLY dbo.DelimitedSplit8K_LEAD (cs.id,',') DS),
Names AS (
SELECT cs.id,
cs.name,
DS.ItemNumber,
DS.Item
FROM dbo.columnsseparates cs
CROSS APPLY dbo.DelimitedSplit8K_LEAD (cs.[name],',') DS)
SELECT I.Item AS ID,
N.Item AS [Name]
FROM Ids I
FULL OUTER JOIN Names N ON I.id = N.id
AND I.ItemNumber = N.ItemNumber
ORDER BY CASE WHEN I.Item IS NULL THEN 1 ELSE 0 END,
TRY_CONVERT(int,I.Item);

sql query for non-existing entries

I have inherited a website and its corresponding database (SQL Server). The website uses stored procedures to pull data from the database. One of these stored procedures contains a pivot and it the pivot is taking over 4 hours to run. This is currently unacceptable. I am looking for help in replacing the pivot with standard SQL queries because I assume that will be faster and provide better performance.
Here is the pivot in question:
SELECT *
FROM (
SELECT ac.AID
,ac.CatName AS t
,convert(INT, ac.Code) AS c
FROM categories AS ac
) AS s
Pivot(Sum(c) FOR t IN (
[tob]
,[ecit]
,[tobwcom]
,[rnorm]
,[raddict]
,[rpolicy]
,[ryouth]
,[rhealth]
,…
)) AS p;
And the results of the pivot
| AID | tob | ecit | tobwcom | rnorm |
|-----------|-----------|------------|---------------|-------------|
| 1 | 1 | NULL | NULL | 0 |
| 2 | 1 | NULL | NULL | 1 |
| 3 | 1 | NULL | NULL | 0 |
| 4 | 1 | NULL | NULL | 0 |
| 5 | 1 | NULL | NULL | 0 |
| 6 | 1 | NULL | NULL | 1 |
Here’s the source table categories and some sample data:
CREATE TABLE categories(
ArticleID INTEGER NOT NULL
,ThemeID INTEGER NOT NULL
,ThemeName VARCHAR(7) NOT NULL
,Code BIT NOT NULL
,CreatedTime VARCHAR(7) NOT NULL
);
INSERT INTO categories(ArticleID,ThemeID,ThemeName,Code,CreatedTime) VALUES (1,1,'tob',1,'57:30.7');
INSERT INTO categories(ArticleID,ThemeID,ThemeName,Code,CreatedTime) VALUES (1,2,'ecig',1,'03:58.3');
INSERT INTO categories(ArticleID,ThemeID,ThemeName,Code,CreatedTime) VALUES (1,5,'rnorm',0,'42:56.5');
INSERT INTO categories(ArticleID,ThemeID,ThemeName,Code,CreatedTime) VALUES (2,1,'tob',1,'57:30.7');
INSERT INTO categories(ArticleID,ThemeID,ThemeName,Code,CreatedTime) VALUES (2,2,'ecig',0,'03:58.3');
INSERT INTO categories(ArticleID,ThemeID,ThemeName,Code,CreatedTime) VALUES (2,5,'rnorm',1,'42:56.5');
INSERT INTO categories(ArticleID,ThemeID,ThemeName,Code,CreatedTime) VALUES (2,6,'raddict',0,'42:59.8');
INSERT INTO categories(ArticleID,ThemeID,ThemeName,Code,CreatedTime) VALUES (3,1,'tob',1,'57:30.7');
INSERT INTO categories(ArticleID,ThemeID,ThemeName,Code,CreatedTime) VALUES (3,2,'ecig',0,'03:58.3');
INSERT INTO categories(ArticleID,ThemeID,ThemeName,Code,CreatedTime) VALUES (3,5,'rnorm',0,'42:56.5');
INSERT INTO categories(ArticleID,ThemeID,ThemeName,Code,CreatedTime) VALUES (21,1,'tob',1,'57:30.7');
INSERT INTO categories(ArticleID,ThemeID,ThemeName,Code,CreatedTime) VALUES (21,2,'ecig',0,'03:58.3');
INSERT INTO categories(ArticleID,ThemeID,ThemeName,Code,CreatedTime) VALUES (21,5,'rnorm',0,'42:56.5');
INSERT INTO categories(ArticleID,ThemeID,ThemeName,Code,CreatedTime) VALUES (21,6,'raddict',0,'42:59.8');
And here’s the table containing the category names – (mytable for now)
CREATE TABLE mytable(
CatID INTEGER NOT NULL PRIMARY KEY
,CatName VARCHAR(7) NOT NULL
,CreatedTime DATETIME NOT NULL
);
INSERT INTO mytable(CatID,CatName,CreatedTime) VALUES (1,'tob','2015-03-12 10:07:54.173');
INSERT INTO mytable(CatID,CatName,CreatedTime) VALUES (2,'ecig','2015-05-18 11:48:16.297');
INSERT INTO mytable(CatID,CatName,CreatedTime) VALUES (4,'tobwcom','2015-06-19 11:12:01.537');
INSERT INTO mytable(CatID,CatName,CreatedTime) VALUES (5,'rnorm','2015-06-22 14:24:02.317');
INSERT INTO mytable(CatID,CatName,CreatedTime) VALUES (6,'raddict','2015-06-22 14:24:13.957');
INSERT INTO mytable(CatID,CatName,CreatedTime) VALUES (7,'ecit','2015-06-22 14:26:18.437');
What I need is a way to perform the pivot’s ability to find the non-existing data in categories. The output would be something like:
| AID | tob | ecit | tobwcom | rnorm |
|-----------|-----------|------------|---------------|-------------|
| 1 | 1 | NULL | NULL | 0 |
| 2 | 1 | NULL | NULL | 1 |
Or the list of AIDs and the CatNames that don’t have any values. Such as:
| AID | CatName |
|-----|---------|
| 1 | ecit |
| 1 | tobwcom |
| 2 | ecit |
| 2 | tobwcom |
I have tried
select distinct(AID) FROM [categories]
where [CatName] not in ( 'ecit', 'tobwcom')
but the results from this, the numbers don't seem to add up, however this could be an error on my part.

Not sure if it would be fast enough for such a huge table. But for that second expected result then something this could help to find the missing.
select a.ArticleID, c.CatName
from #myarticles a
cross join #mycategories c
left join categories ca on (ca.ArticleID = a.ArticleID and ca.ThemeID = c.CatID)
where ca.ArticleID is null;
A test can be found here
Note that this method benefits from a combined primary key index on (ArticleID, ThemeID)
As an alternative, the LEFT JOIN with a NULL check can be changed to a NOT EXISTS.
select a.ArticleID, c.CatName
from #myarticles a
join #mycategories c on c.CatID between 1 and 7
where NOT EXISTS
(
select 1
from categories ca
where ca.ArticleID = a.ArticleID
and ca.ThemeID = c.CatID
);

Troubleshooting to implement SQL Server trigger

I have this table called InspectionsReview:
CREATE TABLE InspectionsReview
(
ID int NOT NULL AUTO_INCREMENT,
InspectionItemId int,
SiteId int,
ObjectId int,
DateReview DATETIME,
PRIMARY KEY (ID)
);
Here how the table looks:
+----+------------------+--------+-----------+--------------+
| ID | InspectionItemId | SiteId | ObjectId | DateReview |
+----+------------------+--------+-----------+--------------+
| 1 | 3 | 3 | 3045 | 20-05-2016 |
| 2 | 5 | 45 | 3025 | 01-03-2016 |
| 3 | 4 | 63 | 3098 | 05-05-2016 |
| 4 | 5 | 5 | 3041 | 03-04-2016 |
| 5 | 3 | 97 | 3092 | 22-02-2016 |
| 6 | 1 | 22 | 3086 | 24-11-2016 |
| 7 | 9 | 24 | 3085 | 15-12-2016 |
+----+------------------+--------+-----------+--------------+
I need to write trigger that checks before the new row is inserted to the table if the table already has row with columns values 'ObjectId' and 'DateReview' that equal to the columns values of the row that have to be inserted, if it's equal I need to get the ID of the exited row and to put to trigger variable called duplicate .
For example, if new row that has to be inserted is:
INSERT INTO InspectionsReview (InspectionItemId, SiteId, ObjectId, DateReview)]
VALUES (4, 63, 3098, '05-05-2016');
The duplicate variable in SQL Server trigger must be equal to 3.
Because the row in InspectionsReview table were ID = 3 has ObjectId and DateReview values the same as in new row that have to be inserted. How can I implement this?

With the extra assumption that you want to log all the duplicate to a different table, then my solution would be to create an AFTER trigger that would check for the duplicate and insert it into your logging table.
Of course, whether this is the solution depends on whether my extra assumption is valid.
Here is my logging table.
CREATE TABLE dbo.InspectionsReviewLog (
ID int
, ObjectID int
, DateReview DATETIME
, duplicate int
);
Here is the trigger (pretty straightforward with the extra assumption)
CREATE TRIGGER tr_InspectionsReview
ON dbo.InspectionsReview
AFTER INSERT
AS
BEGIN
DECLARE #tableVar TABLE(
ID int
, ObjectID int
, DateReview DATETIME
);
INSERT INTO #tableVar (ID, ObjectID, DateReview)
SELECT DISTINCT inserted.ID, inserted.ObjectID, inserted.DateReview
FROM inserted
JOIN dbo.InspectionsReview ir ON inserted.ObjectID=ir.ObjectID AND inserted.DateReview=ir.DateReview AND inserted.ID <> ir.ID;
INSERT INTO dbo.InspectionsReviewLog (ID, ObjectID, DateReview, duplicate)
SELECT ID, ObjectID, DateReview, 3
FROM
#tableVar;
END;

Merge tables having different columns (SQL Server)

I have 2 tables with details as follows
Table 1
Name | City | Employee_Id
-----------------
Raj | CA | A2345
Diya | IL | A1234
Max | PL | A2321
Anna | TX | A1222
Luke | DC | A5643
Table 2
Name | City | Employee_Id | Phone | Age
---------------------------------------
Raj | CA | A2345 | 4094 | 25
Diya | IL | A1234 | 4055 | 19
Max | PL | A2321 | 4076 | 23
As you can see, Employee_Id is the common column in both the columns. I want to update all the entries present in table 1 into table 2.
Raj, Divya and Max are already present in Table 2. So it should not create a duplicate entry in table 2 and skip those 3 entries whereas Anna and Luke are not present in table 2. so this should be added as a new row.
The SQL should be able to merge these 2 columns and ignore the rows which are already present. The final table 2 must be similar to this.
Table 2
Name | City | Employee_Id | Phone | Age
---------------------------------------
Raj | CA | A2345 | 4094 | 25
Diya | IL | A1234 | 4055 | 19
Max | PL | A2321 | 4076 | 23
Anna | TX | A1222 | |
Luke | DC | A5643 | |
Is there a way I could achieve this? I am pretty new to SQL, so any inputs would be of great help. I read about merge and update feature but I guess merge is in Transact-SQL. Also read about joins but could not find a way to crack this.

Demo Setup
CREATE TABLE Table1
([Name] varchar(4), [City] varchar(2), [Employee_Id] varchar(5));
INSERT INTO Table1
([Name], [City], [Employee_Id])
VALUES
('Raj', 'FL', 'A2345'),
('Diya', 'IL', 'A1234'),
('Max', 'PL', 'A2321'),
('Anna', 'TX', 'A1222'),
('Luke', 'DC', 'A5643');
CREATE TABLE Table2
([Name] varchar(4), [City] varchar(2), [Employee_Id] varchar(5), [Phone] int, [Age] int);
INSERT INTO Table2
([Name], [City], [Employee_Id], [Phone], [Age])
VALUES
('Raj', 'CA', 'A2345', 4094, 25),
('Diya', 'IL', 'A1234', 4055, 19),
('Max', 'PL', 'A2321', 4076, 23);
MERGE QUERY
MERGE Table2 AS target
USING Table1 AS source
ON (target.[Employee_Id] = source.[Employee_Id])
WHEN MATCHED THEN
UPDATE SET [Name] = source.[Name],
[City] = source.[City]
WHEN NOT MATCHED THEN
INSERT ([Name], [City], [Employee_Id], [Phone], [Age])
VALUES (source.[Name], source.[City], source.[Employee_Id], NULL, NULL);
SELECT *
FROM Table2

SQL Server get two days difference and days count from date range

I am using SQL Server 2010.
I have a table in the database with records as shown below :
Id | EmpName | JoinDate | ResignedDate
---+----------+-------------------------+--------------
1 | Govind | 2014-04-02 00:00:00.000 | 2014-04-02
2 | Aravind | 2014-04-05 00:00:00.000 | 2014-04-05
3 | Aravind | 2014-04-07 00:00:00.000 | 2014-04-10
4 | Aravind | 2014-04-10 00:00:00.000 | 2014-04-11
5 | Aravind | 2014-04-14 00:00:00.000 | 2014-04-16
Now, I want display the difference of two dates (joinDate , ResignDate) and of that date different how many count available
Sample output:
DateDifferent Count
------------- -----
0 2
1 1
2 1
3 1
Here am showing my sample query,
entityManager.createNativeQuery(SELECT
DATEDIFF(day, e.joinedDate , e.resignedDate),
COUNT(DATEDIFF(day, e.joinedDate , e.resignedDate)))
FROM
Employee e
GROUP BY
DATEDIFF(e.joinedDate , e.resignedDate) ORDER BY (DATEDIFF(e.joinedDate , e.resignedDate)));
This queries is work well for mssql query browser but when I using the query in JPA Native Query (Java code) this query is not working
Any one help me ...

SQL Fiddle
MS SQL Server 2008 Schema Setup:
CREATE TABLE Employee
([Id] int, [EmpName] varchar(7), [JoinDate] datetime, [ResignedDate] datetime)
;
INSERT INTO Employee
([Id], [EmpName], [JoinDate], [ResignedDate])
VALUES
(1, 'Govind', '2014-04-02 00:00:00', '2014-04-02 00:00:00'),
(2, 'Aravind', '2014-04-05 00:00:00', '2014-04-05 00:00:00'),
(3, 'Aravind', '2014-04-07 00:00:00', '2014-04-10 00:00:00'),
(4, 'Aravind', '2014-04-10 00:00:00', '2014-04-11 00:00:00'),
(5, 'Aravind', '2014-04-14 00:00:00', '2014-04-16 00:00:00')
;
Query 1:
SELECT
DATEDIFF(DAY, JoinDate, ResignedDate) AS DateDifferent
, COUNT(DATEDIFF(DAY, JoinDate, ResignedDate)) as FrequencyOf
FROM Employee
GROUP BY DATEDIFF(DAY, JoinDate, ResignedDate)
ORDER BY DateDifferent
Note! You may use column aliases (e.g. DateDifferent) in the ORDER BY clause
Results:
| DateDifferent | FrequencyOf |
|---------------|-------------|
| 0 | 2 |
| 1 | 1 |
| 2 | 1 |
| 3 | 1 |

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

snowfalke split into two columns - snowflake-cloud-data-platform

Related

Separate comma values into individual values

sql query for non-existing entries

Troubleshooting to implement SQL Server trigger

Merge tables having different columns (SQL Server)

SQL Server get two days difference and days count from date range

Categories

Resources