Count of distinct strings in comma separated list (string)? - snowflake-cloud-data-platform

For example let's say I've got a table as such:
PrimaryKey
List
1
thing, stuff
2
thing
3
stuff, doodad, thing
4
stuff, thing
Where each value in the 'List' column is a string with words separated by a comma. I'd like to get the count of each word that appears in that column so that I end up with this:
Word
Count
thing
4
stuff
3
doodad
1
I've seen a lot of really similar questions, but can't seem to figure it out. Any help is much appreciated!

You can use the SPLIT_TO_TABLE function:
Sample data:
CREATE OR REPLACE TABLE ST (PrimaryKey INT, List STRING);
INSERT INTO ST
SELECT * FROM VALUES
(1, 'thing, stuff'),
(2, 'thing'),
(3, 'stuff, doodad, thing'),
(4, 'stuff, thing') AS t(PrimaryKey, List);
Solution:
SELECT TRIM(L.VALUE) AS Word, COUNT(TRIM(L.VALUE)) AS Count
FROM ST,
LATERAL SPLIT_TO_TABLE(ST.List, ',') AS L
GROUP BY TRIM(L.VALUE);
Reference: SPLIT_TO_TABLE

try this,
create table test_1 as (
SELECT fld1 FROM (
SELECT 'thing, stuff' fld1 UNION ALL
SELECT 'thing' fld1 UNION ALL
SELECT 'stuff, doodad, thing' fld1 UNION ALL
SELECT 'stuff, thing' fld1
)
);
select count(*), fld1 from (
SELECT trim(value) as fld1 FROM test_1, lateral split_to_Table(test_1.fld1,',')
order by value
) group by fld1

Both the above answers show how to use SPLIT_TO_TABLE() which is great -> you can also use strtok_split_to_table() This function allows multiple delimiters (SPLIT_TO_TABLE allows just one) which is pretty cool (making it easier in the real world).
I changed some of the commas to a hyphens so you can see it still work.
Copy|Paste|Run:
with a as (SELECT * FROM VALUES
(1, 'thing, stuff'),
(2, 'thing'),
(3, 'stuff, doodad - thing'),
(4, 'stuff- thing') AS t(pk, l))
select
value::string answer,count(1)
from
a
,lateral strtok_split_to_table(a.l,(', |-' ))
group by
answer

Related

INSERT INTO table from 2 unrelated tables

Not sure how to achieve the result, need your help
Source A:
SELECT SourceAID
FROM [dbo].[SourceA]
Source B:
SELECT SourceBID
FROM [dbo].[SourceB]
Result table (select example):
SELECT SourceAID
,SourceBID
,Value
FROM [dbo].[Result]
Idea of insert: For each SourceAID, i need to insert records with all SourceBID. There is no any reference between these 2 tables.
Idea by hand looks like this:
INSERT INTO [dbo].[Result] ([SourceAID], [SourceBID], [Value])
VALUES ('AID_1', 'BID_1', NULL),
('AID_1', 'BID_2', NULL),
('AID_1', 'BID_3', NULL),
('AID_2', 'BID_1', NULL),
('AID_2', 'BID_2', NULL),
('AID_2', 'BID_3', NULL)
and so on
As #Larnu said.
Use some following code:
INSERT INTO [dbo].[Result] ([SourceAID], [SourceBID], [Value])
SELECT
SA.SourceAID,
SB.SourceBID,
NULL
FROM
[dbo].[SourceA] AS SA
CROSS JOIN [dbo].[SourceB] AS SB
The other way is using subquery
INSERT INTO [dbo].[Result] ([SourceAID], [SourceBID], [Value])
SELECT SA.SourceAID,SB.SourceBID,NULL
(SELECT 1 AS ID ,SA.SourceAID FROM [dbo].[SourceA]) SA
join
(SELECT 1 AS ID ,SA.SourceBID FROM [dbo].[SourceB]) SB
on SA.ID=SB.ID

CTE to pull entire tree from arbitrary entry

I'm trying to build a CTE which will pull back all records which are related to a given, arbitrary record in the database.
Create table Requests (
Id bigint,
OriginalId bigint NULL,
FollowupId bigint NULL
)
insert into Requests VALUES (1, null, 3)
insert into Requests VALUES (2, 1, 8)
insert into Requests VALUES (3, 1, 4)
insert into Requests VALUES (4, 3, null)
insert into Requests VALUES (5, null, null)
insert into Requests VALUES (6, null, 7)
insert into Requests VALUES (7, 6, null)
insert into Requests VALUES (8, 2, null)
OriginalId is always the Id of a previous record (or null). FollowupId points to the most recent followup record (which, in turn, points back via OriginalId) and can probably be ignored, but it's there if it's helpful.
I can easily pull back either all ancestors or all descendants of a given record using the following CTE
;With TransactionList (Id, originalId, followupId, Steps)
AS
(
Select Id, originalId, followupId, 0 as Steps from requests where Id = #startId
union all
select reqs.Id, reqs.originalId, reqs.followupId, Steps + 1 from requests reqs
inner join TransactionList tl on tl.Id = reqs.originalId --or tl.originalId = reqs.Id
)
SELECT Id from TransactionList
However, if I use both where clauses, I run into recursion, hit the recursion limit, and it bombs out. Even combining both sets, I don't get the entire tree - just one branch from it.
I don't care about anything other than the list of Ids. They don't need to be sorted, or to display their relationship or anything. Doesn't hurt, but not necessary. But I need every Id in a given tree to pull back the same list when it's passed as #startId.
As an example of what I'd like to see, this is what the output should be when #startId is set to any value 1-4 or 8:
1
2
3
4
8
And for either 6 or 7, I get back both 6 and 7.
You can just create 2 CTE's.
The first CTE will get the Root of the hierarchy, and the second will use the Root ID to get the descendants of the Root.
;WITH cteRoot AS (
SELECT *, 0 [Level]
FROM Requests
WHERE Id = #startId
UNION ALL
SELECT r.*, [Level] + 1
FROM Requests r
JOIN cteRoot cte ON r.Id = cte.OriginalID
),
cteDesc AS (
SELECT *
FROM cteRoot
WHERE OriginalId IS NULL
UNION ALL
SELECT r.*, [Level] + 1
FROM Requests r
JOIN cteDesc cte ON r.OriginalId = cte.Id
)
SELECT * FROM cteDesc
SQL Fiddle

SQL Server Script Quick Replace all found strings with incrementing integer

I have large INSERT SQL script that I want to modify it with quick replace. By replacing each found string with interger, where every next integer is previous integer+1.
Before:
INSERT Compartment (CompartmentID) VALUES ('A')
INSERT Compartment (CompartmentID) VALUES ('B')
After:
INSERT Compartment (CompartmentID) VALUES (1)
INSERT Compartment (CompartmentID) VALUES (2)
I know how to find the specific strings, but I can't find anywhere syntax or way have to replace it incrementing integers.
You can replace all you char CompartmentID with ordered numbers like this:
declare #Compartment table(CompartmentID varchar(10), name varchar(10), intID int)
INSERT INTO #Compartment(CompartmentID, name) values
('a', 'a')
, ('b', 'b')
, ('c', 'c')
, ('d', 'd')
, ('e', 'e')
UPDATE c SET CompartmentID = o.ID
FROM #Compartment c
INNER JOIN (
SELECT CompartmentID, ID = ROW_NUMBER() over(ORDER BY CompartmentID)
FROM #Compartment
) o ON c.CompartmentID = o.CompartmentID
SELECT * FROM #Compartment
Output:
CompartmentID name
1 a
2 b
3 c
4 d
5 e
It would be better to create a new column of type int or change the type of CompartmentID once the update is finished.
You should also use an identity column if you want the numbers to be incremented automaticaly.
Not sure how you want to handle empty string. You can select the rows where CompartmentID contains a character that isnt a numeric and update the result set like this:
DECLARE #Compartment table(CompartmentID varchar(20))
INSERT #Compartment(CompartmentID) VALUES ('A'),('A'),('B'),('1'),('A1')
-- EDIT: Changed answer
;WITH CTE as
(
SELECT CompartmentID, DENSE_RANK() over (ORDER BY CompartmentID) rn
FROM #Compartment
--WHERE CompartmentID LIKE '%[^0-9]%' OR CompartmentID = ''
)
UPDATE CTE
SET CompartmentID = rn
FROM CTE
Result:
CompartmentID
2
2
4
1
3
Note: Now all id will CompartmentID changed(also the numeric CompartmentID), identical values for old CompartmentID will get identical numeric values.

sort float numbers as a natural numbers in SQL Server

Well I had asked the same question for jquery on here, Now my question is same with SQL Server Query :) But this time this is not comma separated, this is separate row in Database like
I have separated rows having float numbers.
Name
K1.1
K1.10
K1.2
K3.1
K3.14
K3.5
and I want to sort this float numbers like,
Name
K1.1
K1.2
K1.10
K3.1
K3.5
K3.14
actually in my case, the numbers which are after decimals will consider as a natural numbers, so 1.2 will consider as '2' and 1.10 will consider as '10' thats why 1.2 will come first than 1.10.
You can remove 'K' because it is almost common and suggestion or example would be great for me, thanks.
You can use PARSENAME (which is more of a hack) or String functions like CHARINDEX , STUFF, LEFT etc to achieve this.
Input data
;WITH CTE AS
(
SELECT 'K1.1' Name
UNION ALL SELECT 'K1.10'
UNION ALL SELECT 'K1.2'
UNION ALL SELECT 'K3.1'
UNION ALL SELECT 'K3.14'
UNION ALL SELECT 'K3.5'
)
Using PARSENAME
SELECT Name,PARSENAME(REPLACE(Name,'K',''),2),PARSENAME(REPLACE(Name,'K',''),1)
FROM CTE
ORDER BY CONVERT(INT,PARSENAME(REPLACE(Name,'K',''),2)),
CONVERT(INT,PARSENAME(REPLACE(Name,'K',''),1))
Using String Functions
SELECT Name,LEFT(Name,CHARINDEX('.',Name) - 1), STUFF(Name,1,CHARINDEX('.',Name),'')
FROM CTE
ORDER BY CONVERT(INT,REPLACE((LEFT(Name,CHARINDEX('.',Name) - 1)),'K','')),
CONVERT(INT,STUFF(Name,1,CHARINDEX('.',Name),''))
Output
K1.1 K1 1
K1.2 K1 2
K1.10 K1 10
K3.1 K3 1
K3.5 K3 5
K3.14 K3 14
This works if there is always one char before the first number and the number is not higher than 9:
SELECT name
FROM YourTable
ORDER BY CAST(SUBSTRING(name,2,1) AS INT), --Get the number before dot
CAST(RIGHT(name,LEN(name)-CHARINDEX('.',name)) AS INT) --Get the number after the dot
Perhaps, more verbal, but should do the trick
declare #source as table(num varchar(12));
insert into #source(num) values('K1.1'),('K1.10'),('K1.2'),('K3.1'),('K3.14'),('K3.5');
-- create helper table
with data as
(
select num,
cast(SUBSTRING(replace(num, 'K', ''), 1, CHARINDEX('.', num) - 2) as int) as [first],
cast(SUBSTRING(replace(num, 'K', ''), CHARINDEX('.', num), LEN(num)) as int) as [second]
from #source
)
-- Select and order accordingly
select num
from data
order by [first], [second]
sqlfiddle:
http://sqlfiddle.com/#!6/a9b06/2
The shorter solution is this one :
Select Num
from yourtable
order by cast((Parsename(Num, 1) ) as Int)

How can I select from list of values in SQL Server

I have very simple problem that I can't solve. I need to do something like this:
select distinct * from (1, 1, 1, 2, 5, 1, 6).
Anybody can help??
Edit
The data comes as a text file from one of our clients. It's totally unformatted (it's a single, very long line of text), but it may be possible to do so in Excel. But it's not practical for me, because I will need to use these values in my sql query. It's not convenient to do so every time I need to run a query.
Available only on SQL Server 2008 and over is row-constructor in this form:
You could use
SELECT DISTINCT *
FROM (
VALUES (1), (1), (1), (2), (5), (1), (6)
) AS X(a)
For more information see:
MS official
http://www.sql-server-helper.com/sql-server-2008/row-value-constructor-as-derived-table.aspx
In general :
SELECT
DISTINCT
FieldName1, FieldName2, ..., FieldNameN
FROM
(
Values
( ValueForField1, ValueForField2,..., ValueForFieldN ),
( ValueForField1, ValueForField2,..., ValueForFieldN ),
( ValueForField1, ValueForField2,..., ValueForFieldN ),
( ValueForField1, ValueForField2,..., ValueForFieldN ),
( ValueForField1, ValueForField2,..., ValueForFieldN )
) AS TempTableName ( FieldName1, FieldName2, ..., FieldNameN )
In your case :
Select
distinct
TempTableName.Field1
From
(
VALUES
(1),
(1),
(1),
(2),
(5),
(1),
(6)
) AS TempTableName (Field1)
Simplest way to get the distinct values of a long list of comma delimited text would be to use a find an replace with UNION to get the distinct values.
SELECT 1
UNION SELECT 1
UNION SELECT 1
UNION SELECT 2
UNION SELECT 5
UNION SELECT 1
UNION SELECT 6
Applied to your long line of comma delimited text
Find and replace every comma with UNION SELECT
Add a SELECT in front of the statement
You now should have a working query
Have you tried using the following syntax?
select * from (values (1), (2), (3), (4), (5)) numbers(number)
If you want to select only certain values from a single table you can try this
select distinct(*) from table_name where table_field in (1,1,2,3,4,5)
eg:
select first_name,phone_number from telephone_list where district id in (1,2,5,7,8,9)
if you want to select from multiple tables then you must go for UNION.
If you just want to select the values 1, 1, 1, 2, 5, 1, 6 then you must do this
select 1
union select 1
union select 1
union select 2
union select 5
union select 1
union select 6
PostgreSQL gives you 2 ways of doing this:
SELECT DISTINCT * FROM (VALUES('a'),('b'),('a'),('v')) AS tbl(col1)
or
SELECT DISTINCT * FROM (select unnest(array['a','b', 'a','v'])) AS tbl(col1)
using array approach you can also do something like this:
SELECT DISTINCT * FROM (select unnest(string_to_array('a;b;c;d;e;f;a;b;d', ';'))) AS tbl(col1)
I know this is a pretty old thread, but I was searching for something similar and came up with this.
Given that you had a comma-separated string, you could use string_split
select distinct value from string_split('1, 1, 1, 2, 5, 1, 6',',')
This should return
1
2
5
6
String split takes two parameters, the string input, and the separator character.
you can add an optional where statement using value as the column name
select distinct value from string_split('1, 1, 1, 2, 5, 1, 6',',')
where value > 1
produces
2
5
6
This works on SQL Server 2005 and if there is maximal number:
SELECT *
FROM
(SELECT ROW_NUMBER() OVER(ORDER BY a.id) NUMBER
FROM syscomments a
CROSS JOIN syscomments b) c
WHERE c.NUMBER IN (1,4,6,7,9)
Using GROUP BY gives you better performance than DISTINCT:
SELECT *
FROM
(
VALUES
(1),
(1),
(1),
(2),
(5),
(1),
(6)
) AS A (nums)
GROUP BY A.nums;
If you need an array, separate the array columns with a comma:
SELECT * FROM (VALUES('WOMENS'),('MENS'),('CHILDRENS')) as X([Attribute])
,(VALUES(742),(318)) AS z([StoreID])
Another way that you can use is a query like this:
SELECT DISTINCT
LTRIM(m.n.value('.[1]','varchar(8000)')) as columnName
FROM
(SELECT CAST('<XMLRoot><RowData>' + REPLACE(t.val,',','</RowData><RowData>') + '</RowData></XMLRoot>' AS XML) AS x
FROM (SELECT '1, 1, 1, 2, 5, 1, 6') AS t(val)
) dt
CROSS APPLY
x.nodes('/XMLRoot/RowData') m(n);
If it is a list of parameters from existing SQL table, for example ID list from existing Table1, then you can try this:
select distinct ID
FROM Table1
where
ID in (1, 1, 1, 2, 5, 1, 6)
ORDER BY ID;
Or, if you need List of parameters as a SQL Table constant(variable), try this:
WITH Id_list AS (
select ID
FROM Table1
where
ID in (1, 1, 1, 2, 5, 1, 6)
)
SELECT distinct * FROM Id_list
ORDER BY ID;
I create a function on most SQL DB I work on to do just this.
CREATE OR ALTER FUNCTION [dbo].[UTIL_SplitList](#parList Varchar(MAX),#splitChar Varchar(1)=',')
Returns #t table (Column_Value varchar(MAX))
as
Begin
Declare #pos integer
set #pos = CharIndex(#splitChar, #parList)
while #pos > 0
Begin
Insert Into #t (Column_Value) VALUES (Left(#parList, #pos-1))
set #parList = Right(#parList, Len(#parList) - #pos)
set #pos = CharIndex(#splitChar, #parList)
End
Insert Into #t (Column_Value) VALUES (#parList)
Return
End
Once the function exists, it is as easy as
SELECT DISTINCT
*
FROM
[dbo].[UTIL_SplitList]('1,1,1,2,5,1,6',',')
Select user id from list of user id:
SELECT * FROM my_table WHERE user_id IN (1,3,5,7,9,4);
A technique that has worked for me is to query a table that you know has a large amount of records in it, including just the Row_Number field in your result
Select Top 10000 Row_Number() OVER (Order by fieldintable) As 'recnum' From largetable
will return a result set of 10000 records from 1 to 10000, use this within another query to give you the desired results
Use the SQL In function
Something like this:
SELECT * FROM mytable WHERE:
"VALUE" In (1,2,3,7,90,500)
Works a treat in ArcGIS

Resources