count the number of spaces in values in sql server [duplicate]

count the number of spaces in values in sql server [duplicate] - sql-server

This question already has answers here:
How do you count the number of occurrences of a certain substring in a SQL varchar?
(23 answers)
Closed 8 years ago.
I need the number of spaces in column values in sql server.
Ex:
column1
------------
aaa bbbb - 1 space
aaa bbb ccc - 2 space
aaa bbb ccc ddd - 3 space
I need the count of spaces like this.
thanks.

SELECT LEN(column1)-LEN(REPLACE(column1, ' ', '')) FROM YourTableName

This will give a different and more accurate result than the other answers, it is also counting spaces in the end of the words, it becomes clear when tested on these examples:
DECLARE #a table(column1 varchar(20))
INSERT #a values('b c ')
INSERT #a values('b c')
INSERT #a values(' b c ')
SELECT
LEN(column1 + ';')-LEN(REPLACE(column1,' ','')) - 1 accurate,
LEN(column1)-LEN(REPLACE(column1,' ', '')) [inaccurate] -- other answers
FROM #a
Result:
accurate inaccurate
2 1
1 1
10 4

Try this one -
DECLARE #t TABLE (txt VARCHAR(50))
INSERT INTO #t (txt)
VALUES
('aaa bbbb')
, ('aaa bbb ccc')
, ('aaa bbb ccc ddd')
SELECT txt, LEN(txt) - LEN(REPLACE(txt, ' ', ''))
FROM #t

this is a code for that
select len('aaa bbb') - len(replace('aaa bbb ccc', ' ', '')) from
**tablename**
output
1
select len('aaa bbb ccc') - len(replace('aaa bbb ccc', ' ', '')) from
**tablename**
ouput
2
Tablename acan be anything table that can be in your database

Related

PATINDEX with SOUNDEX

Want to search the string using PATINDEX and SOUNDEX.
I have the following table with some sample data to search the given string using PATINDEX and SOUNDEX.
create table tbl_pat_soundex
(
col_str varchar(max)
);
insert into tbl_pat_soundex values('Smith A Steve');
insert into tbl_pat_soundex values('Steve A Smyth');
insert into tbl_pat_soundex values('A Smeeth Stive');
insert into tbl_pat_soundex values('Steve Smith A');
insert into tbl_pat_soundex values('Smit Steve A');
String to search:- 'Smith A Steve'
SELECT col_str,PATINDEX('%Smith%',col_str) [Smith],PATINDEX('%A%',col_str) [A],PATINDEX('%Steve%',col_str) [Steve]
FROM tbl_pat_soundex
Getting Output:
col_str Smith A Steve
---------------------------------
Smith A Steve 1 7 9
Steve A Smyth 0 7 1
A Smeeth Stive 0 1 0
Steve Smith A 7 13 1
Smit Steve A 0 12 6
Expected Output:
col_str Smith A Steve
---------------------------------
Smith A Steve 1 7 9
Steve A Smyth 9 7 1
A Smeeth Stive 3 1 10
Steve Smith A 7 13 1
Smit Steve A 1 12 6
Tried:
SELECT col_str,
PATINDEX('%'+soundex('Smith')+'%',soundex(col_str)) [Smith],
PATINDEX('%'+soundex('A')+'%',soundex(col_str)) [A],
PATINDEX('%'+soundex('Steve')+'%',soundex(col_str)) [Steve]
FROM tbl_pat_soundex
But getting unexpected result:
col_str Smith A Steve
---------------------------------
Smith A Steve 1 0 0
Steve A Smyth 0 0 1
A Smeeth Stive 0 1 0
Steve Smith A 0 0 1
Smit Steve A 1 0 0
Note: I have 100 Millions of records in the table to search for.

Here's one option, not sure how it would perform with 100 million records considering all that you need to do. You'll have to test that out.
At a high level how I understand this is you basically need
Search all words in a string based on the words of another string
Returning the character starting position in the original string where that word equals or sounds like the search word.
You can use DIFFERENCE() for the comparison:
DIFFERENCE compares two different SOUNDEX values, and returns an
integer value. This value measures the degree that the SOUNDEX values
match, on a scale of 0 to 4. A value of 0 indicates weak or no
similarity between the SOUNDEX values; 4 indicates strongly similar,
or even identically matching, SOUNDEX values.
You'll need to split the string based on the space ' ' and since you're 2008 you'd have to roll your own function.
I used the XML function from here, https://sqlperformance.com/2012/07/t-sql-queries/split-strings, for my examples, you'll obviously need to adjust if you have your own or want to use something different:
CREATE FUNCTION dbo.SplitStrings_XML
(
#List NVARCHAR(MAX),
#Delimiter NVARCHAR(255)
)
RETURNS TABLE
WITH SCHEMABINDING
AS
RETURN
(
SELECT Item = y.i.value('(./text())[1]', 'nvarchar(4000)')
FROM
(
SELECT x = CONVERT(XML, '<i>'
+ REPLACE(#List, #Delimiter, '</i><i>')
+ '</i>').query('.')
) AS a CROSS APPLY x.nodes('i') AS y(i)
);
GO
I switched and use table variables to show the example, I would suggest not doing that with the amount of data you have and create and use physical tables.
Option 1 - Not dynamic:
DECLARE #tbl_pat_soundex TABLE
(
[col_str] VARCHAR(MAX)
);
INSERT INTO #tbl_pat_soundex
VALUES ( 'Smith A Steve' )
,( 'Steve A Smyth' )
,( 'A Smeeth Stive' )
,( 'Steve Smith A' )
,( 'Smit Steve A' )
SELECT DISTINCT [aa].[col_str]
, MAX([aa].[Smith]) OVER ( PARTITION BY [aa].[col_str] ) AS [Smith]
, MAX([aa].[A]) OVER ( PARTITION BY [aa].[col_str] ) AS [A]
, MAX([aa].[Steve]) OVER ( PARTITION BY [aa].[col_str] ) AS [Steve]
FROM (
SELECT [a].[col_str]
, CASE WHEN DIFFERENCE([b].[item], 'Smith') = 4 THEN
CHARINDEX([b].[item], [a].[col_str])
ELSE 0
END AS [Smith]
, CASE WHEN DIFFERENCE([b].[item], 'A') = 4 THEN
CHARINDEX([b].[item], [a].[col_str])
ELSE 0
END AS [A]
, CASE WHEN DIFFERENCE([b].[item], 'Steve') = 4 THEN
CHARINDEX([b].[item], [a].[col_str])
ELSE 0
END AS [Steve]
FROM #tbl_pat_soundex [a]
CROSS APPLY [dbo].[SplitStrings_XML]([a].[col_str], ' ') [b]
) AS [aa];
Using the function we split the string into individual words
Then we use a case statement to check the DIFFERENCE value
If that DIFFERENCE value equals 4 we then return the CHARINDEX value of the original word against string.
If doesn't equal we return 0
Then from there it's a matter of getting the max value of each based on the original string:
, MAX([aa].[Smith]) OVER ( PARTITION BY [aa].[col_str] ) AS [Smith]
, MAX([aa].[A]) OVER ( PARTITION BY [aa].[col_str] ) AS [A]
, MAX([aa].[Steve]) OVER ( PARTITION BY [aa].[col_str] ) AS [Steve]
To get you your final results:
Option 2 - Dynamic with a pivot:
We'll declare the string we want to search, split that out and search for those individuals words in the original string and then pivot the results.
--This example is using global temp tables as it's showing how
--to build a dynamic pivot
IF OBJECT_ID('tempdb..##tbl_pat_soundex') IS NOT NULL
DROP TABLE [##tbl_pat_soundex];
IF OBJECT_ID('tempdb..##tbl_col_str_SearchString') IS NOT NULL
DROP TABLE [##tbl_col_str_SearchString];
CREATE TABLE [##tbl_pat_soundex]
(
[col_str] VARCHAR(MAX)
);
INSERT INTO [##tbl_pat_soundex]
VALUES ( 'Smith A Steve' )
, ( 'Steve A Smyth' )
, ( 'A Smeeth Stive' )
, ( 'Steve Smith A' )
, ( 'Smit Steve A' );
--What are you searching for?
DECLARE #SearchString NVARCHAR(200);
SET #SearchString = N'Smith A Steve';
--We build a table we load with every combination of the words from the string and the words from the SearchString for easier comparison.
CREATE TABLE [##tbl_col_str_SearchString]
(
[col_str] NVARCHAR(MAX)
, [col_str_value] NVARCHAR(MAX)
, [SearchValue] NVARCHAR(200)
);
--Load that table for comparison
--split our original string into individual words
--also split our search string into individual words and give me all combinations.
INSERT INTO [##tbl_col_str_SearchString] (
[col_str]
, [col_str_value]
, [SearchValue]
)
SELECT DISTINCT [a].[col_str]
, [b].[item]
, [c].[item]
FROM [##tbl_pat_soundex] [a]
CROSS APPLY [dbo].[SplitStrings_XML]([a].[col_str], ' ') [b]
CROSS APPLY [dbo].[SplitStrings_XML](#SearchString, ' ') [c]
ORDER BY [a].[col_str];
--Then we can easily compare each word and search word for those that match or sound alike using DIFFERNCE()
SELECT [col_str], [col_str_value], [SearchValue], CASE WHEN DIFFERENCE([col_str_value], [SearchValue]) = 4 THEN CHARINDEX([col_str_value], [col_str]) ELSE 0 END AS [Match] FROM ##tbl_col_str_SearchString
--Then we can pivot on it
--and we will need to make it dynamic since we are not sure what what #SearchString could be.
DECLARE #PivotSQL NVARCHAR(MAX);
DECLARE #pivotColumn NVARCHAR(MAX);
SET #pivotColumn = N'[' + REPLACE(#SearchString, ' ', '],[') + N']';
SET #PivotSQL = N'SELECT * FROM (
SELECT [col_str], [SearchValue], CASE WHEN DIFFERENCE([col_str_value], [SearchValue]) = 4 THEN CHARINDEX([col_str_value], [col_str]) ELSE 0 END AS [Match] FROM ##tbl_col_str_SearchString
) aa
PIVOT (MAX([Match]) FOR [SearchValue] IN (' + #pivotColumn
+ N')) AS MaxMatch
ORDER BY [MaxMatch].[col_str]
';
--Giving us the final results.
EXEC sp_executesql #PivotSQL

How can I unpivot and then pivot my table so the columns become rows and one column becomes a row?

How can I accomplish this with unpivot and pivot.
I've seen this question asked before and has a solution with case statement and union all
In SQL, how can I count the number of values in a column and then pivot it so the column becomes the row?
and here PIVOT/UNPIVOT multiple rows and columns but I have 20 rows and 24 columns and the query would become very long and I suspect inefficient. Does anyone know how I can do this with unpivot and pivot or is case and union all the only viable option?
Hour A B C D E ... Z
-----------------------------------------
0 4 2 3 0 6 2
1 3 5 7 1 8 7
2 2 6 1 1 4 3
3 2 2 0 3 0 2
4 3 9 6 2 2 8
...
23 6 5 2 3 8 6
Field 0 1 2 3 ...23
-------- -- -- -
A 2 0 2 2 4
B 7 2 8 1 6
....
Z 6 7 7 3 8
This is what I've tried in terms of pivot but I didn't get far:
select B,[0],[1],[2],[3],[4],[5],[6],[7],[8],[9],[10],[11],[12],[13],[14],[15],[16],[17],[18],[19],[20],[21],[22],[23] from CTE2
pivot(
sum(A)
for hour in ([0],[1],[2],[3],[4],[5],[6],[7],[8],[9],[10],[11],[12],[13],[14],[15],[16],[17],[18],[19],[20],[21],[22],[23])) as pvt;
Just to clarify, the numbers in the table are just random numbers I've put to simulate data, they aren't transposed as they should be.

Well, I know you say you've solved it so this probably isn't necessary and you can feel free to use whatever answer you currently have, but here's an example of how you could approach this problem in general.
IF OBJECT_ID('tmpTable', 'U') IS NOT NULL DROP TABLE tmpTable;
CREATE TABLE tmpTable (i INT, a INT, b INT, c INT, d INT);
INSERT tmpTable VALUES (1,69,69,10,1)
, (2,5,0,2,3)
, (3,5,5,5,5)
, (4,1,2,3,4);
DECLARE #applycols NVARCHAR(MAX);
SELECT #applycols = STUFF(
(SELECT ',(' + QUOTENAME(COLUMN_NAME) + ', ''' + COLUMN_NAME + ''')'
FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_NAME = 'tmpTable'
AND COLUMN_NAME <> 'i'
FOR XML PATH ('')),1,1,'');
DECLARE #aggcols NVARCHAR(MAX) = '';
SELECT #aggcols += ', MAX(CASE WHEN i = ' + CAST(i AS NVARCHAR(255)) + ' THEN piv.colval END) ' + QUOTENAME(CAST(i AS NVARCHAR(255)))
FROM tmpTable;
DECLARE #SQL NVARCHAR(MAX) = 'SELECT piv.col' + #aggcols + '
FROM tmpTable
CROSS APPLY (VALUES ' + #applycols + ') piv(colval, col)
GROUP BY piv.col;';
EXEC(#SQL);
DROP TABLE tmpTable;
Essentially, it's using dynamic SQL to determine all the columns/values and then using a simple CROSS APPLY / MAX(CASE... to get all the values.

SQL Server Pivot Table with multiple column with dates

I have a PIVOT situation.
Source table columns:
Title Description Datetime RecordsCount
A California 2015-07-08 10:44:39.040 5
A California 2015-07-08 12:44:39.040 6
A California 2015-05-08 15:44:39.040 3
B Florida 2015-07-08 16:44:39.040 2
B Florida 2015-05-08 19:44:39.040 4
Now I need this pivoted as
2015-07-08 2015-05-08
Title Description
A California 11 3
B Florida 2 4
if we have two record counts on same dates (no matter of time) then sum them, else display in different column.
Trying to write something like this, but it throws errors.
Select * from #DataQualTest
PIVOT (SUM(RecordCount) FOR DateTime IN (Select Datetime from #DataQualTest) )
AS Pivot_Table
Please help me out with this.
Thanks

Not exactly the word for word solution but this should give you a direction.
create table #tmp
(
country varchar(max)
, date1 datetime
, record int
)
insert into #tmp values ('California', '2010-01-01', 2)
insert into #tmp values ('California', '2010-01-01', 5)
insert into #tmp values ('California', '2012-01-01', 1)
insert into #tmp values ('Florida', '2010-01-01', 3)
insert into #tmp values ('Florida', '2010-01-01', 5)
select * from #tmp
pivot (sum(record) for date1 in ([2010-01-01], [2012-01-01])) as avg
output
country 2010-01-01 2012-01-01
California 7 1
Florida 8 NULL

If you want to be more flexible, you need some pre-processing to get from full timestamps to days (in order for later on the PIVOT's grouping to actually have the anticipated effect):
CREATE VIEW DataQualTestView AS
SELECT
title
, description
, DATEFROMPARTS (DATEPART(yyyy, date_time),
DATEPART(mm, date_time),
DATEPART(dd, date_time)) AS day_from_date_time
, recordsCount
FROM DataQualTest
;
From there you could continue:
DECLARE #query AS NVARCHAR(MAX)
DECLARE #columns AS NVARCHAR(MAX)
SELECT #columns = ISNULL(#columns + ',' , '')
+ QUOTENAME(day_from_date_time)
FROM (SELECT DISTINCT
day_from_date_time
FROM DataQualTestView) AS TheDays
SET #query =
N'SELECT
title
, description
, ' + #columns + '
FROM DataQualTestView
PIVOT(SUM(recordsCount)
FOR day_from_date_time IN (' + #columns + ')) AS Pivoted'
EXEC SP_EXECUTESQL #query
GO
... and would get:
| title | description | 2015-05-08 | 2015-07-08 |
|-------|-------------|------------|------------|
| A | California | 3 | 11 |
| B | Florida | 4 | 2 |
See it in action: SQL Fiddle.
Please comment, if and as this requires adjustment / further detail.

Sorting varchar field with mixed alphanumeric data

I searched and read a lot of answers on here, but can't find one that will answer my problem, (or help me to find the answer on my own).
We have a table which contains a varchar display field, who's data is entered by the customer.
When we display the results, our customer wants the results to be ordered "correctly".
A sample of what the data could like is as follows:
"AAA 2 1 AAA"
"AAA 10 1 AAA"
"AAA 10 2 BAA"
"AAA 101 1 AAA"
"BAA 101 2 BBB"
"BAA 101 10 BBB"
"BAA 2 2 AAA"
Sorting by this column ASC returns:
1: "AAA 10 1 AAA"
2: "AAA 10 2 BAA"
3: "AAA 101 1 AAA"
4: "AAA 2 1 AAA"
5: "BAA 101 10 BBB"
6: "BAA 101 2 BBB"
7: "BAA 2 2 AAA"
The customer would like row 4 to actually be the first row (as 2 comes before 10), and similarly row 7 to be between rows 4 and 5, as shown below:
1: "AAA 2 1 AAA"
2: "AAA 10 1 AAA"
3: "AAA 10 2 BAA"
4: "AAA 101 1 AAA"
5: "BAA 2 2 AAA"
6: "BAA 101 10 BBB"
7: "BAA 101 2 BBB"
Now, the real TRICKY bit is, there is no hard and fast rule to what the data will look like in this column; it is entirely down to the customer as to what they put in here (the data shown above is just arbitrary to demonstrate the problem).
Any Help?
EDIT:
learning that this is referred to as "natural sorting" has improved my search results massively
I'm going to give the accepted answer to this question a bash and will update accordingly:
Natural (human alpha-numeric) sort in Microsoft SQL 2005

First create this function
Create FUNCTION dbo.SplitAndJoin
(
#delimited nvarchar(max),
#delimiter nvarchar(100)
) RETURNS Nvarchar(Max)
AS
BEGIN
declare #res nvarchar(max)
declare #t TABLE
(
-- Id column can be commented out, not required for sql splitting string
id int identity(1,1), -- I use this column for numbering splitted parts
val nvarchar(max)
)
declare #xml xml
set #xml = N'<root><r>' + replace(#delimited,#delimiter,'</r><r>') + '</r></root>'
insert into #t(val)
select
r.value('.','varchar(max)') as item
from #xml.nodes('//root/r') as records(r)
SELECT #res = STUFF((SELECT ' ' + case when isnumeric(val) = 1 then RIGHT('00000000'+CAST(val AS VARCHAR(8)),8) else val end
FROM #t
FOR XML PATH('')), 1, 1, '')
RETURN #Res
END
GO
This function gets an space delimited string and split it to words then join them together again by space but if the word is number it adds 8 leading zeros
then you use this query
Select * from Test
order by dbo.SplitAndJoin(col1,' ')
Live result on SQL Fiddle

Without consistency, you only have brute force
Without rules, your brute force is limited
I've made some assumptions with this code: if it starts with 3 alpha characters, then a space, then a number (up to 3 digits), let's treat it differently.
There's nothing special about this - it is just string manipulation being brute forced in to giving you "something". Hopefully it illustrates how painful this is without having consistency and rules!
DECLARE #t table (
a varchar(50)
);
INSERT INTO #t (a)
VALUES ('AAA 2 1 AAA')
, ('AAA 10 1 AAA')
, ('AAA 10 2 BAA')
, ('AAA 101 1 AAA')
, ('BAA 101 2 BBB')
, ('BAA 101 10 BBB')
, ('BAA 2 2 AAA')
, ('Completely different')
;
; WITH step1 AS (
SELECT a
, CASE WHEN a LIKE '[A-Z][A-Z][A-Z] [0-9]%' THEN 1 ELSE 0 END As fits_pattern
, CharIndex(' ', a) As first_space
FROM #t
)
, step2 AS (
SELECT *
, CharIndex(' ', a, first_space + 1) As second_space
, CASE WHEN fits_pattern = 1 THEN Left(a, 3) ELSE 'ZZZ' END As first_part
, CASE WHEN fits_pattern = 1 THEN SubString(a, first_space + 1, 1000) ELSE 'ZZZ' END As rest_of_it
FROM step1
)
, step3 AS (
SELECT *
, CASE WHEN fits_pattern = 1 THEN SubString(rest_of_it, 1, second_space - first_space - 1) ELSE 'ZZZ' END As second_part
FROM step2
)
SELECT *
, Right('000' + second_part, 3) As second_part_formatted
FROM step3
ORDER
BY first_part
, second_part_formatted
, a
;
Relevant, sorted results:
a
---------------------
AAA 2 1 AAA
AAA 10 1 AAA
AAA 10 2 BAA
AAA 101 1 AAA
BAA 2 2 AAA
BAA 101 10 BBB
BAA 101 2 BBB
Completely different
This code can be vastly improved/shortened. I've just left it verbose in order to give you some clarity over the steps taken.

Splitting (duplicating rows of) an SQL table based on the values in one column

I have a table which is the result of an export and has the columns and values like the following
Name Aliases Ranks
Ben BenA BenB BenC 1 5 3
Jerry JerryA JerryB 7 3
Aliases and Ranks are separated by a character (in this case CHAR(10)) and they have the same number of entries. But each Name could have different number of Aliases (And therefore Ranks).
I would like to write a SQL Query to give me the following table
Name Alias Rank
Ben BenA 1
Ben BenB 5
Ben BenC 3
Jerry JerryA 7
Jerry JerryB 3
How can I do this?

with cte as (
select Name, cast(null as int) as AliasStartPosition, cast(0 as int) as AliasEndPosition, Aliases + ' ' as Aliases, cast(null as int) as RankStartPosition, cast(0 as int) as RankEndPosition, Ranks + ' ' as Ranks
from (
values ('Ben', 'BenA BenB BenC', '1 5 3'),
('Jerry', 'JerryA JerryB', '7 3')
) t (Name, Aliases, Ranks)
union all
select Name, AliasEndPosition + 1, charindex(' ', Aliases, AliasEndPosition + 1), Aliases, RankEndPosition + 1, charindex(' ', Ranks, RankEndPosition + 1), Ranks
from cte
where charindex(' ', Aliases, AliasEndPosition + 1) != 0
)
select Name, substring(Aliases, AliasStartPosition, AliasEndPosition - AliasStartPosition) as Alias, substring(Ranks, RankStartPosition, RankEndPosition - RankStartPosition) as Rank
from cte
where AliasStartPosition is not null