Get COUNT() of rows where first 3 digits of column are alike

Get COUNT() of rows where first 3 digits of column are alike - sql-server

I have a result set of codes that are usually three digits followed by up to 2 digits like 012.34 or 123.45. The first three digits define a general category of group, and the digits following the decimal place define more specific qualities. There could be 77 012.xx numbers, and there are hundreds of unique 3 digit group definitions, followed by a varying number of digits per entry.
Does anyone know how to write a quick query to achieve this?

Assuming it is in a varchar column since you're storing 012.34...
SELECT LEFT(someColumn,3), COUNT(*)
FROM someTable
GROUP BY LEFT(someColumn,3)
HAVING COUNT(*) > 5 -- per your comments
ORDER BY LEFT(someColumn,3)
If it's not, then you'd do this:
SELECT LEFT(CONVERT(VARCHAR(10),someColumn),3), COUNT(*)
FROM someTable
GROUP BY LEFT(CONVERT(VARCHAR(10),someColumn),3)
ORDER BY LEFT(CONVERT(VARCHAR(10),someColumn),3)

#rypress these look strikingly similar to ICD-9 Diagnosis codes for Other respiratory tuberculosis ClickMe. Is this correct? In that case you will get Category and subcategory in your result and may change your counts(012->012.0->012.00,012.02..)
Sample Data:
IF OBJECT_ID(N'TempICD') > 0
BEGIN
DROP TABLE TempICD
END
CREATE TABLE TempICD (ICD VARCHAR(10))
INSERT INTO TempICD
VALUES ('012'),('012.0'),('012.00'),('012.01'),('012.02'),
('012.03'),('012.04'),('012.05'),('012.05'),
('013'),('013.0'),('013.00'),('013.01'),('013.02'),
('013.03'),(NULL)
Query to get Category with 6 or more line items (Including Category and Sub Category):
SELECT LEFT(ICD, 3) AS ICDs,
COUNT(1) AS ICDCount
FROM TempICD
GROUP BY LEFT(ICD, 3)
HAVING COUNT(*) > 5
ORDER BY LEFT(ICD, 3)
Query to get Category with 6 or more line items (Excluding Category and Sub Category):
SELECT SUBSTRING(ICD, 1, CHARINDEX('.', ICD + '.') - 1) AS ICDs,
SUM(CASE
WHEN LEN(SUBSTRING(ICD, CHARINDEX('.', ICD) + 1, LEN(ICD))) = 2 THEN 1
ELSE 0
END) AS ICDCount
FROM TempICD
WHERE ICD IS NOT NULL
GROUP BY SUBSTRING(ICD, 1, CHARINDEX('.', ICD + '.') - 1)
HAVING SUM(CASE
WHEN LEN(SUBSTRING(ICD, CHARINDEX('.', ICD) + 1, LEN(ICD))) = 2 THEN 1
ELSE 0
END) > 5
Cleanup Script:
IF OBJECT_ID(N'TempICD') > 0
BEGIN
DROP TABLE TempICD
END

This may also help you.
I assume that the column is Decimal Data type.
SELECT CAST([COLUMN] AS INT) [GROUP],
COUNT(*) [COUNT]
FROM [TABLE] T
GROUP BY CAST([COLUMN] AS INT)

it depends if your resultset is numeric or characters. if not numeric you can use string operations.
select left(resultSet,charindex('.',resultset)) , count(*)
from x
group by by left(resultSet,charindex('.',resultset))
order by left(resultSet,charindex('.',resultset))
with charindex you will get a correct 'cut' when the first digits are not 3 as 'usually'.
if your resultset is numeric/float you can use the floor function
select floor(resultset),count(*)
from x
group by floor(resultset)
order by floor(resultset)

Related

Count of numbers followed by pipe symbol in a single data of a column in SQL Server

LEN(Column)-len(Replace(Column,'|','') will give total count of Pipe available in a single row data of a SQL Server.
But I need to count the number of records that has Pipe Symbol followed immediately to Number,
**Eg 1:** MNY-THY-**2|** *YUI_WER-NA|JIU-ERT-**8|***
The output of the above record is 2.
**Eg 2:** *MNY-YU-NA|*
The output is 0
**Eg 3:** *MNY-9876**5|***
The output is 1
UPDATE TO MY QUESTION BASED ON ANSWERS SUGGESTED:
**Eg 4:** MNY-YU-1234
The output is 0 Since there is no '|' symbol in my example 4, the result should be 0 only.
Any suggestion would be highly supportable.

You can Split the String based on the "|" and Check the value from the right Side whether it contains number or not.
DECLARE #tosearch VARCHAR(MAX)='%[0-9]|%' ,#string VARCHAR(MAX)='FGL_NU_0003'
SELECT COUNT(CASE WHEN RIGHT(VALUE,1) LIKE '[0-9]' THEN 1 ELSE NULL END)
FROM STRING_SPLIT(#string,'|')
WHERE #string LIKE '%|%'
Expected Output:
MNY-YU-1234 - 0

If you are using SQL Server 2016+, STRING_SPLIT() is an option:
Table:
SELECT *
INTO Data
FROM (VALUES
('MNY-THY-2| YUI_WER-NA|JIU-ERT-8|'),
('MNY-YU-NA|'),
('MNY-98765|'),
('FGL_NU_0003')
) v (TextData)
Statement:
SELECT *
FROM Data d
OUTER APPLY (
SELECT COUNT(*) AS NumberCount
FROM STRING_SPLIT(d.TextData, '|') s
WHERE (d.TextData LIKE '%|%') AND (RIGHT(s.[value], 1) LIKE '[0-9]')
) a
Result:
TextData NumberCount
MNY-THY-2| YUI_WER-NA|JIU-ERT-8| 2
MNY-YU-NA| 0
MNY-98765| 1
FGL_NU_0003 0

How to query number based SQL Sets with Ranges in SQL

What I'm looking for is a way in MSSQL to create a complex IN or LIKE clause that contains a SET of values, some of which will be ranges.
Sort of like this, there are some single numbers, but also some ranges of numbers.
EX: SELECT * FROM table WHERE field LIKE/IN '1-10, 13, 24, 51-60'
I need to find a way to do this WITHOUT having to specify every number in the ranges separately AND without having to say "field LIKE blah OR field BETWEEN blah AND blah OR field LIKE blah.
This is just a simple example but the real query will have many groups and large ranges in it so all the OR's will not work.

One fairly easy way to do this would be to load a temp table with your values/ranges:
CREATE TABLE #Ranges (ValA int, ValB int)
INSERT INTO #Ranges
VALUES
(1, 10)
,(13, NULL)
,(24, NULL)
,(51,60)
SELECT *
FROM Table t
JOIN #Ranges R
ON (t.Field = R.ValA AND R.ValB IS NULL)
OR (t.Field BETWEEN R.ValA and R.ValB AND R.ValB IS NOT NULL)
The BETWEEN won't scale that well, though, so you may want to consider expanding this to include all values and eliminating ranges.

You can do this with CTEs.
First, create a numbers/tally table if you don't already have one (it might be better to make it permanent instead of temporary if you are going to use it a lot):
;WITH Numbers AS
(
SELECT
1 as Value
UNION ALL
SELECT
Numbers.Value + 1
FROM
Numbers
)
SELECT TOP 1000
Value
INTO ##Numbers
FROM
Numbers
OPTION (MAXRECURSION 1000)
Then you can use a CTE to parse the comma delimited string and join the ranges with the numbers table to get the "NewValue" column which contains the whole list of numbers you are looking for:
DECLARE #TestData varchar(50) = '1-10,13,24,51-60'
;WITH CTE AS
(
SELECT
1 AS RowCounter,
1 AS StartPosition,
CHARINDEX(',',#TestData) AS EndPosition
UNION ALL
SELECT
CTE.RowCounter + 1,
EndPosition + 1,
CHARINDEX(',',#TestData, CTE.EndPosition+1)
FROM CTE
WHERE
CTE.EndPosition > 0
)
SELECT
u.Value,
u.StartValue,
u.EndValue,
n.Value as NewValue
FROM
(
SELECT
Value,
SUBSTRING(Value,1,CASE WHEN CHARINDEX('-',Value) > 0 THEN CHARINDEX('-',Value)-1 ELSE LEN(Value) END) AS StartValue,
SUBSTRING(Value,CASE WHEN CHARINDEX('-',Value) > 0 THEN CHARINDEX('-',Value)+1 ELSE 1 END,LEN(Value)- CHARINDEX('-',Value)) AS EndValue
FROM
(
SELECT
SUBSTRING(#TestData, StartPosition, CASE WHEN EndPosition > 0 THEN EndPosition-StartPosition ELSE LEN(#TestData)-StartPosition+1 END) AS Value
FROM
CTE
)t
)u INNER JOIN ##Numbers n ON n.Value BETWEEN u.StartValue AND u.EndValue
All you would need to do once you have that is query the results using an IN statement, so something like
SELECT * FROM MyTable WHERE Value IN (SELECT NewValue FROM (/*subquery from above*/)t)

Get the missing value in a sequence of numbers

I made the following query for the SQL Server backend
SELECT TOP(1) (v.rownum + 99)
FROM
(
SELECT incrementNo-99 as id, ROW_NUMBER() OVER (ORDER BY incrementNo) as rownum
FROM proposals
WHERE [year] = '12'
) as v
WHERE v.rownum <> v.id
ORDER BY v.rownum
to find the first unused proposal number.
(It's not about the lastrecord +1)
But I realized ROW_NUMBER is not supported in access.
I looked and I can't find something similar.
Does anyone know how to get the same result as a ROW_NUMBER in access?
Maybe there's a better way of doing this.
Actually people insert their proposal No (incrementID) with no constraint. This number looks like this 13-152. xx- is for the current year and the -xxx is the proposal number. The last 3 digits are supposed to be incremental but in some case maybe 10 times a year they have to skip some numbers. That's why I can't have the auto increment.
So I do this query so when they open the form, the default number is the first unused.
How it works:
Because the number starts at 100, I do -99 so it starts at 1.
Then I compare the row number with the id so it looks like this
ROW NUMBER | ID
1 1 (100)
2 2 (101)
3 3 (102)
4 5 (104)<--------- WRONG
5 6 (105)
So now I know that we skip 4. So I return (4 - 99) = 103
If there's a better way, I don't mind changing but I really like this query.
If there's really no other way and I can't simulate a row number in access, i will use the pass through query.
Thank you

From your question it appears that you are looking for a gap in a sequence of numbers, so:
SELECT b.akey, (
SELECT Top 1 akey
FROM table1 a
WHERE a.akey > b.akey) AS [next]
FROM table1 AS b
WHERE (
SELECT Top 1 akey
FROM table1 a
WHERE a.akey > b.akey) <> [b].[akey]+1
ORDER BY b.akey
Where table1 is the table and akey is the sequenced number.

SELECT T.Value, T.next -1 FROM (
SELECT b.Value , (
SELECT Top 1 Value
FROM tblSequence a
WHERE a.Value > b.Value) AS [next]
FROM tblSequence b
) T WHERE T.next <> T.Value +1

If any clause when grouping

Doing a Sum() on a column adds up the values in that column based on group by. But lets say I want to sum these values only if all the values are not null or not 0, then I need a clause which checks if any of the values is 0 before it does the sum. How can I implement such a clause?
I'm using sql server 2005.
Thanks,
Barry

Let's supose your table schema is:
myTable( id, colA, value)
Then, one approach is:
Select colA, sum(value)
from myTable
group by colA
having count( id ) = count( nullif( value, 0 ))
Notice that nullif is a MSSQL server function. YOu should adapt code to your rdbms brand.
Explanation:
count aggregate function only count not null values. Here a counting null values test.

You say that 0+2+3=0 for this case. Assuming that NULL+2+3 should also be zero:
SELECT GroupField,
SUM(Value) * MIN(CASE WHEN COALESCE(Value, 0) = 0 THEN 0 ELSE 1 END)
FROM SumNonZero
GROUP BY GroupField
The above statement gives this result
GroupField (No column name)
case1 5
case2 0
case3 0
with this test data
CREATE TABLE SumNonZero (
GroupField CHAR(5) NOT NULL,
Value INT
)
INSERT INTO SumNonZero(GroupField, Value)
SELECT 'case1', 2
UNION ALL SELECT 'case1', 3
UNION ALL SELECT 'case2', 0
UNION ALL SELECT 'case2', 2
UNION ALL SELECT 'case2', 3
UNION ALL SELECT 'case3', NULL
UNION ALL SELECT 'case3', 3
UNION ALL SELECT 'case3', 4

It makes no sense to eliminate 0 from a SUM because it wont impact the sum.
But you may want to SUM based on another field:
select FIELD, sum(
case when(OTHER_FIELD>0) then FIELD
else 0
end)
from TABLE
group by TABLE

How do I get the "Next available number" from an SQL Server? (Not an Identity column)

Technologies: SQL Server 2008
So I've tried a few options that I've found on SO, but nothing really provided me with a definitive answer.
I have a table with two columns, (Transaction ID, GroupID) where neither has unique values. For example:
TransID | GroupID
-----------------
23 | 4001
99 | 4001
63 | 4001
123 | 4001
77 | 2113
2645 | 2113
123 | 2113
99 | 2113
Originally, the groupID was just chosen at random by the user, but now we're automating it. Thing is, we're keeping the existing DB without any changes to the existing data(too much work, for too little gain)
Is there a way to query "GroupID" on table "GroupTransactions" for the next available value of GroupID > 2000?

I think from the question you're after the next available, although that may not be the same as max+1 right? - In that case:
Start with a list of integers, and look for those that aren't there in the groupid column, for example:
;WITH CTE_Numbers AS (
SELECT n = 2001
UNION ALL
SELECT n + 1 FROM CTE_Numbers WHERE n < 4000
)
SELECT top 1 n
FROM CTE_Numbers num
WHERE NOT EXISTS (SELECT 1 FROM MyTable tab WHERE num.n = tab.groupid)
ORDER BY n
Note: you need to tweak the 2001/4000 values int the CTE to allow for the range you want. I assumed the name of your table to by MyTable

select max(groupid) + 1 from GroupTransactions

The following will find the next gap above 2000:
SELECT MIN(t.GroupID)+1 AS NextID
FROM GroupTransactions t (updlock)
WHERE NOT EXISTS
(SELECT NULL FROM GroupTransactions n WHERE n.GroupID=t.GroupID+1 AND n.GroupID>2000)
AND t.GroupID>2000

There are always many ways to do everything. I resolved this problem by doing like this:
declare #i int = null
declare #t table (i int)
insert into #t values (1)
insert into #t values (2)
--insert into #t values (3)
--insert into #t values (4)
insert into #t values (5)
--insert into #t values (6)
--get the first missing number
select #i = min(RowNumber)
from (
select ROW_NUMBER() OVER(ORDER BY i) AS RowNumber, i
from (
--select distinct in case a number is in there multiple times
select distinct i
from #t
--start after 0 in case there are negative or 0 number
where i > 0
) as a
) as b
where RowNumber <> i
--if there are no missing numbers or no records, get the max record
if #i is null
begin
select #i = isnull(max(i),0) + 1 from #t
end
select #i

In my situation I have a system to generate message numbers or a file/case/reservation number sequentially from 1 every year. But in some situations a number does not get use (user was testing/practicing or whatever reason) and the number was deleted.
You can use a where clause to filter by year if all entries are in the same table, and make it dynamic (my example is hardcoded). if you archive your yearly data then not needed. The sub-query part for mID and mID2 must be identical.
The "union 0 as seq " for mID is there in case your table is empty; this is the base seed number. It can be anything ex: 3000000 or {prefix}0000. The field is an integer. If you omit " Union 0 as seq " it will not work on an empty table or when you have a table missing ID 1 it will given you the next ID ( if the first number is 4 the value returned will be 5).
This query is very quick - hint: the field must be indexed; it was tested on a table of 100,000+ rows. I found that using a domain aggregate get slower as the table increases in size.
If you remove the "top 1" you will get a list of 'next numbers' but not all the missing numbers in a sequence; ie if you have 1 2 4 7 the result will be 3 5 8.
set #newID = select top 1 mID.seq + 1 as seq from
(select a.[msg_number] as seq from [tblMSG] a --where a.[msg_date] between '2023-01-01' and '2023-12-31'
union select 0 as seq ) as mID
left outer join
(Select b.[msg_number] as seq from [tblMSG] b --where b.[msg_date] between '2023-01-01' and '2023-12-31'
) as mID2 on mID.seq + 1 = mID2.seq where mID2.seq is null order by mID.seq
-- Next: a statement to insert a row with #newID immediately in tblMSG (in a transaction block).
-- Then the row can be updated by your app.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

Get COUNT() of rows where first 3 digits of column are alike - sql-server

This may also help you. I assume that the column is Decimal Data type. SELECT CAST([COLUMN] AS INT) [GROUP], COUNT(*) [COUNT] FROM [TABLE] T GROUP BY CAST([COLUMN] AS INT)

Related

Count of numbers followed by pipe symbol in a single data of a column in SQL Server

How to query number based SQL Sets with Ranges in SQL

Get the missing value in a sequence of numbers

If any clause when grouping

How do I get the "Next available number" from an SQL Server? (Not an Identity column)

Categories

Resources