SQL Server - Average a column that sometimes has numbers and sometimes strings - sql-server

I'm working with a column that was poorly set up as nvarchar, it sometimes has a number, which I want to average, and sometimes letters, nulls, or an empty string value. How can I get an average of all of the numeric values that are greater than 0 for this column?
Side question: If I want to fix the column, what's the best way to do it without losing any of the numeric values?

Not bullet proof but should work:
WITH cte AS (
SELECT *
FROM your_tab
WHERE ISNUMERIC(col) = 1
)
SELECT AVG(CAST(col AS DECIMAL(18,2))) AS average
FROM cte
WHERE CAST(col AS DECIMAL(18,2)) > 0;
SQL Server 2012+ has great TRY_CAST function:
SELECT AVG(casted_col) AS average
FROM (
SELECT TRY_CAST(col AS DECIMAL(18,2)) AS casted -- NULL if cannot cast
FROM your_tab
) sub
WHERE casted_col > 0;

Related

How to get the trasaction that happens in 2020,2021 in SQL Server?

I have a table with columns id(int) and trans_id string.
trans_id contains values such as 20345,19345 - the 1st 2 chars represent years, I want a query for transactions that happened in 2020,2019
You should store dates in a date or datetime column, not as a string or integer. And you certainly shouldn't store multiple values in one column.
Assuming trans_id is an int you can do
SELECT *
FROM YourTable t
WHERE trans_id >= 19000 AND trans_id < 21000;
If trans_id is a varchar string, you can do
SELECT *
FROM YourTable t
WHERE trans_id LIKE '20%' OR trans_id LIKE '19%';
If you've gone for an even worse version and stored multiple values, you need to split them first
SELECT *
FROM YourTable t
WHERE EXISTS (SELECT 1
FROM STRING_SPLIT(trans_id, ',') s
WHERE s.value LIKE '20%' OR s.value LIKE '19%'
);
You can also use LEFT to get the first two characters of the string.
Then use IN for the list of years you need.
SELECT *
FROM YourTable t
WHERE LEFT(trans_id, 2) IN ('19', '20')
But don't use BETWEEN without casting the 2 digits to an INT.

How can determine and avoid records based on its left most String values are numeric in SQL Query?

I have a table AgentDetail, and I need to create a query which returns only records which contain left most 5 numeric digits.
The table has 3 columns
AgentId, AgentName, AgentTextCode
where in the AgentTextCode column, there could be 5 digits or any text value (sometime 2 bytes chars). So output records should be only those which have a value which starts with 5 numeric digits (decimal value not possible).
Sample data & output:
We can use LIKE here:
SELECT
AgentID, AgentName, AgentTextCode
FROM yourTable
WHERE AgentTextCode LIKE '[0-9][0-9][0-9][0-9][0-9]%';
SQL Server's LIKE operator supports some primitive regex capabilities, as shown above.
You can use IsNumeric and Substring from TSQL:
SELECT
AgentID, AgentName, AgentTextCode
FROM yourTable
WHERE ISNUMERIC(Replace(Replace(substring(AgentTextCode, 1, 5),'+','A'),'-','A') + '.0e0') = 1;
GO
Reference here:
CAST and IsNumeric

Update wrong data to the average of a table

I have a table of data that should be integers from 1 to 7. But the data contained mistakes and non-numeric data so I saved the column as nvarchar-type variable. Now I would like to estimate the wrong data by the average of the correct data, i.e. if the value is not from 1 to 7, it should be updated to the average of the data in the same column where the average has been computed on those cells that have value 1,2, 3,4,5,6 or 7. The estimated value can be a float. How can I do that in MSSQL? I tried
SELECT AVG(CAST(ky1 AS FLOAT)) FROM esimerkkikysely
WHERE NOT ISNUMERIC(ky1)=1 OR ky1 NOT BETWEEN 1 AND 7
but it returned 0.
Also,
SELECT AVG(CAST(ky1 AS FLOAT)) FROM esimerkkikysely
WHERE ISNUMERIC(ky1)=1
returns about 4.643.
Try this. Please, PLEASE do all your updates in a new column (I've called it KY2 in the code below). The last thing you want to do is destroy the data you are working from, even if it is filled with errors.
UPDATE esimerkkikysely
SET KY2 = CASE WHEN LTRIM(RTRIM(KY1)) IN ('1','2','3','4','5','6','7')
THEN CONVERT(FLOAT, KY1)
ELSE (SELECT AVG(CONVERT(FLOAT, KY1))
FROM esimerkkikysely e
WHERE LTRIM(RTRIM(KY1)) IN ('1','2','3','4','5','6','7')) END
WHERE LTRIM(RTRIM(KY1)) NOT IN ('1','2','3','4','5','6','7')
I added in the TRIM as if the data import is as bad as you suggest, the chances of spaces being imported and messing up the comparison seems quite high.
You only want the average of integers, between 1 and 7 inclusive, with tolerance to NULLs and strings, correct?
DECLARE #T1 TABLE (SuperColumn VARCHAR(30))
INSERT INTO #T1 VALUES ('2'), ('9874859'), ('JACKJACKSON'), ('1'), ('2'), ('2'), ('1'), ('3')
SELECT AVG(HisHighnessConverted)
FROM ( -- Do AVG only after filtering out problematic values.
SELECT CONVERT(float, SuperColumn) AS HisHighnessConverted
FROM #T1
WHERE TRY_CONVERT(float, SuperColumn) BETWEEN 1 AND 7 -- Skips NULLs, failed converts, and successes outside of the BETWEEN range.
) AS T

SQL Server 2014: How to convert a VARCHAR column mixed with characters and numbers to corresponding numbers

I have a column called result in SQL Server 2014 which has various kinds of lab test results. The values for result can be characters, numbers (integer or decimals or scientific notations) like this:
positive
negative
not detect
n/a
101
15.3
78.002
-12.1
3.49952E-10
7.3E9
I want to only select those representing numbers, which are...
101
15.3
78.002
-12.1
3.49952E-10
7.3E9
And, I want to convert them into a numeric column with the corresponding values. I also want to get AVG, stdev, min, and max of them.
Can someone help me please?
Thanks a lot!
You could use ISNUMERIC function and CAST it to number
DECLARE #SampleData AS TABLE (Value varchar(30))
INSERT INTO #SampleData
VALUES ('positive'),('negative'),('101'),('15.3'),
('78.002'),('-12.1'),('3.49952E-10'),('7.3E9')
SELECT CAST(sd.[Value] AS float) AS Value
FROM #SampleData sd
WHERE isnumeric(sd.[Value]) = 1
Demo link: Rextester
In SQL Server 2012 and newer, you can also use the TRY_CAST function to try to convert a string to a numeric value - if it fails, it will not crash and burn, but instead just simply return NULL.
Based on that, you could use something like this:
-- define a CTE - an "inline" view which handles the conversion
;WITH CTE AS
(
SELECT NumValue = TRY_CAST(YourColumnName AS FLOAT)
FROM dbo.YourTable
)
-- select only those rows from the CTE that have a non-NULL "NumValue"
SELECT *
FROM CTE
WHERE NumValue IS NOT NULL
You could also use pattern matching by using LIKE operator,
SELECT AVG(NumValue) AS Average
,STDEV(NumValue) AS StDev
,MIN(NumValue) AS Min
,MAX(NumValue) AS Max
FROM
(SELECT CONVERT(FLOAT,YourColumn) AS NumValue
FROM YourTable
WHERE YourColumn LIKE '%[0-9]%') x
This subquery will display any data that has number in it, and would return error if there is alphanumeric data other than exponential notation (i.e 3.49952E-10), in that case you could just specified the pattern after LIKE operator.
by using LIKE operator we can restrict string data
;WITH Cte (TextData)
AS
(
SELECT 'positive' UNION ALL
SELECT 'negative' UNION ALL
SELECT 'not detect' UNION ALL
SELECT 'n/a' UNION ALL
SELECT '101' UNION ALL
SELECT '15.3' UNION ALL
SELECT '78.002' UNION ALL
SELECT '-12.1' UNION ALL
SELECT '3.49952E-10'UNION ALL
SELECT '7.3E9'
)
SELECT *
FROM Cte
WHERE TextData LIKE '%[0-9]%'

Can i use a case statement to convert a varchar to decimal and use that in my where clause?

I have a column which I want to convert to decimal so I can then use it to compare in my where clause. I want to make sure all values from the column are greater or equal to 1.3. I converted the column successfully in the select statement but when attempting to do the same convert in the where clause I get the following error:
Arithmetic overflow error converting varchar to data type numeric.
I am using SQL Server 2008.
SELECT ID,
CASE
WHEN ISNUMERIC(USER_3) = 1
THEN Convert(varchar(50), CONVERT(decimal(14,2), USER_3))
END AS KG_M
FROM PART
WHERE USER_3 IS NOT NULL
AND CASE
WHEN ISNUMERIC(USER_3) = 1
THEN Convert(varchar(50), CONVERT(decimal(14,2), USER_3))
END >= 1.3
Sure, why not? Here's a self-contained example:
select a.ID
, b.KG_M
from (values
(1, N'12345678')
, (2, N'ABCDEFGH')
) as a (ID, USER_3)
cross apply (values(
case IsNumeric(a.USER_3)
when 1 then Convert(varchar(50), Convert(decimal(14, 2), a.USER_3))
else a.USER_3
end
)) as b (KG_M)
where b.KG_M >= '1.3';
We simply use the APPLY operator to contain our calculation for reuse later.
You need to choose one way to convert. I would use the native type for comparison, decimal.
SELECT * FROM
(
SELECT ID, KG_M=CAST(USER_3 AS decimal(14,2))
FROM PART
WHERE
ISNUMERIC(USER_3) = 1
)AS X
WHERE
X.KG_M >= 1.3
Allow strings that are not numbers in outoput
SELECT * FROM
(
SELECT
ID,
USER_3_AsDecimal=CASE WHEN ISNUMERIC(USER_3) THEN CAST(USER_3 AS decimal(14,2)) ELSE NULL END,
USER_3
FROM PART
WHERE
NOT USER_3 IS NULL
)AS X
WHERE
X.USER_3_AsDecimal IS NULL
OR
X.USER_3_AsDecimal >= 1.3
The problem was a syntax error, the case in the where clause was a success the entire time.
"you should use >= '1.3' since you are converting to varchar" credit to #Lamak in comments

Resources