Trying to Populate a Column with a Query - sql-server

So I'm doing a data mining project for one of my classes. As part of it, I'm trying to apply Min Max Normalization to some of the data- which is the easy part. The hard part had been actually inserting the results of the queries into the table.
At first, I tried an INSERT INTO statement...
insert into dbo.CountsA([TotalCountMinMAx])
SELECT
1.00*(TotalCount-MinCount)/CountRange as TotalCountMinMax
FROM
(
SELECT
TotalCount,
MIN(TotalCount) OVER () AS MinCount,
MAX(TotalCount) OVER () - MIN(TotalCount) OVER () AS CountRange
FROM
dbo.CountsA
) X
The subquery itself works fine, but the moment I tried inserting the results into the table, it only inserted a number of null records. So instead of, say, updating ten entries in the TotalCountMinMAx column, it created ten additional records, and set all the columns to NULL.
After busting my head trying to figure that out, I tried using an UPDATE query instead.
update dbo.CountsA
set [TotalCountMinMAx]=(
SELECT
1.00*(TotalCount-MinCount)/CountRange as TotalCountMinMax
FROM
(
SELECT
TotalCount,
MIN(TotalCount) OVER () AS MinCount,
MAX(TotalCount) OVER () - MIN(TotalCount) OVER () AS CountRange
FROM
dbo.CountsA
) X)
This query failed to run entirely.
"Subquery returned more than 1 value. This is not permitted when the subquery follows =, !=, <, <= , >, >= or when the subquery is used as an expression."
At this point, short of digging out my old SQL book and basically relearning SQL from scratch (I am very, very rusty), I'm out of ideas for making either of these codes work.

In case 1, when you insert, data will add row below old data. Example, if your table like:
ID | Col
1 2
2 4
After you insert just only Col column values: 3,4
Your table like this:
ID | Col
1 2
2 4
NULL 3
NULL 4
In case 2:
IF you use sub-query to insert like:
UPDATE Your_Table
SET Col = (<sub-query>)
sub-query must return a single value.
You can add where clause make sub-query return a single value, like this:
UPDATE Your_Table
SET Col = (SELECT ... FROM Your_Table) AS A
WHERE Col_ID = A.Col_ID

The problem is you are not correlating your sub-query that's the reason to get Sub-query retuned more than one row error.
Try using CTE to update which is easy and more readable.
;WITH cte
AS (SELECT 1.00 * ( TotalCount - MinCount ) / CountRange AS TotalCountMinMax_N,
TotalCountMinMax
FROM (SELECT TotalCount,
TotalCountMinMax,
Min(TotalCount)
OVER () AS MinCount,
Max(TotalCount)
OVER () - Min(TotalCount)
OVER () AS CountRange
FROM dbo.CountsA) X)
UPDATE cte
SET TotalCountMinMax = TotalCountMinMax_N

Related

Can SQLite3 SELECT (up to) some rows based on a WHERE, and in the same query SELECT other rows on another condition?

Hi it's my first time using SQLite and I am trying to understand if it can count the results it gets and switch conditions to find other results: I am not sure how to write this, so I will use an example:
From a table of players, with the same query I'd like to:
SELECT (x) players WHERE WINS>0 AND WINS<= 3 and
(y) other players WHERE WINS = Null OR WINS=0.
x and y should be integer numbers, but they could vary.
I think I can split the query in 2 queries, but I am worried about the performance since in this way I have to connect to the db twice, and I have to check on the second query that the new IDs have not been selected already in the first one (this should never happen in this example scenario, but might happen with different conditions).
If it's possible to write this all in one single query, that would be much more "simple" and straightforward.
Unfortunately I have to stick to SQLite 3.7.17 and I can't make use of all the updates until now.
mytable is defined like this one:
CREATE TABLE mytable (
ID TEXT PRIMARY KEY,
NAME TEXT,
WINS INTEGER,
LOSSES INTEGER,
NOTES TEXT
);
These are some dummy values:
INSERT INTO mytable(ID, NAME, WINS,LOSSES, NOTES) VALUES
('A001','John','1','0','blue')
('A002','Mark','2','1','blue')
('A003','Hubert','null','null','red')
('A004','Otto','0','0','green')
('A005','Johnson','3','5','red')
('A006','Frank','null','1','green')
As an example result the query should return:
#first part (WINS>0 AND WINS<=3) x = 3
('A005','Johnson','3','5','red')
('A001','John','1','0','blue')
('A002','Mark','2','1','blue')
#second part(WINS=0 OR WINS=Null) y = 2
('A006','Frank','null','1','green')
('A004','Otto','0','0','green')
..so the result should be a table like that
('A005','Johnson','3','5','red')
('A001','John','1','0','blue')
('A002','Mark','2','1','blue')
('A006','Frank','null','1','green')
('A004','Otto','0','0','green')
Thank you for your time and knowledge. :)
Use a CTE where you filter the table for the conditions that you want to apply and with a CASE expression get an integer for the condition that is satisfied for each row.
In another CTE use ROW_NUMBER() window function to rank the rows of each condition.
Finally filter the rows with a CASE expression:
WITH
cond AS (
SELECT *,
CASE
WHEN WINS > 0 AND WINS <= 3 THEN 1
WHEN WINS IS NULL OR WINS = 0 THEN 2 -- or WHEN COALESCE(WINS, 0) = 0 THEN 2
END AS condition
FROM mytable
WHERE condition IN (1, 2)
),
cte AS (SELECT *, ROW_NUMBER() OVER (PARTITION BY condition ORDER BY condition, WINS) rn FROM cond)
SELECT ID, NAME, WINS, LOSSES, NOTES
FROM cte
WHERE rn <= CASE condition
WHEN 1 THEN 3 -- x = 3
WHEN 2 THEN 2 -- y = 2
END
ORDER BY condition, WINS DESC;
See the demo.

T-SQL Selecting TOP 1 In A Query With Aggregates/Groups

I'm still fairly new to SQL. This is a stripped down version of the query I'm trying to run. This query is suppose to find those customers with more than 3 cases and display either the top 1 case or all cases but still show the overall count of cases per customer in each row in addition to all the case numbers.
The TOP 1 subquery approach didn't work but is there another way to get the results I need? Hope that makes sense.
Here's the code:
SELECT t1.StoreID, t1.CustomerID, t2.LastName, t2.FirstName
,COUNT(t1.CaseNo) AS CasesCount
,(SELECT TOP 1 t1.CaseNo)
FROM MainDatabase t1
INNER JOIN CustomerDatabase t2
ON t1.StoreID = t2.StoreID
WHERE t1.SubmittedDate >= '01/01/2017' AND t1.SubmittedDate <= '05/31/2017'
GROUP BY t1.StoreID, t1.CustomerID, t2.LastName, t2.FirstName
HAVING COUNT (t1.CaseNo) >= 3
ORDER BY t1.StoreID, t1.PatronID
I would like it to look something like this, either one row with just the most recent case and detail or several rows showing all details of each case in addition to the store id, customer id, last name, first name, and case count.
Data Example
For these I usually like to make a temp table of aggregates:
DROP TABLE IF EXISTS #tmp;
CREATE TABLE #tmp (
CustomerlD int NOT NULL DEFAULT 0,
case_count int NOT NULL DEFAULT 0,
case_max int NOT NULL DEFAULT 0,
);
INSERT INTO #tmp
(CustomerlD, case_count, case_max)
SELECT CustomerlD, COUNT(tl.CaseNo), MAX(tl.CaseNo)
FROM MainDatabase
GROUP BY CustomerlD;
Then you can join this "tmp" table back to any other table you want to display the number of cases on, or the max case number on. And you can limit it to customers that have more than 3 cases with WHERE case_count > 3

How to compare records in same SQL Server table

My requirement is to compare each column of row with its previous row.
Compare row 2 with row 1
Compare row 3 with row 2
Also, if there is no difference, I need to make that column NULL. Eg: request_status_id of row 3 is same as that of row 2 so I need to update request_status_id of row 3 to NULL.
Is there a clean way to do this?
You can use the following UPDATE statement that employs LAG window function available from SQL Server 2012 onwards:
UPDATE #mytable
SET request_status_id = NULL
FROM #mytable AS m
INNER JOIN (
SELECT payment_history_id, request_status_id,
LAG(request_status_id) OVER(ORDER BY payment_history_id) AS prevRequest_status_id
FROM #mytable ) t
ON m.payment_history_id = t.payment_history_id
WHERE t.request_status_id = t.prevRequest_status_id
SQL Fiddle Demo here
EDIT:
It seems the requirement of the OP is to SET every column of the table
to NULL, in case the previous value is same as the current value. In this case the query becomes a bit more verbose. Here is an example with two columns being set. It can easily be expanded to incorporate any other column of the table:
UPDATE #mytable
SET request_status_id = CASE WHEN t.request_status_id = t.prevRequest_status_id THEN NULL
ELSE T.request_status_id
END,
request_entity_id = CASE WHEN t.request_entity_id = t.prevRequest_entity_id THEN NULL
ELSE t.request_entity_id
END
FROM #mytable AS m
INNER JOIN (
SELECT payment_history_id, request_status_id, request_entity_id,
LAG(request_status_id) OVER(ORDER BY payment_history_id) AS prevRequest_status_id,
LAG(request_entity_id) OVER(ORDER BY payment_history_id) AS prevRequest_entity_id
FROM #mytable ) t
ON m.payment_history_id = t.payment_history_id
SQL Fiddle Demo here

T-SQL First_Value() - How can it return more than a single value?

I have the following query:
Select PH.SubId
From dbo.PanelHistory PH
Where
PH.Scribe2Time <> (Select FIRST_VALUE(ReadTimeLocal) OVER (Order By ReadTimeLocal) From dbo.PanelWorkflow Where ProcessNumber = 2690 And dbo.PanelWorkflow.SubId = PH.SubId)
I'm getting an error (512) that says: Subquery returned more than 1 value. This is not permitted when the subquery follows =, !=, <, <= , >, >= or when the subquery is used as an expression.
How can the subquery return more than a single value? There can only be one first value. I must be overlooking something with this query.
By the way, I realize I could easily use Min() instead of First_Value, but I wanted to experiment with some of these Windowing functions.
How many rows do you see?
SELECT FIRST_VALUE(name) OVER (ORDER BY create_date) AS RN
FROM sys.objects
Even though there is only one distinct first value it still returns it for every row in the query.
So if the sub query itself matches multiple rows you will get this error. You could get rid of it with DISTINCT or TOP 1.
Probably not very efficient but you say this is just for experimental purposes.
This isn't an answer. It's just an extended comment generated by the following conclusion:
I could easily use Min() instead of First_Value, but I
wanted to experiment with some of these Windowing functions.
Min can't be used instead of FIRST_VALUE.
Example:
SET NOCOUNT ON;
DECLARE #MyTable TABLE(ID INT, TranDate DATETIME)
INSERT #MyTable VALUES (1, '2012-02-02'), (2, '2011-01-01'), (3, '2013-03-03')
SELECT MIN(ID) AS MIN_ID FROM #MyTable
SELECT ID, MIN(ID) OVER(ORDER BY TranDate) AS MIN_ID_ORDER_BY FROM #MyTable;
SELECT ID, FIRST_VALUE(ID) OVER(ORDER BY TranDate) AS FIRST_VALUE_ID_ORDER_BY FROM #MyTable;
Results:
MIN_ID
-----------
1
ID MIN_ID_ORDER_BY
----------- ---------------
2 2
1 1
3 1
ID FIRST_VALUE_ID_ORDER_BY
----------- -----------------------
2 2
1 2
3 2
FIRST_VALUE() will still return a row for every record that meets tour WHERE clause. TOP 1 should work:
Select PH.SubId
From dbo.PanelHistory PH
Where
PH.Scribe2Time <> (Select TOP 1 ReadTimeLocal
From dbo.PanelWorkflow
Where ProcessNumber = 2690
And dbo.PanelWorkflow.SubId = PH.SubId
Order By ReadTimeLocal DESC)
or MIN:
Select PH.SubId
From dbo.PanelHistory PH
Where
PH.Scribe2Time <> (Select MIN(ReadTimeLocal)
From dbo.PanelWorkflow
Where ProcessNumber = 2690
And dbo.PanelWorkflow.SubId = PH.SubId)
The PARTITION/OVER functions are look-ahead column functions. They aren't row functions - by that, I mean, they don't effect an entire row, number of rows returned, etc. An OVER aggregate can depend on values in other rows, but the tangible result is only to calculate a single column in the current row.
You may have seen something similar to what you are trying to do via an OVER ROW_NUMBER ranking function. Multiple rows are still returned, but only one of them has a ROW_NUMBER of 1. The rest are filtered in an encapsulating WHERE or JOIN predicate.

Updating multiple rows previous to a particular date

I'm working with a website that had an update done to its business layer, so now I need to convert the old data to match the way new data is saved. I'm writing a SQL query that updates a couple of data columns on a member table, where the person_id matches on the member and registration code tables, an registration code is present on the reg cod table, and that reg code is flagged as used before a certain date on the reg code table.
UPDATE vs_member
SET premium_acct = 1, tenant_reg_key = (
SELECT DISTINCT tenant_reg_key
FROM vs_tenant_reg_key_tbl t
WHERE person_id = t.person_id)
WHERE person_id in (
SELECT t2.person_id
FROM vs_tenant_reg_key_tbl t2
WHERE person_id = t2.person_id AND t2.used = 1 AND premium_acct = 0 AND date_joined <= '2013-11-08')
I'm receiving an error stating the following: Subquery returned more than 1 value. This is not permitted when the subquery follows =, !=, <, <= , >, >= or when the subquery is used as an expression.
Not exactly sure how to handle that, any help would be appreciated.
In your subselect, WHERE person_id = t.person_id probably isn't doing what you think.
Consider writing your UPDATE statement using joins instead of subselects. Perhaps something like this would do what you want.
UPDATE V
SET premium_acct = 1, tenant_reg_key = T.tenant_reg_key
FROM vs_member V
INNER JOIN vs_tenant_reg_key_tbl T ON T.person_id = V.person_id
WHERE T.used = 1 AND T.premium_acct = 0 AND T.date_joined < '2013-11-08'
Check if the following part of your query is returning more than one entry:
SELECT DISTINCT tenant_reg_key
FROM vs_tenant_reg_key_tbl t
WHERE person_id = t.person_id
If so, you won't be able to set tenant_reg_key to the results of this query, as it's essentially trying to add more than one value -- once, mind you -- to one entry.

Resources