How to do a looping select union in Postgres - database

I'm trying to do select from Table2 for each row in an initial select statement from Table1.
Initial select statement would look something like
Select * from Table1 Where GroupID='someId' AND ObjectID='someID'
Table1 would return something like the following (there will only ever be 0-1 returned rows with a null EndDate value).
Table2 will look as follows
Basically my goal is to loop through the first select statements rows. For each row I want to take the objectid, startdate, and enddate to select all the appropriate rows in Table2. Then after do a UNION(?) on all the data I've selected from Table2. I want to be able to select all the rows in Table2 which's objectid matches the objectid given and the timestamp is in-between the start/end dates (or to current timestamp if enddate is null). Does that make sense? I've been looking online, but have not found a way to achieve a looping select union like this.
I'm not sure if this is possible in a single select statement or if a stored proc is needed/cleaner (either is fine but I'd rather avoid stored proc if I can).

You can do it with a join:
SELECT t2.*
FROM Table2 t2 INNER JOIN Table1 t1
ON t1.objectid = t2.objectid
AND t2.timestamp BETWEEN t1.startdate AND COALESCE(t1.enddate, current_timestamp)
WHERE t1.GroupID = 'someId' AND t1.ObjectID = 'someID'

Related

Can you set multiple column names as a macro in SQL to query against?

Can you set multiple column names from a SQL table as a macro in SQL to query against?
For example I have multiple columns I am hitting against multiple times, can I use a macro or some type of reference to identify them ONCE to avoid displaying them repetitively and cluttering up the code?
The current code works, I am just looking for a cleaner/streamlined option.
Current Code:
WHERE ('ABC') IN
([CODE1],[CODE2],[CODE3],[CODE4],[CODE5],[CODE6],[CODE7],[CODE8]
,[CODE9],[CODE10],[CODE11],[CODE12],[CODE13],[CODE14],[CODE15]
,[CODE16],[CODE17],[CODE18],[CODE19],[CODE20],[CODE21],[CODE22]
,[CODE23],[CODE24],[CODE25]
AND ('CFS') IN
([CODE1],[CODE2],[CODE3],[CODE4],[CODE5],[CODE6],[CODE7],[CODE8]
,[CODE9],[CODE10],[CODE11],[CODE12],[CODE13],[CODE14],[CODE15]
,[CODE16],[CODE17],[CODE18],[CODE19],[CODE20],[CODE21],[CODE22]
,[CODE23],[CODE24],[CODE25]
ect...(20 more times)
Goal:
WHERE 'ABC' IN (&columnsmentionedabove)
OR 'FGS' in (&columnsmentionedabove)
OR 'g6s' in (&columnsmentionedabove)
etc.....
This is inherited code and just seems very clunky.
Thank you
Numbered columns like this are almost always a sign you should have an additional table. So if your existing table structure is like this:
Table1
Table1ID, OtherFields, Code1, Code2, Code3.... Code25
You really want something more like this:
Table1
Table1ID, OtherFields
Table1Codes
Table1ID, Code
Where each entry in Table1 will have many entries in Table1Codes. Then you write JOIN statements to show the two sets side-by-side when needed.
FROM Table1 t
INNER JOIN Table1Codes tc1 ON tc.Table1ID = t.Table1ID AND tc.Code = 'ABC'
INNER JOIN Table1Codes tc2 ON tc.Table1ID = t.Table1ID AND tc.Code = 'CFS'
Or
FROM Table1 t
INNER JOIN Table1Codes tc1 ON tc.Table1ID = t.Table1ID AND tc.Code IN ('ABC','FGS','g6s')
If you can't change the table's schema, as in often the case, you can UNPIVOT it. For example, assuming CODE1...CODE25 come from MyTable, wrap the UNPIVOT operation inside a CTE:
;WITH
cte AS
(
SELECT upvt.*
FROM MyTable
UNPIVOT (
CodeValue FOR CodeLabel IN ([CODE1], [CODE2], ..., [CODE25])
) upvt
)
SELECT *
FROM cte
WHERE CodeValue IN ('ABC', 'DEF', ...)
The unpivot operation is not free. Make sure you filter as much as possible from MyTable before unpivoting the it.

Join columns to rows

Suppose you have a table Table1 with columns
UserId, Item1, Item2, Item3, Item4, Item5, Item6, Item7, Item8, Item9, Item10
and you have another table Table2 with
UserId, ItemId, Name
. The values in Table1 is the ItemId from Table2. I have a need to display
UserId, ItemId, Name
where Item1 is 1st and Item10 is last and you have 10 rows. In other words, Item1 is 1st row and Item10 is last row. If there's any way to avoid CASE WHEN that would be great. I may have more columns in the future and would hate to hardcode the 10 columns.
I think you want a reverse pivot in this case. You don't use CASE, like you would in a normal pivot, but instead UNION ALL, like this:
select Table1.UserId, Table2.ItemId, Table2.Name
from Table1 inner join Table2 on Table1.Item1 = Table2.ItemId
UNION ALL
select Table1.UserId, Table2.ItemId, Table2.Name
from Table1 inner join Table2 on Table1.Item2 = Table2.ItemId
UNION ALL
...
select Table1.UserId, Table2.ItemId, Table2.Name
from Table1 inner join Table2 on Table1.Item10 = Table2.ItemId
If you have more items, you should also be able to write a snippet that generates the repeating UNION ALL syntax so you don't have to type it all by hand.
Given you can bypass doing it entirely with SQL, I would highly recommend using e.g. R or Python to process transactions in a ML useable way. The tidyr package with the gather function does exactly what you want to do.
Another way is to crosstabulate. It´s absolutely fine deriving a solution with the SQL standard, but a lot of problems can be much easier done within R or Python.
A table1 with just 3 columns
userid, itemid, sequence
would be more conducive for your purposes. You would be required to convert your AzureML output from the single line
Uid1, itm1,itm2,itm3,...,itm10
into 10 lines like
Uid1, itm1, 1
Uid1, itm2, 2
Uid1, itm3, 3
...
Uid1, itm10,10
Assuming you get the above output line as a (temporary) table output from AzureML with name tbla you could use the follwing UNION ALL construct (as suggested by Spencer Simpson):
INSERT INTO table1 (userid, itemid, sequence)
SELECT uid, itm1, 1 FROM tbla UNION ALL
SELECT uid, itm2, 2 FROM tbla UNION ALL
SELECT uid, itm3, 3 FROM tbla UNION ALL
SELECT uid, itm4, 4 FROM tbla UNION ALL
...
SELECT uid, itm10, 10 FROM tbla
To store the information into table1 which will be the only table you will have to deal with. No JOINs will be required anymore.
Note: I am not quite sure what your column name relates to. Is it the name of an item or the name of a user?
In both cases there should be a second table table2 that takes care of the correspondence between name and userid/itemid like
itm/usr name
This table will then be join-ed into any query that requires displaying the name column too.
What I did to work around this was to use Python (or R) and use the melt function.
There is also a pivot_table function in the dataframe.
So, you can have your columns be converted to rows. Then join those rows on the other table.
Reshaping and Pivot Tables

Subquery returned more than 1 value error after executing update query

I am using sql-server 2012 and i have a strange problem in updating a table.
My select query returns tree rows and is like below:
select * from
TAble1 p join
(select ProductId=max(ProductId) from Table2 s group by s.ProductId) pin on p.id=pin.ProductId
where p.categoryid=238
and the returned row is:
Now, When i run this update query:
update TAble1 set sizing=0 from
TAble1 p join
(select ProductId=max(ProductId) from TAble2 s group by s.ProductId) pin on p.id=pin.ProductId
where p.categoryid=238
I got this error:
Subquery returned more than 1 value. This is not permitted when the subquery follows =, !=, <, <= , >, >= or when the subquery is used as an expression.
The statement has been terminated.
Where is the problem in my query?
Looks like the problem that generates an exception is somewhere else (inside of a trigger for example).
This line could be the reason why there is more than one row updated
(select ProductId=max(ProductId) from TAble2 s group by s.ProductId)
If you want to obtain max ProductID (a single value) - remove it from GROUP BY clause. Currently you are requesting server to return maximum from a single value - which is absurd. It simply returns a list of all ProductID values from Table2. Which is the same as
select distinct ProductID from Table2
This select ProductId=max(ProductId) from Table2 s group by s.ProductId
will give you DISTINCT ProductId's, but not MAX. And You dont have link with TAble1 In fact your query will update all TAble1.sizing column.
Try this:
UPDATE TAble1
SET sizing = 0
FROM TAble1 p
JOIN
(SELECT max(s.ProductId) AS ProductId
FROM TAble2) pin
ON p.id = pin.ProductId AND p.categoryid=238
WHERE p.categoryid = categoryid
I think a better question is what isn't the problem in your query. In your SELECT what exactly are you supposed to be joining? And what index are you joining it to? You're using a GROUP BY but not including the GROUP BY as a column in your SELECT. You don't need to alias the 'TAble2 s' in the subquery. TAble p doesn't have a categoryid column based on what you've shown. You shouldn't need the FROM clause in the UPDATE query in most cases, especially since you're just setting a column to a static value.
But to answer your question: the subquery: "select ProductId=max(ProductId) from TAble2 s group by s.ProductId" returns all ProductId rows, so it fails when you're trying to join.
Since you're not using info from Table2 why not just simply update like this:
update TAble1 set sizing=0 where categoryid=238

Joining a calculated field to a field in another table

I have created a variable table called Table_A which has two columns, Age and Age_Range. The Datatype for Age is integer.
The next stage is a select statement where I’m pulling the Order_Number and a calculated field from Table_B. I want to join the calculated field from Table_B with Age from Table_A, so that I can see what the range is against the calculated field and its order number.
My first attempt was:
SELECT Order_Number, DATEDIFF(DAY,Order_Date,CAST(GETDATE()AS DATE)) AS Ageing, Age_Range
FROM Table_B LEFT JOIN Table_A ON Table_B.Ageing = Table_A.Age_Range
This didn’t work and I understand why. Usually in Access, I would just build the first query with the calculated field and then build the second query joining the calculated field with the desired field from the table. I’ve been looking at sub queries and derived tables, which I believe may solve my problem, but I’m not having any luck. I know this is a basic question, but I’ve just started out with SQL.
Thanks
You cannot join like that because SELECT is executed after JOIN statement.
You can read about it here: https://social.msdn.microsoft.com/Forums/sqlserver/en-US/70efeffe-76b9-4b7e-b4a1-ba53f5d21916/order-of-execution-of-sql-queries
You can make a workaround using CROSS APPLY
SELECT Order_Number
, T.Ageing
, A.Age_Range
FROM Table_B AS B
CROSS APPLY (SELECT DATEDIFF(DAY, B.Order_Date, GETDATE())) AS T(Ageing)
LEFT JOIN Table_A AS A
ON T.Ageing = Table_A.Age_Range
If the beauty of the code is not neccesarry:
SELECT Order_Number, DATEDIFF(DAY,Order_Date,CAST(GETDATE()AS DATE)) AS Ageing, Age_Range
FROM Table_B LEFT JOIN Table_A ON DATEDIFF(DAY,Order_Date,CAST(GETDATE()AS DATE)) = Table_A.Age_Range
Otherwise use CROSS APPLY as already suggested (performance will be the same). By the way, you do not need to CAST getdate() to date, DATEDIFF will work without that, so you can easily write like that:
SELECT Order_Number, DATEDIFF(DAY,Order_Date,GETDATE()) AS Ageing, Age_Range
FROM Table_B LEFT JOIN Table_A ON DATEDIFF(DAY,Order_Date,GETDATE()) = Table_A.Age_Range

Updating column with value from other table, can't use distinct function

My original data is in Table2. I created Table1 from scratch. I populated Column A like this:
INSERT INTO Table1("item")
SELECT DISTINCT(Table2."item")
FROM Table2
I populated Table1.Totals (Column B) like this:
UPDATE Table1
SET totals = t2.q
FROM Table1 INNER JOIN
(
SELECT t2."item"
, SUM(t2.quantity) AS q
FROM t2
GROUP BY t2."item"
) AS t2
ON Table1."item" = t2."item"
How can I populate Table1."date"? My UPDATE above doesn't work here because I can't use an aggregate function on a date. I was able to get the results I wanted using the following code in a separate query:
SELECT DISTINCT Table1."item"
, Table2."date"
FROM Table1 INNER JOIN Table2
ON Table1."item" = Table2."item"
ORDER BY Table1."item"
But how do I use the results of this query to SET the value of the column? I'm using SQL Server 2008.
If you can't do the insert all over again, as #Lamak suggested, then you could perform an UPDATE this way:
UPDATE t1
SET t1.Date = s.Date
FROM Table1 AS t1
INNER JOIN
(
SELECT Item, [Date] = MAX([Date]) -- or MIN()
FROM Table2
GROUP BY Item
) AS s
ON t1.Item = s.Item;
For SQL Server you coul've use a single INSERT statement:
INSERT INTO Table1(Item, Totals, [Date])
SELECT Item, SUM(Quantity), MIN([Date]) -- It could be MAX([Date])
FROM Table2
GROUP BY Item
The easiest way is to use a simple CTAS (create table as select):
select item as item, SUM(quantity) as Q, MIN(date) as d into table2
from table1
group by item
Instead of creating a table, you could create a view, using a select statement like in #Lamak's answer. That way you wouldn't have to update the new row set each time the Table2 updates.

Resources