Join columns to rows - sql-server

Suppose you have a table Table1 with columns
UserId, Item1, Item2, Item3, Item4, Item5, Item6, Item7, Item8, Item9, Item10
and you have another table Table2 with
UserId, ItemId, Name
. The values in Table1 is the ItemId from Table2. I have a need to display
UserId, ItemId, Name
where Item1 is 1st and Item10 is last and you have 10 rows. In other words, Item1 is 1st row and Item10 is last row. If there's any way to avoid CASE WHEN that would be great. I may have more columns in the future and would hate to hardcode the 10 columns.

I think you want a reverse pivot in this case. You don't use CASE, like you would in a normal pivot, but instead UNION ALL, like this:
select Table1.UserId, Table2.ItemId, Table2.Name
from Table1 inner join Table2 on Table1.Item1 = Table2.ItemId
UNION ALL
select Table1.UserId, Table2.ItemId, Table2.Name
from Table1 inner join Table2 on Table1.Item2 = Table2.ItemId
UNION ALL
...
select Table1.UserId, Table2.ItemId, Table2.Name
from Table1 inner join Table2 on Table1.Item10 = Table2.ItemId
If you have more items, you should also be able to write a snippet that generates the repeating UNION ALL syntax so you don't have to type it all by hand.

Given you can bypass doing it entirely with SQL, I would highly recommend using e.g. R or Python to process transactions in a ML useable way. The tidyr package with the gather function does exactly what you want to do.
Another way is to crosstabulate. It´s absolutely fine deriving a solution with the SQL standard, but a lot of problems can be much easier done within R or Python.

A table1 with just 3 columns
userid, itemid, sequence
would be more conducive for your purposes. You would be required to convert your AzureML output from the single line
Uid1, itm1,itm2,itm3,...,itm10
into 10 lines like
Uid1, itm1, 1
Uid1, itm2, 2
Uid1, itm3, 3
...
Uid1, itm10,10
Assuming you get the above output line as a (temporary) table output from AzureML with name tbla you could use the follwing UNION ALL construct (as suggested by Spencer Simpson):
INSERT INTO table1 (userid, itemid, sequence)
SELECT uid, itm1, 1 FROM tbla UNION ALL
SELECT uid, itm2, 2 FROM tbla UNION ALL
SELECT uid, itm3, 3 FROM tbla UNION ALL
SELECT uid, itm4, 4 FROM tbla UNION ALL
...
SELECT uid, itm10, 10 FROM tbla
To store the information into table1 which will be the only table you will have to deal with. No JOINs will be required anymore.
Note: I am not quite sure what your column name relates to. Is it the name of an item or the name of a user?
In both cases there should be a second table table2 that takes care of the correspondence between name and userid/itemid like
itm/usr name
This table will then be join-ed into any query that requires displaying the name column too.

What I did to work around this was to use Python (or R) and use the melt function.
There is also a pivot_table function in the dataframe.
So, you can have your columns be converted to rows. Then join those rows on the other table.
Reshaping and Pivot Tables

Related

How to do a looping select union in Postgres

I'm trying to do select from Table2 for each row in an initial select statement from Table1.
Initial select statement would look something like
Select * from Table1 Where GroupID='someId' AND ObjectID='someID'
Table1 would return something like the following (there will only ever be 0-1 returned rows with a null EndDate value).
Table2 will look as follows
Basically my goal is to loop through the first select statements rows. For each row I want to take the objectid, startdate, and enddate to select all the appropriate rows in Table2. Then after do a UNION(?) on all the data I've selected from Table2. I want to be able to select all the rows in Table2 which's objectid matches the objectid given and the timestamp is in-between the start/end dates (or to current timestamp if enddate is null). Does that make sense? I've been looking online, but have not found a way to achieve a looping select union like this.
I'm not sure if this is possible in a single select statement or if a stored proc is needed/cleaner (either is fine but I'd rather avoid stored proc if I can).
You can do it with a join:
SELECT t2.*
FROM Table2 t2 INNER JOIN Table1 t1
ON t1.objectid = t2.objectid
AND t2.timestamp BETWEEN t1.startdate AND COALESCE(t1.enddate, current_timestamp)
WHERE t1.GroupID = 'someId' AND t1.ObjectID = 'someID'

Can you set multiple column names as a macro in SQL to query against?

Can you set multiple column names from a SQL table as a macro in SQL to query against?
For example I have multiple columns I am hitting against multiple times, can I use a macro or some type of reference to identify them ONCE to avoid displaying them repetitively and cluttering up the code?
The current code works, I am just looking for a cleaner/streamlined option.
Current Code:
WHERE ('ABC') IN
([CODE1],[CODE2],[CODE3],[CODE4],[CODE5],[CODE6],[CODE7],[CODE8]
,[CODE9],[CODE10],[CODE11],[CODE12],[CODE13],[CODE14],[CODE15]
,[CODE16],[CODE17],[CODE18],[CODE19],[CODE20],[CODE21],[CODE22]
,[CODE23],[CODE24],[CODE25]
AND ('CFS') IN
([CODE1],[CODE2],[CODE3],[CODE4],[CODE5],[CODE6],[CODE7],[CODE8]
,[CODE9],[CODE10],[CODE11],[CODE12],[CODE13],[CODE14],[CODE15]
,[CODE16],[CODE17],[CODE18],[CODE19],[CODE20],[CODE21],[CODE22]
,[CODE23],[CODE24],[CODE25]
ect...(20 more times)
Goal:
WHERE 'ABC' IN (&columnsmentionedabove)
OR 'FGS' in (&columnsmentionedabove)
OR 'g6s' in (&columnsmentionedabove)
etc.....
This is inherited code and just seems very clunky.
Thank you
Numbered columns like this are almost always a sign you should have an additional table. So if your existing table structure is like this:
Table1
Table1ID, OtherFields, Code1, Code2, Code3.... Code25
You really want something more like this:
Table1
Table1ID, OtherFields
Table1Codes
Table1ID, Code
Where each entry in Table1 will have many entries in Table1Codes. Then you write JOIN statements to show the two sets side-by-side when needed.
FROM Table1 t
INNER JOIN Table1Codes tc1 ON tc.Table1ID = t.Table1ID AND tc.Code = 'ABC'
INNER JOIN Table1Codes tc2 ON tc.Table1ID = t.Table1ID AND tc.Code = 'CFS'
Or
FROM Table1 t
INNER JOIN Table1Codes tc1 ON tc.Table1ID = t.Table1ID AND tc.Code IN ('ABC','FGS','g6s')
If you can't change the table's schema, as in often the case, you can UNPIVOT it. For example, assuming CODE1...CODE25 come from MyTable, wrap the UNPIVOT operation inside a CTE:
;WITH
cte AS
(
SELECT upvt.*
FROM MyTable
UNPIVOT (
CodeValue FOR CodeLabel IN ([CODE1], [CODE2], ..., [CODE25])
) upvt
)
SELECT *
FROM cte
WHERE CodeValue IN ('ABC', 'DEF', ...)
The unpivot operation is not free. Make sure you filter as much as possible from MyTable before unpivoting the it.

Alternative of UNION in sql server

I have 2 tables which contains 5 unique cities each. I want all 10 cities but i don't want to use UNION. Is there any alternative for UNION.
SELECT DISTINCT CITY FROM TABLE1
UNION
SELECT DISTINCT CITY FROM TABLE2
Here is an alternate way
SELECT DISTINCT CASE WHEN a.city is null then b.city else a.city end
FROM Table1 FULL JOIN Table2b ON 1 = 0
it offers no advantage over UNION - but you might be interested in seeing FULL JOIN, which has its similarities to UNION
You can apply Full Outer join instead of Union
SELECT DISTINCT ISNULL(t.City,t1.City)
FROM dbo.TABLE1 t
FULL OUTER JOIN dbo.TABLE2 t1 ON t.City = t.City;
This query provides you the same result as union
You can insert the data that you want into a temporary table and retrieve it from there. That will avoid the need for a UNION.
SELECT DISTINCT CITY
INTO #City
FROM TABLE1
INSERT INTO #City
SELECT DISTINCT CITY
FROM TABLE2
SELECT DISTINCT City
FROM #City
If the first table is sure to contains all the records of the second table, then one can check if the id could be found inside a subquery with an OR clause.
I'm using an ORM framework which doesn't support the UNION operator (Apache OJB) and, with the above assumption, this strategy has proven to be faster than with the use of FULL OUTER JOIN.
For instance if the table STUDENT contains all the students of a province/state with a field for their current main school and another table, STUDENT_SECONDARY_SCHOOL, contains information for those students attending a second school part time, I can get the union of all students attending a particular school either full time or part time this way :
SELECT STD_ID FROM STUDENT
WHERE
STD_SCHOOL='the_school'
OR
STD_ID IN (SELECT STD_ID FROM STUDENT_SECONDARY_SCHOOL WHERE STD_SCHOOL='the_school')
Again, I want to emphasize that this is NOT the equivalent of a UNION but can be useful in some situations.

Distinct count over two tables. SQL

I'm very new to SQL, I apologize if something doesn't make sense!
I have two tables each of which has a column 'client_nbr'. Some of the client_nbrs will overlap in the two tables. I'm needing to count the number of people with a certain value in column 'age' that is in both tables. For example, the results should have something like
age - 5 count - 3,000
And that will only count a client number once, even if it is in both tables.
When I do this for one table I run:
Select age, count(distinct(client_nbr))
From table1
Group by age
I tried to follow the example here: http://www.sqlservercurry.com/2011/07/sql-server-distinct-count-multiple.html?m=1
Using:
Select table1.age,table2.age,
Count(distinct(table1.client_nbr)) as total
From table1,table2
Where table1.client_nbr=table2.client_nbr
Group by table1.age,table2.age
It didn't work out though. The total count was less than when I run a distinct count on just table1.
Thank you in advance!
Try this instead:
SELECT age, COUNT(DISTINCT client_nbr) AS Total
FROM
(
SELECT age, client_nbr FROM table1
UNION ALL
SELECT age, client_nbr FROM table2
) AS t
GROUP BY age
You are using an implicit inner join in your query meaning only the values contained in both tables are returned. Use an outer join to get all the values in both tables
Select table1.age,table2.age,
Count(distinct(table1.client_nbr)) as total
From table1 FULL OUTER JOIN table2 ON table1.age = table2.age
Group by table1.age,table2.age

Oracle tables in one view

I have 2 tables in an oracle database, which has the same column-names and types.
For example:
Table1: id, name, comment
Table2: id, name, comment
How can I show all data from both table in one view?
If you want 4 separate columns, simply use aliases, like you would any other select.
create or replace view vw_my_view as
select t1.id t1_id
,t1.comment t1_comment
,t2.id t2_id
,t2.comment t2_comment
from table1 t1
inner join table2 t2 on join condition
where filter conditions
EDIT Of course, your tables will relate to each other in some way, otherwise there is no way for a single row of your view to mean anything. You will therefore have a join condition to join the two tables, such as t1.id = t2.id
If you want them in two columns, use Union
create or replace view vw_my_view as
select id
,comment
from table1
union all -- use ALL unless you want to lose rows
select id
,comment
from table2;
Why two identical tables? Whatever happened to "Don't Repeat Yourself"? Sorry, sounds like a bad design smell to me.
Whatever difference inspired you to create two tables, I'll bet it really could be another attribute to distinguish two groups in one table.
SELECT * FROM TABLE1 UNION SELECT * FROM TABLE2
(or UNION ALL if you want duplicates)
I agree with what duffymo says, but if there is a good reason for it then a UNION will do it for you. e.g.
SELECT id, name, comment FROM Table1
UNION
SELECT id, name, comment FROM Table2
select * from table1
union
select * from table2;

Resources