MSSQL: Joining multiple tables with dynamic values - sql-server

I'm struggling to get following 3 tables into one query:
tPerson
ID FirstName
1 'Jack'
2 'Liz'
tAttribute
ID AttributeName
101 'LastName'
102 'Gender'
tData
PersonID AttributeID AttributeValue
1 101 'Nicholson'
1 102 'Male'
2 101 'Taylor'
2 102 'Female'
Important: The attributes in tAttribute are dynamic. There could be more, e.g.:
ID AttributeName
103 'Income'
104 'MostPopularMovie'
Question: How can I write my query (or queries if neccessary), so that I get following output:
PersonID FirstName LastName Gender [otherFields]
1 'Jack' 'Nicholson' 'Male' [otherValues]
2 'Liz' 'Taylor' 'Female' [otherValues]
I often read "What have you tried so far?", but posting all my failed attempts using subqueries and joins wouldn't make much sense. I'm just not that secure with SQL.
Many thanks in advance.

Thanks to #Tab Alleman, I google for "SQL PIVOT" and came up with following result:
SELECT PersonID,
FirstName,
[LastName],
[Gender]
FROM (
SELECT tPerson.ID AS PersonID,
tPerson.FirstName,
tAttribute.AttributeName,
tData.AttributeValue
FROM tAttribute
INNER JOIN tData ON (
tAttribute.ID = tData.AttributeID
)
INNER JOIN tPerson ON (
tData.PersonID = tPerson.ID
)
) AS unPivotResult
PIVOT (
MAX(AttributeValue)
FOR AttributeName IN ([LastName],[Gender])
) AS pivotResult
Addition: I didn't know how to get LastName and Gender dynamically via SQL, so I did that with ColdFusion, which I use for programming. It will look like this:
<!--- "local.attributes" gets generated by making another query,--->
<!--- I just wrote it statically here for this example --->
<cfset local.attributes = "[LastName],[Gender]" />
<cfquery name="local.persons">
SELECT PersonID,
FirstName,
#local.attributes#
FROM (
...
) AS unPivotResult
PIVOT (
MAX(AttributeValue)
FOR AttributeName IN (#local.attributes#)
) AS pivotResult
</cfquery>
It'd be cool, if I could replace the ColdFusion part with something like
SELECT AttributeName FROM tAttribute and then use that to get the brackets-definition.

Related

how to create a table join on elements in an Array in Google BigQuery

I have some data, contact_IDs, that are in an Array of Structs inside a table called deals as in the example below.
WITH deals AS (
Select "012345" as deal_ID,
[STRUCT(["abc"] as company_ID, [123,678,810] as contact_ID)]
AS associations)
SELECT
deal_ID,
ARRAY(
SELECT AS STRUCT
( SELECT STRING_AGG(CAST(id AS STRING), ', ')
FROM t.contact_ID id
) AS contact_ID
FROM d.associations t
) AS contacts
FROM deals d
The query above takes the contact_IDs in the associations array and list them separated by commas.
Row deal_ID contacts.contact_ID
1 012345 123, 678, 810
But my problem now is that I need to replace the contact_IDs with with first and last names from another table called contacts that looks like the following where contact_ID is INT64 and the name fields are Strings.
contact_id first_name last_name
123 Jane Doe
678 John Smith
810 Alice Acre
I've attempted doing it with a subquery like this:
WITH deals AS (
Select "012345" as deal_ID,
[STRUCT(["abc"] as company_ID, [123,678,810] as contact_ID)]
AS associations)
SELECT
deal_ID,
ARRAY(
SELECT AS STRUCT
company_ID,
( SELECT STRING_AGG(
(select concat(c.first_name, " ", c.last_name)
from contacts c
where c.contact_id=id), ', ')
FROM t.contact_ID id
) AS contact_name
FROM d.associations t
) AS contacts
FROM deals d
But this gives an error "Correlated subqueries that reference other tables are not supported unless they can be de-correlated, such as by transforming them into an efficient JOIN." But I can't figure out how to make a join between deals.associations.contact_ID and contacts.contact_id when the thing I need to be joining on is inside the deals.associations array...
Thanks in advance for any guidance.
Below is for BigQuery Standard SQL
#standardSQL
SELECT deal_ID,
ARRAY_AGG(STRUCT(company_ID, contact_name)) AS contacts
FROM (
SELECT
deal_ID,
ANY_VALUE(company_ID) AS company_ID,
STRING_AGG(FORMAT('%s %s', IFNULL(first_name, ''), IFNULL(last_name, '')), ', ') AS contact_name
FROM deals d,
d.associations AS contact,
contact.contact_ID AS contact_ID
LEFT JOIN contacts c
USING(contact_ID)
GROUP BY deal_ID, FORMAT('%t', company_ID)
)
GROUP BY deal_ID
if applied to sample data from your question - output is
Row deal_ID contacts.company_ID contacts.contact_name
1 012345 abc Jane Doe, John Smith, Alice Acre
Note - below
FROM deals d,
d.associations AS contact,
contact.contact_ID AS contact_ID
is a shortcut for
FROM deals,
UNNEST(associations) AS contact,
UNNEST(contact_ID) AS contact_ID
Somehow - this is my preference when possible not to use explicit UNNEST() in the query text

Using a CTE to split results across a CROSS APPLY

I have some data that I need to output as rows containing markup tags, which I'm doing inside a table valued function.
This has been working fine up to a point using code in the format below, using the search query to gather my data, and then inserting into my returned table using the output from results.
I now need to take a longer data field and split it up over a number of rows, and I'm at something of a loss as to how to achieve this.
I started with the idea that I wanted to use a CTE to process the data from my query, but I can't see a way to get the data from my search query into my CTE and from there into my results set.
I guess I can see an alternative way of doing this by creating another table valued function in the database that returns a results set if I feed it my comment_text column, but it seems like a waste to do it that way.
Does anyone see a route through to a solution?
Example "Real" Table:
DECLARE #Comments TABLE
(
id INT NOT NULL IDENTITY PRIMARY KEY CLUSTERED,
comment_date DATETIME NOT NULL,
first_name VARCHAR(50) NOT NULL,
last_name VARCHAR(50) NOT NULL,
comment_title VARCHAR(50) NOT NULL,
comment_text char(500)
);
Add Comment Rows:
INSERT INTO #Comments VALUES(CURRENT_TIMESTAMP, 'Bob', 'Example','Bob''s Comment', 'Text of Bob''s comment.');
INSERT INTO #Comments VALUES(CURRENT_TIMESTAMP, 'Alice', 'Example','Alice''s Comment', 'Text of Alice''s comment that is much longer and will need to be split over multiple rows.');
Format of returned results table:
DECLARE #return_table TABLE
(
comment_date DATETIME,
commenter_name VARCHAR(101),
markup VARCHAR(100)
);
Naive query (Can't run because the variable comment_text in the SplitComment CTE can't be identified.
WITH SplitComment(note,start_idx) AS
(
SELECT '<Note>'+SUBSTRING(comment_text,0,50)+'</Note>', 0
UNION ALL
SELECT '<Text>'+SUBSTRING(note,start_idx,50)+'</Text>', start_idx+50 FROM SplitComment WHERE (start_idx+50) < LEN(note)
)
INSERT INTO #return_table
SELECT results.* FROM
(
SELECT
comment_date,
CAST(first_name+' '+last_name AS VARCHAR(101)) commenter,
comment_title,
comment_text
FROM #Comments
) AS search
CROSS APPLY
(
SELECT comment_date, commenter, '<title>'+comment_title+'</title>' markup
UNION ALL SELECT comment_date, commenter, SplitComment
) AS results;
SELECT * FROM #return_table;
Results (when the function is run without the CTE):
comment_date commenter_name markup
2017-07-07 11:53:57.240 Bob Example <title>Bob's Comment</title>
2017-07-07 11:53:57.240 Alice Example <title>Alice's Comment</title>
Ideally, I'd like to get one additional row for Bob's comment, and two rows for Alice's comment. Something like this:
comment_date commenter_name markup
2017-07-07 11:53:57.240 Bob Example <title>Bob's Comment</title>
2017-07-07 11:53:57.240 Bob Example <Note>Bob's Comment</Note>
2017-07-07 11:53:57.240 Alice Example <title>Alice's Comment</title>
2017-07-07 11:53:57.240 Alice Example <Note>Text of Alice''s comment that is much longer and w</Note>
2017-07-07 11:53:57.240 Alice Example <Text>ill need to be split over multiple rows.</Text>
May be you are looking for something like this (it' a simplified version, I used only first name and comment_date as "identifier").
I tested it using this data and - for the moment - imaging max len 50 to split text column.
Tip: change comment_text datatype to VARCHAR(500)
DECLARE #Comments TABLE
(
id INT NOT NULL IDENTITY PRIMARY KEY CLUSTERED,
comment_date DATETIME NOT NULL,
first_name VARCHAR(50) NOT NULL,
last_name VARCHAR(50) NOT NULL,
comment_title VARCHAR(50) NOT NULL,
comment_text VARCHAR(500)
);
INSERT INTO #Comments VALUES(CURRENT_TIMESTAMP, 'Bob', 'Example','Bob''s Comment', 'Text of Bob''s comment.');
INSERT INTO #Comments VALUES(CURRENT_TIMESTAMP, 'Alice', 'Example','Alice''s Comment'
, 'Text of Alice''s comment that is much longer and will need to be split over multiple rows aaaaaa bbbbbb cccccc ddddddddddd eeeeeeeeeeee fffffffffffff ggggggggggggg.');
WITH CTE AS (SELECT comment_date, first_name, '<Note>'+CAST( SUBSTRING(comment_text, 1, 50) AS VARCHAR(500)) +'</Note>'comment_text, 1 AS RN
FROM #Comments
UNION ALL
SELECT A.comment_date, A.first_name, '<Text>'+CAST( SUBSTRING(A.comment_text, B.RN*50+1, 50) AS VARCHAR(500)) +'</Text>'AS comment_text, B.RN+1 AS RN
FROM #Comments A
INNER JOIN CTE B ON A.comment_date=B.comment_date AND A.first_name=B.first_name
WHERE LEN(A.comment_text) > B.RN*50+1
)
SELECT A.comment_date, A.first_name, '<title>'+ comment_title+'</title>' AS markup
FROM #Comments A
UNION ALL
SELECT B.comment_date, B.first_name, B.comment_text AS markup
FROM CTE B ;
Output:
comment_date first_name markup
2017-07-07 14:30:51.117 Bob <title>Bob's Comment</title>
2017-07-07 14:30:51.117 Alice <title>Alice's Comment</title>
2017-07-07 14:30:51.117 Bob <Note>Text of Bob's comment.</Note>
2017-07-07 14:30:51.117 Alice <Note>Text of Alice's comment that is much longer and wi</Note>
2017-07-07 14:30:51.117 Alice <Text>ll need to be split over multiple rows aaaaaa bbbb</Text>
2017-07-07 14:30:51.117 Alice <Text>bb cccccc ddddddddddd eeeeeeeeeeee fffffffffffff g</Text>
2017-07-07 14:30:51.117 Alice <Text>gggggggggggg.</Text>
Here's a solution that also allows sorting the resultset
It uses a recursive CTE to calculate the positions in the long text.
And by joining the table to the CTE, the text can be sliced up into rows.
with cte as
(
select id, 1 as lvl, len(comment_text) as posmax, 1 pos1, 50 limit
from #Comments
union all
select id, lvl + 1, posmax, iif(pos1+limit<posmax,pos1+limit,posmax), limit
from cte
where pos1+limit<posmax
)
, CTE2 AS
(
select id, 0 as lvl,
comment_date,
concat(first_name,' ',last_name) as commenter,
'<Title>'+rtrim(comment_title)+'</Title>' as markup
from #Comments
union all
select t.id, c.lvl,
comment_date,
concat(first_name,' ',last_name) as commenter_name,
concat(iif(lvl=1,'<Note>','<Text>'),substring(comment_text,pos1,limit),iif(lvl=1,'</Note>','</Text>')) as markup
from #Comments t
join cte c on c.id = t.id
)
select comment_date, commenter, markup
from CTE2
order by id, lvl;
Output:
comment_date commenter markup
----------------------- ------------ -------------------------------------
2017-07-07 15:06:31.293 Bob Example <Title>Bob's Comment</Title>
2017-07-07 15:06:31.293 Bob Example <Note>Text of Bob's comment.</Note>
2017-07-07 15:06:31.293 Alice Example <Title>Alice's Comment</Title>
2017-07-07 15:06:31.293 Alice Example <Note>Text of Alice's comment that is much longer and wi</Note>
2017-07-07 15:06:31.293 Alice Example <Text>ll need to be split over multiple rows.</Text>

listagg data to useable format?

This is my first time working with the LISTAGG function and I'm confused. I can select the data easily enough, but the characters of the USERS column all have spaces in between them, and when trying to copypaste it, no data from that column is copied. I've tried with two different IDEs. Am I doing something wrong?
Example:
select course_id, listagg(firstname, ', ') within group (order by course_id) as users
from (
select distinct u.firstname, u.lastname, u.student_id, cm.course_id
from course_users cu
join users u on u.pk1 = cu.users_pk1
join course_main cm on cm.pk1 = cu.crsmain_pk1
and cm.course_id like '2015SP%'
)
group by course_id;
Yields:
I had similar problem, it turned out that the problem was with encoding. I got this solved like this (change to another encoding if needed):
...listagg(convert(firstname, 'UTF8', 'AL16UTF16'), ', ')...
Your firstname column seems to be defined as nvarchar2:
with t as (
select '2015SP.BOS.PPB.556.A'as course_id,
cast('Alissa' as nvarchar2(10)) as firstname
from dual
union all select '2015SP.BOS.PPB.556.A'as course_id,
cast('Dorothea' as nvarchar2(10)) as firstname
from dual
)
select course_id, listagg(firstname, ', ')
within group (order by course_id) as users
from t
group by course_id;
COURSE_ID USERS
-------------------- ------------------------------
2015SP.BOS.PPB.556.A
... and I can't copy/paste the users values from SQL Developer either, but it displays with spaces, as you can see from SQL*Plus:
COURSE_ID USERS
-------------------- ------------------------------
2015SP.BOS.PPB.556.A A l i s s a, D o r o t h e a
As the documentation says, the listagg() function always returns varchar2 (or raw), so passing in an nvarchar2 value causes an implicit conversion which is throwing out your results.
If you're stuck with your column being of that data type, you could cast it to varchar2 inside the listagg call:
column users format a30
with t as (
select '2015SP.BOS.PPB.556.A'as course_id,
cast('Alissa' as nvarchar2(10)) as firstname
from dual
union all select '2015SP.BOS.PPB.556.A'as course_id,
cast('Dorothea' as nvarchar2(10)) as firstname
from dual
)
select course_id, listagg(cast(firstname as varchar2(10)), ', ')
within group (order by course_id) as users
from t
group by course_id;
COURSE_ID USERS
-------------------- ------------------------------
2015SP.BOS.PPB.556.A Alissa, Dorothea
But you probably don't really want it to be nvarchar2 at all.
Apparently it's a known (unresolved?) bug in 11. TO_CHAR() worked for me...
SELECT wiporderno, LISTAGG(TO_CHAR(medium), ',') WITHIN GROUP(ORDER BY wiporderno) AS jobclassification
...where medium was the problematic column/data type.

Select Different Column Value for Row with Max Value

I'm hoping for a cleaner way to do something that I know how to do one way. I want to retrieve the UserId for the MAX ID value as well as that MAX ID value. Let's say I have a table with data like this:
ID UserId Value
1 10 'Foo'
2 15 'Blah'
3 10 'Blech'
4 20 'Qwerty'
I want to retrieve:
ID UserId
4 20
I know I could do this like so:
SELECT
t.ID,
t.UserID
FROM
(
SELECT MAX(ID) as [MaxID]
FROM table
) as m
JOIN table as t ON m.MaxID = t.ID
I'm only vaguely familiar with the ROW_NUMBER(), RANK() and other similar methods and I can't help believing that this scenario could benefit from some such method to get rid of joining back to the table.
You can definitely use ROW_NUMBER for something like this:
with t1Rank as
(
select *
, t1Rank = row_number() over (order by ID desc)
from t1
)
select ID, UserID
from t1Rank
where t1Rank = 1
SQL Fiddle with demo.
The advantage with this approach is you can bring Value (or other fields as required) into the result set, too. Plus you can tweak the ordering/grouping as required.
You could also just do it with a sub-query like this:
SELECT ID ,
UserID
FROM table
WHERE ID = ( SELECT MAX(ID)
FROM table
);
SELECT TOP 1 ID, UserID FROM <table> ORDER BY ID DESC

Getting filtered results with subquery

I have a table with something like the following:
ID Name Color
------------
1 Bob Blue
2 John Yellow
1 Bob Green
3 Sara Red
3 Sara Green
What I would like to do is return a filtered list of results whereby the following data is returned:
ID Name Color
------------
1 Bob Blue
2 John Yellow
3 Sara Red
i.e. I would like to return 1 row per user. (I do not mind which row is returned for the particular user - I just need that the [ID] is unique.) I have something already that works but is really slow where I create a temp table adding all the ID's and then using a "OUTER APPLY" selecting the top 1 from the same table, i.e.
CREATE TABLE #tb
(
[ID] [int]
)
INSERT INTO #tb
select distinct [ID] from MyTable
select
T1.[ID],
T2.[Name],
T2.Color
from
#tb T1
OUTER APPLY
(
SELECT TOP 1 * FROM MyTable T2 WHERE T2.[ID] = T1.[ID]
) AS V2
DROP TABLE #tb
Can somebody suggest how I may improve it?
Thanks
Try:
WITH CTE AS
(
SELECT ROW_NUMBER() OVER (PARTITION BY ID ORDER BY ID) AS 'RowNo',
ID, Name, Color
FROM table
)
SELECT ID,Name,color
FROM CTE
WHERE RowNo = 1
or
select
*
from
(
Select
ID, Name, Color,
rank() over (partition by Id order by sum(Name) desc) as Rank
from
table
group by
ID
)
HRRanks
where
rank = 1
If you're using SQL Server 2005 or higher, you could use the Ranking functions and just grab the first one in the list.
http://msdn.microsoft.com/en-us/library/ms189798.aspx

Resources