SQL query: for each project, find the project number, project name and the total number of hours that employees worked for this project and order the results by the total work hours.
Consider this example data:
[OP should edit this to make sure it reflects their actual circumstance]
DECLARE #Project TABLE (PNo INT IDENTITY, PName NVARCHAR(50))
DECLARE #WorksFor TABLE (WNo INT IDENTITY, WName NVARCHAR(50), Hours INT, PNo INT)
INSERT INTO #Project (Pname) VALUES
('SQL Refectoring'),('Unit Testing')
INSERT INTO #WorksFor (WName, Hours, PNO) VALUES
('Joe', 1, 1),('Joe', 5, 1),('Jim', 3, 1),('Bob', 2, 1),
('Joe', 1, 2),('Joe', 3, 2),('Joe', 7, 2),('Joe', 1, 2)
What is wrong with my query? Why does this SQL code not work ?
select Pname, Pnumber,
(select sum(Hours) from Worksfor group by (Pno)) as total_hours
from
Project P, Worksfor w
where
p.Pnumber = w.Pno
order by
w.Hours
using some example data:
DECLARE #Project TABLE (PNo INT IDENTITY, PName NVARCHAR(50))
DECLARE #WorksFor TABLE (WNo INT IDENTITY, WName NVARCHAR(50), Hours INT, PNo INT)
INSERT INTO #Project (Pname) VALUES
('SQL Refectoring'),('Unit Testing')
INSERT INTO #WorksFor (WName, Hours, PNO) VALUES
('Joe', 1, 1),('Joe', 5, 1),('Jim', 3, 1),('Bob', 2, 1),
('Joe', 1, 2),('Joe', 3, 2),('Joe', 7, 2),('Joe', 1, 2)
We can join these two tables together (properly, as the method you are trying to use is depreciated as mentioned in the comments).
SELECT PName, p.PNo, SUM(wf.Hours) AS hours
FROM #Project p
INNER JOIN #WorksFor wf
ON p.PNo = wf.PNo
GROUP BY PName, p.Pno
Here we are joining the Project table to the Works for table on the PNo column. Then we aggregate the hours up by PNo and PName to sum the hours:
PName PNo hours
-------------------------
SQL Refectoring 1 11
Unit Testing 2 12
Things to consider:
Table and columns names should be descriptive, there is little reason to abbreviate them into gibberish any longer. 'ThisIsTheProjectNameAsDefinedByProjectManagement' is obviously excessive but 'PMOOfficeProjectName' reasonable. Table names should also, usually, be pluralized as they represent one or many entities.
Alias' should always be defined and used when referring columns. This makes code easier to read and prevents ambiguity.
Related
I try to create dynamic forecast for 18(!) months depend on previous columns (months) and i am stuck:
I have three columns:
Stock
SafetyStock
Need for production - another select with clause WHERE date = getdate()
what i need to achieve:
Index, Stock- Current month, SafetyStock-Current month, Need for production (select * from Nfp where date = getdate()), Stock - Current month + 1, Safetystock - Current Month + 1, Need for Production - Current Month + 1 ... etc till 18 months
calculations:
Stock - Current month + 1 = Stock previous month + SafetyStock previous month - Needs for production of current month
there is any possibility to create something like this ? it has to be dynamic and get calculation for current date and next 18 months. So now i have to calculate from 2020-10 till let's say 2022-04
What i have tried:
I prepared 18 cte and joins everything. Then i do calculations - it works but it slow and i think it is not profesional.
I have tried to do dynamic sql, below you can see my code but i have stucked when i wanted to do computed column depended on previous computed column:
------------------- CODE -------------------------
if object_id('tempdb..#tmp') is not null
drop table #tmp
if object_id('tempdb..#tmp2') is not null
drop table #tmp2
declare #cols as int
declare #iteration as int
declare #Mth as nvarchar(30)
declare #data as date
declare #sql as nvarchar(max)
declare #sql2 as nvarchar(max)
set #cols = 18
set #iteration = 0
set #Mth = month(getdate())
set #data = cast(getdate() as date)
select
10 as SS,
12 as Stock
into #tmp
WHILE #iteration < #cols
begin
set #iteration = #iteration + 1
set #sql =
'
alter table #tmp
add [StockUwzgledniajacSS - ' + cast(concat(year(DATEADD(Month, #Iteration, #data)),'-', month(DATEADD(Month, #Iteration, #data))) as nvarchar(max)) +'] as (Stock - SS)
'
exec (#sql)
set #Mth= #Mth+ 1
set #sql2 =
'
alter table #tmp
add [StockUwzgledniajacSS - ' + #Mth +'] as ([StockUwzgledniajacSS - ' + #Mth +'])
'
end
select * from #tmp
thanks in advance!
Update 1 note: I wrote this before you posted your data. This still holds I believe but, of course, stock levels are way different. Given that your NFP data is by day, and your report is by month, I suggest adding something to preprocess that data into months e.g., sum of NPS values, grouped by month.
Update 2 (next day) note: From the OPs comments below, I've tried to integrate this with what was written and more directly answering the question e.g., creating a reporting table #tmp.
Given that the OP also mentions millions of rows, I imagine each row represents a specific part/item - I've included this as a field called StockNum.
I have done something that probably doesn't do your calculations properly, but demonstrates the approach and should get you over your current hurdle. Indeed, if you haven't used these before, then updating this code with your own calculations will help you to understand how it works so you can maintain it.
I'm assuming the key issue here for calculation is that this month's stock is based on last month's stock and then new stock minus old stock for this month.
It is possible to calculate this in 18 separate statements (update table set col2 = some function of col1, then update table set col3 = some function of col2, etc). However, updating the same table multiple times is often an anti-pattern causing poor performance - especially if you need to read the base data again and again.
Instead, something like this is often best calculated using a Recusive CTE (here's an example description), where it 'builds' a set of data based on previous results.
The key difference in this approach is that it
Creates the reporting table (without any data/calculations going in)
Calculates the data as a separate step - but with columns/fields that can be used to link to the reporting table
Inserts the data from calculations into the reporting table as a single insert statement.
I have used temporary tables/etc liberally, to help demonstrate the process.
You haven't explained what safety stock is, nor how you measure what's coming in, so for the example below, I have assumed safety stock is the amount produced and is 5 per month. I've then assumed that NFP is amount going out each month (e.g., forward estimates of sales). The key result will be stock at the end of month (e.g., which you could then review whether it's too high or too low).
As you want to store it in a table that has each month as columns, the first step is to create a list with the relevant buckets (months). These include fields used for matching in later calculations/etc. Note I have included some date fields (startdate and enddate) which may be useful when you customise the code. This part of the SQL is designed to be as straightforward as possible.
We then create the scratch table that has our reference data for stock movements, replacing your SELECT * FROM NFP WHERE date = getdate()
/* SET UP BUCKET LIST TO HELP CALCULATION */
CREATE TABLE #RepBuckets (BucketNum int, BucketName nvarchar(30), BucketStartDate datetime, BucketEndDate datetime)
INSERT INTO #RepBuckets (BucketNum) VALUES
(0),(1),(2),(3),(4),(5),(6),(7),(8),(9),(10),
(11),(12),(13),(14),(15),(16),(17),(18)
DECLARE #CurrentBucketStart date
SET #CurrentBucketStart = DATEFROMPARTS(YEAR(getdate()), MONTH(getdate()), 1)
UPDATE #RepBuckets
SET BucketName = 'StockAtEnd_' + FORMAT(DATEADD(month, BucketNum, #CurrentBucketStart), 'MMM_yy'),
BucketStartDate = DATEADD(month, BucketNum, #CurrentBucketStart),
BucketEndDate = DATEADD(month, BucketNum + 1, #CurrentBucketStart)
/* CREATE BASE DATA */
-- Current stock
CREATE TABLE #Stock (StockNum int, MonthNum int, StockAtStart int, SafetyStock int, NFP int, StockAtEnd int, PRIMARY KEY(StockNum, MonthNum))
INSERT INTO #Stock (StockNum, MonthNum, StockAtStart, SafetyStock, NFP, StockAtEnd) VALUES
(12422, 0, NULL, NULL, NULL, 10)
-- Simulates SELECT * FROM NFP WHERE date = getdate()
CREATE TABLE #NFP_by_month (StockNum int, MonthNum int, StockNFP int, PRIMARY KEY(StockNum, MonthNum))
INSERT INTO #NFP_by_month (StockNum, MonthNum, StockNFP) VALUES
(12422, 1, 4), (12422, 7, 4), (12422, 13, 4),
(12422, 2, 5), (12422, 8, 5), (12422, 14, 5),
(12422, 3, 2), (12422, 9, 2), (12422, 15, 2),
(12422, 4, 7), (12422, 10, 7), (12422, 16, 7),
(12422, 5, 9), (12422, 11, 9), (12422, 17, 9),
(12422, 6, 3), (12422, 12, 3), (12422, 18, 3)
We then use the recursive CTE to get calculate our data. It stores these in table #StockProjections.
What this does is
Start with your current stock (last row in the #Stock table). Note that the only value that matters in that is the stock at end of month.
Uses that stock level at the end of last month, as the stock level at the start of the new month
Adds the safety stock, minuses the NFP, and calculates your stock at end.
Note that within the recursive part of the CTE, 'SBM' (StockByMonth) refers to last month's data). This is then used with whatever external data (e.g., #NFP) to calculate new data.
These calculations create a table with
StockNum (the ID number of the relevant stock item - for this example, I've used one stock item 12422)
MonthNum (I've used integers this rather than dates, for clarity/simplicity)
BucketName (an nvarchar representing the month, used for column names)
Stock at start of month
Safety stock (which I assume is incoming stock, 5 per month)
NFP (which I assume is outgoing stock, varies by month and comes from a scratch table here - you'll need to adjust this to your select)
Stock at end of month
/* CALCULATE PROJECTIONS */
CREATE TABLE #StockProjections (StockNum int, BucketName nvarchar(30), MonthNum int, StockAtStart int, SafetyStock int, NFP int, StockAtEnd int, PRIMARY KEY (StockNum, BucketName))
; WITH StockByMonth AS
(-- Anchor
SELECT TOP 1 StockNum, MonthNum, StockAtStart, SafetyStock, NFP, StockAtEnd
FROM #Stock S
ORDER BY MonthNum DESC
-- Recursion
UNION ALL
SELECT NFP.StockNum,
SBM.MonthNum + 1 AS MonthNum,
SBM.StockAtEnd AS NewStockAtStart,
5 AS Safety_Stock,
NFP.StockNFP,
SBM.StockAtEnd + 5 - NFP.StockNFP AS NewStockAtEnd
FROM StockByMonth SBM
INNER JOIN #NFP_by_month NFP ON NFP.MonthNum = SBM.MonthNum + 1
WHERE NFP.MonthNum <= 18
)
INSERT INTO #StockProjections (StockNum, BucketName, MonthNum, StockAtStart, SafetyStock, NFP, StockAtEnd)
SELECT StockNum, BucketName, MonthNum, StockAtStart, SafetyStock, NFP, StockAtEnd
FROM StockByMonth
INNER JOIN #RepBuckets ON StockByMonth.MonthNum = #RepBuckets.BucketNum
Now we have the data, we set up a table for reporting purposes. Note that this table has the month names embedded into the column names (e.g., StockAtEnd_Jun_21). It would be easier to use a generic name (e.g., StockAtEnd_Month4) but I've gone for the slightly more complex case here for demonstration.
/* SET UP TABLE FOR REPORTING */
DECLARE #cols int = 18
DECLARE #iteration int = 0
DECLARE #colname nvarchar(30)
DECLARE #sql2 as nvarchar(max)
CREATE TABLE #tmp (StockNum int PRIMARY KEY)
WHILE #iteration <= #cols
BEGIN
SET #colname = (SELECT TOP 1 BucketName FROM #RepBuckets WHERE BucketNum = #iteration)
SET #sql2 = 'ALTER TABLE #tmp ADD ' + QUOTENAME(#colname) + ' int'
EXEC (#sql2)
SET #iteration = #iteration + 1
END
The last step is to add the data to your reporting table. I've used a pivot here but feel free to use whatever you like.
/* POPULATE TABLE */
DECLARE #columnList nvarchar(max) = N'';
SELECT #columnList += QUOTENAME(BucketName) + N' ' FROM #RepBuckets
SET #columnList = REPLACE(RTRIM(#columnList), ' ', ', ')
DECLARE #sql3 nvarchar(max)
SET #sql3 = N'
;WITH StockPivotCTE AS
(SELECT *
FROM (SELECT StockNum, BucketName, StockAtEnd
FROM #StockProjections
) StockSummary
PIVOT
(SUM(StockAtEnd)
FOR [BucketName]
IN (' + #columnList + N')
) AS StockPivot
)
INSERT INTO #tmp (StockNum, ' + #columnList + N')
SELECT StockNum, ' + #columnList + N'
FROM StockPivotCTE'
EXEC (#sql3)
Here's a DB<>fiddle showing it running with results of each sub-step.
I have text stored in the table "StructureStrings"
Create Table StructureStrings(Id INT Primary Key,String nvarchar(4000))
Sample Data:
Id String
1 Select * from Employee where Id BETWEEN ### and ### and Customer Id> ###
2 Select * from Customer where Id BETWEEN ### and ###
3 Select * from Department where Id=###
and I want to replace the "###" word with a values fetched from another table
named "StructureValues"
Create Table StructureValues (Id INT Primary Key,Value nvarcrhar(255))
Id Value
1 33
2 20
3 44
I want to replace the "###" token present in the strings like
Select * from Employee where Id BETWEEN 33 and 20 and Customer Id> 44
Select * from Customer where Id BETWEEN 33 and 20
Select * from Department where Id=33
PS: 1. Here an assumption is that the values will be replaced with the tokens in the same order i.e first occurence of "###" will be replaced by first value of
"StructureValues.Value" column and so on.
Posting this as a new answer, rather than editting my previous.
This uses Jeff Moden's DelimitedSplit8K; it does not use the built in splitter available in SQL Server 2016 onwards, as it does not provide an item number (thus no join criteria).
You'll need to firstly put the function on your server, then you'll be able to use this. DO NOT expect it to perform well. There's a lot of REPLACE in this, which will hinder performance.
SELECT (SELECT REPLACE(DS.Item, '###', CONVERT(nvarchar(100), SV.[Value]))
FROM StructureStrings sq
CROSS APPLY DelimitedSplit8K (REPLACE(sq.String,'###','###|'), '|') DS --NOTE this uses a varchar, not an nvarchar, you may need to change this if you really have Unicode characters
JOIN StructureValues SV ON DS.ItemNumber = SV.Id
WHERE SS.Id = sq.id
FOR XML PATH ('')) AS NewString
FROM StructureStrings SS;
If you have any question, please place the comments on this answer; do not put them under the question which has already become quite a long discussion.
Maybe this is what you are looking for.
DECLARE #Employee TABLE (Id int)
DECLARE #StructureValues TABLE (Id int, Value int)
INSERT INTO #Employee
VALUES (1), (2), (3), (10), (15), (20), (21)
INSERT INTO #StructureValues
VALUES (1, 10), (2, 20)
SELECT *
FROM #Employee
WHERE Id BETWEEN (SELECT MIN(Value) FROM #StructureValues) AND (SELECT MAX(Value) FROM #StructureValues)
Very different take here:
CREATE TABLE StructureStrings(Id int PRIMARY KEY,String nvarchar(4000));
INSERT INTO StructureStrings
VALUES (1,'SELECT * FROM Employee WHERE Id BETWEEN ### AND ###'),
(2,'SELECT * FROM Customer WHERE Id BETWEEN ### AND ###');
CREATE TABLE StructureValues (Id int, [Value] int);
INSERT INTO StructureValues
VALUES (1,10),
(2,20);
GO
DECLARE #SQL nvarchar(4000);
--I'm asuming that as you gave one output you are supplying an ID or something?
DECLARE #Id int = 1;
WITH CTE AS(
SELECT SS.Id,
SS.String,
SV.[Value],
LEAD([Value]) OVER (ORDER BY SV.Id) AS NextValue,
STUFF(SS.String,PATINDEX('%###%',SS.String),3,CONVERT(varchar(10),[Value])) AS ReplacedString
FROM StructureStrings SS
JOIN StructureValues SV ON SS.Id = SV.Id)
SELECT #SQL = STUFF(ReplacedString,PATINDEX('%###%',ReplacedString),3,CONVERT(varchar(10),NextValue))
FROM CTE
WHERE Id = #Id;
PRINT #SQL;
--EXEC (#SQL); --yes, I should really be using sp_executesql
GO
DROP TABLE StructureValues;
DROP TABLE StructureStrings;
Edit: Note that Id 2 will return NULL, as there isn't a value to LEAD to. If this needs to change, we'll need more logic on what the value should be if there is not value to LEAD to.
Edit 2: This was based on the OP's original post, not what he puts it as later. As it currently stands, it's impossible.
I'd would like to ask your help with a gordian knot in my head with regards to SQL Server. I'm trying to replace an UDF with a joined view, but I'm struggling to get the view to return what I need. A bit of clever ordering or so may well do the trick, I'm stuck.
Unfortunately I can't sign up to SQL fiddle at the moment so I have to present the test data here:
CREATE TABLE #Contacts
(
ID INT NOT NULL IDENTITY,
Firstname VARCHAR(50) NULL,
Lastname VARCHAR(50) NULL
)
GO
CREATE TABLE #Cars (ID INT NOT NULL IDENTITY, CarModel VARCHAR(50) NULL)
GO
CREATE TABLE #Ownership
(
Contacts_ID INT NOT NULL,
cars_id INT NOT NULL,
ownership_type TINYINT NOT NULL,
DisplayName VARCHAR(50) NULL
)
GO
CREATE TABLE #Races(ID INT NOT NULL IDENTITY, RaceName VARCHAR(50) NOT NULL)
GO
CREATE TABLE #RaceEntries
(
ID INT NOT NULL IDENTITY,
Races_ID INT NOT NULL,
Contacts_ID INT NOT NULL,
cars_id INT NOT NULL
)
INSERT [#Contacts] ([Firstname], [Lastname])
SELECT
'Justin', 'Case' UNION ALL SELECT
'Gladys', 'Friday' UNION ALL SELECT
'Mandy', 'Lifeboats'
GO
INSERT [#Cars] ([CarModel])
VALUES ('Great Car')
GO
INSERT [#Races] ([RaceName])
VALUES ('A Car Race')
INSERT [#Ownership] ([Contacts_ID], [cars_id], [ownership_type], [DisplayName]) SELECT
1, 1, 0, NULL UNION ALL SELECT
2, 1, 1, NULL UNION ALL SELECT
3, 1, 1, 'Mandy Lifeboats Racing Team'
INSERT [#RaceEntries] ([Races_ID], [Contacts_ID], [cars_id]) SELECT
1, 1, 1 UNION ALL SELECT
1, 3, 1 UNION ALL SELECT
1, 2, 1
What I'd like:
SELECT
[cars_id], mvo.Ownername
FROM
[#RaceEntries] -- join a view that returns the ownername
LEFT OUTER JOIN
#myViewOwnername AS mvo ON mvo.Contacts_ID = [#RaceEntries].[Contacts_ID] AND mvo.Cars_ID = [#RaceEntries].[cars_id]
The issue here is that every car only has one owner (type 0 in ownership). It can have other contacts as representatives.
Usually on lists for #RaceEntries, the owner's name is displayed, unless the representative has an agreed "an override" (so that his or a company name is displayed).
In the above example, for Justin Case's entry it's straight-forward. He is the owner (type 0), end of story.
When Gladys Friday enters (she doesn't have an override "DisplayName") the system should again return Justin Case's name as the owner.
In the last example, Mandy Lifeboats has a DisplayName and therefore that should be returned.
Ideally, I would end up with a view or similar that does the heavy lifting, so that I can join it to x000 records from #RaceEntries (joined on car and contact ID) to get the correct owner name back.
I hope I've simplified the example as much as possible, the real thing is a bit more complex... Please let me know if I should prepare anything else to make helping a bit easier. Many thanks!
I think this might be what you are looking for:
select [#Ownership].Contacts_ID,
[#Ownership].Cars_ID,
coalesce([DisplayName], CarOwners.ContactName) Ownername
into #myViewOwnername
from [#Ownership]
join
(
select [cars_id], Contacts_ID, [#Contacts].Firstname + ' ' + [#Contacts].Lastname ContactName
from [#Ownership]
join [#Contacts]
on [#Ownership].Contacts_ID = [#Contacts].ID
where ownership_type = 0
) CarOwners
on [#Ownership].cars_id = CarOwners.cars_id
A car has one direct owner (type 0), so the sub-query will get you all the owner names of each car. Then you just join that to your ownership table on the cars_id fields. If there is a display name in the ownership table you display that, otherwise you show the car owner's name.
I'd be surprised if this hasn't been asked before, but I haven't been able to find anything. Excel has a function
CHOOSE(n, x_1, x_2, x_3, ...)
which returns x_n for the given value of n.
Is there anything similar in SQL (either standard or MS-specific) supported by SQL Server 2008? I know it should really be implemented using a lookup table in the database, but for what I'm doing I'm not able to add new tables to the database.
I could create a temporary table and populate it from the SQL script, or use
CASE n WHEN 1 THEN x_1 WHEN 2 THEN x_2 WHEN 3 THEN x_3 ... END
but is there anything less cumbersome?
Unfortuantely, no it seems not to be the present in your version.
The CHOOSE-Function is only available since SQL Server 2012 and works quite the same as you describe the Excel-function.
"but for what I'm doing I'm not able to add new tables to the database". Well you always can use temporary table, table variable or, if it's really one time thing - derived table:
select
...,
l.v
from <your table> as t
left outer join (values
(1, x_1), (2, x_2), (3, x_3)
) as l(n, v) on l.n = t.n
Of course, you can always try to create your own choose() function:
create function dbo.f_Choose5(
#index int,
#value1 sql_variant,
#value2 sql_variant,
#value3 sql_variant,
#value4 sql_variant,
#value5 sql_variant
)
returns sql_variant
as
begin
return (
case #index
when 1 then #value1
when 2 then #value2
when 3 then #value3
when 4 then #value4
when 5 then #value5
end
)
end
select dbo.f_Choose5(3, 1, 2 ,3, 4, 5)
select dbo.f_Choose5(3, 1, 2 ,3, default, default)
but you have to keep in mind that scalar functions are not really optimized in SQL Server.
I am currently trying to re-write a stored procedure to take into account the normalisation of one of our tables. In the original procedure we have two tables:
CREATE TABLE #t_batch
(batch_id integer,
thread_group NVARCHAR(60),
dye_code_1 NVARCHAR(10),
dye_conc_1 NUMERIC(19, 7),
dye_code_2 NVARCHAR(10),
dye_conc_2 NUMERIC(19, 7),
dye_code_3 NVARCHAR(10),
dye_conc_3 NUMERIC(19, 7),
dye_code_4 NVARCHAR(10),
dye_conc_4 NUMERIC(19, 7),
dye_code_5 NVARCHAR(10),
dye_conc_5 NUMERIC(19, 7),
dye_code_6 NVARCHAR(10),
dye_conc_6 NUMERIC(19, 7))
CREATE TABLE #t_group
(group_id INTEGER IDENTITY(1, 1),
dye_code_1 NVARCHAR(10),
dye_conc_1 NUMERIC(19, 7),
dye_code_2 NVARCHAR(10),
dye_conc_2 NUMERIC(19, 7),
dye_code_3 NVARCHAR(10),
dye_conc_3 NUMERIC(19, 7),
dye_code_4 NVARCHAR(10),
dye_conc_4 NUMERIC(19, 7),
dye_code_5 NVARCHAR(10),
dye_conc_5 NUMERIC(19, 7),
dye_code_6 NVARCHAR(10),
dye_conc_6 NUMERIC(19, 7),
thread_group NVARCHAR(60),
num_batches INTEGER)
After a number of actions #t_batch was populated with a number of records. We then inserted data into #t_group in the following way:
INSERT INTO #t_group
(dye_code_1, dye_conc_1, dye_code_2, dye_conc_2, dye_code_3, dye_conc_3,
dye_code_4, dye_conc_4, dye_code_5, dye_conc_5, dye_code_6, dye_conc_6,
thread_group, num_batches)
SELECT dye_code_1, dye_conc_1, dye_code_2, dye_conc_2, dye_code_3, dye_conc_3,
dye_code_4, dye_conc_4, dye_code_5, dye_conc_5, dye_code_6, dye_conc_6,
thread_group, COUNT(batch_id_fk)
FROM #t_batch
GROUP BY dye_code_1, dye_conc_1, dye_code_2, dye_conc_2, dye_code_3, dye_conc_3,
dye_code_4, dye_conc_4, dye_code_5, dye_conc_5, dye_code_6, dye_conc_6,
thread_group
ORDER BY dye_code_1, dye_conc_1, dye_code_2, dye_conc_2, dye_code_3, dye_conc_3,
dye_code_4, dye_conc_4, dye_code_5, dye_conc_5, dye_code_6, dye_conc_6,
thread_group
So, we had a series of records that are grouped by the dye columns and a unique group_id for each unique combination of dyes and their concentrations. Also, there is a count of the batch records for each group.
However, since there is in reality no limit to the number of dyes for a batch the tables have been normalised:
CREATE TABLE #t_batch
(batch_id INTEGER,
thread_group NVARCHAR(60))
CREATE TABLE #t_batch_dye
(batch_id_fk INTEGER,
stage INTEGER,
sequence INTEGER,
dye_code NVARCHAR(10),
dye_conc NUMERIC(19,7))
CREATE TABLE #t_group
(group_id INTEGER IDENTITY(1, 1),
thread_group NVARCHAR(60),
num_batches INTEGER)
CREATE TABLE #t_group_dye
(group_id INTEGER,
stage INTEGER,
sequence INTEGER,
dye_code NVARCHAR(10),
dye_conc NUMERIC(19,7))
Now, my question is: assuming that we have #t_batch and #t_batch_dye populated and that there are a varying number of #t_batch_dye records for each record in #t_batch, how can I insert records into #t_group with a unique group_id for each unique combination of dyes and their concentrations as well as a count of the batches for each group?
Is this something I could use the PIVOT keyword for? The examples I have found on the web all seem to assume that the number of pivoted fields is known in advance.
Many thanks,
David
Glasgow, Scotland
Update:
What I have done is to use a function that returns a concatenated string of codes and concs and used that to group the data.
DECLARE #dyes NVARCHAR(2000)
SELECT #dyes = ISNULL(#dyes,'') + dye_code + ' ' + convert(nvarchar, requested_dye_conc) + ' '
FROM #t_batch_dye
WHERE batch_id_fk = #batch_id
ORDER BY dye_code ASC
You're correct in assuming that PIVOT and more traditional methods of cross-tab querying expect you to know how many columns you want in advance. At that point, you'll need to use some dynamic SQL to get what you're after:
Dynamic Pivoting in SQL Server
SQL Server dynamic PIVOT query?
Pivots with dynamic columns in SQL Server
A partial answer, and not an ideal one:
If you know that there will never be more than say 20 dye combinations, you can create another temp table with
select b.thread_group,
case when d.sequence=1 then d.dye_code end as code1,
case when d.sequence=1 then d.dye_conc end as conc1,
case when d.sequence=2 then d.dye_code end as code2,
case when d.sequence=2 then d.dye_conc end as conc2,
case when d.sequence=3 then d.dye_code end as code3,
case when d.sequence=3 then d.dye_conc end as conc3,
<lots of boring copy&paste...>
case when d.sequence=20 then d.dye_code end as code20,
case when d.sequence=20 then d.dye_conc end as conc20
from #t_batch t, #t_batch_dye d
where t.batch_id = d.batch_id
and then select your group out of that, using all of code1 to conc20. It's not beautiful, but it's clear. And I know it negates the whole point of normalising your tables out in the first place! Good luck.