How do I do a SQL Group By with an aggregate? - sql-server

I have a pair of linked tables, the first with job information and the second with material information (each job has 1-10 materials). I am also joining a couple other tables (with customer info and additional job info). I want a total sum of all material of the same type within the same job. Thus job 78100 has three entries of the same material with quantities of 800, 800, and 900. I want the final table to say "78100, etc. . . 2500."
I get the results I want with the following code but I also need the material name, which is in another table that is joined only to CJF.
SELECT
OJ.JobN, C.CustomerName, OJ.JobDescription, OJ.Quantity,
OJ.JobCloseDate, CJ.FSC,
(SELECT SUM(CJF.MT_NETSHEETS)
FROM CT_JobForm CJF
WHERE CJF.JobN = OJ.JobN)
FROM
OpenJob OJ
JOIN
Customer C ON OJ.CustomerN = C.CustomerN
JOIN
CT_Job2 CJ ON OJ.JobN = CJ.JobN
WHERE
JobDescription LIKE '%FSC%'
AND JobCloseDate BETWEEN '01-01-2016' AND '12-31-2016'
ORDER BY
OJ.JobN ASC
I can get the material name using this code, but doing so causes be to lose the grouping (the new code is inside the double-asterisks).
SELECT
OJ.JobN, C.CustomerName, OJ.JobDescription, OJ.Quantity,
OJ.JobCloseDate, CJ.FSC,
(SELECT SUM(CJF.MT_NETSHEETS)
FROM CT_JobForm CJF
WHERE CJF.JobN = OJ.JobN),
**(SELECT MCC.MCCDescription
FROM MatlCostCntr MCC
WHERE CJF.MCCN = MCC.MCCN)**
FROM
OpenJob OJ
JOIN
Customer C ON OJ.CustomerN = C.CustomerN
JOIN
CT_Job2 CJ ON OJ.JobN = CJ.JobN
**JOIN
CT_JobForm CJF ON CJF.JobN = OJ.JobN**
WHERE
JobDescription LIKE '%FSC%'
AND JobCloseDate BETWEEN '01-01-2016' AND '12-31-2016'
ORDER BY
OJ.JobN ASC
I tried nesting the MCCN Select statement in the Netsheet one but couldn't make that work at all. Is there a way to do this that isn't more work than just manually fixing it in Excel?

Related

SQL-Server join issue when filtering on a freetext (vchar) column

I am stuck with a SQL Server view I am trying to create. The view returns a list of resources that are assigned to a project along with a few other details such as contract information.
I'm having issues with the resource_contracts table though because I'm stuck dealing with what is essentially a free-text field.
SELECT DISTINCT
CONCAT(RTRIM(res.first_name),' ', RTRIM(res.surname)) AS fullname ,
res.main_res_id ,
res.resource_id ,
res.resource_typ ,
res.status ,
rel.rel_value ,
asn.booking_project AS project ,
asn.booking_project_descr AS project_descr ,
asn.assignment_position AS position ,
asn.date_from AS commencement_date ,
DATEADD(DAY,1,asn.date_to) AS end_date ,
con.comment_fx
FROM resourcees res
INNER JOIN resource_relations rel
ON
res.main_res_id = rel.resource_id
AND rel.date_to >= CAST(CURRENT_TIMESTAMP AS DATE)
AND res.client = rel.client
LEFT OUTER JOIN resource_relations cc
ON
res.client = cc.client
AND res.resource_id = cc.resource_id
AND cc.rel_attr_id = 'C1'
AND res.date_to BETWEEN cc.date_from AND cc.date_to
AND cc.status = 'N'
INNER JOIN relation_values ar2
ON
cc.rel_value = ar2.dim_value
AND ar2.client = res.client
INNER JOIN assignments asn
ON
res.main_res_id = asn.resource_id
LEFT OUTER JOIN resource_contracts con
ON
con.dim_value = res.main_res_id
AND res.client = con.client
AND con.comment_fx LIKE '%CONAU%'
AND con.date_to_fx >= asn.date_to
WHERE
asn.booking_project = '123456'
ORDER BY
fullname
I guess the above looks fairly large. It's the last join causing the issue for reference.
The resource_contracts table contacts three columns. I hate this setup, but it's outside of my control unfortunately.
date_from_fx = DATETIME
date_to_fx = DATETIME
comment_fx = VCHAR 255
It's used to record contracts date from and date to, and a free text field that could contain anything annoyingly. Sample values might be "CONAU SPP" or "CONSG ABC" etc..
I'm stuck on the comment_fx field above however.
I specifically want to see contracts containing CONAU, or return a NULL value if they do not have one that meets the date require, or do not have a row at all. Unfortunately this logic is getting mixed up any other contract they have such as "CONSG ABC"
No matter what join I apply, I either can see all the resources with the contract required, or duplicate rows with null values and a mix of the non-applicable contracts. I guess I am missing something simple
Ultimately I need to produce a list of resources that are assigned to a project, but do not have the required contract (CONAU), that list will trigger another process that I've already sorted out.
Updated:
Let me show you the data that gets returned if we removed the resource_contracts table causing my issues:
Data result
Apologies, I couldn't format the table into something that resembled a table to paste here.
Here is the data from the contracts table:
contracts table
There's kind of multiple things I want to do here but I'll simplify it into a single one.
I'm trying to send a parameter to the query, 'CONAU' for example. So it will return all resources that do NOT have a valid row containing the string CONAU.
Problem is, they might have other rows like CONSG, or no rows at all.
During my attempts I would often get the wrong rows to show or when using ISNULL in the SELECT part, I would get null rows and duplicated data.
The conditions could be inverted but I'm trying to learn this myself.
SQL fiddle too: http://sqlfiddle.com/#!18/7558f/2
WHERE
asn.booking_project = '123456'
AND con.comment_fx LIKE '%CONAU%'
To the trailing end.

SqlServer Many to Many AND

I have 3 (hypothetical) tables.
Photos (a list of photos)
Attributes (things describing the photos)
PhotosToAttributes (a table to link the first 2)
I want to retrieve the Names of all the Photos that have a list of attributes.
For example, all photos that have both dark lighting and are portraits (AttributeID 1 and 2). Or, for example, all photos that have dark lighting, are portraits and were taken at a wedding (AttributeID 1 and 2 and 5). Or any arbitrary number of attributes.
The scale of the database will be maybe 10,000 rows in Photos, 100 Rows in Attributes and 100,000 rows in PhotosToAttributes.
This question: SQL: Many-To-Many table AND query is very close. (I think.) I also read the linked answers about performance. That leads to something like the following. But, how do I get Name instead of PhotoID? And presumably my code (C#) will build this query and adjust the attribute list and count as necessary?
SELECT PhotoID
FROM PhotosToAttributes
WHERE AttributeID IN (1, 2, 5)
GROUP by PhotoID
HAVING COUNT(1) = 3
I'm a bit database illiterate (it's been 20 years since I took a database class); I'm not even sure this is a good way to structure the tables. I wanted to be able to add new attributes and photos at will without changing the data access code.
It is probably a reasonable way to structure the database. An alternate would be to keep all the attributes as a delimited list in a varchar field, but that would lead to performance issues as you search the field.
Your code is close, to take it to the final step you should just join the other two tables like this:
Select p.Name, p.PhotoID
From Photos As p
Join PhotosToAttributes As pta On p.PhotoID = pta.PhotoID
Join Attributes As a On pta.AttributeID = a.AttributeID
Where a.Name In ('Dark Light', 'Portrait', 'Wedding')
Group By p.Name, p.PhotoID
Having Count(*) = 3;
By joining the Attributes table like that it means you can search for attributes by their name, instead of their ID.
For first create view from your joins:
create view vw_PhotosWithAttributes
as
select
p.PhotoId,
a.AttributeID,
p.Name PhotoName,
a.Name AttributeName
from Photos p
inner join PhotosToAttributes pa on p.PhotoId = pa.PhotoId
inner join Attributes a on a.AttributeID = pa.AttributeID
You can easy ask for attribute, name, id but don't forget to properly index field.

SQL SUM() function with parameters returned by query for each row

First of all, sorry for that weird title. Here is the thing:
I work for a online shop, which sells products on amazon. Since we sell sets of different items, it happens that we send the same item within multiple sets to amazon fba. To give out the total sum of one item in all sets, I wrote the following query:
SELECT
SUM(nQuantity)
AS [total]
FROM [amazon_fba]
INNER JOIN (SELECT
[cArtNr]
FROM [tArtikel]
INNER JOIN (SELECT
[kStueckliste]
FROM [tStueckliste]
WHERE [kArtikel] = (SELECT
[kArtikel]
FROM [tArtikel]
WHERE [cHAN] = 12345)) [bar]
ON [tArtikel].[kStueckliste] = [bar].[kStueckliste]) [foo]
ON [amazon_fba].[cSellerSKU] = [foo].[cArtNr]
The cHAN=12345 part is just used to pick one specific item for which we want to know the total number of items. This query itself works fine, so this is not the problem.
However, I also know that all products that are part of sets have [tArtikel].[kStueckliste]=0, which -in theory- makes identifying them pretty easy. Which got me to the idea, that I could use this query to instantly generate a list of all these products with their respective total, like:
kArtikel | total
=================
01234 | 23
56789 | 42
So basically I needed something like
foreach (
select [kArtikel]
from [tArtikel]
where [tArtikel].[kStueckliste]=0
) do (
< the query I made >
)
Thus I tried the following statement:
SELECT
SUM(nQuantity)
AS [total]
FROM [amazon_fba]
INNER JOIN (SELECT
[cArtNr]
FROM [tArtikel]
INNER JOIN (SELECT
[kStueckliste]
FROM [tStueckliste]
INNER JOIN (SELECT
[kArtikel]
FROM [tArtikel]
WHERE [tArtikel].[tStueckliste] = 0) [baz]
ON [tStueckliste].[kArtikel] = [baz].[kArtikel]) [bar]
ON [tArtikel].[kStueckliste] = [bar].[kStueckliste]) [foo]
ON [amazon_fba].[cSellerSKU] = [foo].[cArtNr]
This did not -as I hoped- return a list of sums, but instead gave me the total sum of all sums I wanted to create.
Since I am pretty new to SQL (about two weeks in maybe), I have neither any idea what to do, nor where my mistake is, NOR what phrasing I should use to google my way around -thus that wierd Title of this post. So if anyone could help me with that and/or point me into the right direction I'd be really happy :)
I write MySQL rather than SQL but I believe it's very similar other than a few functions and syntaxes. Here's what I think should work for you:
select am.cArtNr, sum(am.nQuantity) as total
from amazon_fba am
join tArtikel ar on ar.cArtNr=am.cArtNr
join tStueckliste st on st.kStueckliste=ar.kStueckliste
where ar.kStueckliste=0
group by am.cArtNr;
Adding the group by will do the split out by articles, but reducing the number of brackets (in this instance derived tables) will speed up the query provided you're using indexes. Again, this is how I would do it in MySQL, and the only other query language I have experience in is BigQuery which won't help here.

Multi join issue

*EDIT** Thanks for all the input, and sorry for late reply. I have been away during the weekend without access to internet. I realized from the answers that I needed to provide more information, so people could understand the problem more throughly so here it comes:
I am migrating an old database design to a new design. The old one is a mess and very confusing ( I haven't been involved in the old design ). I've attached a picture of the relevent part of the old design below:
The table called Item will exist in the new design as well, and it got all columns that I need in the new design as well except one and it is here my problem begin. I need the column which I named 'neededProp' to be associated( with associated I mean like a column in the new Item table in the new design) with each new migrated row from Item.
So for a particular eid in table Environment there can be n entries in table Item. The "corresponding" set exists in table Room. The only way to know which rows that are associated in Item and Room are with the help of the columns "itemId" and "objectId" in the respective table. So for example for a particular eid there might be 100 entries in Item and Room and their "itemId" and "objectId" can be values from 1 to 100, so that column is only unique for a particular eid ( or baseSeq which it is called in table BaseFile).
Basically you can say that the tables Environment and BaseFile reminds of each other and the tables Item and Room reminds of each other. The difference is that some tables lack some columns and other may have some extra. I have no idea why it is designed like this from the beginning.
My question is if someone can help me with creating a query so that I can be able to find out the proper "neededProp" for each row in the Item-table so I can get that data into the new design?
*OLD-PART**This might be a trivial question but I can't get it to work as I want. I want to join a few tables as in the sql-statement below. If I start like this and run this query
select * from Environment e
join items ei on e.eid = ei.eid
I get like 400000 rows which is what I want. However if I add one more line so it looks like this:
select * from Environment e
join items ei on e.eid= ei.eid
left join Room r on e.roomnr = r.roomobjectnr
I get an insane amount of rows so there must be some multiplication going on. I want to get the same amount of rows ( like 400000 in this case ) even after joining the third table. Is that possible somehow? Maybe like creating a temporary view with the first 2 rows.
I am using MSSQL server.
So without knowing what data you have in your second query it's very difficult to say exactly how to write this out, and you're likely having a problem where there's an additional column that you are joining to in Rooms that perhaps you have forgotten such as something indicating a facility or hallway perhaps where you have multiple 'Room 1' entries as an example.
However, to answer your question regarding another way to write this out without using a temp table I've crufted up the below as an example of using a common table expression which will only return one record per source row.
;WITH cte_EnvironmentItems AS (
SELECT *
FROM Environment E
INNER JOIN Items I ON I.eid = E.eid
), cte_RankedRoom AS (
SELECT *
,ROW_NUMBER() OVER (ORDER BY R.UpdateDate DESC) [RN]
FROM Room R
)
SELECT *
FROM cte_EnvironmentItems E
LEFT JOIN cte_RankedRoom R ON E.roomnr = R.roomobjectnr
AND R.RN = 1
btw,do you want column from room table.if no then
select * from Environment e
join items ei on e.eid= ei.eid
where e.roomnr in (select r.roomobjectnr from Room r )
else
select * from Environment e
join items ei on e.eid= ei.eid
left join (select distinct roomobjectnr from Room) r on e.roomnr = r.roomobjectnr

How can I convert a view containing a START WITH...CONNECT BY sub-query to SQL Server?

I am trying to convert a view from an Oracle RDBMS to SQL Server. The view looks like:
create or replace view user_part_v
as
select part_region.part_id, users.id as users_id
from part_region, users
where part_region.region_id in(select region_id
from region_relation
start with region_id = users.region_id
connect by parent_region_id = prior region_id)
Having read about recursive CTE's and also about their use in sub-queries, my best guess at translating the above into SQL Server syntax is:
create view user_part_v
as
with region_structure(region_id, parent_region_id) as (
select region_id
, parent_region_id
from region_relation
where parent_region_id = users.region_id
union all
select r.region_id
, r.parent_region_id
from region_relation r
join region_structure rs on rs.parent_region_id = r.region_id
)
select part_region.part_id, users.id as users_id
from part_region, users
where part_region.region_id in(select region_id from region_structure)
Obviously this gives me an error about the reference to users.region_id in the CTE definition.
How can I achieve the same result in SQL Server as I get from the Oracle view?
Background
I am working on the conversion of a system from running on an Oracle 11g RDMS to SQL Server 2008. This system is a relatively large Java EE based system, using JPA (Hibernate) to query from the database.
Many of the queries use the above mentioned view to restrict the results returned to those appropriate for the current user. If I cannot convert the view directly then the conversion will be much harder as I will need to change all of the places where we query the database to achieve the same result.
The tables referenced by this view have a structure similar to:
USERS
ID
REGION_ID
REGION
ID
NAME
REGION_RELATIONSHIP
PARENT_REGION_ID
REGION_ID
PART
ID
PARTNO
DESCRIPTION
PART_REGION
PART_ID
REGION_ID
So, we have regions, arranged into a hierarchy. A user may be assigned to a region. A part may be assigned to many regions. A user may only see the parts assigned to their region. The regions reference various geographic regions:
World
Europe
Germany
France
...
North America
Canada
USA
New York
...
If a part, #123, is assigned to the region USA, and the user is assigned to the region New York, then the user should be able to see that part.
UPDATE: I was able to work around the error by creating a separate view that contained the necessary data, and then have my main view join to this view. This has the system working, but I have not yet done thorough correctness or performance testing yet. I am still open to suggestions for better solutions.
I reformatted your original query to make it easier for me to read.
create or replace view user_part_v
as
select part_region.part_id, users.id as users_id
from part_region, users
where part_region.region_id in(
select region_id
from region_relation
start with region_id = users.region_id
connect by parent_region_id = prior region_id
);
Let's examine what's going on in this query.
select part_region.part_id, users.id as users_id
from part_region, users
This is an old-style join where the tables are cartesian joined and then the results are reduced by the subsequent where clause(s).
where part_region.region_id in(
select region_id
from region_relation
start with region_id = users.region_id
connect by parent_region_id = prior region_id
);
The sub-query that's using the connect by statement is using the region_id from the users table in outer query to define the starting point for the recursion.
Then the in clause checks to see if the region_id for the part_region is found in the results of the recursive query.
This recursion follows the parent-child linkages given in the region_relation table.
So the combination of doing an in clause with a sub-query that references the parent and the old-style join means that you have to consider what the query is meant to accomplish and approach it from that direction (rather than just a tweaked re-arrangement of the old query) to be able to translate it into a single recursive CTE.
This query also will return multiple rows if the part is assigned to multiple regions along the same branch of the region heirarchy. e.g. if the part is assigned to both North America and USA a user assigned to New York will get two rows returned for their users_id with the same part_id number.
Given the Oracle view and the background you gave of what the view is supposed to do, I think what you're looking for is something more like this:
create view user_part_v
as
with user_regions(users_id, region_id, parent_region_id) as (
select u.users_id, u.region_id, rr.parent_region_id
from users u
left join region_relation rr on u.region_id = rr.region_id
union all
select ur.users_id, rr.region_id, rr.parent_region_id
from user_regions ur
inner join region_relation rr on ur.parent_region_id = rr.region_id
)
select pr.part_id, ur.users_id
from part_region pr
inner join user_regions ur on pr.region_id = ur.region_id;
Note that I've added the users_id to the output of the recursive CTE, and then just done a simple inner join of the part_region table and the CTE results.
Let me break down the query for you.
select u.users_id, u.region_id, rr.parent_region_id
from users u
left join region_relation rr on u.region_id = rr.region_id
This is the starting set for our recursion. We're taking the region_relation table and joining it against the users table, to get the starting point for the recursion for every user. That starting point being the region the user is assigned to along with the parent_region_id for that region. A left join is done here and the region_id is pulled from the user table in case the user is assigned to a top-most region (which means there won't be an entry in the region_relation table for that region).
select ur.users_id, rr.region_id, rr.parent_region_id
from user_regions ur
inner join region_relation rr on ur.parent_region_id = rr.region_id
This is the recursive part of the CTE. We take the existing results for each user, then add rows for each user for the parent regions of the existing set. This recursion happens until we run out of parents. (i.e. we hit rows that have no entries for their region_id in the region_relationship table.)
select pr.part_id, ur.users_id
from part_region pr
inner join user_regions ur on pr.region_id = ur.region_id;
This is the part where we grab our final result set. Assuming (as I do from your description) that each region has only one parent (which would mean that there's only one row in region_relationship for each region_id), a simple join will return all the users that should be able to view the part based on the part's region_id. This is because there is exactly one row returned per user for the user's assigned region, and one row per user for each parent region up to the heirarchy root.
NOTE:
Both the original query and this one do have a limitation that I want to make sure you are aware of. If the part is assigned to a region that is lower in the heirarchy than the user (i.e. a region that is a descendent of the user's region like the part being assigned to New York and the user to USA instead of the other way around), the user won't see that part. The part has to be assigned to either the user's assigned region, or one higher in the region heirarchy.
Another thing is that this query still exhibits the case I mentioned above about the original query, where if a part is assigned to multiple regions along the same branch of the heirarchy that multiple rows will be returned for the same combination of users_id and part_id. I did this because I wasn't sure if you wanted that behavior changed or not.
If this is actually an issue and you want to eliminate the duplicates, then you can replace the query below the CTE with this one:
select p.part_id, u.users_id
from part p
cross join users u
where exists (
select 1
from part_region pr
inner join user_regions ur on pr.region_id = ur.region_id;
where pr.part_id = p.part_id
and ur.users_id = u.users_id
);
This does a cartesian join between the part table and the users table and then only returns rows where the combination of the two has at least one row in the results of the subquery, which are the results that we are trying to de-duplicate.

Resources