SQL join with same if exists, if not, any - sql-server

I need to update a column with the prices of products. The product can be identical as the order or can be similar to the order.
When te product is identical, it's easy.
But when the product is similar, not all the characteristics are equal to the order table, and I don't know how to do the match.
So far, I've writed a query like this:
Update #SelledProducts
Set S.Price=O.Price
FROM
#SelledProducts S, #OrdersWithPrice O
WHERE S.MandatoryCharacteristic1=O.MandatoryCharacteristic1
AND S.MandatoryCharacteristic2=O.MandatoryCharacteristic2
AND S.MandatoryCharacteristic3=O.MandatoryCharacteristic3
--Te following is wrong:
...
AND S.OptionalCharacteristic1=O.OptionalCharacteristic1
AND S.OptionalCharacteristic2=O.OptionalCharacteristic2
But of course it's not working when the OptionalCharacteristics are not equal. With optional characteristic, I mean:
The order can have a red box, but if there are no red boxes, there can be any color boxes for the same order.
How can I achive this? I'm using SQL Server 2008.
Thanks

You can use OR Condition instead of AND in your query for optional Characteristics.
Please find below example for it.
Update #SelledProducts
Set S.Price=O.Price
FROM
#SelledProducts S, #OrdersWithPrice O
WHERE S.MandatoryCharacteristic1=O.MandatoryCharacteristic1
AND S.MandatoryCharacteristic2=O.MandatoryCharacteristic2
AND S.MandatoryCharacteristic3=O.MandatoryCharacteristic3
--Te following is wrong:
...
AND ( S.OptionalCharacteristic1=O.OptionalCharacteristic1
OR S.OptionalCharacteristic2=O.OptionalCharacteristic2)

Related

MSSQL select query with prioritized OR

I need to build one MSSQL query that selects one row that is the best match.
Ideally, we have a match on street, zip code and house number.
Only if that does not deliver any results, a match on just street and zip code is sufficient
I have this query so far:
SELECT TOP 1 * FROM realestates
WHERE
(Address_Street = '[Street]'
AND Address_ZipCode = '1200'
AND Address_Number = '160')
OR
(Address_Street = '[Street]'
AND Address_ZipCode = '1200')
MSSQL currently gives me the result where the Address_Number is NOT 160, so it seems like the 2nd clause (where only street and zipcode have to match) is taking precedence over the 1st. If I switch around the two OR clauses, same result :)
How could I prioritize the first OR clause, so that MSSQL stops looking for other results if we found a match where the three fields are present?
The problem here isn't the WHERE (though it is a "problem"), it's the lack of an ORDER BY. You have a TOP (1), but you have nothing that tells the data engine which row is the "top" row, so an arbitrary row is returned. You need to provide logic, in the ORDER BY to tell the data engine which is the "first" row. With the rudimentary logic you have in your question, this would like be:
SELECT TOP (1)
{Explicit Column List}
realestates
WHERE Address_Street = '[Street]'
AND Address_ZipCode = '1200'
ORDER BY CASE Address_Number WHEN '160' THEN 1 ELSE 2 END;
You can't prioritize anything in the WHERE clause. It always results in ALL the matching rows. What you can do is use TOP or FETCH to limit how many results you will see.
However, in order for this to be effective, you MUST have an ORDER BY clause. SQL tables are unordered sets by definition. This means without an ORDER BY clause the database is free to return rows in any order it finds convenient. Mostly this will be the order of the primary key, but there are plenty of things that can change this.

Groupby and count() with alias and 'normal' dataframe: python pandas versus mssql

Coming from a SQL environment, I am learning some things in Python Pandas. I have a question regarding grouping and aggregates.
Say I group a dataset by Age Category and count the different categories. In MSSQL I would write this:
SELECT AgeCategory, COUNT(*) AS Cnt
FROM TableA
GROUP BY AgeCategory
ORDER BY 1
The result set is a 'normal' table with two columns, the second column I named Count.
When I want to do the equivalent in Pandas, the groupby object is different in format. So now I have to reset the index and rename the column in a following line. My code would look like this:
grouped = df.groupby('AgeCategory')['ColA'].count().reset_index()
grouped.columns = ['AgeCategory', 'Count']
grouped
My question is if this can be accomplished in one go. Seems like I am over-doing it, but I lack experience.
Thanks for any advise.
Regards, M.
Use parameter name in DataFrame.reset_index:
grouped = df.groupby('AgeCategory')['ColA'].count().reset_index(name='Count')
Or:
grouped = df.groupby('AgeCategory').size().reset_index(name='Count')
Difference is GroupBy.count exclude missing values, GroupBy.size not.
More information about aggregation in pandas.

Using expressions for a value in Paramaters

I have a report that returns various products depending on which product group you select. Most of these products all have similar product codes that allow me to use the LIKE operator to get the required results. However, for one particular product group, I have the following problem:
VSAMPLES
VSAMPLES2016
VSAMPLES2016DD
VSAMPLESADD
VSAMPLESET
VSAMPLESLARGE
VSAMPLESLARGEADD
VSAMPLESNEW
I only need the top two products to be listed. But using 'VSAMPLES% as a parameter value will return all of these products.
Can i write an expression for the parameter value that will use 'VSAMPLES% and 'VSAMPLES2016% to only return these two products?
EDIT
The query is:
SELECT STRC_CODE, STRC_DESC FROM DeFactoUser.F_ST_Products
WHERE STRC_CODE LIKE #ProductCode
I am using LIKE so I don't have to specify dozens of products for each group.
For one Parameter value I am using 'PA.A% This works perfectly because every product starting with PA.A is needed. In the case of VSAMPLES this isn't the case.
Parameter Values are as follows:
So, can I not add a value to the Aspire tab that will return only those two products?
OK, What i was asking might not have been possible. i fixed the issue by altering my query.
SELECT STRC_CODE, STRC_STATUS, STRC_DESC FROM DeFactoUser.F_ST_Products
WHERE STRC_CODE LIKE #ProductCode AND STRC_CODE NOT IN ('VSAMPLES2016DD',
'VSAMPLESADD', 'VSAMPLESET', 'VSAMPLESLARGE', 'VSAMPLESLARGEADD',
'VSAMPLESNEW')
This results in only the two products I needed being returned when i use VSAMPLES% as a value.
Much simpler then I thought.
Thanks for the input into the question I asked.

Relational database structure design advice

This is a textual description of data for which I need to create a database design (using SQLite) for an application.
The application needs to keep a record of operations. Each operation has a Name and its list of parameters. Each parameter has its Name and a Value. However, the values of the parameters will change over the lifetime of the app (in fact the user will be able to changes them using GUI) and we want to keep a history of the values which a certain parameter has had. Furthermore, each operation can have multiple parameter sets. A parameter set is like an envelope which encompasses a set of parameter values (which all belong to the same operation) and gives this envelope a unique Number and a non-unique Description.
This is what I have so-far:
[Database model image][1]
The database model should allow me to perform these actions on the database data:
Show a list of operations - I know how to do this.
Show a list of parameters for a given operation - I know how to do this.
For a given operation, show all its parameters as columns and show the values of the parameters as rows - each row represents a different parameter value from the history of values. I'm stuck at this one.
For a given operation, show a list of all parameter sets which belong to that operation. I'm stuck at this one too.
For a given operation and for a given parameter set, get the latest values of its parameters. Stuck at this.
I'm not sure if I should re-work my database model or if I should look for proper SQL statements to accomplish the tasks above with the model that I have. Any help is greatly appreciated. Thank you.
EDIT 1
I have re-worked my database model according to a helpful advice from #Marek Herman. Thanks to that I am able to accomplish tasks 1) 2) 4).
Now I'm trying to accomplish 5) which should not be that difficult with the current database model. I have this SQL statement:
SELECT Parameter.ParameterIdentifier, ParameterValue.ParameterValue,
ParameterValueVersion.VersionNumber, ParameterValueVersion.ChangedOn
FROM ParameterValueVersion INNER JOIN
(((Operation INNER JOIN Parameter ON Operation.OperationPLC_ID = Parameter.OperationPLC_ID)
INNER JOIN ParameterSet ON Operation.OperationPLC_ID = ParameterSet.OperationPLC_ID)
INNER JOIN ParameterValue ON (ParameterSet.ID = ParameterValue.ParameterSetID) AND
(Parameter.ID = ParameterValue.ParameterID)) ON ParameterValueVersion.ID = ParameterValue.ParameterValueVersionID
WHERE (Operation.OperationPLC_ID=[opID] AND
ParameterSet.ParameterSetNumber=[parSetNum]);
where [opID] and [parSetNum] are the input parameters. This SQL statement actually only joins all these tables together on their PK->FK relationship: Operation, Parameter, ParameterSet, ParameterValue, ParameterValueVersion and filters the rows by specified OperationPLC_ID and ParameterSetNumber.
Here is an example of an output of this SQL statement. Each row shows a name of a parameter, its value, a version number of the value and date of change of that value. Some parameters only have one value (only one version -e.g., "OFFSET"). Some parameters have two values. For example "PREFILLING" has a value of "3" which was input on Oct 20, 2016 (and has a version number 1) and it also has a value of "3.5" which was input on Oct 21, 2016 and has a version number of 2. So I'd like to show only the latest versions of the values of the parameters. Any advice how to modify the SQL statement is much appreciated. Thank you.
EDIT 2
I guess I figured out how to perform 5). I had to study a bit how GROUP BY works. This did the trick:
SELECT Parameter.ParameterIdentifier, last(ParameterValue.ParameterValue) AS ParameterValue, last(ParameterValueVersion.ChangedOn) AS ChangedOn, max(ParameterValueVersion.VersionNumber) AS VersionNumber
FROM ParameterValueVersion INNER JOIN
(((Operation INNER JOIN Parameter ON Operation.OperationPLC_ID = Parameter.OperationPLC_ID)
INNER JOIN ParameterSet ON Operation.OperationPLC_ID = ParameterSet.OperationPLC_ID)
INNER JOIN ParameterValue ON (ParameterSet.ID = ParameterValue.ParameterSetID) AND
(Parameter.ID = ParameterValue.ParameterID)) ON ParameterValueVersion.ID = ParameterValue.ParameterValueVersionID
WHERE (((Operation.OperationPLC_ID)=[opID]) AND ((ParameterSet.ParameterSetNumber)=[parSetNum]))
GROUP BY Parameter.ParameterIdentifier
ORDER BY Parameter.ParameterIdentifier
Now I still need to figure out how to perform task no. 3. I'm gonna study the suggested COALESCE function. Thank you.
0) I would connect ParameterSet to Operation and Parameter and not to ParameterValue.
1) okay!
2) okay!
3) I think you can use the COALESCE() function to display the columns and then it should be possible to show all parameters with matching OperationID
4) you can do that if you do point #0
5) same as above I think

Determining Difference Between Items On-Hand and Items Required per Project in Access 2003

I'm usually a PHP programmer, but I'm currently working on a project in MS Access 2003 and I'm a complete VBA newbie. I'm trying to do something that I could easily do in PHP but I have no idea how to do it in Access. The facts are as follows:
Tables and relevant fields:
tblItems: item_id, on_hand
tblProjects: project_id
tblProjectItems: project_id, item_id
Goal: Determine which projects I could potentially do, given the items on-hand.
I need to find a way to compare each project's required items against the items on-hand to determine if there are any items missing. If not, add the project to the list of potential projects. In PHP I would compare an array of on-hand items with an array of project items required, using the array_diff function; if no difference, add project_id to an array of potential projects.
For example, if...
$arrItemsOnHand = 1,3,4,5,6,8,10,11,15
$arrProjects[1] = 1,10
$arrProjects[2] = 8,9,12
$arrProjects[3] = 7,13
$arrProjects[4] = 1,3
$arrProjects[5] = 2,14
$arrProjects[6] = 2,5,8,10,11,15
$arrProjects[7] = 2,4,5,6,8,10,11,15
...the result should be:
$arrPotentialProjects = 1,4
Is there any way to do this in Access?
Consider a single query to reach your goal: "Determine which projects I could potentially do, given the items on-hand."
SELECT
pi.project_id,
Count(pi.item_id) AS NumberOfItems,
Sum(IIf(i.on_hand='yes', 1, 0)) AS NumberOnHand
FROM
tblProjectItems AS pi
INNER JOIN tblItems AS i
ON pi.item_id = i.item_id
GROUP BY pi.project_id
HAVING Count(pi.item_id) = Sum(IIf(i.on_hand='yes', 1, 0));
That query computes the number of required items for each project and the number of those items which are on hand.
When those two numbers don't match, that means at least one of the required items for that project is not on hand.
So the HAVING clause excludes those rows from the query result set, leaving only rows where the two numbers match --- those are the projects for which all required items are on hand.
I realize my description was not great. (Sorry.) I think it should make more sense if you run the query both with and without the HAVING clause ... and then read the description again.
Anyhow, if that query gives you what you need, I don't think you need VBA array handling for this. And if you can use that query as your form's RecordSource or as the RowSource for a list or combo box, you may not need VBA at all.

Resources