SQL Server Full traceability of a product - sql-server

I'm creating a program that manages a manufacturing plant, and I need to show each product's traceability (all the paths it took from creation until it's final delivery.
Let me show you an example:
The plant creates the document A001, with quantity 400.
Then, they need to split the product, creating documents B002 and B003, both with quantity 200 and both with their Parent field value of A001.
After that, they'll split B002 into smaller pieces. That creates documents C004, C005, C006 and C007, all with quantity 50 and all with the Parent field value B002.These smaller pieces can also be split again...
Now, if I wanted to trace the full cycle of document B002, I'd check the Parent field and cross it with the document field to get that info, and then get the documents where the Parent field is B002. That's the "easy" part.
Now the tricky part.
I want to know the full cycle of document C007. I'd have to check his parent, and get the B002 document, THEN have to get that document's Parent and get the A001 document. I'd also check for documents with Parent C007 and find none.
Or know the full cycle of document A001. I'd check if there was any Parent (there won't be), they i'd have to get all the documents with Parent A001, then get all documents with Parent B002 and B003 and so on.
Is there any function on SQL that let's me do this, or do I have to create a procedure that recurs itself over and over to check for both parents and childs? And if so, I have no idea what to do, so any help would be appreciated.

Basically you ask for something simple that has been done thousands of times - find the root of a tree.
There are various approaches to that, among other things a special data type (HierarchyId) that supports that right in SQL Server.
https://msdn.microsoft.com/en-us/library/bb677290.aspx
is the documentation for this.
That said, you likely will use a normal field as ID - and then the best approach is a stored procedure.
http://vyaskn.tripod.com/hierarchies_in_sql_server_databases.htm
has some thoughts about it - as has google tons of it (there are various approaches to query them).
http://blog.sqlauthority.com/2012/04/24/sql-server-introduction-to-hierarchical-query-using-a-recursive-cte-a-primer/
is from quite a reputable source and using a CTE Like this:
WITH MyCTE
AS ( SELECT EmpID, FirstName, LastName, ManagerID
FROM Employee
WHERE ManagerID IS NULL
UNION ALL
SELECT EmpID, FirstName, LastName, ManagerID
FROM Employee
INNER JOIN MyCTE ON Employee.ManagerID = MyCTE.EmpID
WHERE Employee.ManagerID IS NOT NULL )
SELECT *
FROM MyCTE

Related

Oracle/mySQL database

I'm kinda new in databases and I need to solve this one:
client(NAME, CITY, STREET, NUMBER, BALANCE, CLIENT_CODE);
order(ORDER_NUMBER, CLIENT_CODE, PROVIDER_NAME, FUEL, QTY, DATE);
provider(PROVIDER_NAME, PROVIDER_ADRESS, FUEL_CODE, PRICE, STOCK);
fuel(FUEL_CODE, FUEL_NAME);
I've already tried (and succeeded I guess) to create an ALTER TABLE for provider and fuel named provider_fuel in order to solve many-to-many relationship, but I don't have any idea how I could create a connection between order and provider.
I've changed the entities like this:
provider(PROVIDER_ID, PROVIDER_ADRESS);
fuel(FUEL_CODE, FUEL_NAME);
provider_fuel(ID_ENTRY, ID_PROVIDER, ID_FUEL, PRICE, STOCK).
Is that even ok? If so, how can I make connections between all those entities in my database?
I have to mention that my app should allow clients to place orders from providers who have their specific fuels, prices and so on.
I would suggest the following table layouts:
address(ID_ADDRESS, CITY, STREET, NUMBER);
client(ID_CLIENT, NAME, ID_ADDRESS);
client_order(ID_CLIENT_ORDER, ID_CLIENT, RECEIVED_DATE);
client_order_detail(ID_CLIENT_ORDER_DETAIL, ID_CLIENT_ORDER, ID_PROVIDER, ID_PRODUCT,
ORDER_QTY, STATUS, DELIVERED_DATE); -- Status in ('OPEN', 'DELIVERED')
provider(ID_PROVIDER, NAME, ID_ADDRESS);
product(ID_PRODUCT, PRODUCT_NAME);
provider_product(ID_PROVIDER, ID_PRODUCT, PRICE, STOCK_QTY);
You can, if you like, expand this. For example, a provider might have multiple locations from which product can be supplied, in which case you'd need to re-work the single PROVIDER table into something like
PROVIDER(ID_PROVIDER, NAME)
PROVIDER_PRODUCT(ID_PROVIDER, ID_PRODUCT, PRICE)
PROVIDER_ADDRESS(ID_PROVIDER_ADDRESS, ID_PROVIDER, ID_ADDRESS)
and then rework PROVIDER_PRODUCT as
PROVIDER_ADDRESS_PRODUCT(ID_PROVIDER_ADDRESS, ID_PRODUCT, STOCK_QTY)
I suppose it's also possible that the price a provider charges might depend on the location from which it's shipped so you might need to change the model to accommodate that. The point is that there are many different ways to do this, and it depends very much on the requirements you have.
EDIT
To obtain the total value of an order using the tables above you'd use a query such as:
SELECT co.ID_CLIENT_ORDER, SUM(cod.ORDER_QTY * pp.PRICE) AS ORDER_TOTAL_VALUE
FROM CLIENT_ORDER co
INNER JOIN CLIENT_ORDER_DETAIL cod
ON cod.ID_CLIENT_ORDER = co.ID_CLIENT_ORDER
INNER JOIN PROVIDER_PRODUCT pp
ON pp.ID_PROVIDER = cod.ID_PROVIDER AND
pp.ID_PRODUCT = cod.ID_PRODUCT
GROUP BY co.ID_CLIENT_ORDER
ORDER BY co.ID_CLIENT_ORDER
You can put id_provider in order table and get the provider details by the id .For ex
Select *,p.price,p.fuel_code from order as o join provider as p on o.id_provider=p.id
Welcome to the Oracle database world!
You can make connections between tables or views using JOIN structure.
In your example, you can connect your tables like this:
SELECT od.* FROM order od, provider pd WHERE od.provider_name=pd.provider_name
Or you can write it like this. It's up to your style. But I prefer the first one.
SELECT od.* FROM order od INNER JOIN provider pd ON od.provider_name=pd.provider_name
But it's always better using NUMBER type columns in joins. So I suggest, create provider_id (NUMBER type) on 2 tables and join them.
There are 5 main types (INNER, OUTER, FULL, LEFT, RIGHT) of joins in SQL. Each of them provides different things.
If I understand correctly your database represents a market place where a client can buy different kinds of fuels from providers. So far you have used natural keys:
A client has some login code that identifys them (client_code).
A provider is identified by their name. Do you want this? That would mean a provider address stays constant and cannot change. That may or may not be desired. If the company JACOB'S FUELSHOP INC changes to THE FUELSHOP INC, do you want to treat it as the same company or a different one?
You also generate order numbers. Do you want them unique in your database, or per client or per provider? The order table's key would accordingly be either just order_number or order_number + client_code or order_number + provider_name.
Now you don't want to have one provider only sell one type of fuel, but various types. You introduce a bridge table. But in the same step you introduce technical IDs. Do you want to use technical IDs instead of natural keys?
You have provider(PROVIDER_ID, PROVIDER_ADRESS) where the provider's name is either missing or included in the address. This allows you to change the name later. But remember that you may want to store the name then with every order, as you may want to reprint it later or just say whether you placed the order with the legal company JACOB'S FUELSHOP INC or THE FUELSHOP INC. Technical IDs are fine, but you must remember to still keep unique constraints on the natural keys and think about consequences as the one just mentioned.
As to the fuel price: You are storing the current fuel price, but you are not storing the price when ordered. Add the price to the orders table.
As to your question: Order and provider are already linked by provider_name in the original design. In the new design you've introduced the provider_id. If you want to use this instead, put provider_id in the orders table. As mentioned, you may or may not want the provider name in that table, too.

parent-child and child-parent relationships in one query SOQL

Is it possible to write a query to access parent-child and child-parent objects, in one SOQL query?
I have a scenario where, I need to access Account Objects from child and child of Account too, in the same query.
Example:
Select Id,
(Select Id,Name, (Select Id, Address from Addresses__r) from x__r.Account),
x__r.Account.Name
From x
.
(Pardon me, if I use any wrong terms. I am pretty new to Salesforce)
Yes, it is possible to do so.
In the example below, we are using sub-query to retrieve Ids of all Account Territories (parent-to-child relationship) and, at the same time, we are using child-to-parent relationship to retrieve TimeZoneSidKey of the User who has created the record.
SELECT Id, (SELECT Id FROM Accounts_Territories__r), CreatedBy.TimeZoneSidKey FROM Account
Documentation on the topic

What is the best way to design a database to store record with a lot of values?

I want to design a database for events and track a lot of statistic about the it.
Option 1
Create one table for Events and put all my statistic column in it. Like number of male, number of female, number of unidentified gender, temperature that day, time it started, any fights, was the police called, and etc.
The query would be a very simple select * from events
Option 2
Create two tables, one for Events and one for EventsAttributes. In the Events table I would store important stuff like id, event title, and start/end time.
In EventsAttributes I would store all the event statistic and link them back to Events with a eventId foreign key.
The query would look like below. (attributeType == 1 would represent number of males)
select e.*,
(select ev.value from EventAttributes ev where ev.eventId = e.id and attributeType = 1) as NumberOfMale
from Events e
The query would be not be as straight forward as option 1, but I want to design it the right way and live with the messy query.
So which option is the right way to do it, and why (I'm not a database admin, but curious).
Thank you for your time.
I prefer using option 2 for designing database.
In that option(2), you apply the best practice of database normalization.
There are three main reasons to normalize a database:
The first is to minimize duplicate data.
The second is to minimize or avoid data modification issues
The third is to simplify queries.
For more details, read Designing a Normalized Database
You can create views (queries) based on this normalized database to support Option (1).
In this way, database will be ready for any future scaling.
Update:
You can use the the valuable operator pivot and common table expressions (CTE) to get eventAttributes1, eventAttributes2, ...
Suppose your tables are :events and event_attributes as described below:
events
----------
# event_id
event_title
start_date
end_date
event_attributes
-------------
#event_id
#att_type
att_value
# is primary key
-- using table expression (it's like a dynamic view)
with query as (
select e.event_id, e.event_title,a.att_type, a.att_value
from events e
join event_attributes a on e.event_id =a.event_id
)
select event_id , event_title,
[1] as eventAttributes1, -- list all eventAttributes1 numbered [1],[2],...
[2] as eventAttributes2
[3] as eventAttributes3
FROM query
PIVOT(SUM(att_value) FOR att_type IN ([1],[2],[3])) as pvt
For details on pivot read: Using PIVOT
For details Using Common Table Expressions

In SQL Server what is most efficient way to compare records to other records for duplicates with in a given range of values?

We have an SQL Server that gets daily imports of data files from clients. This data is interrelated and we are always scrubbing it and having to look for suspect duplicate records between these files.
Finding and tagging suspect records can get pretty complicated. We use logic that requires some field values to be the same, allows some field values to differ, and allows a range to be specified for how different certain field values can be. The only way we've found to do it is by using a cursor based process, and it places a heavy burden on the database.
So I wanted to ask if there's a more efficient way to do this. I've heard it said that there's almost always a more efficient way to replace cursors with clever JOINS. But I have to admit I'm having a lot of trouble with this one.
For a concrete example suppose we have 1 table, an "orders" table, with the following 6 fields.
(order_id, customer_id, product_id, quantity, sale_date, price)
We want to look through the records to find suspect duplicates on the following example criteria. These get increasingly harder.
Records that have the same product_id, sale_date, and quantity but different customer_id's should be marked as suspect duplicates for review
Records that have the same customer_id, product_id, quantity and have sale_dates within five days of each other should be marked as suspect duplicates for review
Records that have the same customer_id, product_id, but different quantities within 20
units, and sales dates within five days of each other should be considered suspect.
Is it possible to satisfy each one of these criteria with a single SQL Query that uses JOINS? Is this the most efficient way to do this?
If this gets much more involved, then you might be looking at a simple ETL process to do the heavy carrying for you: the load to the database should be manageable in the sense that you will be loading to your ETL environment, running tranformations/checks/comparisons and then writing your results to perhaps a staging table that outputs the stats you need. It sounds like a lot of work, but once it is setup, tweaking it is no great pain.
On the other hand, if you are looking at comparing vast amounts of data, then that might entail significant network traffic.
I am thinking efficient will mean adding index to the fields you are looking into the contents of. Not sure offhand if a megajoin is what you need, or just to list off a primary key of the suspect records into a hold table to simply list problems later. I.e. do you need to know why each record is suspect in the result set
You could
-- Assuming some pkid (primary key) has been added
1.
select pkid,order_id, customer_id product_id, quantity, sale_date
from orders o
join orders o2 on o.product_id=o2.productid and o.sale_date=o2.sale_date
and o.quantity=o2.quantity and o.customerid<>o2.customerid
then keep joining up more copies of orders, I suppose
You can do this in a single Case statement. In this below scenario, the value for MarkedForReview will tell you which of your three Tests (1,2, or 3) triggered the review. Note that I have to check for the conditions of the third test before the second test.
With InputData As
(
Select order_id, product_id, sale_date, quantity, customer_id
, Case
When O.sale_date = O2.sale_date Then 1
When Abs(DateDiff(d, O.sale_date, O2.sale_date)) <= 5
And Abs( O.quantity - O2.quantity ) <= 20 Then 3
When Abs(DateDiff(d, O.sale_date, O2.sale_date)) <= 5 Then 2
Else 0
End As MarkedForReview
From Orders As O
Left Join Orders As O2
On O2.order_id <> O.order_id
And O2.customer_id = O.customer_id
And O2.product_id = O.product_id
)
Select order_id, product_id, sale_date, quantity, customer_id
From InputData
Where MarkedForReview <> 0
Btw, if you are using something prior to SQL Server 2005, you can achieve the equivalent query using a derived table. Also note that you can return the id of the complementary order that triggered the review. Both orders that trigger a review will obviously be returned.

Activity list ala SO

We are building a set of features for our application. One of which is a list of recent user activities ala on SO. I'm having a little problem finding the best way to design the table for these activities.
Currently we have an Activities table with the following columns
UserId (Id of the user the activity is for)
Type (Type of activity - i.e. PostedInForum, RepliedInForum, WroteOnWall - it's a tinyint with values taken from an enumerator in C#)
TargetObjectId (An id of the target of the activity. For PostedInForum this will be the Post ID, for WroteOnWall this will be the ID of the User whose wall was written on)
CreatedAtUtc (Creationdate)
My problem is that TargetObjectId column doesn't feel right. It's a soft link - no foreign keys and only a knowledge about the Type tells you what this column really contains.
Does any of you have a suggestion on an alternate/better way of storing a list of user activites?
I should also mention that the site will be multilingual, so you should be able to see the activity list in a range of languages - that's why we haven't chosen for instance to just put the activity text/html in the table.
Thanks
You can place all content to a single table with a discriminator column and then just select top 20 ... from ... order by CreatedAtUtc desc.
Alternatively, if you store different type of content in different tables, you can try something like (not sure about exact syntax):
select top 20 from (
select top 20 ID, CreatedAtUtc, 'PostedToForum' from ForumPosts order by CreatedAtUtc
union all
select top 20 ID, CreatedAtUtc, 'WroteOnWalll' from WallPosts order by CreatedAtUtc) t
order by t.CreatedAtUtc desc
You might want to check out http://activitystrea.ms/ for inspiration, especially the schema definition. If you look at that spec you'll see that there is also the concept of a "Target" object. I have recently done something very similar but I had to create my own database to encapsulate all of the activity data and feed data into it because I was collecting activity data from multiple applications with disparate data sources in different databases.
Max

Resources