SSIS: Lookup data and merge again

SSIS: Lookup data and merge again - sql-server

Somehow I have a feeling that this should be an easy one - but I cannot see how to fix this in a SSIS package.
I have two tables: Order and CustomerMapping. My CustomerMapping table is not containing all my customers, but only customers that have changed Id and CustomerName.
The definitions of the tables are:
Order:
Id (int)
CustomerId (int)
CustomerName (nvarchar(50))
CustomerMapping:
Id (int)
ObsoleteId (int)
CustomerName (nvarchar(50))
ObsoleteCustomerName (nvarchar(50))
Data in Order table is:
Id CustomerId CustomerName
1 100 Customer 1
2 101 Customer 2
3 102 Customer 3
Data in CustomerMapping:
Id ObsoleteId CustomerName ObsoleteCustomerName
20 100 New Customer 1 Customer 1
21 101 New Customer 2 Customer 2
I want to use the tools provided by the SSIS package to create what I would do like this in SQL:
SELECT o.Id,
CustomerId = ISNULL(cm.Id, o.CustomerId),
CustomerName = ISNULL(cm.CustomerName, o.CustomerName)
FROM Order AS o
LEFT JOIN CustomerMapping AS cm ON o.CustomerId = cm.ObsoleteId
The result of above query is
Id CustomerId CustomerName
1 20 New Customer 1
2 21 New Customer 2
3 102 Customer 3
I now want to take this result-set and save into a new table.
I know that I can just take the above SQL and do what I want (which might be what I end up doing), but somehow I believe I can make a lookup, merge and/or merge join... But I must admit I cannot really see how I do this without splitting the lookup into two and then having to save from each new "thread".
The above is rather simplified... I have 4 columns which I have to compare and then do some other stuff with the whole lot, before saving it into a new table again, which is why I want to keep a single "thread".
Edit: image added:

Related

Generate String Value Table Automatically in EF Core

I have a table called Customer that has several columns called National Code and Name. It also has a number of other features called Contact Numbers and Recommenders, since the number of Contact Numbers and Recommenders is more than one, so you need some other table to store them.
Also suppose I have other tables like the Customer, each of which has a number of attributes greater than one.
What is your suggestion for storing these values?
In one source, it was suggested that for each table, a table called StringValue be used for storage. Does EF core have a way to implement StringValue without writing additional code?
Example:
Customer Table:
CustomerId Name NationalCode
------------------------------------------------------------------------
1 David xxxx
------------------------------------------------------------------------
StringValue Table:
StringId CustomerId StringName Value
------------------------------------------------------------------------
10 1 PhoneNumber 915245
11 1 PhoneNumber 985452
12 1 PhoneNumber 935446
13 1 Recommenders Mr Jhon
14 1 Recommenders Mr bb
------------------------------------------------------------------------

I think it is more intutive create a new table for the field which has more than one records, then configure a one-to-many relationship between the two tables. Take your case as an example, you can divide the customer table into three tables, they can be linked by foreignkey:
1.Customer Table:
CustomerId Name NationalCode
---------------------------------------------
1 David xxxx
2.Contact Table:
Id CustomerId PhoneNumber
---------------------------------------------
1 1 915245
2 1 985452
3 1 935446
3.Recommender Table:
Id CustomerId RecommenderName
---------------------------------------------
1 1 Mr Jhon
2 1 Mr bb

T-SQL Select, manipulate, and re-insert via stored procedure

The short version is I'm trying to map from a flat table to a new set of tables with a stored procedure.
The long version: I want to SELECT records from an existing table, and then for each record INSERT into a new set of tables (most columns will go into one table, but some will go to others and be related back to this new table).
I'm a little new to stored procedures and T-SQL. I haven't been able to find anything particularly clear on this subject.
It would appear I want to something along the lines of
INSERT INTO [dbo].[MyNewTable] (col1, col2, col3)
SELECT
OldCol1, OldCol2, OldCol3
FROM
[dbo].[MyOldTable]
But I'm uncertain how to get that to save related records since I'm splitting it into multiple tables. I'll also need to manipulate some of the data from the old columns before it will fit into the new columns.
Thanks
Example data
MyOldTable
Id | Year | Make | Model | Customer Name
572 | 2001 | Ford | Focus | Bobby Smith
782 | 2015 | Ford | Mustang | Bobby Smith
Into (with no worries about duplicate customers or retaining old Ids):
MyNewCarTable
Id | Year | Make | Model
1 | 2001 | Ford | Focus
2 | 2015 | Ford | Mustang
MyNewCustomerTable
Id | FirstName | LastName | CarId
1 | Bobby | Smith | 1
2 | Bobby | Smith | 2

I would say you have your OldTable Id to preserve in new table till you process data.
I assume you create an Identity column Id on your MyNewCarTable
INSERT INTO MyNewCarTable (OldId, Year, Make, Model)
SELECT Id, Year, Make, Model FROM MyOldTable
Then, join the new table and above table to insert into your second table. I assume your MyNewCustomerTable also has Id column with Identity enabled.
INSERT INTO MyNewCustomerTable (CustomerName, CarId)
SELECT CustomerName, new.Id
FROM MyOldTable old
JOIN MyNewCarTable new ON old.Id = new.OldId
Note: I have not applied Split of Customer Name to First Name and
Last Name as I was unsure about existing data.
If you don't want your OldId in MyNewCarTable, you can DELETE it
ALTER TABLE MyNewCarTable DROP COLUMN OldId

You are missing a step in your normalization. You do not need to duplicate your customer information per vehicle. You need three tables for 4th Normal form. This will reduce storage size and more importantly allow an update to the customer data to take place in one location.
Customer
CustomerID
FirstName
LastName
Car
CarID
Make
Model
Year
CustomerCar
CustomerCarID
CarID
CustomerID
DatePurchaed
This way you can have multiple owners per car, multiple cars per owner and only one record needs to be updated per car and or customer...4th Normal Form.

If I am reading this correctly, you want to take each row from table 1, and create a new record into table A using some of that row data, and then data from the same original row into Table B, Table C but referencing back to Table A again?
If that's the case, you will create TableA with an Identity and make thats the PK.
Insert the required column data into that table and use the #IDENTITY to retrieve the last identity value, then you will insert the remaining data from the original table into the other tables, TableB, TableC, etc. and use the identity you retrieved from TableA as the FK in the other tables.
By Example:
Table 1 has columns col1, col2, col3, col4, col5
Table A has TabAID, col1, col2
Table B has TabBID, TabAID, col3
TableC has TabCID, TabAID, col4
When the first row is read, the values for col1 & col2 are inserted into TableA.
The Identity is captured from that row inserted, and then value for col3 AND the identity are entered into TableB, and then value for col4 AND the identity are entered into TableC.
This is a standard data migration technique for normalizing data.
Hope this assists,

Some basic questions on database design and how to insert accordingly with LINQ to Entities

Ok, I am total newbie so bear with me.
Trying to implement an ordering system and wish
to save the orders to the database with LINQ to Entities. I can do it now
but for each new object that is saved to the orders table
a new row is inserted, with new OrderNo for each ProductID where as I obviously
should be able to have multiple ProductID's for each OrderNo.
Everything is very simplified as I am just testing.
I have an orders table with columns as such:
OrderNo PK, Identity specification
Line int PL
ProductID int
and a products table
ProductID int PK
An order entity object is instantiated and its properties
are populated with data from a form which is posted to an action method.
It is then saved to the orders table with the following code:
(DropDownList1Value) has value of an existing ProductID and "DropDownList1Value" is the id of the DropDownList element in view.
[HttpPost]
public ActionResult OrderProcessor()
{
int productId = int.Parse(Request.Form["DropDownList1Value"]);
var order = new Order();
order.ProductID = productId;
context.Orders.AddObject(order);
context.SaveChanges();
return View(order);
}
So the records that are inserted look as such:
Sorry, couldn't line up the values under their respective column name in this editor.
OrderNo Line ProductID
101 0 3
102 0 5
103 0 2
Where as I want something like this:
OrderNo Line ProductID
101 1 3
101 2 5
101 3 2
102 1 2
So I wish to know how can I modify the orders table so it
can have multiple records with same "OrderNo" and just increment for "Line" for diff ProductID's and how do I go about inserting such records with LinQ to Entities where
I will obviously have many ProductId from multiple DropDowLists
and they will all be for the one order.
Currently I have foreign key dependency on ProductID in Products table,
so no record in the Orders table can have ProductID which does not exists in the Products table.
I need to make the table depend on the whole key that is OrderNo + Line
and have the "Line" auto increment.
Or is there a much better way of implementing of what I am after here?
Thanks.

Let me first explain briefly what I understood.
There is an invoice, which contains several products for one order number.
and this is how your invoice looks like:
Order Number: 101
------------------
Sl. Products
1 3
2 5
3 2
Before answering I want to point out that you are taking OrderId from a form (That is from client side) This is a wrong and INSECURE approach. Let the order id be AutoGenerated by database.
I would suggest to tweak your database design a little.
Here is a solution that will work.
Note: I am consedering your database support Auto-Increment, for MS SQL replace it with IDENTITY, for Oracle you need to create a sequence.
Product (
id INT PK AUTO-INCREMENT
);
Order (
id INT PK AUTO-INCREMENT
user-id INT FK # user who purchased
### and other common details Like date of purchase etc.
);
Order-Detail (
id INT PK AUTO-INCREMENT
order-id INT FK # Common order id
pdt-id INT FK # product which was purchased.
);
When you make a purchase:
1. Insert a row in order table
2. Fetch the last inserted id
3. Insert order-id from last step and products which are purchased in Order-Detail table,
Fetch all the orders made by a user:
1. Read from order table.
Fetch all products purchased for an order:
1. Fetch details from Order-Detail
Note: You will get List of products purchased, Use Order-detail.id as "Line"
EDIT:
Thanx to HLGEM's comment
If you think price of a product may change then instead of updating the price add a new row to the table (and flag the old table so that it wont be visible, you can also have a column in new table pointing to old table), thus old purchase will point to old product and new orders will point to updated (new) row.
There is one more approach this problem:
store the current cost of product in order-detail table.
If you are facing difficulty understanding above solution here is another and simpler one.
In Order table, Make a composite primary key including OrderNo and Line.
Whenever inserting into database you will need to generate line number in your code, which you can do by runnign a loop over array of propduct being purchased.

I think it would be better to split your current Order table into two separate tables:
Order table
(PK, Identity specification) OrderId
Perhaps other fields like Invoice address, Delivery address, etc.
OrderLine table
(PK, Identity specification) OrderLineId
(FK to Order table) OrderId
(FK to Product table) ProductId
For both tables you have an Entity in your class model: class Order and class OrderLine and a one-to-many relationship between them, so Order has a collection of OrderLines.
Creating an order with all order lines would then look like this:
var order = new Order();
foreach (var item in collection)
{
var orderLine = new OrderLine()
// Get productId from your DropDownLists
orderLine.ProductId = productId;
order.OrderLines.Add(orderLine);
}
context.Orders.AddObject(order);
context.SaveChanges();
Edit
The MVC MusicStore Tutorial might also help for the first steps to create an order processing system with ASP.NET MVC and Entity Framework. It contains classes for orders and order details (among others) and explains their relationships.

How to automatically set the SNo of the row in MS SQL Server 2005?

I have a couple of rows in a database table (lets call it Customer). Each row is numbered by SNo, which gets automatically incremented by the identity property inherent in MS SQLServer. But when I delete a particular row that particular row number is left blank, but I want the table to auto correct itself.
To give you a example:
I have a sample Customer Table with following rows:
SNo CustomerName Age
1 Dani 28
2 Alex 29
3 Duran 21
4 Mark 24
And suppose I delete 3rd row the table looks like this:
SNo CustomerName Age
1 Dani 28
2 Alex 29
4 Mark 24
But I want the table to look like this:
SNo CustomerName Age
1 Dani 28
2 Alex 29
3 Mark 24
How can I achieve that?
Please help me out
Thanks in anticipation

As has been pointed out doing that would break anything in a relationship with SNo, however if your doing this because you need ordinal numbers in you presentation layer for example, you can pull off a [1..n] row number with;
SELECT ROW_NUMBER() OVER(ORDER BY SNo ASC), SNo, CustomerName, Age FROM Customer
Obviously in this case the row number is just an incrementing number, its meaningless in relation to anything else.

I don't think you want to do that. Imagine the scenario where you have another table CustomerOrder that stores all customer orders. The structure for that table might look something like this:
CustomerOrder
-------------
OrderID INT
SNo INT
OrderDate DATETIME
...
In this case, the SNo field is a foreign key into the CustomerOrder table, and we use it to relate orders to a customer. If you delete a record from your Customer table (say with SNo = 1), are you going to go back and update the SNo values in the entire CustomerOrder table? It's best to just let the ID's autoincrement and not worry about spaces in the IDs due to deletions.

Why not create a view?
CREATE VIEW <ViewName>
AS
SELECT
ROW_NUMBER() OVER(ORDER BY SNo ASC) AS SNo
,CustomerName
,Age
FROM Customers
GO
Then access the data in customers table by selecting from the view.
Of course the SNo shown by the view has no meaning in the context of relationships, but the data returned will look exactly like you want it to look.

Using transactions when inserting records in the Database with C#
You have to use DBCC CHECKIDENT(table_name, RESEED, next_val_less_1);

As have been pointed out in other answers, this is a bad idea, and if the reason is for a presentation there are other solutions.
-- Add data to temp table
select SNo, CustomerName, Age
into #Customer
from Customer
-- Truncate Customer
-- Resets identity to seed value for column
truncate table Customer
-- Add rows back to Customer
insert into Customer(CustomerName, Age)
select CustomerName, Age
from #Customer
order by SNo
drop table #Customer

Outputting Results from complicated database structure (SQL Server)

This will be a long question so I'll try and explain it as best as I can.
I've developed a simple reporting tool in which a number of results are stored and given a report id, these results were generated from a particular quote being used on the main system, with a huge list of these being stored in a quotes table. Here are the current batch:
REPORTS
REP_ID DESC QUOTE_ID
-----------------------------------
1 Test 1
2 Today 1
3 Last Week 2
RESULTS
RES_ID TITLE REFERENCE REP_ID
---------------------------------------------------
1 Equipment Toby 1
2 Inventory Carl 1
3 Stocks Guest 2
4 Portfolios Guest 3
QUOTE
QUOTE_ID QUOTE
------------------------------------
1 Booking a meeting room
2 Car Park Policy
3 New User Guide
So far, so good, a simple stored procedure was able to pull all the information necessary.
Now, the feature list has been upped to include categories and groups of the quotes. In the Reports table quote_id has been changed to group_id to link to the following tables.
REPORTS
- REPORT_ID
- DESC
- GROUP_ID
GROUP
- GROUP_ID
- GROUP
GROUP_CAT_JOIN
- GCJ_ID
- CAT_ID
- GROUP_ID
CATEGORIES
- CAT_ID
- CATEGORY
CAT_QUOTE_JOIN
- CQJ_ID
- CAT_ID
- QUOTE_ID
The idea of these changes is so that instead of running a report on a quote I should now write a report for a group where a group is a set of quotes for certain occasions. I should also be able to run a report on a category where a category is also a set of quotes for certain departments. The trick is that several categories can fall into one group.
To explain it further, the results table has a report_id that links to reports, reports has a group_id that links to groups, groups and categories are linked through a group_cat_join table, the same with categories and quotes through a cat_quote_join table.
In basic terms I should be able to pull all the results from either a group of quotes or a category of quotes. The query will aim to pull all the results from a certain report under either a certain category, a group or both. This puzzle has left me stumped for days now as inner joins don't appear to be working and I'm struggling to find other ways to solve the problem using SQL.
Can anyone here help me?
Here's some extra clarification.
I want to be able to return all the results within a category, but as of right now the solution below and the ones I've tried always output every solution within a description, which is not what I want.
Here's an example of the data I have in there at the moment
Results
RES_ID TITLE REFERENCE REP_ID
---------------------------------------------------
1 Equipment Toby 1
2 Inventory Carl 1
3 Stocks Guest 2
4 Portfolios Guest 3
Reports
REP_ID DESC GROUP_ID
-----------------------------------
1 Test 1
2 Today 1
3 Last Week 2
GROUP
GROUP_ID GROUP
---------------------------------
1 Standard
2 Target Week
GROUP_CAT_JOIN
GCJ_ID GROUP_ID CAT_ID
----------------------------------
1 1 1
2 1 2
3 2 3
CATEGORIES
CAT_ID CAT
-------------------------------
1 York Office
2 Glasgow Office
3 Aberdeen Office
CAT_QUOTE_JOIN
CQJ_ID CAT_ID QUOTE_ID
-----------------------------------
1 1 1
2 2 2
3 3 3
QUOTE
QUOTE_ID QUOTE
------------------------------------
1 Booking a meeting room
2 Car Park Policy
3 New User Guide
This is the test data I am using at the moment and to my knowledge it is similar to what will be run through once this is done. In all honesty I'm still trying to get my head around this structure.
The result I am looking for is if I choose to search by group I'll get everything within a group, if I choose everything inside a category I get everything just inside that category, and if I choose something from a category in a group I get everything inside that category. The problem at the moment is that whenever the group is referenced everything inside every category that's linked to the group is pulled.

The following will get the necessary rows from the results:
select
a.*
from
results a
inner join reports b on
a.rep_id = b.rep_id
and (-1 = #GroupID or
b.group_id = #GroupID)
and (-1 = #CatID or
b.cat_id = #CatID)
Note that I used -1 as the placeholder for all Groups and Categories. Obviously, use a value that makes sense to you. However, this way, you can specify a specific group_id or a specific cat_id and get the results that you want.
Additionally, if you want Group/Category/Quote details, you can always append more inner joins to get that info.
Also note that I added the Group_ID and Cat_ID conditions to the Reports table. This would be the SQL necessary if and only if you add a Cat_ID column to the Reports table. I know that your current table structure doesn't support this, but it needs to. Otherwise, as my grandfather used to say, "Boy, you can't get there from here." The issue here is that you want to limit reports by group and category, but reports only knows about group. Therefore, we need to tie something to the category from reports. Otherwise, it will never, ever, ever limit reports by category. The only thing that you can limit by both group and category is quotes. And that doesn't seem to be your requirement.
As an addendum: If you add cat_id to results instead of reports, the join condition should be:
and (-1 = #CatID or
a.cat_id = #CatID)

Is this what you are looking for?
SELECT a.*
FROM Results a
JOIN Reports b ON a.REP_Id = c.REP_Id
WHERE EXISTS (
SELECT * FROM CAT_QUOTE_JOIN c
WHERE c.QUOTE_ID = b.QUOTE_ID -- correlation to the outer query
AND c.CAT_ID = #CAT_ID -- parameterization
)
OR EXISTS (
-- note that subquery table aliases are not visible to other subqueries
-- so we can reuse the same letters
SELECT * FROM CAT_QUOTE_JOIN c, GROUP_CAT_JOIN d
WHERE c.CAT_ID = d.CAT_ID -- subquery join
AND c.QUOTE_ID = b.QUOTE_ID -- correlation to the outer query
AND d.GROUP_ID = #GROUP_ID -- parameterization
)

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight