SSIS Lookup Suggestion

SSIS Lookup Suggestion - sql-server

I have a task to Generate a Derived Column (RestrictionID) from a table and add it to a source table if 6 columns on both tables match. The 6 columns include (Country, Department, Account .etc) I decided to go with the SSIS Lookup and generate the Derived column when there's a match. This ID also has a time and amount Limit. Once all the records have ID's, I'm supposed to calculate a running total based on the ID to enforce the limits which is the easy part.
The only problem is this Lookup table changes almost daily and any or all of the 6 columns can have NULLS. Even the completely null rows have an Id. Nulls mean the Restriction is open. eg. If the Country column on one record on the lookup table is null, then the ID of that record can be assigned to records with any country on the source. If one row on the lookup has all null columns, then this is completely open and all records on the source qualify for that ID. The Source table doesn't have NULLS.
Please assist if possible
Thanks

If NULL means any and ignore column in lookup then add this to your where:
use a stored proc and pass your values in and return:
select lookup.ID
from lookup
where #Country = isnull(lookup.Country,#Country) //If lookup inull then it refers to itself creating a 1=1 scenario
and #Department = isnull(lookup.Department,#Department)
and ...

Related

Bad design to compare to computed columns?

Using SQL Server I have a table with a computed column. That column concatenates 60 columns:
CREATE TABLE foo
(
Id INT NOT NULL,
PartNumber NVARCHAR(100),
field_1 INT NULL,
field_2 INT NULL,
-- and so forth
field_60 INT NULL,
-- and so forth up to field_60
)
ALTER TABLE foo
ADD RecordKey AS CONCAT (field_1, '-', field_2, '-', -- and so on up to 60
) PERSISTED
CREATE INDEX ix_foo_RecordKey ON dbo.foo (RecordKey);
Why I used a persisted column:
Not having the need to index 60 columns
To test to see if a current record exists by checking just one column
This table will contain no fewer than 20 million records. Adds/Inserts/updates happen a lot, and some binaries do tens of thousands of inserts/updates/deletes per run and we want these to be quick and live.
Currently we have C# code that manages records in table foo. It has a function which concatenates the same fields, in the same order, as the computed column. If a record with that same concatenated key already exists we might not insert, or we might insert but call other functions that we may not normally.
Is this a bad design? The big danger I see is if the code for any reason doesn't match the concatenation order of the computed column (if one is edited but not the other).
Rules/Requirements
We want to show records in JQGrid. We already have C# that can do so if the records come from a single table or view
We need the ability to check two records to verify if they both have the same values for all of the 60 columns

A better table design would be
parts table
-----------
id
partnumber
other_common_attributes_for_all_parts
attributes table
----------------
id
attribute_name
attribute_unit (if needed)
part_attributes table
---------------------
part_id (foreign key to parts)
attribute_id (foreign key to attributes)
attribute value
It looks complicated but due to proper indexing this is super fast even if part_attributes contain billions of records!

try to add a new column as foreign key in existing table with data and existing data manipulation

A very simple example. I have web API with a table in the database
Employees
---------
Id
---------
Name
and for example, I have 50 records.
Now I have to Implement a feature to add extra info about the department. Because I have one to many relationships the new database schema is with department id
Employees Department
---------- -----------
Id Id
--------- -----------
Name Name
---------
DepartmentId
for this, I run the query (i use SQL server)
alter table Employees add constraint fk_employees_departmentid
foreign key (DepartmentId) references Department(Id);
But now I have some issues to handle
1)Now I have the 50 existing records without departmentId. However, I must add manually this value? What is the best practice? For 50 records it is possible but for 2000 records and more?
2) when I add departmentId column I set this column to have null values(is correct?), but as a foreign key, I don't want to allow null values. Can I change it or how can I handle it?

1)Now I have the 50 existing records without departmentId. However, I must add manually this value? What is the best practice? For 50 records it is possible but for 2000 records and more?
It depends. You could set up a new department for "unassigned" and assign them all to that; you could send out a spreadsheet to HR saying "the following employees don't have an assigned department; what department are they in? ps; don't remove the EmployeeID column from the sheet before you send it back; i need it to update the DB". It's very much a business contextual question, not a technical one. X thousand records is easy to handle.. It'll just take a bit of time to work through if you (or someone else) is doing it manually. This information is likely to be available somewhere else; you could perhaps send a list out to all department heads saying "are any of these guys yours? Please remove all the names you don't have in your team from this spreadsheet and send it back to me" then update the DB based on what you get back
As this is a one time operation you don't need anything particularly whizz for it - you can just get your Excel sheet back and in an empty column put:
="UPDATE emp SET departmentID = 5 WHERE id = " & A1
And fill it down to generate a bunch of update statements, copy the text into your query tool and hit go; don't need to get all fancy loading the sheet into a table, doing update joins etc - just hacky style sling together something in excel that will write the SQL for you, copy/paste/run. If HR have sent back the sheet with a list of department names, then put the dept name and id somewhere else on the sheet and use VLOOKUP or XLOOKUP to turn the name into the department number, then compose your SQL based on that
2) when I add departmentId column I set this column to have null values(is correct?), but as a foreign key, I don't want to allow null values. Can I change it or how can I handle it?
Foreign keyed columns are allowed to have NULL values - it isn't the FK that imposes a "No Nulls" restriction, it's the nullability of the column (alter the column to departmantid INT NOT NULL) that imposes that. A FK references a primary key and the primary key may not be null (or in some DB, at most one record can have a [partly] null PK), but you could just leave those departments null. If you do alter the column to be not null, you'll need to correct the NULL values first or the change will fail

SQL Server Insert If Not Exists - No Primary Key

I have Table A and Table B.
Table A contains data from another source.
Table B contains data that is inserted from Table A along with data from other tables. I have done the initial insert of data from A to B but now what I am trying to do is insert the records that do not exist already in Table B from Table A on a daily basis. Unfortunately, there is no primary key or unique identifier in Table A which is making this difficult.
Table A contains a field called file_name which has values that looks like this:
this_is_a_file_name_01011980.txt
There can be duplicate values in this column (multiple files from the same date).
In Table B I created a column data_date which extracts the date from the table a.file_name field. There is also a load_date field which just uses GETDATE() at the time the data is inserted.
I am thinking I can somehow compare the dates in these tables to decide what needs to be inserted. For example:
If the file date from Table A (would need to extract again) is greater than the load_date of Table B, then insert these records into Table B.
Let me know if any clarification is needed.

You could use exists or except. With the explanation here it seems like except would make short work of this. Something like this.
insert tableB
select * from tableA
except
select * from tableB

Inserting new rows in a table

When I add news rows using the Insert into select code, the new rows get added randomly in between the already existing rows, instead of getting added to the end of the table.
I'm using, Insert into Table1 (Name1) select Name from Table2.

SQL tables are modeled after unordered sets, and hence you should not assume that there is any order to your data in the table. The only order which exists is what you specify when you query using ORDER BY, e.g.
SELECT Name1
FROM Table1
ORDER BY Name1
An index can also be thought of a way of ordering your records, but these two are mostly distinct entities from your actual table.

I agree with Tim's answer. But if you still want the data inserted in the way you want, then you can try to add the primary key yourself which is incremental (like 1,2,3 ... or 10,20,30 ...).
Although I don't recommend it, but I think following can help you if you don't want to handle the primary key yourself.
How do I add a auto_increment primary key in SQL Server database?

Merging SQLite3 tables with identical primary keys

I am trying to merge two tables with financial information about the same list of stocks: the first is the prices table (containing, daily, weekly, monthly, etc... price data) and the second is the ratios table (containing valuation and other ratios). Both tables have identical primary key numerical ID columns (referencing the same stock tickers). After creating a connection cursor cur, My code for doing this is:
CREATE TABLE IF NOT EXISTS prices_n_ratios AS SELECT * FROM
(SELECT * FROM prices INNER JOIN ratios ON prices.id = ratios.id);
DROP TABLE prices;
DROP TABLE ratios;
This works fine except that the new prices_n_ratios table contains an extra column named ID:1 whose name is causing problems during further processing.
How do I avoid the creation of this column, maybe by somehow excluding the second tables's first primary key ID column from * (listing all the column names is not an option), or if I can't, how can I get rid of this extra column from the generated table as I have found it very hard to delete it in SQLite3?

Just list all the columns you actually want in the SELECT clause, instead of using *.
Alternatively, join with the USING clause, which automatically removes the duplicate column:
SELECT * FROM prices JOIN ratios USING (id)