To create dynamic tables in SSIS - sql-server

I have a scenario where I need to put my CSV files into multiple tables. These table names will be different or unknown until run time. How can I create the SSIS package to load data into a unknown table name. To be specific, I have a table A and table B. Structure of both these tables are same. Today they maybe called Table A and Table B, but they may change to table C and table D with same structure. Now, I want to write a ssis package to load data into these 2 tables, but since the table names change all the time, I do not know on how to accomplish this
Thanks!
I do not know where to begin from

Related

SSIS flat file with joins

I have a flat file which has following columns
Device Name
Device Type
Device Location
Device Zone
Which I need to insert into SQL Server table called Devices.
Devices table has following structure
DeviceName
DeviceTypeId (foreign key from DeviceType table)
DeviceLocationId (foreign key from DeviceLocation table)
DeviceZoneId (foreign key from DeviceZone table)
DeviceType, DeviceLocation and DeviceZone tables are already prepopulated.
Now I need to write ETL which reads flat file and for each row get DeviceTypeId, DeviceLocationId and DeviceZoneId from corresponding tables and insert into Devices table.
I am sure this is not new but its being a while I worked on such SSIS packages and help would be appreciated.
Load the flat content into a staging table and write a stored procedure to handle the inserts and updates in T-SQL.
Having FK relationships between the destination tables, can probably make a lot of trouble with a single data flow and a multicast.
The problem is that you have no control over the order of the inserts so the child record could be inserted before the parent.
Also, for identity columns on the tables, you cannot retrieve the identity value from one stream and use it in another without using subsequent merge joins.
The simplest way to do that, is by using Lookup Transformation to get the ID for each value. You must be aware that duplicates may lead to a problem, you have to make sure that the value is not found multiple times in the foreign tables.
Also, make sure to redirect rows that have no match into a staging table to check them later.
You can refer to the following article for a step by step guide to Lookup Transformation:
An Overview of the LOOKUP TRANSFORMATION in SSIS

How postgresql manage columns

I want to know the way how postgresql manage columns of table.
Say for e.g
I have created one table that contains 2 fields, so how postgresql manage these columns, table? In how many tables postgresql create entry for a single column ?
I would like to know the structure how the postgresql manage table and it's fields.
I only about pg_attribute table.
It would be good if anyone can share useful links.
Any help would be really appriciated.
Tables (and indexes) are organized in 8KB blocks in files in the data directory.
The column definitions are only in pg_attribute.
A table row with all its columns is stored together in one table block, and a table block can contain several such rows. In other words, PostgreSQL uses the traditional row oriented storage model.
Details can be read in the documentation.
Note: Don't use PostgreSQL 9.1 any more.

How to create a 'sanitized' copy of our SQL Server database?

We're a manufacturing company, and we've hired a couple of data scientists to look for patterns and correlation in our manufacturing data. We want to give them a copy of our reporting database (SQL 2014), but it must be in a 'sanitized' form. This means that all table names get converted to 'Table1', 'Table2' etc., and column names in each table become 'Column1', 'Column2' etc. There will be roughly 100 tables, some having 30+ columns, and some tables have 2B+ rows.
I know there is a hard way to do this. This would be to manually create each table, with the sanitized table name and column names, and then use something like SSIS to bulk insert the rows from one table to another. This would be rather time consuming and tedious because of the manual SSIS column mapping required, and manual setup of each table.
I'm hoping someone has done something like this before and has a much faster, more efficienct, way.
By the way, the 'sanitized' database will have no indexes or foreign keys. Also, it may seem to make any sense why we would want to do this, but this is what was agreed to by our Director of Manufacturing and the data scientists, as the first round of analysis which will involve many rounds.
You basically want to scrub the data and objects, correct? Here is what I would do.
Restore a backup of the db.
Drop all objects not needed (indexes, constraints, stored procedures, views, functions, triggers, etc.)
Create a table with two columns, populate the table, each row has orig table name and new table name
Write a script that iterates through the table, roe by row, and renames your tables. Better yet, put the data into excel, and create a third column that builds the tsql you want to build, then cut/paste and execute in ssms.
Repeat step 4, but for all columns. Best to query sys.columns to get all the objects you need, put to excel, and build your tsql
Repeat again for any other objects needed.
Backip/restore will be quicker than dabbling in SSIS and data transfer.
They can see the data but they can't see the column names? What can that possibly accomplish? What are you protecting by not revealing the table or column names? How is a data scientist supposed to evaluate data without context? Without a FK all I see is a bunch of numbers on a column named colx. What are expecting to accomplish? Get a confidentially agreement. Consider a FK columns customerID verses a materialID. Patterns have widely different meanings and analysis. I would correlate a quality measure with materialID or shiftID but not with a customerID.
Oh look there is correlation between tableA.colB and tableX.colY. Well yes that customer is college team and they use aluminum bats.
On top of that you strip indexes (on tables with 2B+ rows) so the analysis they run will be slow. What does that accomplish?
As for the question as stated do a back up restore. Using system table drop all triggers, FK, index, and constraints. Don't forget to drop the triggers and constraints - that may disclose some trade secret. Then rename columns and then tables.

SQL Normalizing array of tables into multiple new tables

I have a database with 51 tables all with the same schema (one table per state). Each table has a couple million rows and about 50 columns.
I've normalized the columns into 6 other tables, and now I want to import all of the data from those 51 tables into the 6 new tables. The column names are all the same, and so I'm hoping I can automate the process of importing all the data.
I'm assuming what I'll need to do is:
Select the names of all the lists that have the raw schema
SELECT TABLE_NAME
FROM INFORMATION_SCHEMA.TABLES
WHERE TABLE_SCHEMA = 'raw'
Iterate over all the results
Grab all rows from that table, and SELECT INTO the appropriate cols into the appropriate tables
Delete row from raw table
Is there anything I'm missing? Also, is there any way to have this run on the SQL Server so I don't have to have my SQL Server Management Studio open the whole time?
Yes, obviously, you can automate it with t-sql. But I recommened you to use SSIS in this case. As you say, structure of all tables are the same than you can make some ETL process and then you just change table name in the source. Consecuently, you will have the folowwing advantages:
Solve issue with couple of clicks
Low risk of errors
You will able to use the number of data transformations

Is it possible to make postgres automatically update the column definition of an auditory table?

I have the following question, I am using audit tables for some entities in my project so for instance if there is a "people" table there will be a "public_people_audit" table (where "public" is the schema where the table is and "audit" is just a suffix that was chosen).
Now the thing is that when someone from the team modifies the "people" table and adds a column to it they may forget to do it on the auditory table and the system will fail because it will try to insert the new column value in the audit table and it won't find it.
I know that the team should be careful and put the modification in both tables, but if there could be a way of automating this so if someone makes an "ALTER TABLE people ADD COLUMN foo VARCHAR(10)" the same command will be executed on the public_people_audit table it would be very helpful.
The short answer: no.
The longer answer is you can automate this by create a quick script. Make a simple text file listing a few tables that need auditing. The script reads the text file, looks at the columns in the base tables, and makes sure any missing columns are added to the audit table.

Resources