How to send data from OLE DB source to Anchor model tables using ETL procedure? - sql-server

I'm currently solving this task: some data should be sent from AdventureWorks2012 to Anchor model tables on the same server in MsSQL.
This is my Anchor Model
At this point I have a pretty simple Integration Services project in Visual Studio and it looks like this.
Control flow:
For example Load_territories is:
The main requirement is to fill all tables of Anchor model tables in MsSQL but I'm constantly facing a problem: the amount of attributes in tables are different and some of them are repeating
At this picture in the second table basically TR_ID,TR_GRP_TR_ID, TR_TID_TR_ID, TR_TNM_TR_ID contain the same values from dwh_key but it's impossible to create a one-to-many relation between attributes. My tutor has recommended me to use Lookup but I cannot figure out how to implement them in this project

This may be considered as cheating, but if you insert data into the latest view rather than the separate 6NF tables all of those ID fields will be populated by underlying trigger logic. I suspect that this would defeat the purpose of using SSIS though, since you would effectively be loading attributes sequentially rather than in parallel.
Another option is to leave surrogate key management to the ETL tool. This would require that you switch the data type for your identities from integers to GUID:s. SSIS can then generate a GUID and you can then use that very same GUID to populate all the attributes. Note that the anchor would have to be loaded first, or you will get a foreign key violation.
The most common solution though, is to leave surrogate key management to the database (and use integers). You would have a step in which you populate the metadata column in the anchor with the desired number of new identities to be created. Using the metadata number you can then select the newly generated identities and merge them into your data flow. It doesn't matter which number gets assigned to which row. After that all attributes can be populated in parallel, including their ID columns.
Of course, if this is intended to be used for more than an initial load, you would also have to add steps to detect if the data you are loading is already known or not.
I can also recommend watching the video tutorial referenced in this blog post: https://clinthuijbers.wordpress.com/2013/06/14/ssis-anchor-modeling-example-tutorial/

Related

How do I write to an access query that uses linked tables?

I have been tasked with create a tracker for company work flow.
I have 10 tables with data in them. There are attributes all the tables have in common. I made a table with those attributes, giving those records a unique ID that could join them to the unique attribute records of the original tables. I am also linking a personnel table to the original tables. All of these tables exist on my SQL Server back end. I Have made a query in Access that displays all the information I was given. I'm going to use the forms in access as a front end to display, edit, and add records.
The issue I am encountering is that I can not write to a query that has externally linked tables. I have spent a lot of time normalizing this data and I know I can get around it by making tables with redundant attributes in SQL and not writing to the query, but rather write to the linked tables instead. Just wondering if there is a way around this.
Thanks
In general, even without linked tables, such queries are NOT updatable.
The general approach when working with multiple tables is to use sub forms for the child tables. That way, each form is only bound to one table. (You are free even to bind such forms directly to the linked table).
Thus, you might say have a customer table and then a table of invoices. So your main form will display the one customer and is bound to that one table.
In the sub form, you can then display all of the invoices.
So to combine multiple tables into a form or screen that allows users to update the data, or add more data, you don’t build some query that joins all the data together, but simply combine several forms into one form. But each of those separate forms will display data from the given related child table.
Here is a typical invoice form thus built in Access:
The top part is the “main” form of an invoice. It is bound to the customer table – one record. Then the multiple lines of detail is the invoice details table. So the form does NOT use queries, but each part of the form and sub forms are bound to the given related tables. You are binding each form directly to the linked table (or tables if you need to show related data like above).
This approach allows you to cobble together a set of forms that edit related tables, but each form is bound directly to the linked table.
So the fact of linked tables or not is moot – such queries are not used to edit data and such queries from link or non-linked tables are NOT updatable.
So your form + sub forms will follow the pattern of related tables that you need to work together as one whole view and means to edit data. You don’t need nor want to use a query to fill these forms.
You most certainly will provide some kind of “search” form, or some means to pull up say one customer invoice, and that invoice along with its sub forms will display the related data, and also allow editing of that data.

Event Sourcing SQL Populate Parent and Child Table

Following up from question
CQRS Read Model Design when Event Sourcing with a Parent-Child-GrandChild… relationship:
We utilize Event sourcing with SQL Server 2016 at Example: furniture company.
(1) We have a Parent and Child table. Say a FurnitureDescriptionTable, (Parent table- description of all furniture Items) and FurnitureOrders(Child - multiple customers orders, refers to FurnitureDescription table). Should the join column between these be Guid or Integer Identity in SQL?
(2) If Guid, who generates the Guid, API or SQL? any reason?
Choosing what kind of type you need for for primary/foreign keys is a known problem in RDBMS world. Simple googling will help. But still:
Guids are usually done on the application side. This option is popular (since you are referring to CQRS) when command handlers can generate complete domain objects, including the identity. Otherwise, you need to have a unique identity generator, which might be non-trivial, but still feasible in some databases, like using Oracle sequences.
Numbers are usually chosen for database-generated ids. Then, new id will only be known when the row is inserted to a table. For event-sourcing scenario this is not an option, since you will only insert on the read side, but objects are created on the write side.

TPT entities derived from TPH base classes in Entity Framework 6?

I have a project that is using EF6 Database first mapped to a SQL database. This is all new so I control the EF model as well as the database schema.
I currently have a table that I'll call Vehicle for simplicity. I use a discriminator column to get subclass Entities Car and Truck. This all works fine.
Now I need to do a 'soft delete' and move any deleted vehicles to a VehicleHistory table. (After trying this w/ EF i will probably use a SQL transaction). This needs to be reviewable so I need this history table mapped as well, but I would like to keep it within the inheritance hierarchy so its easily reused in other classes.
My idea was to create 'vehiclecurrent' and 'vehiclehistory' tables with FK's to Vehicle for shared columns. i would then use TPT in EF to get 'carcurrent','carhistory', ect... derived from my TPH classes(so e.g. carhistory->car->vehicle). This is not working and I get Error 3034: "Entities w/ different keys are mapped to the same row"
So my question is basically how can I pull this off? Will this approach work and how, or is there another way to accomplish this? Thanks!

database design - best practice- one table for web form drop down options or separate table for each drop down options

I'm looking at the best practice approach here. I have a web page that has several drop down options. The drop downs are not related, they are for misc. values (location, building codes, etc). The database right now has a table for each set of options (e.g. table for building codes, table for locations, etc). I'm wondering if I could just combine them all into on table (called listOptions) and then just query that one table.
Location Table
LocationID (int)
LocatValue (nvarchar(25))
LocatDescription (nvarchar(25))
BuildingCode Table
BCID (int)
BCValue (nvarchar(25))
BCDescription (nvarchar(25))
Instead of the above, is there any reason why I can't do this?
ListOptions Table
ID (int)
listValue (nvarchar(25))
listDescription (nvarchar(25))
groupID (int) //where groupid corresponds to Location, Building Code, etc
Now, when I query the table, I can pass to the query the groupID to pull back the other values I need.
Putting in one table is an antipattern. These are differnt lookups and you cannot enforce referential integrity in the datbase (which is the ciorrect place to enforce it as applications are often not the only way data gets changed) unless they are in separate tables. Data integrity is FAR more important than saving a few minutes of development time if you need an additonal lookup.
If you plan to use the values later in some referencing FKeys - better use separate tables.
But why do you need "all in one" table? Which problem it solves?
You could do this.
I believe that is your master data and it would not be having any huge amounts of rows that it might create and performance problems.
Secondly, why would you want to do it once your app is up and running. It should have thought about earlier. The tables might be used in a lot of places and it's might be a lot of coding and most importantly testing.
Can you throw further light into your requirements.
You can keep them in separate tables and have your stored procedure return one set of data with a "datatype" key that signifies which set of values go with what option.
However, I would urge you to consider a much different approach. This suggestion is based on years of building data driven websites. If these drop-down options don't change very often then why not build server-side include files instead of querying the database. We did this with most of our websites. Think about it, each time the page is presented you query the database for the same list of values... that data hardly ever changes.
In cases when that data did have the tendency to change, we simply added a routine to the back end admin that rebuilt the server-side include file whenever an add, change or delete was done to one of the lookup values. This reduced database I/O's and spead up the load time of all our websites.
We had approximately 600 websites on the same server all using the same instance of SQL Server (separate databases) our total server database I/O's were drastically reduced.
Edit:
We simply built SSI that looked like this...
<option value="1'>Blue</option>
<option value="2'>Red</option>
<option value="3'>Green</option>
With single table it would be easy to add new groups in favour of creating new tables, but for best practices concerns you should also have a group table so you can name those groups in the db for future maintenance
The best practice depends on your requirements.
Do the values of location and building vary frequently? Where do the values come from? Are they imported from external data? Do other tables refer the unique table (so that I need a two-field key to preper join the tables)?
For example, I use unique table with hetorogeneus data for constants or configuration values.
But if the data vary often or are imported from external source, I prefer use separate tables.

inserting into a view in SQL server

I have a SQL Server as backend and use ms access as frontend.
I have two tables (persons and managers), manager is derived from persons (a 1:1 relation), thus i created a view managersFull which is basically a:
SELECT *
FROM `managers` `m`
INNER JOIN `persons` `p`
ON `m`.`id` = `p`.`id`
id in persons is autoincrementing and the primary key, id in managers is the primary key and a foreign key, referencing persons.id
now i want to be able to insert a new dataset with a form in ms access, but i can’t get it to work. no error message, no status line, nothing. the new rows aren’t inserted, and i have to press escape to cancel my changes to get back to design view in ms access.
i’m talking about a managers form and i want to be able to enter manager AND person information at the same time in a single form
my question is now: is it possible what i want to do here? if not, is there a “simple” workaround using after insert triggers or some lines of vba code?
thanks in advance
The problem is that your view is across several tables. If you access multiple tables you could update or insert in only one of them.
Please also check the MSDN for more detailed information on restrictions and on proper strategies for view updates
Assuming ODBC, some things to consider:
make sure you have a timestamp field in the person table, and that it is returned in your managers view. You also probably need the real PK of the person table in the manager view (I'm assuming your view takes the FK used for the self-join and aliases it as the ID field -- I wouldn't do that myself, as it is confusing. Instead, I'd use the real foreign key name in the managers view, and let the PK stand on its own with its real name).
try the Jet/ACE-specific DISTINCTROW predicate in your recordsource. With Jet/ACE back ends, this often makes it possible to insert into both tables when it's otherwise impossible. I don't know for certain if Jet will be smart enough to tell SQL Server to do the right thing, though.
if neither of those things works, change your form to use a recordsource based on your person table, and use a combo box based on the managers view as the control with which you edit the record to relate the person to a manager.
Ilya Kochetov pointed out that you can only update one table, but the work-around would be to apply the updates to the fields on one table and then the other. This solution assumes that the only access you have to these two tables is through this view and that you are not allowed to create a stored procedure to take care of this.
To model and maintain two related tables in access you don’t use a query or view that is a join of both tables. What you do is use a main form, and drop in a sub-form that is based on the child table. If the link master and child setting in the sub-form is set correctly, then you not need to write any code and access will insert the person’s id in the link field.
So, don’t use a joined table here. Simply use a form + sub-form setup and you be able to edit and maintain the data and the data in the related child table.
This means you base the form on the table, and not a view. And you base the sub-form on the child table. So, don't use a view here.

Resources