DataVault modelling for Domain Reference Table - data-modeling

Folks,
Quick Version:
How should I model HUBs, SATs and LINKs when I have multiple domain lookup references in my HUB_SAT?
If you are were to generically model these from the source schema, how would you differentiate between FKs that should be LINKs and FKs that should be References?
Long Version:
I am building out a generic solution for generating DV models from an existing 3NF MSSQL schema. In my source database I have one huge Domain reference table which holds the majority of the business lookup keys
Key INT (Unique)
TypeID INT
Description VARCHAR
Posting Code
... some other fields that are not relevant to the discussion
As I see it there are four basic choices for linking to this table
Create it as a HUB and then produce LINK tables for each business HUB that refers to it
Create it as a single ReferenceLookup table and include the R_ReferenceIDs in the SAT table
Create a separate ReferenceLookup table for each TypeID and link from the SAT using the R_ReferenceID
Create Separate HUBs for each TypeID and generate LINK tables
Create a single LINK table with a LINK_SAT table to hold details of which reference value is mapped by the LINK
and of these #3 feels like the best design (but also the hardest to model correctly - especially as in my case the lookup table has a FK to the Type table)
From the Wikipedia for DataVault,
Reference tables are referenced from Satellites, but never bound with physical foreign keys.
My generic code is based on the design pattern as explained in the BIML DataVault walkthrough
I am looking at all tables in the source schema to determine whether they are a HUB (Have PK and multiple FKs plus fields that are not FKs), SAT (Have PK and only one FK) or LINK (Have PK, more than one FK and all fields are in PK/FK)
I then build:
HUBs with the a HUB_ID and the source PK
SATs with the non FK fields of the HUB source table
SATs for the source SAT tables
LINKs from the source LINK tables
LINKs from the HUB FK relationships
This is all working up to a point (i.e. I have tables for all of the above) however there are some pretty wide tables where a significant number of the fields are simply R_RefID fields all looking up on the same HUB and they are all bound with FKs on the entity table referencing the reference table
E.G. source Asset table has reference fields for
- Asset Type
- Asset Purpose
- Asset Manager
- Asset Funder
- ...
so in the preliminary model I have:
ASSET_HUB (HubID, Asset_ID)
ASSET_SAT (SAT_ID, BuildDate, DisposalDate, ....)
Lookup_HUB (Hub_ID, LookupID)
ASSET_Lookup_1_LINK) (Link_ID,ASSET_HUB_ID,Lookup_HUB_ID)
ASSET_Lookup_2_LINK) (Link_ID,ASSET_HUB_ID,Lookup_HUB_ID)
ASSET_Lookup_3_LINK) (Link_ID,ASSET_HUB_ID,Lookup_HUB_ID)
ASSET_Lookup_4_LINK) (Link_ID,ASSET_HUB_ID,Lookup_HUB_ID)
but there is no way of identifying what each of the LINK tables respresents in the domain model
How would you go about interrogating the schema to determine whether the table is a genuine HUB candidate or whether it should be a REF table instead and how would you determine whether an FK should be treated as a LINK or a SAT.R_RefID. I am after strategy rather than code (but I ;m not going to turn down code if it is on offer :) ) My source DB is SQL2008R2 and my development environment i SQL2016_Dev
In Response to tobi6:
In the source system The business entity has a number of attribute fields which are just XXX_ID types that look up their descriptors from the domain reference table. If you model this domain reference table as a HUB then you either have to have separate link tables for each lookup (LINK tables are automatically generated because there is an FK on the business entity), or multiple active LINK records with a LINK_SAT to identify which attribute you are tracking (actually this creates a 5th design pattern option). If I tag the domain reference table as a REFerence then the XXX_IDs stay in the HUB_SAT which feels like a better solution but is harder to model generically. I.e. how do I determine whether the business entity FK should create a LINK, LINK and LINK_SAT or SAT.R_RefID

Related

Database of TYPO3 - Is there a relation between tables?

I don't have much knowledge about a database of TYPO3. So I have a question.
Has a database of TYPO3 some relation between tables?
I can see a primary key and indexing my database, but there are no foreign key. So I can't find a relation anywhere else in my database, if I check it using phpMyAdmin.
Does it mean that all tables are independent and the tables are searched just using index? Is it a so-called b-tree?
If I create a diagramm of "database model" using these tables: pages, tt_content, be_usersand fe_group, how can I give a relation in a diagramm? Is it just a line and no relation (Cardinality)?
Can I express so a diagramm with these relation, if I make a diagramm using tables of these four tables?
The tables pages and be_users have m:n relation, so fe_froups and pages have 1:n and fe_groups and be_users have 1:n relation. Is that right?
And if I write a primary and foreign key, where should I write them in each tables? Or is it not possible in this case, maybe?
Thank you for your help.
There are a lot of relations, but most are not visible in the database.
for an understanding of the relations you need to understand the TCA of TYPO3 which defines the kind of relation, which is handled in the datahandler and which is shown in the formhandler which enables the editors to manage relations.
there are some relations which exists for (nearly) every record in TYPO3:
Each record has some fields with basic relations:
pid = parent/page Id = the page (table: pages) where a record is stored
even pages are stored in pages and so a tree is build (similar to folders on your disk)
uid = unique ID = unique identification of a record and which is the reference for relations to this record
If you have activated versioning or language support you get further fields as they identify variants of that record and contain a reference to the original (relations to the same table)
These fields mostly are single valued which enables it to store just an int.
but also there are m:n-relations. in older versions those were stored in a string with comma separated ints, but today they mostly are stored in a mm-table which builds the connection between two different records.
For identification in those records remains an int field which holds the count of relations.
in this way the diagram of the core tables (which are more than 40) and their relations would be very large.
regarding the tables you mentioned:
pages builds the elemetary page tree
tt_content contains the basic content visible on the website
(be_? = BackEnd-?, fe_? = FrontEnd-?)
be_users holds the data about editors working with TYPO3 (there is a right management which user can access which kind of information in TYPO), there also is logging and each record contains a (relation) field which be_user created it.
The access-rights can be stored in the be_users record, but mostly they are stored in be_groups which are related to be_users and so the be_user inherited all the rights from the be_groups.
In the same way there exist a pair for the frontend: fe_users fe_groups. The fe_groups can be assigned to records and so the visibility in front end can be controlled (only logged in members of that group can see the information)
In this way there are relations from fe_groups to pages and tt_content. more lines in your diagram of those few tables.

Composite Projects - handling additional columns

From this post....
http://blogs.msdn.com/b/ssdt/archive/2012/06/26/composite-projects-and-schema-compare.aspx
...it seems that (Same) Database References are a way to share common parts of a database.
If a specific database needs additional columns on a table from a (Same) Database Reference is there any way of handling that?
I was hoping you might be able to override the definition of a table from a Database Reference simply by re-declaring the table in the referencing Database Project.
e.g. if you had a Employee table in a Common Database project, a definition for Employee table in a Client Database referencing Common Database would override the definition in the Common project. Instead when you go to deploy the porject you get the error...
SQL71508: The model already has an element that has the same name dbo.Employee.
EDIT:
Anticipating the feedback below, the resolution I've made is to not use database references for the existing client databases. Instead I've created a structure as follows....
+OurCompanyDatabases
+Common
Common.sqlproj
+dbo...
+ClientA
+dbo....
+ClientB
+dbo....
ClientA.sqlproj
ClientB.sqlproj
So I've got multiple sqlproj files within the same folder and I include and exclude files from the projects as required.
So for example ClientA's Sales table has a ClientARewardsID column added I exclude the Sales table within the /OurCompanyDatabases/Common/dbo folder and create add a new Sales table within the /OurCompanyDatabases/ClientA/dbo folder.
This way Client A and Client B can retain the full use of SSDT update and deployment, whilst minimizing the duplication of sql scripts. I'm hoping this will reduce the cost of maintenance on the sites.
Going forward I will use database references and additional columns will be added in new tables with a foreign 1:1 foreign key relationship with the Common table.
No it doesn't support an inheritance type model and you can only really share complete objects so in your case you would have it structured like:
proj a - TableA
references - proj shared
proj b - TableA
references - proj shared
proj shared - TableXYZ
Then you can have two different definitions of TableA but still share all of the objects that are the same.
There is another option you could not include the table definition in SSDT or include one or the other and then handle any changes and the deployment yourself in post deploy scripts and use my filter (http://agilesqlclub.codeplex.com/) to stop ssdt deploying any changes to your table but this sort of invalidates one of the main reasons for using ssdt (merge type deployments for free).
ed
It's much safer and better practise to add a new table for the extra columns, and make its primary key a foreign key to the table it extends.

Linking an address table to multiple other tables

I have been asked to add a new address book table to our database (SQL Server 2012).
To simplify the related part of the database, there are three tables each linked to each other in a one to many fashion: Company (has many) Products (has many) Projects and the idea is that one or many addresses will be able to exist at any one of these levels. The thinking is that in the front-end system, a user will be able to view and select specific addresses for the project they specify and more generic addresses relating to its parent product and company.
The issue now if how best to model this in the database.
I have thought of two possible ideas so far so wonder if anyone has had a similar type of relationship to model themselves and how they implemented it?
Idea one:
The new address table will additionally contain three fields: companyID, productID and projectID. These fields will be related to the relevant tables and be nullable to represent company and product level addresses. e.g. companyID 2, productID 1, projectID NULL is a product level address.
My issue with this is that I am storing the relationship information in the table so if a project is ever changed to be related to a different product, the data in this table will be incorrect. I could potentially NULL all but the level I am interested in but this will make getting parent addresses a little harder to get
Idea two:
On the address table have a typeID and a genericID. genericID could contain the IDs from the Company, Product and Project tables with the typeID determining which table it came from. I am a little stuck how to set up the necessary constraints to do this though and wonder if this is going to get tricky to deal with in the future
Many thanks,
I will suggest using Idea one and preventing Idea two.
Second Idea is called Polymorphic Association anti pattern
Objective: Reference Multiple Parents
Resulting side effect: Using dual-purpose foreign key will violating first normal form (atomic issue), loosing referential integrity
Solution: Simplify the Relationship
The simplification of the relationship could be obtained in two ways:
Having multiple null-able forging keys (idea number 1): That will be
simple and applicable if the tables(product, project,...) that using
the relation are limited. (think about when they grow up to more)
Another more generic solution will be using inheritance. Defining a
new entity as the base table for (product, project,...) to satisfy
Addressable. May naming it organization-unit be more rational. Primary key of this organization_unit table will be the primary key of (product, project,...). Other collections like Address, Image, Contract ... tables will have a relation to this base table.
It sounds like you could use Junction tables http://en.wikipedia.org/wiki/Junction_table.
They will give you the flexibility you need to maintain your foreign key restraints, as well as share addresses between levels or entities if that is desired.
One for Company_Address, Product_Address, and Project_Address

Design for amplifier make/model settings database?

As a personal project, I essentially want to create a web application that allows users to submit amplifier settings for specific tones, which will render images and create an archive of guitar tones for specific amps.
I know that I first should design a database to support this web application. After reading about relational databases and normalization, I have started to draft a database, but I've confused myself in the process.
So far, I've created the following tables:
tbl_Makes (list of amplifier brands):
tbl_Models (list of amplifier models, linked to their brand by the MakeID field):
But I am at a bit of a loss on how to design the remaining table(s). I assume I will need a tbl_Settings table which contains both MakeID, and ModelID as foreign keys, but also some sort of column(s) to hold the amplifier settings. The issue I'm currently facing is that most amplifiers have different settings, so I'm not sure how I'd handle that. Would I need an additional table for each amplifier model to hold its specific settings?
Any suggestions? Is my current database design ok, or does it need to be modified?
You may be breaking it down too far. A table with Make and Model may be good enough. However with your current design, you would make the MakeID a Foreign Key to the Make table's Primary Key (ID).
Then you'd have an "settings" table which has a ModelID, attribute, and value - since each AMP may have different attributes. You may want to have an attribute table and use attributeID if you want to control the attribute types (with a PK and FK relationship).
Oh, and for the love of God, please don't prefix tables with tbl_

Suggestion on DB Design to log user activity in a system

I would like to ask a for an advice for a good database design to log user activity.
Currently I am implementing such approach for a simple website where user can post/edit/delete an article on the website.
Table Logbook
- log_id
- log_change[Enum: new/edit/remove]
- log_date
- member_id
- post_id
Table Post
- post_id
- post_title
- etc....
Table Member
- member_id
- member_username
- member_pwd
- etc..
Using this table, everytime user makes a new post (or edit/remove) of an article it will be logged on the Logbook (along with the time when it happens).
However, what if I am dealing with a larger system where not only user can post an article but do other things such as login/logout (from the system), make a purchase (transaction).
Should I go for different table for each module?. For example, if the system has modules like Posting article, E-commerce, etc.. hence I would have log tables for:
Article Log
E-Commerce Log
Where each table will log activity in each corresponding module.
You could use an entity sub-typing design approach, where common log attributes, like who and when are tracked in a single table for all types of changes. For changes that have additional attributes you can have additional tables, one for each type.
Each of the sub-type log tables reference the common table using a foreign key. Typically the foreign key from the sub-type table to the common table is also the primary key of the sub-type table, i.e. the relationship is 1:1.
In such a design, the common table often includes a column (partitioning attribute) which indicates which sub-type is applicable to each record in the common table.
This approach reduces the amount of code you need to build and maintain your logging system while allowing you to keep your log tables normalized.

Resources