Composite Projects - handling additional columns - sql-server

From this post....
http://blogs.msdn.com/b/ssdt/archive/2012/06/26/composite-projects-and-schema-compare.aspx
...it seems that (Same) Database References are a way to share common parts of a database.
If a specific database needs additional columns on a table from a (Same) Database Reference is there any way of handling that?
I was hoping you might be able to override the definition of a table from a Database Reference simply by re-declaring the table in the referencing Database Project.
e.g. if you had a Employee table in a Common Database project, a definition for Employee table in a Client Database referencing Common Database would override the definition in the Common project. Instead when you go to deploy the porject you get the error...
SQL71508: The model already has an element that has the same name dbo.Employee.
EDIT:
Anticipating the feedback below, the resolution I've made is to not use database references for the existing client databases. Instead I've created a structure as follows....
+OurCompanyDatabases
+Common
Common.sqlproj
+dbo...
+ClientA
+dbo....
+ClientB
+dbo....
ClientA.sqlproj
ClientB.sqlproj
So I've got multiple sqlproj files within the same folder and I include and exclude files from the projects as required.
So for example ClientA's Sales table has a ClientARewardsID column added I exclude the Sales table within the /OurCompanyDatabases/Common/dbo folder and create add a new Sales table within the /OurCompanyDatabases/ClientA/dbo folder.
This way Client A and Client B can retain the full use of SSDT update and deployment, whilst minimizing the duplication of sql scripts. I'm hoping this will reduce the cost of maintenance on the sites.
Going forward I will use database references and additional columns will be added in new tables with a foreign 1:1 foreign key relationship with the Common table.

No it doesn't support an inheritance type model and you can only really share complete objects so in your case you would have it structured like:
proj a - TableA
references - proj shared
proj b - TableA
references - proj shared
proj shared - TableXYZ
Then you can have two different definitions of TableA but still share all of the objects that are the same.
There is another option you could not include the table definition in SSDT or include one or the other and then handle any changes and the deployment yourself in post deploy scripts and use my filter (http://agilesqlclub.codeplex.com/) to stop ssdt deploying any changes to your table but this sort of invalidates one of the main reasons for using ssdt (merge type deployments for free).
ed

It's much safer and better practise to add a new table for the extra columns, and make its primary key a foreign key to the table it extends.

Related

Rename table or column in SQL server without breaking existing apps

I have an existing database in MS SQL server and want to rename some tables and columns because the names currently used aren't accurate to what it represents.
I have multiple web and desktop applications that access the database, using Entity Framework (code first). Too many to update in one go and cannot afford for all apps to start working.
I was thinking it was nice is SQL server allowed a 'permanent' alias for tables and columns but I don't think this feature exists.
Or I was wondering if there was a way in EF to have two names for the same property?
For the tables, you could rename them and then create a synonym with the old name pointing to the new name.
For the columns, changing their name will break your application.You could create computed columns with the old name as well, that simply display the value of the new named column though (but this seems a little silly).
Note, however, that a computed column cannot reference another computed column, so you would have to duplicate the column in its entirety. That could lead to problems down the line if you don't update the definition of both columns.
A view containing a simple select statement acts exactly like a table. You really need to fix this properly across the database and applications. However if you want to go the view route, I suggest you do this:
Say you have a table called MyTable that you rename TheTable and with a column called MyColumn that you want to rename to TheColumn
Create a schema, say, new
Move the original table into it with this ALTER SCHEMA new TRANSFER MyTable
Rename the table and column.
Now you have a table called new.TheTable with a column called TheColumn. Everything is broken
Lastly, create a view that looks just like the old table
CREATE VIEW dbo.MyTable
AS
SELECT Column1, Column2, Column3, TheColumn As MyColumn
FROM new.TheTable;
Now everything works again.
All your fixed 'new' tables are in the new schema
However now everything is extra complicated
This is basically an illustration that you should just fix it properly across the whole app one at a time with careful change management. Definitely don't complicate it with triggers
Since you are using code first with multiple web and desktop applications, you are likely managing database changes from one place through migrations and ignoring changes other places.
You can create an empty migration and add code that will change the table name and column names to what you want. The migration should then create a view that will select from that table with the original table and column names. When you apply this migration, everything should still be working as normal from all applications. There are no model changes since you didn’t touch the model classes. Inserts, updates, and deletes will still happen through the view. There is no need for potentially buggy triggers or synonyms on the table in this option.
Now that you have the table changed, you can focus on the application code. If it helps, you can add annotations over the column and table names and start refactoring the code. You need to make sure you don’t make model changes that will break the other apps. If apps ignore model changes, you can get away with adding annotations over the columns and classes on all the apps before refactoring. You can get rid of the view sooner this way.

Use fully qualified table names for Database Projects, Unresolved reference error

I have an existing very large database (several hundred tables) for which I am using a database project in VS 2017. Consider a UDF as simple as
create function [MySchema].fn_Name --MySchema is located in MyDbName
(
#num int
)
returns table
as return
select #num
inner join [MyDbName].dbo.MyTable mt
on mt.num = #num
I have a database project for MyDbName and several tables that are pertinent to this particular project, but those tables do NOT include MyTable. If I want to use fully qualified names, how can I prevent the ambiguous reference error (SQL71561) for MyTable? Do I have to explicitly add all of the tables that My procs and functions reference into my database project?
It is quite hard to understand your question exactly, but there could be several scenarios:
If you have a single Database project (MyDbName), and both table and the UDF exists there, you should not be using fully qualified name, as you are referencing itself then it makes no sense. There could be a scenario where you have multiple instances of same structure database and one serves as master others as slave, but as I understand this is not the case then simply do not use database name in the statement.
If dbo.MyTable exists in different Database project than UDF then you need to add database reference and use variable to define database. Read more. Database reference is added by referencing the .dacpac of other database project. Then your query would look something like this:
inner join [$(MyDbName)].dbo.MyTable mt
To answer your question: No, you don't have to add the tables themselves, but you need to add a reference to a projects which holds those tables.

DataVault modelling for Domain Reference Table

Folks,
Quick Version:
How should I model HUBs, SATs and LINKs when I have multiple domain lookup references in my HUB_SAT?
If you are were to generically model these from the source schema, how would you differentiate between FKs that should be LINKs and FKs that should be References?
Long Version:
I am building out a generic solution for generating DV models from an existing 3NF MSSQL schema. In my source database I have one huge Domain reference table which holds the majority of the business lookup keys
Key INT (Unique)
TypeID INT
Description VARCHAR
Posting Code
... some other fields that are not relevant to the discussion
As I see it there are four basic choices for linking to this table
Create it as a HUB and then produce LINK tables for each business HUB that refers to it
Create it as a single ReferenceLookup table and include the R_ReferenceIDs in the SAT table
Create a separate ReferenceLookup table for each TypeID and link from the SAT using the R_ReferenceID
Create Separate HUBs for each TypeID and generate LINK tables
Create a single LINK table with a LINK_SAT table to hold details of which reference value is mapped by the LINK
and of these #3 feels like the best design (but also the hardest to model correctly - especially as in my case the lookup table has a FK to the Type table)
From the Wikipedia for DataVault,
Reference tables are referenced from Satellites, but never bound with physical foreign keys.
My generic code is based on the design pattern as explained in the BIML DataVault walkthrough
I am looking at all tables in the source schema to determine whether they are a HUB (Have PK and multiple FKs plus fields that are not FKs), SAT (Have PK and only one FK) or LINK (Have PK, more than one FK and all fields are in PK/FK)
I then build:
HUBs with the a HUB_ID and the source PK
SATs with the non FK fields of the HUB source table
SATs for the source SAT tables
LINKs from the source LINK tables
LINKs from the HUB FK relationships
This is all working up to a point (i.e. I have tables for all of the above) however there are some pretty wide tables where a significant number of the fields are simply R_RefID fields all looking up on the same HUB and they are all bound with FKs on the entity table referencing the reference table
E.G. source Asset table has reference fields for
- Asset Type
- Asset Purpose
- Asset Manager
- Asset Funder
- ...
so in the preliminary model I have:
ASSET_HUB (HubID, Asset_ID)
ASSET_SAT (SAT_ID, BuildDate, DisposalDate, ....)
Lookup_HUB (Hub_ID, LookupID)
ASSET_Lookup_1_LINK) (Link_ID,ASSET_HUB_ID,Lookup_HUB_ID)
ASSET_Lookup_2_LINK) (Link_ID,ASSET_HUB_ID,Lookup_HUB_ID)
ASSET_Lookup_3_LINK) (Link_ID,ASSET_HUB_ID,Lookup_HUB_ID)
ASSET_Lookup_4_LINK) (Link_ID,ASSET_HUB_ID,Lookup_HUB_ID)
but there is no way of identifying what each of the LINK tables respresents in the domain model
How would you go about interrogating the schema to determine whether the table is a genuine HUB candidate or whether it should be a REF table instead and how would you determine whether an FK should be treated as a LINK or a SAT.R_RefID. I am after strategy rather than code (but I ;m not going to turn down code if it is on offer :) ) My source DB is SQL2008R2 and my development environment i SQL2016_Dev
In Response to tobi6:
In the source system The business entity has a number of attribute fields which are just XXX_ID types that look up their descriptors from the domain reference table. If you model this domain reference table as a HUB then you either have to have separate link tables for each lookup (LINK tables are automatically generated because there is an FK on the business entity), or multiple active LINK records with a LINK_SAT to identify which attribute you are tracking (actually this creates a 5th design pattern option). If I tag the domain reference table as a REFerence then the XXX_IDs stay in the HUB_SAT which feels like a better solution but is harder to model generically. I.e. how do I determine whether the business entity FK should create a LINK, LINK and LINK_SAT or SAT.R_RefID

DACPAC package with complex changes

I'm looking to switch to DACPACs for our database changes, but I'm a bit at a loss about what to do when it comes to more complex database updates. To illustrate what I mean, let me use a simple example that has the same problem.
Say I have a Customer table that is currently live and I want to add a new CustomerType table with a foreign key from Customer to CustomerType. The new column in Customer should be required (not nullable), but should not have a default value.
I want to use some arbitrary formula to setup the initial type for the existing customers upon upgrading. How would I accomplish this using a DACPAC?
The DACPAC will only know there's a new column and will try to add it to the Customer table, which will of course fail because it is required. Setting a default value is undesirable, as is allowing null values.
Since the DACPAC should be usable to upgrade from every state to the latest, I don't see what kind of configuration or pre/post scripts I should setup to make this work.
Various searches have produced a disappointing lack of useful results :(
I hope there's someone here that can help out. Thanks in advance.
The answer will vary a bit depending on how you're planning to deploy the dacpac(s). One common case is having the dacpac replace some collection of T-SQL update scripts that are executed in sequence to update a database schema from one version to the next. In this case you might choose to have one dacpac file for each schema-version of your database and to update a database you would plan to publish the dacpacs in sequence to update a database to the latest version.
In that case, it's possible to use a post-deploy script to fix up the schema as appropriate. For your example scenario, you can model the database in the database project with the new column specified as NULL and without the FK relationship with the new table. Then, in a post-deploy script you can author the T-SQL necessary to execute an UPDATE statement to fill the new table and the new column, an ALTER statement to change the column's type from NULL to NOT NULL, and finally to add the foreign key relationship.
Then moving forward you can remove the post-deploy script and model the new column and table with the proper column type and FK relationship.

What would you do to avoid conflicting data in this database schema?

I'm working on a multi-user internet database-driven website with SQL Server 2008 / LinqToSQL / custom-made repositories as the DAL. I have run across a normalization problem which can lead to an inconsistent database state if exploited correctly and I am wondering how to deal with the problem.
The problem: Several different companies have access to my website. They should be able to track their Projects and Clients at my website. Some (but not all) of the projects should be assignable to clients.
This results in the following database schema:
**Companies:**
ID
CompanyName
**Clients:**
ID
CompanyID (not nullable)
FirstName
LastName
**Projects:**
ID
CompanyID (not nullable)
ClientID (nullable)
ProjectName
This leads to the following relationships:
Companies-Clients (1:n)
Companies-Projects (1:n)
Clients-Projects(1:n)
Now, if a user is malicious, he might for example insert a Project with his own CompanyID, but with a ClientID belonging to another user, leaving the database in an inconsistent state.
The problem occurs in a similar fashion all over my database schema, so I'd like to solve this in a generic way if any possible. I had the following two ideas:
Check for database writes that might lead to inconsistencies in the DAL. This would be generic, but requires some additional database queries before an update and create queries are performed, so it will result in less performance.
Create an additional table for the clients-Projects relationship and make sure the relationships created this way are consistent. This also requires some additional select queries, but far less than in the first case. On the other hand it is not generic, so it is easier to miss something in the long run, especially when adding more tables / dependencies to the database.
What would you do? Is there any better solution I missed?
Edit: You might wonder why the Projects table has a CompanyID. This is because I want users to be able to add projects with and without clients. I need to keep track of which company (and therefore which website user) a clientless project belongs to, which is why a project needs a CompanyID.
I'd go with with the latter, having one or more tables that define the allowable relationships between entities.
Note, there's no circularity in the references you have, so the title is misleading.
What you have is the possibility of conflicting data, that's different.
Why do you have "CompanyID" in the project table? The ID of the company involved is implicitly given by the client you link to. You don't need it.
Remove that column and you've removed your problem.
Additionally, what is the purpose of the "name" column in the client table? Can you have a client with one name, differing from the name of the company?
Or is "client" the person at that company?
Edit: Ok with the clarification about projects without companies, I would separate out the references, but you're not going to get rid of the problem you're describing without constraints that prevent multiple references being made.
A simple constraint for your existing tables would be that not both the CompanyID and ClientID fields of the project row could be non-null at the same time.
If you want to use the table like this and avoid the all the new queries just put triggers on the table and when user tries to insert row with wrong data the trigger with stop him.
Best Regards,
Iordan
My first thought would be to create a special client record for each company with name "No client". Then eliminate the CompanyId from the Project table, and if a project has no client, use the "No client" record rather than a "normal" client record. If processing of such no-client's is special, add a flag to the no-client record to explicitly identify it. (I'd hate to rely on the name being "No Client" or something like that -- too fuzzy.)
Then there would be no way to store inconsistent data so the problem would go away.
In the end I implemented a completely generic solution which solves my problem without much runtime overhead and without requiring any changes to the database. I'll describe it here in case someone else has the same problem.
First off, the approach only works because the only table that other tables are referencing through multiple paths is the Companies table. Since this is the case in my database, I only have to check whether all n:1 referenced entities of each entity that is to be created / updated / deleted are referencing the same company (or no company at all).
I am enforcing this by deriving all of my Linq entities from one of the following types:
SingleReferenceEntityBase - The norm. Only checks (via reflection) if there really is only one reference (no matter if transitive or intransitive) to the Companies table. If this is the case, the references to the companies table cannot become inconsistent.
MultiReferenceEntityBase - For special cases such as the Projects table above. Asks all directly referenced entities what company ID they are referencing. Raises an exception if there is an inconsistency. This costs me a few select queries per CRUD operation, but since MultiReferenceEntities are much rarer than SingleReferenceEntities, this is negligible.
Both of these types implement a "CheckReferences" and I am calling it whenever the linq entity is written to the database by partially implementing the OnValidate(System.Data.Linq.ChangeAction action) method which is automatically generated for all Linq entities.

Resources