Where Do You Put Commonly Shared Tables, Stored Procedures, and Functions? - sql-server

I have a SQL Server with a number of databases. Most are for applications, but some store data for reporting and analysis. I also have information that is not specific to any one database, but can be used by several of them.
A good example is my company's fiscal calendar. I store this information in a table. Putting the same fiscal calendar table in each database is a bad idea for me. Even with the negative of having multiple database dependencies, I think it is worth it because otherwise there is too much risk for inconsistency. What I do now is put the fiscal calendar and other similar functions and procedures in a database simply titled "Community".
I have the rare and glorious opportunity of moving to a new server and refactoring everything as I go. I am wondering if I should change this practice. Below are a few specific questions:
Am I unaware of any disadvantages of my current method?
Is there a better place or name to use to store this type of information?
What is your experience with issues like this, and am I missing what should be an obvious solution?
Thanks

You've already taken the important step of separating the shared data into its own database. I don't think there's a better approach. The name is fairly subjective, but Common is another term frequently used for this purpose.

I would hide this behind a "shared data service" or something. Not rely on the existence of a database.
You don't have to be a big shop before you need to put one app onto it's own servers then you're bollixed.
At the very least, I'd use a linked server to hide it even if on the same server so you are independent of actual server names.

Related

SQL Server copy data across databases

I'm using SQL Server 2019. I have a "MasterDB" database that is exposed to GUI application.
I'm also going to have as many as 40 user database like "User1DB", "User2DB", etc. And all these user databases will have "exact same" schema and tables.
I have a requirement to copy tables data (overwriting target) from one user database (let's say "User1DB") to the other (say "User2DB"). I need to do this from within "MasterDB" database since the GUI client app going to have access to only this database. How to handle this dynamic situation? I'd prefer static SQL (in form of Stored Procedures) rather than dynamic SQL.
Any suggestion will greatly be appreciated. Thanks.
Check out this question here for transferring data from one database to another.
Aside from that, I agree with #DaleK here. There is no real reason to have a database per user if we are making the assumption that a user is someone who is logging into your frontend app.
I can somewhat understand replicating your schema per customer if you are running some multi-billion record enterprise application where you physically have so much data per customer that it makes sense to split it up, but based on your question that doesn't seem to be the case.
So, if our assumptions are correct, you just need to have a user table, where your fields might be...
UserTable
UserId
FName
LName
EmailAddress
...
Edit:
I see in the comments you are referring to "source control data" ... I suggest you study up on databases and how they're meant to be designed, implemented, and how data should be transacted. There are a ton of great articles and books out there on this with a simple Google search.
If you are looking to replicate your data for backup purposes, look into some data warehouse design principles, maybe creating a separate datastore in a different geographic region for that. The latter is a very complex subject to which I can't go over in this answer, but it sounds like that goes far beyond your current needs. My suggestion is to backtrack and hash out the needs for your application, while understanding some of the fundamentals of databases (or different methods of storing data). Implement something and then see where it can be expanded upon / refactored.
Beyond that, I can't be more detailed than the original question you posted. Hope this helps.

Transact SQL, prefix with schema or not?

I'm developing a tool where I've prefixed tables etc. with "dbo" now I get requests for custom schema names. I'm thinking of skipping them and instead let the user control this via the associated login against the Db. I know there's talk about "performance" since it needs to search the users's schemes and then fallback on dbo etc, but is that really an issue? Opinions?
First, I would look at this question as a feature request from your customers (users?). So the immediate decision to make is, should you even consider looking into this now, or do you have other feature requests that are obviously more important and deliver more benefit to the customer?
For example, for now you could simply tell customers that your application requires its own database that should not be shared with other applications or manipulated in any way by the customer. Then you don't have to worry about schemas or the same object name in two schemas because your application 'owns' the database. Perhaps this is already the case, but if so then I don't understand why your customers care which schema your objects are in.
Second, assuming that you do decide to work on it, you should gather some information about why people are asking for this, to make sure that you clearly understand what they expect you to deliver and what the benefit is for them. If customers are really saying "your application runs slowly" then the choice of schema is highly unlikely to be the reason, it's much more probable that indexing, schema design or your application code are the areas to look at.
Finally, if you still want to go ahead you need to find a technical solution. This is partly a deployment issue and partly a coding issue. It's a deployment issue because you have to deploy your database objects in a specific schema that is specified at installation time, and all your patches and later releases need to be aware of that too. The coding issue is that you need your database code to be "schema-aware", in case you end up in a situation where you have dbo.TableName, MyTool.TableName and OtherSchema.TableName all in the same database. The solution is obviously to reference the schema name in all code, which is considered an important best practice anyway. But exactly how you do this depends on how you have structured your application, if you use an ORM etc.

Should we start with multiple small-grained databases for an app that may scale massively

We're developing a new eCommerce website and are using NHibernate for the first time. At present we are splitting our data into multiple SQL Server databases, divided per area of functionality. So we have one for UserInfo, one for Orders, one for ProductCatalogue and so on...
Our justification for this decision is twofold really:
the website has the potential to be HUGE (it is a new website for one of the largest online brands in the UK) and we feel that by partitioning our data along functional lines we will be able to move the databases onto their own servers which would give us an easy scaling route should we need it;
my team has always worked this way - partly as a consequence of following the MS Commerce Server pattern from previous projects.
However, reading up on this decision on the internet, we find that the normal response to this sort of model is extremely scathing. "Creating more work for the devs now in order to create more work for the devs later" is one sample comment from Stack Overflow!
In addition, NHibernate is much easier to use with only one database (just one SessionFactory needed). And knowing that Stack Overflow ran off just one box for a long time makes me think that maybe we should not try to be so clever.
So, my question is, "are we correct in thinking that using fine-grained databases might increase our ability to scale or should we sacrifice this for easier development"?
Why don't you just design your database properly and put the files on appropriate disk? Use a cluster if necessary. Creating multiple databases is not an inherently scaling solution. Also - cross database referential integrity? Good luck.
What's your definition of "HUGE"? SQL Server can handle massive databases, but one thing I've learnt is that people often have no idea what constitutes a lot of data.
I've never worked in a project like this. I'm used to databases with several hundred tables, which had never been a problem.
Therefore I can't say if your idea is a good idea, I never tried it. The "my team has always worked this way"-argument is a major driver for many decisions, and I can't even say that it is always wrong.
With NHibernate you organize your data in classes. They can be in different namespaces and assemblies. You usually don't work much with the database directly, you don't need this kind of structure there.
About the scalability argument: I'm not sure if it is really scaling well when you need to access several databases every time. I mean: you always need users and orders and probably more. Then you need to get all this data from several databases.
Agree fully with starskythehutch - keep your related tables together in the same DB. BUT, you may want to consider having separate databases for things that are not related or non-critical to your main product; but that are a part of the app.
For eg: if you decide to log every visit/hit to the site in a DB, you should probably keep that in a separate DB.
The reason you should consider:
1. huge number of transactions - say hundreds of thousands / sec. Having non-critical un-related stuff in a separate DB will ensure that tlog contentions because of this are avoided.
Restore, DBCC CHECKDB, backup times. If you stuff your non-related non-critical stuff in your main DB, you are essentially increasing the size of your DB and it will affect these operations. Having it in separate DB will help you improve performance of these operations.

Prefixing database table names

I have noticed a lot of companies use a prefix for their database tables. E.g. tables would be named MS_Order, MS_User, etc. Is there a good reason for doing this?
The only reason I can think of is to avoid name collision. But does it really happen? Do people run multiple apps in one database? Is there any other reason?
Personally, I don't see any value in it. In fact, it's a bummer for intellisense-like features because everything begins with MS_. :) The Master agrees with me too.
Huge schemas often have many tables with similar, but distinct, purposes. Thus, various "segmented" naming conventions.
Darn, didn't get first post :-)
In SQL Server 2005 and above the schema feature eliminates the need for any kind of prefix. A good example of their usage can be found by reading about the Schemas in AdventureWorks.
In some older versions of SQL server, having a prefix to create a pseudo namespace might of been useful with DBs with lots of tables.
Other than that I can't really see the point.
Even when the database only contains one application, the prefixes can be useful in grouping like parts of the application together. So tables that containtain cutomer information might be prefixed with cust_, those that contain warehouse information might be prefixed with inv_ (for inventory), those that contain finacial information might be prefixed with fin_, etc.
I've worked on systems where there is an existing database for an application which was created and maintained by a different company and we've needed to add another app that uses large amounts of the same data with just a few extra tables of our own, so in that case having an app specific prefix can help with the separation.
Slightly tangentially to the original question, I've seen databases use prefixes to indicate the type of data that a table is holding. There'd be a prefix for lookup tables which are obviously pretty static in both size and content and a different prefix for the tables that contain variable data. This in turn may be broken into having one prefix for tables that are added to but not really changed like logging, processed orders, customer transactions etc, and and another for more variable data like customer balance or whatever. Link tables could also have their own prefix to separate them out too.
I have never seen a naming collision, as it usually doesn't make sense to put tables from different applications into the same database namespace. If you had some sort of reusable library that could be integrated into different applications, perhaps that might be a reason, but I haven't seen anything like that.
Though, now that I think about it, there are some cheap web hosting providers that only allow users to create a very small number of databases, so it would be possible to run a number of different applications using a single database, so long as the names didn't collide (and such a prefixing convention would certainly help).
Multiple applications using a particular table, correct. Prefixes prevent name collision. Also, it makes it rather simple to backup tables and keep them in the same database, just change the prefix and your backup will be fully functional, etc. Aside from that, it's just good practice.
Prefixes are a good way to sort out which sql objects are associated with which app when multiple apps dip into the same database.
I have also prefixed sql objects differently within the same app to facilitate easier management of security. i.e. all the objects with admin_ need this security applied and the rest need something else.
Prefixes can be handy for humans, search tools and scripts. If the situation is a simple one, however, there is probably no use for them at all.
It's most often use if several applications are sharing one database. For example, if you install Wordpress, it prefixes all tables with "wp_". This is good if you want your applications to share data very easily(sessions across all applications in your company, for example.)
There are better ways to accomplish this however, and I never prefix my table names, as each application has it's own self-contained database.

django AuditTrail vs Reversion

I am working on an new web app I need to store any changes in database to audit table(s). Purpose of such audit tables is that later on in a real physical audit we can asecertain what happened in a situation, who edited what and what was the state of db at the time of e.g. a complex calculation.
So mostly audit table will be written and not read. Report may be generated though sometimes.
I have looked for available solution
AuditTrail - simple and that is why I am inclining towards it, I can understand it single file code.
Reversion - looks simple enough to use but not sure how easy it would be to modify it if needed.
rcsField seems to be very complex and too much for my needs
I haven't tried anyone of these, so I wanted to know some real experiences and which one I should be using. e.g. which one is faster uses less space, easy to extend and maintain?
Personally I prefer to create audit tables in the database and populate through triggers so that any change even ad hoc queries from the query window are stored. I would never consider an audit solution that is not based in the database itself. This is important because people who are making malicious changes to the database or committing fraud are not likely to do so through the web interface but on the backend directly. Far more of this stuff happens from disgruntled or larcenous employees than outside hackers. If you are using an ORM already, your data is at risk because the permissions are at the table level rather than the sp level where they belong. Therefore it is even more important that you capture any possible change to the dat not just what was from the GUI. WE have a dynamic proc to create audit tables that is run whenever new tables are added to the database. Since our audit tables populate only the changes and not the whole record, we do not need to change them every time a field is added.
Also when evaluating possible solutions, make sure you consider how hard it will be to revert the data to undo a specific change. Once you have audit tables, you will find that this is one of the most important things you need to do from them. Also consider how hard it will be to maintian the information as the database schema changes.
Choosing a solution because it appears to be the easiest to understand, is not generally a good idea. That should be lowest of your selction criteria after meeting the requirements, security, etc.
I can't give you real experience with any of them but would like to make an observation.
I assume by AuditTrail you mean AuditTrail on the Django wiki. If so, I think you'll want to instead look at HistoricalRecords developed by the same author (Marty Alchin aka #gulopine) in his book Pro Django. It should work better with Django 1.x.
This is the approach I'll be using on an upcoming project, not because it necessarily beats the others from a technical standpoint, but because it matches the "real world" expectations of the audit trail for that application.
As i stated in my question rcField seems to be to much for my needs, which is simple that i want store any changes to my table, and may be come back later to those changes to generate some reports.
So I tested AuditTrail and Reversion
Reversion seems to be a better full blown application with many features(which i do not need), Also as far as i know it saves data in a single table in XML or YAML format, which i think
will generate too much data in a single table
to read that data I may not be able to use already present db tools.
AuditTrail wins in that regard that for each table it generates a corresponding audit table and hence changes can be tracked easily, per table data is less and can be easily manipulated and user for report generation.
So i am going with AuditTrail.

Resources