Serial numbers, created and modified in SQL Server - sql-server

I'm needing to add serial numbers to most of the entities in my application because I'm going to be running a Lucene search index side-by-side.
Rather than having to run an ongoing polling process, or manually run my indexer by my application I'm thinking of the following:
Add a Created column with a default value of GETUTCDATE().
Add a Modified column with a default value of GETUTCDATE().
Add an ON UPDATE trigger to the table that updates Modified to GETUTCDATE() (can this happen as the UPDATE is executed? i.e. it adds SET [Modified] = GETUTCDATE() to the SQL query instead of updating it individually afterwards?)
The ON UPDATE trigger will call my Lucene indexer to update its index (this would have to be an xp_cmdshell call presumably, but is there a way of sending a message to the process instead of starting a new one? I heard I could use Named Pipes, but how do you use named pipes from within a Sproc or trigger? (searching for "SQL Server named pipes" gives me irrelevant results, of course).
Does this sound okay, and how can I solve the small sub-problems?

As I understood, you have to introduce two columns to your existing tables and have them processed (at east one of them) in 'runtime' and used by an external component.
Your first three points are nothing unusual. There are two types of triggers in SQL Server according to time when trigger get processed: INSTEAD OF trigger (actually processed before insert happens) and AFTER trigger. However, inside INSTEAD OF trigger you have to provide logic what to really insert data into the table, along with other custom processing you require. I usually avoid that if not really necessary.
Now about your fourth point - it's tricky and there are several approaches to solve this in SQL Server, but all of them are at least a bit ugly. Basically you have to either execute external process or send message to it. I really don't have any experience with Lucene indexer but I guess one of these methods (execute or send message) would apply.
So, you can do one of the the following to directly or indirectly access external component, meaning to access Lucene indexer directly or via some proxy module:
Implement unsafe CLR trigger; basically you execute .NET code inside the trigger and thus get access to the whole (be careful with that - not entirely true) .NET framework
Implement unsafe CLR procedure; only difference to CLR trigger is that you wouldn't call it imediatelly after INSERT, but you will do fine with some database job that runs periodically
Use xp_cmdshell; you already know about this one, but you can combine this aproach with job-wrapping technique in last point
Call web service; this technique is usually marked as experimental AND you have to implement the service by yourself (if Lucene indexer doesn't install some web service on its own)
There surely are other methods I can't think of right now...
I would personally go with third point (job+xp_cmdshell) because of the simplicity, but that's just because I lack any knowledge of how does the Lucene indexer work.
EDIT (another option):
Use Query Notifications; SQL Server Service Broker allows an external application to connect and monitor interesting changes. You even have several options how to do that (basically synchronous or asynchronous), only precondition is that your Service Borker is up, running and available to your application. This is more sophisticated method to inform external component that something has changed.

Related

Change Implementation of Select, Update, and Insert in SQL Server

To give you the question first: I want to know if it is possible to create a stored procedure or something in SQL Server that intercepts and translates SELECT, INSERT, and UPDATE commands. Now for the explanation:
I am writing a web application to replace an old desktop app. Its a business app which is basically a database interface with reports and searches and all the good ol' CRUD. The new and old apps need to live in harmony together since some customers may be using the old and new together to access the same DB.
My problem is that the original database format stores most data in a single blob of text (1 nvarchar(MAX) field). I want to add functionality to search on fields stored in the blob, but it will be cumbersome and slow. I would like to update the database format without changing the desktop app at all, hence the question above.
It occurs to me that I could do this on the client by writing a wrapper class for the data access object and then do a bulk replace in the client code to reference the wrapper, but I want to know what my options are on the server as well.
In case anyone wants to know, the old app is in VB6 and the new in C#.
EDIT
Alright, so it looks like if I do anything on the server side we are looking at adding stored procedures and then updating the client VB6 code to reference the stored procs. Do something like a bulk replace of SELECT with sp_oldselect ... To return the data in a different format. I'm guessing a client-side wrapper would be the best solution for the time-being. Old apps die hard.
You can create a bunch of views for the old client and let it to query those views. It will be slow as hell in most cases, but it can 'replace' the select query. For updates and insert.. well.. instead of triggers on the views could help is some cases, but it will require lots of processing.
However my suggestion is to provide exactly the same functionality in the web app and deprecate the desktop app. When the desktop app's share is low enough, stop supporting it. From this point, you are (mostly) free to add new functions, upgrade the database schema, etc.
I agree with JonH, that alot can go wrong here, but you can try and read up on the INSTEAD OF Triggers in MS SQL server here: https://technet.microsoft.com/en-us/library/ms179288(v=sql.105).aspx

Execute multiple Stored Procedures with Quartz.NET

Still trying to wrap my head around Quartz.NET after reading all the tutorials, which seem very code specific, versus implementation focused. Here's what I'm trying to do. I have 20 SQL stored procs that do various things, like query log tables, resubmit data to other processes, etc. I'd like to have these SP running throughout the day at regular intervals. So it seems like a natural for Quartz.NET. I plan on creating a Windows Svc that implements Quartz.NET and contains jobs in assemblies in the same folder as the Quartz assembly.
One bad way to implement this, I think, would be to write a single job class for every SP and associate a separate trigger for each one. The job class would simply execute a particular SP whose named was hard coded in the class. That's the bad way.
But for the life of me I can't figure out what the Good way would be. Obviously having a single job class that just does a generic 'execute SP by name', where the names come from a simple SQL table, seems like the way to go, but how would I get different triggers associated with different SPs, and how would Quartz know to load up all twenty SPs on separate threads?
And how would Quartz know to pickup a changed trigger for example for one of the SPs? Would that have to be a start/stop cycle on the Win Svc to reload jobs and triggers, or would I have to hand code some kind of "reload" too?
Any thoughts? Am I misunderstanding what Quartz is? The verbiage makes it sound like it's an Enterprise Scheduler, a System, a thing you install. All the documentation OTOH makes it seem like just a bunch of classes you stitch together to create your OWN scheduler or scheduling system, no different from the classes MS provides in .NET to create apps that do FTP for example. Maybe I'm expecting too much?
A pretty easy way to fulfill your requirements could be:
Start with sample server
Take Quartz.NET distribution's server as starting point, you have there a ready made template for a Windows service that utilizes TopShelf for easy installation
Use XML configuration with change detection
quartz.config file contains the actual configuration, there you can see that jobs and triggers are read from XML file quartz_jobs.xml .
You need to add quartz.plugin.xml.scanInterval = 10 to watch for changes (every ten seconds)
Use trigger job data maps to parameterize the job
You can use same job class for every trigger if SQL execution is as trivial as you propose. Just add needed configuration to trigger's definition in XML (sample here that runs every ten seconds, add as many triggers as you want):
<trigger>
<simple>
<name>sqlTrigger1</name>
<job-name>genericSqlJob</job-name>
<job-group>sqlJobs</job-group>
<job-data-map>
<entry>
<key>sql_to_run</key>
<value>select 1</value>
</entry>
</job-data-map>
<misfire-instruction>SmartPolicy</misfire-instruction>
<repeat-count>-1</repeat-count>
<repeat-interval>10000</repeat-interval>
</simple>
</trigger>
Just use the quartz_jobs.xml as base and make required changes.
Use configuration in your job
You can access the configuration in your job from context's MergedJobDataMap that contains both job's and trigger's parameters, latter overriding former.
public void Execute(IJobExecutionContext context)
{
string sqlToRun = context.MergedJobDataMap.GetString("sql_to_run");
SqlTemplate.ExecuteSql(sqlToRun);
}

How should I rename many Stored Procedures without breaking stuff?

My database has had several successive maintainers over the years and any naming guidelines that may have once been in place have been ignored.
I'd like to rename the stored procedures to a consistent format. Obviously I can rename them from within SQL Server Management Studio, but this will not then update the calls made in the website code behind (C#/ASP.NET).
Is there anything I can do to ensure all calls get updated to the new names, short of searching for every single old procedure name in the code? Does Visual Studio have the ability to refactor such stored procedure names?
NB I do not believe my question to be a duplicate of this question as the latter is solely about renaming within the database.
You could make the change in stages:
Copy of the stored procedures to the new stored procedures under their new name.
Alter the old stored procedures to call the new ones.
Add logging to the old stored procedures when you've changed all the code in the website.
After a while when you're not seeing any calls to the old stored procedures and you're happy you've found all the calls in the web site, you can remove the old stored procedures and logging.
You can move the 'guts' of the SPROC to a new SPROC meeting your new naming conventions, and then leave the original sproc as a shell / wrapper which delegates to the new SPROC.
You can also add an 'audit' table to track when the old wrapper SPROC is called - this way you will know that there are no dependencies on the old SPROC, and the old SPROC can be safely dropped (also, make sure that it isn't just 'your app' using the DB - e.g. cross database joins or other apps)
This has a small performance penalty, and won't really buy you that much (other than being able to 'find' your new SPROCs easier)
You will need to handle this in at least two areas, the application and the database. There could be other areas as well, and you have to be careful not to overlook them.
The Application
A Nice Practice for Future Projects
It helps to abstract your sprocs out. In our apps, we wrap all of our sprocs in a giant class, I can make calls like this:
Dim SomeData as DataTable = Sprocs.sproc_GetSomeData(5)
That way, the code end is nice and encapsulated. I can go into Sprocs.sproc_GetSomeData and tweak the sproc name in just one place, and of course I can right click on the method and do a symbolic rename to fix the method call solution-wide.
Without the Abstraction
Without that abstraction, you can just do Find In Files (Cntl+Shift+F) for the sproc name and then if the results looks right, open the files up and Find/Replace all the occurances.
The Sql Server
Don't Trust View Dependencies
On the SQL server end, theoretically in MSSMS 2008 you can right click on a sproc and select View Dependencies.
That should show you a list of all the places where the sproc is used in the database, however my confidence in this feature is very low. It might be better in SQL 2008, but in previous versions it definitely had problems.
View Dependencies hurt me, and it will take time for that to heal. :)
Wrap It!
You end up having to keep the old sproc around for awhile. This is the major reason why renaming sprocs is a such a project - it can take a month to finally be done with it.
First replace its contents with some simple TSQL that calls the the new sproc with the same parameters, and write some logging so that once some time goes by, you can tell if the old sproc is actually unused.
Finally, when you're sure the old sproc is unused, delete it.
Other Areas?
There could be a lot of other areas as well. Reporting Services springs to mind. SSIS packages. Using the technique of keeping the old sproc around and re-routing to the new one (mentioned above) will help you know if you missed anything, however it won't tell you what you missed. This can lead to much pain!
Good luck!
Short of testing every path in your application to ensure that any calls to the database and the relevant stored procedures have been updated... no.
Use global search and replace (but review each suggested replacement) to try to avoid missing any instances. If you app is well structured then there really should only be 1 place each stored proc is called.
As far as changing your application, I have all my stored procs as settings in the web.config file, so all the names are in one place and can be changed at any time to match changes to the database.
When the application needs to call a stored proc, the name is determined from web.config.
This makes it easier to manage all the potential calls which the application could make to the database services layer.
It will be a bit of a tedious search through your source code and other database objects I'm afraid.
Don't forget SSIS Packages, SQL Agent Jobs, Reporting Services rdl as well as your main application code.
You could use a regular expression like spProc1|spProc2 to search in the source code for all object names at the same time if you have a tool that supports searching through files using regular expressions (I have used RegexBuddy for this in the past)
If you want to just cover the possibility you might have missed the odd one you could leave all the previous stored procedures behind for a month and just have them log a custom SQL trace event with APP_NAME(), SUSER_NAME() and any other info you find helpful then have it call the renamed version. Then set up a trace monitoring this event.
If you use a connection to DB, stored procedures etc, you should create a service class to delegate these methods.
This way when something in your database, SP etc changes, you only have to update your service class, and everything is protected from breaking.
There are tools for VS that can manage changing a name, like refactor, and resharper
I did this and I relied heavily on global search in my source code for stored procedure names and SQL digger to find sql procs that called sql proces.
http://www.sqldigger.com/
SQL Server (as of SQL 2000) poorly understands it own dependencies, so one is left searching the text of the scripts to find dependencies, which could be other stored procs or substrings of dynamic sql.
I would obtain a list of references to a procedure by using the following, because SSMS dependencies doesn't pickup dynamic SQL references or references outside the database.
SELECT OBJECT_NAME(m.object_id), m.*
FROM SYS.SQL_MODULES m
WHERE m.definition LIKE N'%my_sproc_name%'
The SQL needs to be run in every database where there could be references.
syscomments and INFORMATION_SCHEMA.routines have nvarchar(4000) columns. So if "mySprocName" is used at position 3998, it won't be found. syscomments does have multiple lines but ROUTINES truncates. Should you disagree, take it up with gbn.
Based on that list of dependencies, I'd create new stored procedures starting the foundation stored procedures - those with the least dependencies. But I'd mind not to create stored procedures, prefixing the name with "sp_"
Verify the foundation procedures work identically to existing ones
Move to the next level of stored procedures - repeat steps 1-3 as needed till the highest level procedure has been processed.
Test the switch over the application uses to the new procedure - don't wait until the all the procedures are updated to test interaction with the application code. This doesn't need to be done for every stored procedure, but waiting to do this wholesale isn't a great approach either.
Developing in parallel has it's risks too:
Any changes to existing code needs to also be applied to the new code. If possible, work in areas where development is frozen or use a bug fix as an opportunity to migrate to new code rather than apply the patch in two places (while also minimizing downtime for transition).
Use a utility like FileSeek to search the contents inside each and every file in your project folder. Don't trust the windows search - it's slow and user-unfriendly.
So if you had a Stored Procedure named OldSprocOne and want to rename it to SP_NewONe, search all occurrences Of OldSprocOne then search all occurrences of OldSprocOne to see if that name isn't already being used somewhere else and won't cause problems. Then rename each and every occurrence in the code.
This can be very time consuming and repetitive for larger systems.
I would be more concerned about ignoring the names of the procedures and replacing your legacy DAL with Enterprise Library Data Access Block 5
Database Accessors in Enterprise Library 5 DAAB - Database.ExecuteSprocAccessor
Having code that is like
public Contact FetchById(int id)
{
return _database.ExecuteSprocAccessor<Contact>
("FetchContactById", id).SingleOrDefault();
}
Will have atleast a billion times more value than having stored procs with consistent names, especially if the current code passes around DataTables or DataSets ::shudders::
I'me all in favor of refactoring any sort of code.
What you really need here is a method slowly and incrementally renaming your stored procs.
I certainly would not do a global find and replace.
Rather, as you identify small pieces of functionality and understand the relationships between the procs, you can re-factor in small pieces.
Fundamental to this process, though, is source-code control of your database.
If you do not manage changes to your database the same as normal code, you will be in serious trouble.
Have a look at DBSourceTools. http://dbsourcetools.codeplex.com
It's specifically designed to help developers get their databases under source code control.
You need a repeatable method of restoring your database to a specific state - prior to refactoring.
Then re-apply your refactored changes in a controlled way.
Once you have embraced this mindset, this mammoth and error-prone task will become simple.
This is assuming that you use SQL Server 2005 or above. An option that I have used before is to rename the old database object and create a SQL Server Synonym with the old name. This will allow for you to update your objects to whatever convention you choose and replace the refrences in code, SSIS packages, etc... as you come along them. Then you can concentrate updating the references in your code gradually over however maintenance releases you choose (as opposed to breaking them all at once). As you feel that you've found all references you can remove the synonym as the code goes to QA.

Why django checks whether settings.DATABASE_NAME db actually exists for running testcases?

I will be frequently running testcases for my django project. But one
fine day it occured to me that django actually checks the
settings.DATABASE_NAME db actual existence while running testcases.
Why is this so. All I thought was django will be taking the
settings.DATABASE_NAME and creates a test db called 'test_' +
settings.DATABASE_NAME. It also checks whether the database with the
name = settings.DATABASE_NAME, is actually existing or not(for
creating the test db)? Ideally speaking, only name should be checked
but not the actual existence of the db right?
I browsed through the django source code and found out that the "connection" which is used to create the testdb actually is created using DATABASE setting options. All it should be bothered about settings' values and not their actual existence. Right?
Neat question... you know, this had never occurred to me. The short answer is that Django itself doesn't need to verify that the DATABASE_NAME actually exists, but it does need to connect to the database in order to create the test database. Most databases accept (and some require) the DATABASE_NAME in order to formulate the connection string; often this is because the database name to which you're connecting contributes to the permissions for your connection session.
Because the test database doesn't exist yet, django has to first connect using the normal settings.DATABASE_NAME in order to create the test database.
So, it works like this:
Django's test runner passes off to the backend-specific database handler
The backend-specific database handler has a function called create_test_db which will use the normal settings to connect to the database. It does this using a plain cursor = self.connection.cursor() command, which obviously uses the normal settings values because that's all it knows to be in existence at this point.
Once connected to the database, the backend-specific handler will issue a CREATE DATABASE command with the name of the new test database.
The backend-specific handler closes the connection, then returns to the test runner, which swaps the normal settings.DATABASE_NAME for the test_database_name
The test will then run as normal. All subsequent calls to connection.cursor() will use the normal settings module, but now that module has the swapped out database name
At the end, the test runner restores the old database name after calling the backend-specific handler's destroy_test_db function.
If you're interested, the relevant code for the main part is in django.db.backends.creation. Have a look at the _create_test_db function.
I suppose that it would be possible for the Django designers to make exceptions on a db-by-db basis since not every DB needs the current database name in the connection string, but that would require a bit of refactoring. Right now, the create_test_db function is actually in one of the backend base classes, and most actual backend handlers don't override it, so there's be a fair amount of code to push downstream and to duplicate in each backend.

Re-Running Database Development Scripts

In our current database development evironment we have automated build procceses check all the sql code out of svn create database scripts and apply them to the various development/qa databases.
This is all well and good, and is a tremdous improvement over what we did in the past, but we have a problem with rerunning scripts. Obviously this isn't a problem with some scripts like altering procedures, because you can run them over and over without adversly affecting the system. Right now to add metadata and run statements like create/alter table statements we add code to check and see if the objects exists, and if they do, don't run them.
Our problem is that we really only get one shot to run the script, because once the script has been run, the objects are in the environment and system won't run the script again. If something needs to change once it's been deployed, we have a difficult process of running update scripts agaist the update scripts and hoping that everything falls in the correct order and all of the PKs line up between the environments (the databases are, shall we say, "special").
Short of dropping the database and starting the process from scratch (the last most current release), does anyone have a more elegant solution to this?
I'm not sure how best to approach the problem in your specific environment, but I'd suggest reading up on Rail's migrations feature for some inspiration on how to get started.
http://wiki.rubyonrails.org/rails/pages/UnderstandingMigrations
We address this - or at least a similar problem to this - as follows:
The schema has a version number - this is represented by a table which has one row per version which, as well as the version number, carries boring things like a date/time stamp for when that version came into existence.
By having the schema create/modify DDL wrapped in code that performs the changes for us.
In the context above one would build the schema change code as part of the build process then run it and it would only apply schema changes that haven't already been applied.
In our experience (which is bound not to be representative) in most cases the schema changes are sufficiently small/fast that they can safely be run in a transaction which means that if it fails we get a rollback and the db is "safe" - although one would always recommend taking backups before applying schema updates if practicable.
I evolved this out of nasty painful experience. Its not a perfect system (or an original idea) but as a result of working this way we have a high degree of confidence that if there are two instances of one of our databases with the same version that then the schema for those two databases will be the same in almost all respects and that we can safely bring any db up to the current schema for that application without ill effects. (That last isn't 100% true unfortunately - there's always an exception - but its not too far from the truth!)
Do you keep your existing data in the database? If not, you may want to look at something similar to what Matt mentioned for .NET called RikMigrations
http://www.rikware.com/RikMigrations.html
I use that on my projects to update my database on the fly, while keeping track of revisions. Also, it makes it very simple to move database schema to different servers, etc.
if you want to have re-runnability in your scripts, then you can't have them as definitions... what I mean by this is that you need to focus on change scripts rather than here is my Table script.
let's say you have a table Customers:
create table Customers (
id int identity(1,1) primary key,
first_name varchar(255) not null,
last_name varchar(255) not null
)
and later you want to add a status column. Don't modify your original table script, that one has already run (and can have the if(! exists) syntax to prevent it from causing errors while running again).
Instead, have a new script, called add_customer_status.sql
in this script you'll have something like:
alter table Customers
add column status varchar(50) null
update Customers set status = 'Silver' where status is null
alter table Customers
alter column status varchar(50) not null
Again you can wrap this with an if(! exists) block to allow re-running, but here we've leveraged the notion that this is a change script, and we adapt the database accordingly. If there is data already in the customers table then we're still okay, since we add the column, seed it with data, then add the not null constraint.
Both of the migration frameworks mentioned above are good, I've also had excellent experience with MigratorDotNet.
Scott named a couple of other SQL tools that address the problem of change management. But I'm still rolling my own.
I would like to second this question, and add my puzzlement that there is still no free, community-based tool for this problem. Obviously, scripts are not a satisfactory way to maintain a database schema; neither are instances. So, why don't we keep metadata in a separate (and while we're at it, platform-neutral) format?
That's what I'm doing now. My master database schema is a version-controlled XML file, created initially from a simple web service. A simple javascript program compares instances against it, and a simple XSL transform yields the CREATE or ALTER statements. It has limits, like RikMigrations; for instance it doesn't always sequence inter-depdendent objects correctly. (But guess what — neither does Microsoft's SQL Server Database Publication tool.) Really, it's too simple. I simply didn't include objects (roles, users, etc.) that I wasn't using.
So, my view is that this problem is indeed inadequately addressed, and that sooner or later we'll have to get together and tackle the devilish details.
We went the 'drop and recreate the schema' route. We had some classes in our JUnit test package which parameterized the scripts to create all the objects in the schema for the developer executing the code. This allowed all the developers to share one test database and everyone could simultaneously create/test/drop their test tables without conflicts.
Did it take a long time to run? Yes. At first we used the setup method for this which meant the tables were dropped/created for every test and that took way too long. Then we created a TestSuite which could be run once before all the tests for a class and then cleaned up when all the class tests were complete. This still meant that the db setup ran many times when we ran our 'AllTests' class which included all the tests in all our packages. How I solved it was adding a semaphore to the OracleTestSuite code so when the first test requested the database to be setup it would do that but any subsequent call would just increment a counter. As each tearDown() method was called, the counter would decrement the counter until it reached 0 and the OracleTestSuite code would drop everything. One issue this leaves is whether the tests assume that the database is empty. It can be convenient to let database tests know the order in which they run so they can take advantage of the state of the database because it can reduce the duplication of DB setup.
We used the concept of ObjectMothers to solve a similar problem with creating complex domain objects for testing purposes. Mock objects might be a better answer but we hadn't heard about them at the time. After all this time, I'd recommend creating test helper methods that could create standardized datasets for the typical scenarios. Plus that would help document the important edge cases from a data perspective.

Resources