We're migrationg from an old version of MS SQL Server to a much more recent version. The trouble is, we have too many equally old stored procedures that won't run in the new version, in particular, many use the "*=" notation instead of "LEFT JOIN" or RIGHT JOIN. Upgrading them on-hand would be extremely error-prone and time consuming (just one of the databases I'm in charge of has 900+ SP, and I've yet to check the other four!), so I'm wondering if there's any software out there that can aupgrade all these procedures. Any help would be greatly appreciated!
You don't mention the target version of SQL server, but I believe all of the recent version have a corresponding SQL Upgrade Advisor. I believe this tool will be useful as it will identify all of the places you use the obsolete syntax.
Assuming you only have server-side code (and no dynamic sql) this would be useful.
However, I don't think you will find any tool that can do this automatically for all cases -- the old style syntax was at times ambiguous and there are other possible problems.
You might find this article useful though as it shows how you can use SMSS to convert an old-style join. It is not perfect, and it does not always yield the highest performance equivalent join, but it may be better than nothing.
I never used the old-style joins myself on MS as my first version of MS-SQL was 7.0.
Related
For some of our environment, we have upgraded our SQL Version to 2017. And what we are doing is that we take SQL 2012 database backup and restore it on SQL 2017 through some automated scripts. And same was the case when these databases were migrated to SQL 2012 from SQL 2008 R2. But the compatibility level of our databases is still of 2008 & 2012 because basically we don’t update compatibility level of database during migration. And the sole reason is the difference of memory architecture among these SQL Versions.
So can anyone help me understanding what can be the possible impacts if we upgrade compatibility of our databases to 2017 from 2012 or 2008? And is there any parallel feature that we’ll have to enable/use in order to have minimal impact of compatibility level change on our environment?
Any help in understanding this would be appreciated.
Adding as an answer as I was running out of space as a comment:
I suspect this question is too broad to be able to provide a comprehensive answer. The best approach might be to simply try the change in a test environment and see what happens. What impact there is will depend largely on your type of workload and what features you're using.
In my experience the biggest catch in these cases when going from a pre-2014 version to a post one is the change to the cardinality estimator. The average query will probably get a little faster but you probably won't notice that because the real problems it solves will have been optimised away. On the other hand there will be a handful of queries that will get slower for you and you'll need to identify these and fix them. As a quick short term fix you can use the hint OPTION (USE HINT ('FORCE_LEGACY_CARDINALITY_ESTIMATION'))
If you're using MERGE statements then there's a risk those might start throwing errors as there's a few known bugs that can appear where the query deadlocks itself, we had to switch to dropping some indexes before merging and re-building them in a big ETL pipeline.
i'm just starting out with ms sql server (2008 express) and spent the last hour or so trying out some basic queries. how does sql scripting (correct term?) compare to general computer programming? is writing sql queries as involved, lengthy to learn, needs as much practice, etc. as software development? is writing advanced sql queries comparable to software development, at least to some degree, and if so could you explain that?
also, i found a couple of tutorial and reference sites for learning sql but could you also recommend some other sites.
also(2), i was playing around with the sql query designer in msSQL and it seems like a good tool to learn writing sql commands, but is there a way to use the designer to INSERT data. it seems that it's only for SELECT ing data.
any other general comments for learning, better understanding, etc SQL would be appreciated. thanks
First at all, SQL is more about databases and less about programming, in that way that you you cannot just succeed by "writing good queries": you must also know how to structure your data, make optimized tables, choose appropriate types, etc. You can spend a day thinking about how your data will be stored, without really writing any queries. SQL is not a way to solve an abstract problem, but a way to store and retrieve data efficiently and safely. For example, making maintenance and backup plans is purely a DBA job, and has nothing to do with SQL queries.
Is it lengthy to learn? Well, here, it is quite similar to general development. Basic SQL syntax is pretty simple, so after reading the first page of SQL book, you will probably be able to insert, retrieve and remove data. But to master SQL and database stuff, you must be ready to spend years and years of practice. Just like CSS: writing CSS is easy. Mastering it is hard.
Some advices:
Take in account security.
You communicate with SQL Server by sending strings, and the server must interpret them. The big mistake is to let the end user participate in building your queries: it leads to security leaks, with the ability for the hacker to do whatever he want with your data (it's called SQL Injection). It's just like letting everyone write any source code they want, and execute it on your machine. In practice, nobody let a third person write arbitrary code on her machine, but plenty of developers forget to sanitize user input before querying the database.
Use parametrized queries and stored procedures.
You may want to consider as soon as possible using parametrized queries. It will increase security, optimize performance and force you somehow to write better queries (even if it is debatable).
After learning SQL for a few weeks or months, you may also want to learn what are stored procedures and how to use them. They have their strong points, but don't make an error I made when I started learning SQL: do not decide to use stored procedures everywhere.
Use frameworks.
If you are a .NET developer, learn to use Linq-to-SQL. If you already used Linq on .NET objects (lists, collections, etc.), it can be very helpful to figure out how to do some queries. By the way, remember you can use Linq queries and see how Linq transforms them into SQL queries.
Keep in mind that using a framework or an abstraction layer will not make you a database guru. Use it as a helpful tool, but do sometimes SQL stuff yourself. Yes, it can free you from writing SQL queries by hand, even on large-scale projects (for example, StackOverflow uses Linq-to-SQL). But soon or later, you will either need to work on a project which does not use Linq, or you will see some possible limitations of Linq versus plain SQL.
Borrow a book.
Seriously, if you want to learn stuff, buy or borrow a book. Tutorials will explain you how to do a precise thing, but you will lose an opportunity to learn something you never thought about. For example, database partitioning or mirroring is something you must know if you want to work as a DBA. Any book about databases will talk about partitioning; on the other hand, there are few tutorials which will lead you to this subject by themselves.
Test, evaluate, profile.
SQL is about optimized queries. Anybody can write a select statement, but many people will write it in a non-optimized form.
If you are dealing with a few kilobytes database which have a maximum of hundred records, all your queries will perform well, but when the things will scale up, you will notice that a simple select query spends three seconds on a few billions of rows database instead of a few milliseconds.
To learn how to write optimized queries and create optimized databases, try to work on large sets. AdventureWorks demo database from Microsoft is a good start point, but you may also need sometimes to fill the database with random stuff just to have enough data to measure performance correctly.
Use Microsoft SQL Profiler (included in SQL Server 2008 Enterprise). It will help you to know what the server is really doing and how fast, and to find bottlenecks and poorly-written queries.
Learn from others.
Reading a book is a good point to start, but is not enough. Read the stuff on StackOverflow, especially the questions related to developers doing DBA work. For example, see Database Development Mistakes Made by App Developers question, and return reading the answers from time to time while learning SQL.
If you have any precise question (a query which does not produce what you expected, a strange performance issue, etc.), feel free to ask it on StackOverflow. The community is great, and there are plenty of people who know extremely well the stuff.
Sometimes, talking to DBA in your company (if there is one) can be also an opportunity to learn things.
is there a way to use the designer to INSERT data. it seems that it's only for SELECT ing data
If I remember well, query designed in Visual Studio let you build insert statements too. But maybe I'm wrong. In all cases, you can use Microsoft SQL Management Studio (included with Microsoft SQL 2008 Enterprise), which let you see how to build some cool queries (right-click on an element in Object Explorer, than use "Script database as..." menu).
I think you'll find that they key issue is that SQL is declarative, unlike most computing languages you're likely familiar with. This is fundamental. Grab any computer science text book and start there.
SQL is no more or less difficult than anything else in my view. Historically it was an area which people would tend to specialize in, but that was a consequence of the technology available at the time. It's now more accessible and the tools are significantly better, so expertise is generally spread more widely now.
It is different, SQL programming is quite restricted, when writing complex logic you might find it cumbersome with its limited programming options, unclear code as there is no modular programming, and bad implementations of stuff like cursors.
I read somewhere on SO, that database is not for coding, its only for storing data and querying. Its well said in some sense.
What I believe important to learn in that area is first of knowing all the features available in the db so that you make it use efficiently. Secondly improve querying/analytical skills.
Basic SQL features can be learnt from w3schools(joins , grouping etc)
Advance db features can be learnt from your dbms certification exam book. (the most basic certification exam be it oracle/sql server)
Analytical skills and some fun - puzzles by Joe Celko
I came to know about some uses of undocumented stored procedures (such as 'sp_MSforeachdb' ...etc) in MS SQL.
What are they? and Why they are 'undocumented'?
+1 on precipitous. These procs are generally used by replication or the management tools, they are undocumented as the dev team reserves the right to change them any time. Many have changed over the years, especially in SQL 2000 Sp3 and SQL 2005
My speculation would be because they are used internally and not supported, and might change. sp_who2 is another that I find very handy. Maybe management studio activity monitor uses that one - same output. Did I mention undocumented probably means unsupported? Don't depend on these to stick around or produce the same results next year.
sp_MSforeachdb and sp_MSforeachtable are unlikely to change.
Both can be used like this:
EXEC sp_MSforeachtable "print '?'; DBCC DBREINDEX ('?')"
where the question mark '?' is replaced by the tablename (or DB name in the other sp).
Undocumented means unsupported and that MS reserves the right to change or remove said commands at any time without any notice whatsoever.
Any documented features go through two deprecation stages before they are removed. An undocumented command can be removed, even in a service pack or hotfix, without any warning or any announcements.
It is most likely that one of the internal SQl Server developers needed these stored procedures to implement the functionality that they were working on, so they developed it and used it in their code. When working with the technical documentation people they covered the scope of their project, and included in the official documentation only the portion of their project that applied to the customer. Over time, people found the extra stored procedures (because you can't hide them) and started using them. While the internal SQl Server developers wouldn't want to change these undocumented procedures, I'm sure they would in two seconds if they had to to for their next project.
As others have said, they are unsupported features that are not intended for general consumption, although they can't stop you from having a go, and indeed, sometimes they can be very useful.
But as internal code, they might have unexpected side-effects or limitations, and may well be here one day and gone the next.
Use them carefully if you wish, but don't rely on them entirely.
In my current project, the DB is SQL 2005 and the load is around 35 transactions/second. The client is expecting more business and are planning for 300 transactions/second. Currently even with good infrastructure, DB is having performance issues. A typical transaction will have at least one update/insert and a couple of selects.
Have you guys worked on any systems that handled more than 300 txn/s running on SQL 2005 or 2008, if so what kind of infrastructure did you use how complex were the transactions? Please share your experience. Someone has already suggested using Teradata and I want to know if this is really needed or not. Not my job exactly, but curious about how much SQL can handle.
Its impossible to tell without performance testing - it depends too much on your environment (the data in your tables, your hardware, the queries being run).
According to tcp.org its possible for SQL Server 2005 to get 1,379 transactions per second. Here is a link to a system that's done it. (There are SQL Server based systems on that site that have far more transactions... the one I linked was just the first I one I looked at).
Of course, as Kragen mentioned, whether you can achieve these results is impossible for anyone here to say.
Infrastructure needs for high performance SQL Servers may be very differnt than your current structure.
But if you are currently having issues, it is very possible the main part of your problem is in bad database design and bad query design. There are many way to write poorly performing queries. In a high transaction system, you can't afford any of them. No select *, no cursors, no correlated subqueries, no badly performing functions, no where clauses that aren't sargeable and on and on.
The very first thing I'd suggest is to get yourself several books on SQl Server peroformance tuning and read them. Then you will know where your system problems are likely to be and how to actually determine that.
An interesting article:
http://sqlblog.com/blogs/paul_nielsen/archive/2007/12/12/10-lessons-from-35k-tps.aspx
I know there have been questions in the past about SQL 2005 versus Lucene.NET but since 2008 came out and they made a lot of changes to it and was wondering if anyone can give me pros/cons (or link to an article).
SQL Server FTS is going to be easier to manage for a small deployment. Since FTS is integrated with the DB, the RDBMS handles updating the index automatically. The con here is that you don't have an obvious scaling solution short of replicating DB's. So if you don't need to scale, SQL Server FTS is probably "safer". Politically, most shops are going to be more comfortable with a pure SQL Server solution.
On the Lucene side, I would favor SOLR over straight-up Lucene. With either solution you have to do more work yourself updating the index when the data changes, as well as mapping data yourself to the SOLR/Lucene index. The pros are that you can easily scale by adding additional indexes. You could run these indexes on very lean linux servers, which eliminates some license costs. If you take the Lucene/SOLR route, I would aim to put ALL the data you need directly into the index, rather than putting pointers back to the DB in the index. You can include data in the index that is not searchable, so for example you could have pre-built HTML or XML stored in the index, and serve it up as a search result. With this approach your DB could be down but you are still able to serve up search results in a disconnected mode.
I've never seen a head-to-head performance comparison between SQL Server 2008 and Lucene, but would love to see one.
I built a medium-size knowledge base (maybe 2GB of indexed text) on top of SQL Server 2005's FTS in 2006, and have now moved it to 2008's iFTS. Both situations have worked well for me, but the move from 2005 to 2008 was actually an improvement for me.
My situation was NOT like StackOverflow's in the sense that I was indexing data that was only refreshed nightly, however I was trying to join search results from multiple CONTAINSTABLE statements back in to each other and to relational tables.
In 2005's FTS, this meant each CONTAINSTABLE would have to execute its search on the index, return the full results and then have the DB engine join those results to the relational tables (this was all transparent to me, but it was happening and was expensive to the queries). 2008's iFTS improved this situation because the database integration allows the multiple CONTAINSTABLE results to become part of the query plan which made a lot of searches more efficient.
I think that both 2005 and 2008's FTS engines, as well as Lucene.NET, have architectural tradeoffs that are going to align better or worse to a lot of project circumstances - I just got lucky that the upgrade worked in my favor. I can completely see why 2008's iFTS wouldn't work in the same configuration as 2005's for the highly OLTP nature of a use case like StackOverflow.com. However, I would not discount the possibility that the 2008 iFTS could be isolated from the heavy insert transaction load... but it also sounds like it could be as much work to accomplish that as move to Lucene.NET ... and the cool factor of Lucene.NET is hard to ignore ;)
Anyway, for me, the ease and efficiency of SQL 2008's iFTS in the majority of situations probably edges out Lucene's 'cool' factor (though it is easy to use, I've never used it in a production system so I'm reserving comment on that). I would be interesting in knowing how much more efficient Lucene is (has turned out to be? is it implemented now?) in StackOverflow or similar situations.
This might help:
https://blog.stackoverflow.com/2008/11/sql-2008-full-text-search-problems/
Haven't used SQL Server 2008 personally, though based on that blog entry, it looks like the full-text search functionality is slower than it was in 2005.
we use both full-text-search possibilities, but in my opinion it depends on the data itself and your needs.
we scale with web-servers, and therefore i like lucene, because i don't have that much load on the sql-server.
for starting at null and wanting to have a full-textsearch i would prefer the sql-server solution, because i think it is really fast to get results, if you want lucene you have to implement more at start (and also get some know-how).
One consideration that you need to keep in mind is what kind of search constraints you have in addition to the full-text constraint. If you are doing constraints that lucene can't provide, then you will almost certainly want to use FTS. One of the nice things about 2008 is that they improved the integration of FTS with standard sql server queries so performance should be better with mixed database and FT constraints than it was in 2005.