MS SQL SERVER: sql queries compared to general programming - sql-server

i'm just starting out with ms sql server (2008 express) and spent the last hour or so trying out some basic queries. how does sql scripting (correct term?) compare to general computer programming? is writing sql queries as involved, lengthy to learn, needs as much practice, etc. as software development? is writing advanced sql queries comparable to software development, at least to some degree, and if so could you explain that?
also, i found a couple of tutorial and reference sites for learning sql but could you also recommend some other sites.
also(2), i was playing around with the sql query designer in msSQL and it seems like a good tool to learn writing sql commands, but is there a way to use the designer to INSERT data. it seems that it's only for SELECT ing data.
any other general comments for learning, better understanding, etc SQL would be appreciated. thanks

First at all, SQL is more about databases and less about programming, in that way that you you cannot just succeed by "writing good queries": you must also know how to structure your data, make optimized tables, choose appropriate types, etc. You can spend a day thinking about how your data will be stored, without really writing any queries. SQL is not a way to solve an abstract problem, but a way to store and retrieve data efficiently and safely. For example, making maintenance and backup plans is purely a DBA job, and has nothing to do with SQL queries.
Is it lengthy to learn? Well, here, it is quite similar to general development. Basic SQL syntax is pretty simple, so after reading the first page of SQL book, you will probably be able to insert, retrieve and remove data. But to master SQL and database stuff, you must be ready to spend years and years of practice. Just like CSS: writing CSS is easy. Mastering it is hard.
Some advices:
Take in account security.
You communicate with SQL Server by sending strings, and the server must interpret them. The big mistake is to let the end user participate in building your queries: it leads to security leaks, with the ability for the hacker to do whatever he want with your data (it's called SQL Injection). It's just like letting everyone write any source code they want, and execute it on your machine. In practice, nobody let a third person write arbitrary code on her machine, but plenty of developers forget to sanitize user input before querying the database.
Use parametrized queries and stored procedures.
You may want to consider as soon as possible using parametrized queries. It will increase security, optimize performance and force you somehow to write better queries (even if it is debatable).
After learning SQL for a few weeks or months, you may also want to learn what are stored procedures and how to use them. They have their strong points, but don't make an error I made when I started learning SQL: do not decide to use stored procedures everywhere.
Use frameworks.
If you are a .NET developer, learn to use Linq-to-SQL. If you already used Linq on .NET objects (lists, collections, etc.), it can be very helpful to figure out how to do some queries. By the way, remember you can use Linq queries and see how Linq transforms them into SQL queries.
Keep in mind that using a framework or an abstraction layer will not make you a database guru. Use it as a helpful tool, but do sometimes SQL stuff yourself. Yes, it can free you from writing SQL queries by hand, even on large-scale projects (for example, StackOverflow uses Linq-to-SQL). But soon or later, you will either need to work on a project which does not use Linq, or you will see some possible limitations of Linq versus plain SQL.
Borrow a book.
Seriously, if you want to learn stuff, buy or borrow a book. Tutorials will explain you how to do a precise thing, but you will lose an opportunity to learn something you never thought about. For example, database partitioning or mirroring is something you must know if you want to work as a DBA. Any book about databases will talk about partitioning; on the other hand, there are few tutorials which will lead you to this subject by themselves.
Test, evaluate, profile.
SQL is about optimized queries. Anybody can write a select statement, but many people will write it in a non-optimized form.
If you are dealing with a few kilobytes database which have a maximum of hundred records, all your queries will perform well, but when the things will scale up, you will notice that a simple select query spends three seconds on a few billions of rows database instead of a few milliseconds.
To learn how to write optimized queries and create optimized databases, try to work on large sets. AdventureWorks demo database from Microsoft is a good start point, but you may also need sometimes to fill the database with random stuff just to have enough data to measure performance correctly.
Use Microsoft SQL Profiler (included in SQL Server 2008 Enterprise). It will help you to know what the server is really doing and how fast, and to find bottlenecks and poorly-written queries.
Learn from others.
Reading a book is a good point to start, but is not enough. Read the stuff on StackOverflow, especially the questions related to developers doing DBA work. For example, see Database Development Mistakes Made by App Developers question, and return reading the answers from time to time while learning SQL.
If you have any precise question (a query which does not produce what you expected, a strange performance issue, etc.), feel free to ask it on StackOverflow. The community is great, and there are plenty of people who know extremely well the stuff.
Sometimes, talking to DBA in your company (if there is one) can be also an opportunity to learn things.
is there a way to use the designer to INSERT data. it seems that it's only for SELECT ing data
If I remember well, query designed in Visual Studio let you build insert statements too. But maybe I'm wrong. In all cases, you can use Microsoft SQL Management Studio (included with Microsoft SQL 2008 Enterprise), which let you see how to build some cool queries (right-click on an element in Object Explorer, than use "Script database as..." menu).

I think you'll find that they key issue is that SQL is declarative, unlike most computing languages you're likely familiar with. This is fundamental. Grab any computer science text book and start there.
SQL is no more or less difficult than anything else in my view. Historically it was an area which people would tend to specialize in, but that was a consequence of the technology available at the time. It's now more accessible and the tools are significantly better, so expertise is generally spread more widely now.

It is different, SQL programming is quite restricted, when writing complex logic you might find it cumbersome with its limited programming options, unclear code as there is no modular programming, and bad implementations of stuff like cursors.
I read somewhere on SO, that database is not for coding, its only for storing data and querying. Its well said in some sense.
What I believe important to learn in that area is first of knowing all the features available in the db so that you make it use efficiently. Secondly improve querying/analytical skills.
Basic SQL features can be learnt from w3schools(joins , grouping etc)
Advance db features can be learnt from your dbms certification exam book. (the most basic certification exam be it oracle/sql server)
Analytical skills and some fun - puzzles by Joe Celko

Related

What do I need to know about databases?

In general, I think I do alright when it comes to coding in programming languages, but I think I'm missing something huge when it comes to databases.
I see job ads requesting knowledge of MySQL, MSSQL, Oracle, etc. but I'm at a loss to determine what the differences would be.
You see, like so many new programmers, I tend to treat my databases as a dumping ground for data. Most of what I do comes down to relatively simple SQL (INSERT this, SELECT that, DELETE this_other_thing), which is mostly independent of the engine I'm using (with minor exceptions, of course, mostly minor tweaks for syntax).
Could someone explain some common use cases for databases where the specific platform comes into play?
I'm sure things like stored procedures are a big one, but (a) these are mostly written in a specific language (T-SQL, etc) which would be a different job ad requirement than the specific RDBMS itself, and (b) I've heard from various sources that stored procedures are on their way out and that in a lot of cases they shouldn't be used now anyway. I believe Jeff Atwood is a member of this camp.
Thanks.
The above concepts do not vary much for MySQL, SQL Server, Oracle, etc.
With this question, I'm mostly trying to determine the important difference between these. I.e. why would a job ad demand n years experience with MySQL when most common use cases are relatively stable across RDBMS platforms.
CRUD statements, joins, indexes.. all of these are relatively straightforward within the confines of a certain engine. The concepts are easily transferable if you know a different RDBMS.
What I'm looking for are the specifics which would cause an employer to specify a specific engine rather than "experience using common database engines."
I believe that the essential knowledge about databases should be:
What database are for?
Basic CRUD Operations
SELECT queries with JOINs
Normalization
Basic Indexing
Referential Integrity with Foreign Key Constraints
Basic Check Constraints
The above concepts do not vary much between MySQL, SQL Server, Oracle, Postgres, and other relational database systems. However you'd find a different set of concepts for the now-popular NoSQL databases, such as CouchDB, MongoDB, SimpleDB, Cassandra, Bigtable, and many others.
After the CRUD statements, to be an effective DB programmer I think some of the most important things to understand are JOIN statements. Understand the difference between LEFT and RIGHT, OUTER and INNER joins, and know when to use each. Most importantly, know what the database is actually constructing when it performs a JOIN.
For me, the Wikipedia article was very helpful.
Also, indexing is very important - this is how relational databases can perform fast queries. Understand how to use them and what happens under the hood.
Wikipedia article on DB indexing.
You should also know how to construct a many-to-one relationship (using foreign keys) and a many-to-many relationship (using join tables).
I know that in your question you're asking about specific DB implementations, but if you're to be taken literally and you only know about SELECT, INSERT, UPDATE, and DELETE, then the above concepts will be far more valuable than learning the intricacies of a particular implementation.
It's not just stored procs and functions. Each database has fundamental differences and quirks that are important to understand even though SQL works more or less the same.
Examples:
Oracle and MySQL handle locking differently, in different situations.
Oracle doesn't have autoincrementing primary keys like MySQL and SQL Server.
Subtle vendor-specific behavior, like the way Oracle does sorting for VARCHARs differently depending on locale.
If you really want to improve your applications, you eventually have to become familiar with the details about how your specific database works. Most of the time it doesn't make a lot of difference, but when it does matter, it usually makes a big difference, especially when it comes to performance.
Some things which seem to come up when talking with my Database-keen colleagues:
Row vs page vs table locking escalation when doing multiple complex joins, implies sometimes doing very different things on different vendors dbs. This is where the theory is really hitting the tarmac and often it is non-intuitive.
Differences between how cursors are best used on different vendor db implementations
Odd stuff in the stored proc language variants, like how best to handle failure cases
Differences in how temporary tables and views are best used depending on the underlying implementations.
All of these kind of things don't really matter until you are trying to solve something that either has to
- Run very fast
- Contain lots and lots of data
- Gets very big and complex (i.e. multiple queries hitting same tables simultaneously)
These are the kinds of things that DBAs should be helping with, so depends on if you are aiming to be a DBA or a programmer. None of the above have really hurt me yet, because I've not worked on db-intensive systems, but I've worked near a few, and the programmers on those end up knowing a lot about the internals, restrictions, and good features about the specific database they are using.
Best way to get knowledge like that (other than on the job) is to read the manuals or hang out with people that already know and ask them about it.
Don't forget relation schemas, Primary and foreign keys and how they are related. To start with DB, I would use MySql and MSSQL as these are most common in the market. I take Oracle as more advanced and complex db
As for the level of differences there are between vendors, it is because SQL is a standard (http://en.wikipedia.org/wiki/SQL#Standardization) and vendors implement that std differently.
Each of these vendors try to offer extras to have the crowd by their side... that's why you see functions available to one and not the other. But sometimes that function make its way into the standard so its not always a bad thing.
For stored proc. I would agree as ORMs and practices of today tend to do a greater separation of concerns by removing business logic from the database and considering it "only" a repository.
My 2 cents
I see job ads requesting knowledge of MySQL, MSSQL, Oracle, etc. but I'm at a loss to determine what the differences would be.
I'm what's called a SQL Developer. You won't see the differences much when you are doing run of the mill database work (CRUD). However the differences become quite apparent when you are dealing with the databases own brand of SQL.
When talking SQL outside of the standards, there are 4 distinctive types of commands. These are:
Data Manipulation Language (DML)
Data Definition Language (DDL)
Data Control Language (DCL)
Transactional Control Language (TCL)
The biggest differences come in the last two, DCL and TCL. Those have a LOT of database specific non-standard SQL commands. The first two, DML and DDL, are very similar across any database that use the relational model.
Also the bigger database vendors have nicknamed their SQL implementation. Here's a short sample:
SQL Server : T-SQL
Oracle : PL-SQL
PostgreSQL : P-SQL or NG-SQL
Firebird : IB-SQL
MySQL : mSQL
The list goes on, but you get the point. Wikipedia has good articles on the different command acronyms.
I have found that most employers won't be able to articulate this, because most will use non-technical managers and/or HR to do the hiring. They are basically being told by the tech managers that the new hires need to know X technology. This and also, because the majority are too lazy to hire for intelligence, instead they fall back on the "We have X, so darn it, we need to hire somebody that knows X!" meme. The differences are actually not that hard to learn, for the people who frequent StackOverflow. I'm confident that anybody here can learn these fairly fast.
Even something as simple as an auto-incrementing primary key can be very different in Oracle, mysql, and SQL Server.
Some other important differences:
SQL Server makes a distinction between clustering key and primary key; other database do not. This choice comes with major performance implications.
SQL Server allows the SET #Total = Total = #Total + Amount syntax for fast computations of things like running totals. mysql lets you use a user variable in a similar way (I think). In other databases you'd probably have to use a correlated subquery. Huge difference in performance.
SQL Server can generate "sequential GUIDs" with newsequentialid. I'm not sure which other databases have this feature, but as with the above two points, there are significant performance implications to using a traditional GUID as opposed to a sequential or comb.
Oracle's CONNECT BY is a very useful and pretty unique syntax. Common Table Expressions in SQL Server and mysql are similar but not exactly the same.
Support for ranking/ordering functions varies vastly across different databases. I'm constantly posting answers here invoking ROW_NUMBER. A lot of queries are much harder to write without this - but at the same time, abusing it can hurt performance.
XML support is all over the map. Most databases have reasonably good support for it now, but both syntax and semantics are completely different on every platform.
Date/time handling can be quite different. Oracle has several different date/time-related types, some including time zone information. In general, Oracle is way better than other databases at managing temporal data, and has several features that you will miss if you switch. Until recently, Microsoft didn't have the date and time types, just datetime, which was much harder to normalize.
Spatial types are different and/or nonexistent in different databases. mysql exposes an entire OpenGIS model; Microsoft's support is a bit more basic but still competent. Oracle has it, but it's a little hard to find information on, and it's some sort of optional add-on. I think DB2 is starting to get it, but support is still a little spotty.
mysql actually lets you choose how to store an index (i.e. btree or hash). This is also an important performance consideration.
SQL Server allows you to INCLUDE columns in an index - very important for performance.
Oracle allows you to create function-based indexes, bitmap indexes, and so on. These can be pretty difficult to wrap your head around.
Oracle can perform "skip seeks" in very specific situations, something that I don't believe is supported in other databases (yet). This might factor into how you order index columns.
SQL Server has CLR types/functions/aggregates. Obviously not supported in any other database product.
Trigger support varies significantly. SQL Server has AFTER and INSTEAD OF. mysql has BEFORE and AFTER. Oracle has all of those and more. These all behave quite differently.
I'm sure that there are many, many more differences, but that should give you at least a basic idea of why 5 years of experience with Oracle is completely different from 5 years of experience with SQL Server.
That databases are encoded collections of assertions of fact.
That the logical structure of the tables corresponds to the syntactical structure of those "assertions of fact".
That Normalization theory helps you find the most optimal logical structure of the database, by minimizing redundancy, i.e. minimizing the possibility for contradictions in said assertions of fact to occur.
That database constraints are really nothing else than business rules, expressed in a formal way and in terms of the components of the database.
That really every and any business rule can be expressed as a database constraint.
That therefore, it is possible for the DBMS to enforce any and every business rule you can imagine.
That there is a very important difference between logical design and physical design.
That SQL and SQL systems are, eurhm, not really helpful (and that's putting it mildly), in supporting developers to recognise this important distinction.
That SQL and SQL systems are, eurhm, significantly deficient (and that's putting it mildly), in their support for database constraints.
That these latter two examples are a very good illustration of the importance of the difference between a model (Codd's RM) and its implementation (some particular SQL system). As far as relational database technology is concerned, the latters deviate ever more propostrously from the former.
And whatever else I forgot to remember.

What database to use for big data storage and manipulation?

I have to make a decision of which database server to use for my next project, but the simple decision to use MySQL like almost all the projects I did is harder now, because I expect very much records.
The database will store a user list, some other irrelevant tables, and the last one, some user-collected data. Let's say, if I have 6000 users responding to a quiz about each other. Simple math shows that from those users, if each one completes the quiz about everyone (and in my project that is 99% sure that will happen) I'll end up with 35.99million records(they will exclude themselves and in this particular situation the operation is 6000*5999). Unfortunately 6000 maybe is a small number, the real one growing day by day.
What to choose? MySQL and maybe if things go well and the project grows to expand it in a cluster? PostgreSQL, MSSQL? Oracle?
I've read about all of them, each one has it's pros and cons, but still don't know what to choose. The advantage of MySQL and PostgreSQL is of course, the starting price of $0 which is pretty nice in a usual self-funded startup.
Any opinions, pieces of advice? If you encountered this situation in your experience as developers, I'd love to hear from you.
These days, free isn't something that differenciates between databases any more. Both Oracle and SQL Server have free versions, but the limitations is resources - 4 GB database, RAM & single CPU utilization. Millions of records is not a concern - it's what datatypes you're using.
I saw the OPs comment about not liking MS software - that's your prerogative, but using the free versions of either Oracle or SQL Server do benefit from seamless transition to upscale versions of the respective database.
Personally, my choice would be either Oracle or SQL Server because of IMHO, real feature considerations like hierarchical query support, subquery factoring/CTE, packages (long before I get concerned with functions/procedures), full text searching, xml support, etc.
MySQL will handle 35 million records no problem. Worry about scalability when you get there. You can easily add raid hard disks backing your database tables, and if you really start getting big you can get a compellant SAN that will scream... Don't worry about the DB engine as much as the underlying hardware.. MySQL rocks for us with millions of records.
I've had no problems handling tables as large as 36,000,000 rows on MySQL and Oracle.
Just be sure that you index the proper columns, run EXPLAINs for your queries, and maintain proper design principles.
Most of the truly large scale web properties use a distributed key-value store. That said, 35 million is large, but not that large. With most modern databases, your main two scaling worries should be throughput and what happens when no single box can contain your entire database anymore. And both of these problems can be solved to some degree for any database you choose to use. (Caching, replication, sharding, etc.)
Use MySQL until you can't anymore. At that point, you ought to be rolling in dough anyways and you now have a very desirable problem.
Use MySQL as it's free and you have experience with it.
Besides in my opinion it matters more on how you design the tables than which database you use.
35 million records can be easily handled by MS SQL Server (assuming proper database design, indices, etc.). You can start with the free SQL Server Express edition and later, if you need, you can upgrade to the full version which supports clustering, etc.
SQL Server Express does have some limitations - single CPU, 1 GB memory, max 4 GB database size and a few other things. I'm not sure how quickly these limitations will become a problem but you can always move to the full version when you run into them.
MySQL(i) & Postgre
0$ of costs
large community
many tutorials
well documentated
MSSQL
You can get "money" from MS if you promote that you are using MSSQL (secret information from some companies I worked for)
MS tools work very well
Complete tool set from C# IDE over .NET lib to Windows Server 2003
Oracle
Professional and commercial provider
Used by many large companies (I also heard about Blizzard (World of Warcraft) using Oracle)
- expensive
The final decision depends on the very special requirements of your project.
Make yourself a quick list of things , that ARE IMPORTANT for your project (e.g. quick performed queries) and look up which Database pros are matching the most to your requirements.
Everything is about design. SQL Database are some kind of cars, you just have to know which component has to be placed here and which there.
Make a clear design and you won't struggle with any of them.
May be you can test Firebird
Blog post about big Firebird database here
MySQL licence is here (not allways free).
Postgresql and Firebird are free.
First of all, don't think about performance. Premature optimization being the root of all evil and all that. You can always throw more hardware and/or tuning at it later.
All of the mentioned should perform nicely if tuned/maintained correctly. I'd focus on manageability and familiarity. IMHO open source databases excels on manageability (perhaps not the best GUIs, but the CLI has been my home for a long long time).
And if the database becomes the bottleneck, why limit yourself to those choices? How about a key-value distributed database? Or perhaps serialize data directly to disk? Storing data outside of a RDBMS, while often frowned upon, might be the correct path. Or simply use the common route of denormalization.
Always remember not to optimize prematurely.
As far as opinions go (since you specifically asked for it) I favor open source databases, specifically PostgreSQL. It's rock solid, fast and very well-featured. And even with (relatively) large datasets it has performed superbly on mediocre hardware (some tuning involved, of course, but you can't skip that step no matter which db you end up choosing).

Swapping out databases?

It seems like the goal of a lot of ORM tools and custom data access layers (DAO pattern, etc.) is to abstract the database to the point where you could supposedly swap out the entire database system with minimal work.
Following the common DAL patterns is usually a good idea in code, but it seems like it would never be minimal work to swap out a database. (Cost, training, data migration, etc.)
Does anyone have any experience with swapping out one database for another in a large system, and dealing with the implications in code? Is it worth it to worry about abstracting the actual database from your code?
Question 1: Does anyone have any experience with
swapping out one database for another
in a large system, and dealing with
the implications in code?
Yes we tried it. Our customer is using a large MS Access based Delphi client server application. After about five years we considered switching to SQL Server. We analyzed the problem and concluded that swapping the database would be very costly and provide only a few advantages. Customer decided not to swap the database. The application is still running fine and the customer is still happy.
Note that:
MS Access is only being used for data storage and report generation.
The server application ensures that MS Access is only being accessed on the server. Normal multi-user MS Access applications will transfer large chunks of the Access database over the network - resulting in slow and unreliable database functionality. This is not the case for this application. Client <> Server <> MS Access. Only the server application communicates with the MS Access database. Actually the Server has exclusive access to the MS Access database. No other computer can open to the MS Access database. Conclusion: MS Access is being used as a true RDBMS, Relational DataBase Management System - please no flaming about MS Access being inferior and unstable - it has been running fine for more than 10 years.
The most important issues you will have to consider:
SQL statements: (SELECT, UPDATE, DELETE, INSERT, CREATE TABLE) and make sure they would be compatible with the SQL database. It's amazing how much all the RDBMS differ in the details (date formats, number formats, search formats, string formats, join syntax, create table syntax, stored procedures, user defined functions, (auto) primary keys, etc.)
Report generation: Depending on your database you might be using a different reporting tool. Our customer has over 200 complex reports. Converting all these reports is very time consuming.
Performance: all RDBMS have different performances in different environments. Normally performance optimalisations are very much RDBMS dependent.
Costs: the costs of tools, developers, server and user licenses varies greatly. It ranges from free to very expensive. Free does not mean cheap and expensive does not always equate to good. A cost/value comparison will have to be made.
Experience: making the best use of your RDBMS requires experience. If you have to develop for an "unknown" RDBMS your productivity will suffer.
Question 2: Is it worth it to worry about
abstracting the actual database from
your code?
Yes. In an ideal world, swapping a database would just be adjusting the data connection string. In the real world this is not possible because all databases are different. They all have tables and SQL support but the differences are in the details. If you can keep the differences of the databases shielded through abstraction - please do so. Make a list of the databases you need to support. Check the selected database systems for the differences. Provide centralized code to handle the differences. Support one RDBMS and provide stubs for future support of other RDBMS.
I disagree that the purpose is to be able to swap out databases, and I think you are correct in showing some suspicion about ORMs leading towards that goal.
However, I would still use an ORM, as it abstracts away the details of data access. Isn't this the goal of object oriented programming? Keep your concerns separated.
I think the primary use case for database abstraction (via ORM tools) is to be able to ship a product that works with multiple database brands. I believe it's a rarer occurrence for a company to switch between database vendors, but that's still one of the use cases.
I've worked jobs where we started out using MySQL for monetary reasons (think a startup) and, one we started making money, wanted to switch to Oracle. We didn't end up making the switch, but it was nice to have the option.
Still, ORM tools are not a completely leak-less abstractions and I know our migration still would have been painful and costly. It totally depends on what you are building, but it has been my experience that -- for performance reasons, usually -- you end up either working around your ORM solution or exploiting vendor-specific features at some point.
The only time I've seen a database switch was from HSQL during early development to Oracle as the project progressed. The ORM made this easy.
I often use the DAO pattern to swap out data services (from a database to web service or to swap a web service to a test stub).
For ORM I don't think the goal is to enable you to switch databases - it is to hide you from the complexities of different database implementations and removing the need to worry about the fine details of translating from relational to object represenations of your data.
By having someone smart write an ORM that handles caching, only updates fields that have changed, groups updates, etc I don't need to. Although in the cases where I need something special I can still revert to SQL if I want.

How long does it take to become reasonably proficient in Oracle given SQL Server [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 7 years ago.
Improve this question
In applying for jobs via agents I sometimes get blocked by an agent who says do you know software package X. When I reply that I know the similar package Y they might say unless you know X I cannot put you forward.
The problem is that some of these agents don't know what they talking about, they are merely being used by their clients as a screening filter.
It would be useful to be able to say to these agents that because I know Y I can expect to become reasonably proficient in X in a given number of days/months. However not knowing X determining the required time is why I'm asking this question.
Most recently X was Oracle and Y was SQL Server.
Please can those of you who know both, express an opinion on how long is required to become reasonably proficient? NB I'm not talking about becoming a DBA!
I'll state my bias upfront - Oracle is far more complex than Sql Server. So it depends what you'll be asked to do. You say that this is not for a DBA position but that definition is pretty fluid. At my company, Developers are charged with designing tables, adding the correct indexes, determining partitioning.
If you say, all I'll do is code in java or c# and call packages written by a DBA or Oracle Developer, then you are safe.
But if you take all of your MSSS experience and add semi-colons to the end of your lines you'll kill your Oracle instance. Many standard practices in MSSS are anethema to Oracle. In MSSS it's recommended to have clustered indexes on most every table. In Oracle we build IOTs (Index Organized Tables) only for specific purposes. In MSSS doing DDL in T-SQL is as easy as falling off a log. In Oracle it is made difficult on purpose, it's discouraged and in fact somewhat dangerous. In MSSS you whip off #temp table like they're jelly beans, in Oracle we plan them in advance since they are permanent database objects that aren't just created in the middle of a proc when the logic gets a little tricky.
That said, would you be able to make Oracle do something? Well, yah, but the real question is will it work efficiently and scale to meet the needs of the business your agent placed you at. And that's a resounding no.
I've gone the other way (Oracle first, then SQL Server). My experience is:
SQL queries - trivial differences, except in the realm of string-date conversions, which are WAY easier in Oracle.
Stored Procedures - T-SQL syntax is significantly different from PL/SQL. There is a learning curve there, but nothing insurmountable.
Database Admin - very different, but WAY easier in SQL Server. If that's part of the job description, then they might be justified in considering someone else.
If you're being hired as a DBA, it will take a while to switch between databases as the management of them differs (I base this on my experience with Oracle and DB2 only).
I DON'T know SQL Server but I can imagine a Microsoft program with its nice GUI management would be vastly different to managing DB2/z, for example (although you can use the fancy DB2 LUW (Linux/UNIX/Window) tools if you're that way inclined).
If you're just cutting code to use the database, the SQL differences are minor (relatively). That shouldn't take much time at all, assuming you're already proficient with SQL.
I've done SQL Server for 10 years, but started working with Oracle about a year and a half ago. Like any database, there's no magical proficiency point - you just learn more about it the more you use it. In terms of what a developer would need to know, it shouldn't take more than a week (and even that's a bit much) for an experienced developer to get up to speed with using it. PL/SQL isn't that much different than T-SQL, although stored procs are kind of different.
Programming-wise, the Oracle data access classes are modeled on ADO.NET (we use OleDb actually - one of the side benefits is that you don't have to futz around with the OracleBlob class in order to access BLOB data), so not much learning to be done there.
I originally had my team using TOAD, because I had heard that TOAD is what you're supposed to use with Oracle (although I also heard bad things about it). We eventually got to the point where I was the only one using TOAD, and everyone else was using SqlDeveloper. Avoid TOAD.
I went from SQL Server to Oracle in 2001 where I went from working on a VB6/SQL Server project (as a developer) to working as an Oracle development DBA for a large J2EE project. Here are the edited highlights of my experiences and some reflections.
For development, the basic principle of SQL is not radically differnt. T-SQL is a somewhat different beast to PL/SQL so the idioms are a bit different. Most competent programmers should be able to make the jump by just tinkering around and getting some good SQL server books such as the Guru's Guide or Oracle books such as Expert one-on-one Oracle, depending on which way you're going.
I'd say for a developer a week or two to get used to another database platform will get you most of the way there. The basic principles are fairly similar (modulo differences in the architecture); really only the window dressing is different. However, if you're going to Oracle, get a copy of a third-party query tool such as TOAD as these are much, much, much better than the ones that Oracle supplies.
Agents are notoriously bad for matching specific buzzwords and I get this on a semi-regular basis (I'm a contractor). If you need to bone up on SQL Server the Developer Edition is very cheap and will install on a desktop O/S such as Windows XP. Oracle also offers Free downloads for all their supported platforms that you can use to tinker.
You might also get some mileage from asking a stackoverflow question along the lines of "What are the main idiomatic differences between PL/SQL and T-SQL".
If all you want to do is put a bullet point on your resume to get past the agents, just add it, that will get you past the agents and give you a chance to get your foot in the door. If you are an honest type, grab Oracle XE, Read the appropriate guide at the documentation library, and spend a day or 5 throwing together a blog/address book/flickr clone/etc in your favorite language against Oracle.
As others have suggested there is some significant difference in things you would do compared to SQL Server, if you just apply what you know from SQL Server to Oracle, you'll likely kill performance.
I would disagree that the SQL is similar. The SQL SYNTAX is similar (to the lowest common denominator of ANSI) for basic CRUD development. But actually building packages, writing multi-table / multi step joins, bulk updates and inserts, use of the rich and powerful Oracle feature set is very different from SQL Server. This takes a different mindset and can take many years to master. However...
The purpose of the CV / resume is to get an interview.
Work with your agent to highlight your best TALENTS and BEHAVIOUR'S, not purely skills.
Skill's can be learnt and taught. A talent for, and the demonstration of, learning new skills in new environments is gold dust for an employer.
Don't try to blag it. Don't lie. You'll be found out. Use your experience gained from SQL Server to demonstrate that you can solve the problem. Show eagerness to learn, train and graft.
But, if they really want a "parachute in and start running Oracle programmer", then your stuffed!
AS for my knowledge,It is better to learn Oracle rather than SQL Server because Through out my life I have come across majority of people who are working with SQL Server and a lot competition involved with it because this database is easy to learn.So If you learn oracle,it makes you strong at initial stages and less competition Involved with it.

When is it time to change database backends?

Is there a general rule of thumb to follow when storing web application data to know what database backend should be used? Is the number of hits per day, number of rows of data, or other metrics that I should consider when choosing?
My initial idea is that the order for this would look something like the following (but not necessarily, which is why I'm asking the question).
Flat Files
BDB
SQLite
MySQL
PostgreSQL
SQL Server
Oracle
It's not quite that easy. The only general rule of thumb is that you should look for another solution when the current one can't keep up anymore. That could include using different software (not necessarily in any globally fixed order), hardware or architecture.
You will probably get a lot more benefit out of caching data using something like memcached than switching to another random storage backend.
If you think you are going to ever need one of the heavyweights (SqlServer, Oracle), you should start with one of those at the beginning. Data migrations are extremely difficult. In the long run it will cost you less to just start at the top and stay there.
I think you're being overly specific in your rankings. You can pretty much start with flat files and the like for very small data sets, go up to something like DBM for slightly bigger ones that don't require SQL-like syntax, and go to some kind of SQL database after that.
But who wants to do all that rewriting? If the application will benefit from access to joins, stored procedures, triggers, foreign key validation, and the like--just use a SQL database regardless of the dataset size.
Which one should depend more on the client's existing installations and what DBA skills are available than on the amount of data you're holding.
In other words, the size of your database is far from the only consideration, and maybe not the most important one.
There is no blanket answer to this, but ALMOST always, using flat files is not a good idea. You have to parse through them (i suppose) and they do not scale well. Starting with a proper database, like Oracle or SQL Server (or MySQL, Postgres if you are looking for free options) is a good idea. For very little overhead, you will save yourself a lot of effort and headache later on. They also allow you to structure your data in a non-stupid fashion, leaving you free to think of WHAT you will do with the data rather than HOW you will be getting it in/out.
It really depends on your data, and how you intend to use it. At one of my previous positions, we used Postgres due to the native geo-location and timezone extensions which existed because it allowed us to manage our data using polygonal datatypes. For us, we needed to do that, and we also wanted to use stored procedures, views and the like.
Now, another place I worked at used MySQL simply because the data was normalized, standard row by row data.
SQL Server, for a long time, had a 4gb database limit (see SQL Server 2000), but despite that limitation it remains a very stable platform for small to medium applications for which the old data is purged.
Now, from working with Oracle and SQL Server 05/08, all I can tell you is that if you want the creme of the crop for stability, scalability and flexibility, then these two are your best bet. For enterprise applications, I strongly recommend them (merely because that's what we use where I work now).
Other things to consider:
Language integration (ASP.NET session storage, role management, etc.)
Query types (Select, Update, Delete) [Although this is more of a schema design issue, not a DBMS issue)
Data storage requirements
Your application's utilization of the database is the most critical ones. Mainly what queries are used most often (SELECT, INSERT or UPDATE)?
Say if you use SQLite, it is gears for smaller application but for "web" application you might a bigger one like MySQL or SQL Server.
The way you write scripts and your web application platforms also matters. If you're developing on a Microsoft platform, then SQL Server is a better alternative.
Typically, I go with what is commonly accepted by whichever framework I am using. So, if I'm doing .NET => SQL Server, Python (via Django or Pylons) => MySQL or SQLite.
I almost never use flat files though.
There is more to choosing an RDBMS solution that just "back end horsepower". The ability to have commitment control, for example, so you can roll back a failed transaction is one. reason.
Unless you are in the megatransaction rate application, most database engines would be adequate - so it becomes a question of how much you want to pay for the software, whether it runs on the hardware and operating system environment you want, and what expertise you have in managing that software.
That progression sounds painful. If you're going to include MS products (especially the for-pay SQL Server) in there anywhere, you may as well use the whole stack, since you only have to pay for the last of these:
SQL Server Compact -> SQL Server Express -> SQL Server Enterprise (clustered).
If you target your app at SQL Server Compact initially, all your SQL code is guaranteed to scale up to the next version without modification. If you get bigger than SQL Server Enterprise, then congratulations. That's what they call a good problem to have.
Also: go back and check the SO podcasts. I believe they talked about this briefly.
This question depends on your situation really.
If you have control over the server you're deploying to and you can install whatever services you need, then the time to install a MySql or MSSQL Express server and code against an existing database framework VERSUS coding against flat file structure is not worth the effort of considering.
What about FireBird? Where would that fit into that list?
And lets not forget the requirements that the "customer" of your solution must also have in place. If your writing a commercial application for a small companies, then Oracle might not be a good choice... but if your writing a customized solution for a large enterprise which must share data among multiple campuses, and has a good sized IT department then the decision of Oracle vs Sql Server would come down to what does the customer most likely already have deployed.
Data migration nowdays isn't that bad since we have those great tools from Embarcadero, so I would instead let the customer needs drive the decision.
If you have the option SQL Server is a good choice from the word go, predominantly because you have access to solid procedures and functions and the database backup facilities are totally reliable. Wrapping up as much as your logic as you can inside the database itself (rather than in whatever language you are using) helps security and performance - indeed there's an good argument to be made for always using procedures for insert/update logic as these make you invulnerable to injection attacks.
If I have the choice the only time I'd consider MySQL in preference is with a large, fairly simple, database predominantly used for read access. This isn't to decry MySQL which has improved markedly of late and I happily use if I don't have the choice, but for more complex systems with update/insert activity MSSQL is generally the superior option.
I think your list is subjective but I will play your game.
Flat Files
BDB
SQLite
MySQL
PostgreSQL
SQL Server
Oracle
Teradata

Resources