Compare application database created in SQL Server and Oracle - sql-server

I looked around for this task I have on my hands but did not find anything helpful. I am primarily a Java person with sound knowledge of database from software development point of view. I do have some knowledge of DBA functions with what can and cannot be done but not able to come up with a good solution.
The task I have is to compare the databases created in SQL Server and Oracle by our application installer.
I think I have been able to come up with some queries (of course, by searching online) in SQL Server that will give me things like number of tables in a schema, each table's columns with data types and indexes, different types of constraints, triggers, etc. (with their count) created for each of those tables. I can provide those SQLs if somebody is interested.
However, Oracle seem to be more tricky. I would appreciate if somebody can help or maybe point me in the right direction.
I am trying to find out somethings like following:
Number of tables created
Number of indexes, constraints (with their types), triggers for each of those tables
Number of stored procedures/functions created
Number of views created
Any help will be greatly appreciated.
Thank you.

First off, if you are already comfortable writing Java code, I'm not sure that I would be writing a bunch of SQL to do this comparison. JDBC already has a DatabaseMetaData class that has methods like getTables to get all the tables. That would give you one API to work with and let you leverage the fact that the folks that wrote the JDBC drivers already wrote all the code to query the data dictionary tables in whatever database you are using. This will also let you focus on differences in how the objects your installer creates will be perceived by the application.
If you are going to write specific SQL, the Oracle data dictionary tables are pretty easy to work with. The ones you'll care about are going to follow the pattern [user|all|dba]_<<type of thing>>. The [user|all|dba] prefix indicates whether you are looking for objects that you own (user), objects that you have access to (all), or all objects in the database (dba). Normal users often don't have access to the dba views because that is a potential security issue-- generally you don't want people to know that an object exists if they don't have access to it. In my examples, I'll use the all versions of the objects but you can change all to user or dba depending on what you're after.
all_tables will show you information about all the tables you have access to. You probably want to add a filter on owner for the schema(s) that your installer touches since you may have access to tables that are not part of your application.
all_indexes, all_constraints, and all_triggers will show you information about indexes, constraints, and triggers. Again, you may want to add a predicate on owner to limit yourself to the schema(s) that you care about.
all_procedures will show you information about procedures and functions both stand-alone and in packages.
all_views will show you information about all views.
If you are really just interested in counts, you may be able to simply go to all_objects and do a count grouping by object_type. I'm guessing that you'll want to see attributes of the various objects so you'll want to go to the various object-specific views.

Related

SQL Server copy data across databases

I'm using SQL Server 2019. I have a "MasterDB" database that is exposed to GUI application.
I'm also going to have as many as 40 user database like "User1DB", "User2DB", etc. And all these user databases will have "exact same" schema and tables.
I have a requirement to copy tables data (overwriting target) from one user database (let's say "User1DB") to the other (say "User2DB"). I need to do this from within "MasterDB" database since the GUI client app going to have access to only this database. How to handle this dynamic situation? I'd prefer static SQL (in form of Stored Procedures) rather than dynamic SQL.
Any suggestion will greatly be appreciated. Thanks.
Check out this question here for transferring data from one database to another.
Aside from that, I agree with #DaleK here. There is no real reason to have a database per user if we are making the assumption that a user is someone who is logging into your frontend app.
I can somewhat understand replicating your schema per customer if you are running some multi-billion record enterprise application where you physically have so much data per customer that it makes sense to split it up, but based on your question that doesn't seem to be the case.
So, if our assumptions are correct, you just need to have a user table, where your fields might be...
UserTable
UserId
FName
LName
EmailAddress
...
Edit:
I see in the comments you are referring to "source control data" ... I suggest you study up on databases and how they're meant to be designed, implemented, and how data should be transacted. There are a ton of great articles and books out there on this with a simple Google search.
If you are looking to replicate your data for backup purposes, look into some data warehouse design principles, maybe creating a separate datastore in a different geographic region for that. The latter is a very complex subject to which I can't go over in this answer, but it sounds like that goes far beyond your current needs. My suggestion is to backtrack and hash out the needs for your application, while understanding some of the fundamentals of databases (or different methods of storing data). Implement something and then see where it can be expanded upon / refactored.
Beyond that, I can't be more detailed than the original question you posted. Hope this helps.

Transact SQL, prefix with schema or not?

I'm developing a tool where I've prefixed tables etc. with "dbo" now I get requests for custom schema names. I'm thinking of skipping them and instead let the user control this via the associated login against the Db. I know there's talk about "performance" since it needs to search the users's schemes and then fallback on dbo etc, but is that really an issue? Opinions?
First, I would look at this question as a feature request from your customers (users?). So the immediate decision to make is, should you even consider looking into this now, or do you have other feature requests that are obviously more important and deliver more benefit to the customer?
For example, for now you could simply tell customers that your application requires its own database that should not be shared with other applications or manipulated in any way by the customer. Then you don't have to worry about schemas or the same object name in two schemas because your application 'owns' the database. Perhaps this is already the case, but if so then I don't understand why your customers care which schema your objects are in.
Second, assuming that you do decide to work on it, you should gather some information about why people are asking for this, to make sure that you clearly understand what they expect you to deliver and what the benefit is for them. If customers are really saying "your application runs slowly" then the choice of schema is highly unlikely to be the reason, it's much more probable that indexing, schema design or your application code are the areas to look at.
Finally, if you still want to go ahead you need to find a technical solution. This is partly a deployment issue and partly a coding issue. It's a deployment issue because you have to deploy your database objects in a specific schema that is specified at installation time, and all your patches and later releases need to be aware of that too. The coding issue is that you need your database code to be "schema-aware", in case you end up in a situation where you have dbo.TableName, MyTool.TableName and OtherSchema.TableName all in the same database. The solution is obviously to reference the schema name in all code, which is considered an important best practice anyway. But exactly how you do this depends on how you have structured your application, if you use an ORM etc.

Automatic database schema generation system?

I'm working with a client who has a piece of custom website software that has something I haven't seen before. It has a MySQL Database backend, but most of the tables are auto-generated by the php code. This allows end-users to create tables and fields as they see fit. So it's a database within a database, but obviously without all the features available in the 'outermost' database. There are a couple tables that are basically mappings of auto-generated table names and fields to user-friendly table names and fields.* This makes queries feel very unintuitive :P
They are looking for some additional features, ones that are immediately available when you use the database directly, such as data type enforcement, foreign keys, unique indexes, etc. But since this a database within a database, all those features have to be added into the php code that runs the database. The first thing that came to my mind is Inner Platform Effect* -- but I don't see a way to get out of database emulation and still provide them with the features they need!
I'm wondering, could I create a system that gives users nerfed ability to create 'real' tables, thus gaining all the relational features for free? In the past, it's always been the developer/admin who made the tables, and then the users did CRUD operations through the application. I just have an uncomfortable feeling about giving users access to schema operations, even when it is through the application. I'm in uncharted territory.
Is there a name for this kind of system? Internally, in the code, this is called a 'collection' system. The name of 'virtual' tables and fields within the database is called a 'taxonomy'. Is this similiar to CCK or the taxonomy modules in Drupal? I'm looking for models of software that do this kind of this, so I can see what the pitfalls and benefits are. Basically I'm looking for more outside information about this kind of system.
Note this is not a simple key-value mapping, as the wikipedia article on inner-platform effect references. These work like actual tuples of multiple cells -- like simple database tables.
I've done this, you can make it pretty simple or go completely nuts with it. You do run into problems though when you put it into customers' hands, are we going to ask them to figure out primary keys, unique constraints and foreign keys?
So assuming you want to go ahead with that in mind, you need some type of data dictionary, aka meta-data repository. You have a start, but you need to add the ideas that columns are collected into tables, then specify primary and foreign keys.
After that, generating DDL is fairly trivial. Loop through tables, loop through columns, build a CREATE TABLE command. The only hitch is you need to sequence the tables so that parents are created before children. That is not hard, implement a http://en.wikipedia.org/wiki/Topological_ordering
At the second level, you first have to examine the existing database and then sometimes only issue ALTER TABLE ADD COLUMN... commands. So it starts to get complicated.
Then things continue to get more complicated as you consider allowing DEFAULTS, specifying indexes, and so on. The task is finite, but can be much larger than it seems.
You may wish to consider how much of this you really want to support, and then make a value judgment about coding it up.
My triangulum project does this: http://code.google.com/p/triangulum-db/ but it is only at Alpha 2 and I would not recommend using it in a Production situation just yet.
You may also look at Doctrine, http://www.doctrine-project.org/, they have some sort of text-based dictionary to build databases out of, but I'm not sure how far they've gone with it.

How to generate a diagram of a very large database schema (SQL Server)

I have a very large database I need to diagram. The database is SQL Server 2008 on x64. It is large in that there are hundreds of related tables, each with up to 2000 fields (some are sparse), multiple relationships between tables (often hundreds per table, in fact), multiple schemas... you get the idea.
I tried to use the Database Diagrams feature of SQL Server Management Studio, but it crashed with a Win32Exception: "Not enough storage is available to process this command..."
I tried to use Visio's reverse engineering feature on a different machine to connect in and diagram it, but that's been going for a few hours with no sign of completion.
The scripts to build this giant schema are being by a tool we built for the job. While the tool is doing its job just fine, it's tricky to visualise its output.
I'm after a tool to kick out a diagram of this database so we can do this. Any suggestions?
EDIT:
Just to emphasize, the diagram is indeed not supposed to be used for actual useful reference. It's a client relationship management device to demonstrate the complexity/scale of the system.
I worked at a place that had several hundred tables (near 1k) and no one really knew what was going on in the system, company was growing and hiring a lot. A guy was tasked with doing a diagram, and he auto-magically created a gigantic tiled poster that contained every table with lines connecting various tables (going all over the place). I'm not sure what he used, it was Unix and Oracle years ago (way before Linux and open source). There was no real rhyme or reason to the layout of the the tables in his diagram. He had successfully created a diagram of every table. The "poster" was put on a wall in a common area, and got a few looks, but no one ever really used it, it was unusable, too cluttered, too unorganized. As a result, I used MS-Word to create a single page diagram containing the 20 main tables (it went through a few iterations as I "discovered" new main tables) with lines for each foreign key and each table located in a logical manner. I showed the column name, data type, nullability, PK, and all FKs. I put my diagram up on my wall by my monitor. Eventually everyone wanted a copy of my diagram, including the person that made the "poster". When I left that job they were still giving my diagram to new hires.
I recommend that you work like an explorer, find the key tables and map them as you go, making as many specific diagrams as necessary as you discover the system. Trying to make a gigantic "poster" automatically will not work very well.
Generating an image of any kind for a database of that size simply becomes eye candy that is stuck on a wall that draw's gasps, and honestly serves no real purpose except occasional glances. Why not use a tool like Red Gate's Documentation tool that will serve an actual purpose? Please understand I'm not saying this in a mocking way, but I've been down this road before trying to diagram a huge database, and I succeeded to some degree, but never found a good outlet where it was of some use.
Since you have multiple schemas maybe a good idea is to generate diagrams per schema instead
Use graphviz. Use some SQL statements to generate the digram, then run it through dot.exe to generate a PDF or PNG.
I've used it to generate digrams of data within SQL Server tables. No reason why you can use it for tables too.
http://www.graphviz.org/
There are also java, silverlight, and AJAX utilities for navigating extra large graphs, as PDF is only for one page.
I'd avoid doing the whole thing in a single diagram. As you mentioned, the tools crash, and it's probably not possible to easily comprehend a diagram with hundreds of tables with potentially thousands of records per table. Can you generate diagrams of smaller logical areas with some overlap to other
logical areas?
Alternately, you could try using something like graphviz to parse the DDL statements and then produce a graph. It will probably churn for a while, but I remember seeing in a university poster-sized diagrams with tiny print, that were probably of the same complexity as yours. Good luck!
FWIW, assuming you do want to go ahead with this I've personally found that the visual studio 2010 database modeller does the nicest diagrams I've come across so far - Just import your database as if you were going to use it for Linq2SQL
schemaspy
provides a handy interface to generate interactive diagrams that span multiple schemas using graphviz as a backend. I've never tried it on anything this size though.
IntelliJ (specifically IDEA as just tried with this, but I believe their other IDEs offer this feature https://www.jetbrains.com/) has a built in database client facility, from here you can connect to your database and analyse individual tables, specific combination or table or all your tables by highlighting the desired tables, 'right clicking' and selecting the 'diagram' option. You can save for later reference and also print. I have just tried this on a large DB of 500+ tables and it rendered in seconds, the vector diagram serves as an alternative way to digest database structures visually and the relationships and constraints between certain tables but not recommended for printing.

Can you provide some advice on setting up my database?

I'm working on a MUD (Multi User Dungeon) in Python and am just now getting around to the point where I need to add some rooms, enemies, items, etc. I could hardcode all this in, but it seems like this is more of a job for a database.
However, I've never really done any work with databases before so I was wondering if you have any advice on how to set this up?
What format should I store the data in?
I was thinking of storing a Dictionary object in the database for each entity. In htis way, I could then simply add new attributes to the database on the fly without altering the columns of the database. Does that sound reasonable?
Should I store all the information in the same database but in different tables or different entities (enemies and rooms) in different databases.
I know this will be a can of worms, but what are some suggestions for a good database? Is MySQL a good choice?
1) There's almost never any reason to have data for the same application in different databases. Not unless you're a Fortune500 size company (OK, i'm exaggregating).
2) Store the info in different tables.
As an example:
T1: Rooms
T2: Room common properties (aplicable to every room), with a row per **room*
T3: Room unique properties (applicable to minority of rooms, with a row per property per room - thos makes it easy to add custom properties without adding new columns
T4: Room-Room connections
Having T2 AND T3 is important as it allows you to combine efficiency and speed of row-per-room idea where it's applicable with flexibility/maintanability/space saving of attribute-per-entity-per-row (or Object/attribute/value as IIRC it's called in fancy terms) schema
Good discussion is here
3) Implementation wise, try to write something re-usable, e.g. have generic "Get_room" methods, which underneath access the DB -= ideally via transact SQL or ANSI SQL so you can survive changing of DB back-end fairly painlessly.
For initial work, you can use SQLite. Cheap, easy and SQL compatible (the best property of all). Install is pretty much nothing, DB management can be done by freeware tools or even FireFox plugin IIRC (all of FireFox 3 data stores - history, bookmarks, places, etc... - are all SQLite databases).
For later, either MySQL or Postgres (I don't do either one professionally so can't recommend one). IIRC at some point Sybase had free personal db server as well, but no idea if that's still the case.
This technique is called entity-attribute-value model. It's normally preferred to have DB schema that reflects the structure of the objects, and update the schema when your object structure changes. Such strict schema is easier to query and it's easier to make sure that the data is correct on the database level.
One database with multiple tables is the way to do.
If you want a database server, I've recommend PostgreSQL. MySQL has some advantages, like easy replication, but PostgreSQL is generally nicer to work with. If you want something smaller that works directly with the application, SQLite is a good embedded database.
Storing an entire object (serialized/encoded) as a value in the database is bad for querying - I am sure that some queries in your mud will NOT need to know 100% of attributes, or may retrieve a list of object by a value of attributes.
it seems like this is more of a job
for a database
True, although 'database' doesn't have to mean 'relational database'. Most existing MUDs store all data in memory, and read it in from flat-file saved in a plain-text data format. I'm not necessarily recommending this route, just pointing out that a traditional database is by no means necessary. If you do want to go the relational route, recent versions of Python come with sqlite which is a lightweight embedded relational database with good SQL support.
Using relational databases with your code can be awkward. Any change to a game logic class can require a parallel change to the database, and changes to the code that read and write to the database. For this reason good planning will help you a lot, but it's hard to plan a good database schema without experience. At least get your entity classes planned first, then build a database schema around it. Reading up on normalizing a database and understanding the principles there will help.
You may want to use an 'object-relational mapper' which can simplify a lot of this for you. Examples in Python include SQLObject, SQLAlchemy, and Autumn. These hide a lot of the complexities for you, but as a result can hide some of the important details too. I'd recommend using the database directly until you are more familiar with it, and consider using an ORM in the future.
I was thinking of storing a Dictionary
object in the database for each
entity. In htis way, I could then
simply add new attributes to the
database on the fly without altering
the columns of the database. Does that
sound reasonable?
Unfortunately not - if you do that, you waste 99% of the capabilities of the database and are effectively using it as a glorified data store. However, if you don't need aforementioned database capabilities, this is a valid route if you use the right tool for the job. The standard shelve module is well worth looking at for this purpose.
Should I store all the information in
the same database but in different
tables or different entities (enemies
and rooms) in different databases.
One database. One table in the database per entity type. That's the typical approach when using a relational database (eg. MySQL, SQL Server, SQLite, etc).
I know this will be a can of worms,
but what are some suggestions for a
good database? Is MySQL a good choice?
I would advise sticking with sqlite until you're more familiar with SQL. Otherwise, MySQL is a reasonable choice for a free game database, as is PostGreSQL.
One database. Each database table should refer to an actual data object.
For instance, create a table for all items, all creatures, all character classes, all treasures, etc.
Spend some time now and figure out how objects will relate to each other, as this will affect your database structure. For example, can a character have more than one character class? Can monsters have character classes? Can monsters carry items? Can rooms have more than one monster?
It seems pedantic, but you'll save yourself a whole lot of trouble early by figuring out what database objects "belong" to which other database objects.

Resources