I have a tool which uses SQL scripts to apply changes to a customer database. Often this invloves changing a column definition (datatype etc). The problem is that often there are primary keys applied by the user that we don't know about (and they don't remember), which trips up the process (eg when changing columns belonging to the indexes or primary keys).
The requirement given to me is that this update process should be 'seamless', with no human involvement to prepare the ground. I have also researched this on this forum, and as far as I can see my particular question has not yet been asked.
I know how to disable and then later rebuild all indexes on a database, and even those only in certain tables, but if the index is on a primary key I still can't change any column that is part of the primary key unless I explicitly drop the PK by name, and later recreate it explicitly, which means I have to know about it at code-time. I can probably write a query to find the name of the primary key on a table if one is there, but how to know how to recreate it?
How can I, using Transact-SQL (or PL/SQL), detect, drop and then recreate the primary keys on given tables, without knowing at code time what they are or what columns belong to them? The key is that the tool cannot know in advance what the primary keys are are on any given table, nor what they comprise. The SQL code must handle this itself.
Better still would be to detect if a known column belongs to a primary key, then drop and later recreate that after I have changed the column.
This needs to be done in both Oracle and Sql Server, ideally purely with SQL code.
TIA
I really don't understand why would a customer define his own primary keys for the tables? Moreover, I don't understand why would you let them? In my world, if customer changes schema in any way, this automatically means end of support for them.
I will strongly advise against dropping and recreating primary keys on production database. Any number of bad things can happen, leading to data loss.
And it's not just the PKs, you will have to drop the foreign key constraints first. And FKs may reference not only the PKs but the unique constraints as well, so yao have to deal with those as well.
Your best bet would be to create a new table with the required schema, copy the data, drop original table and rename the new one. Of course, you will have to handle the FKs, but it's easier. Check this link an example:
http://sqlblog.com/blogs/john_paul_cook/archive/2009/09/17/script-to-create-all-foreign-keys.aspx
Related
I have large table with many rows. I want to change the primary key from int to bigint.
My question is do I have to update/rebuild the indexes? Or is that automatically done behind the scenes?
I assume you are dropping and recreating the PK for the data type change and it is a clustered index. In this case, the non-clustered indexes are rebuilt automatically when the clustered primary key is dropped and again when it's recreated.
With a large table, you could manually drop the non-clustered indexes first and recreate afterwards. That way, they are only rebuilt once and save some time.
In principle, the only thing you need to do is to issue ALTER TABLE ALTER COLUMN .... SQL Server will fix the implicit primary key index as part of the type change.
In practice you will need to plan for such a change.
In some circumstances the change can be done online by issuing WITH (ONLINE=ON). (The default is to do the change "offline", blocking access to the table while the change is carried out.). However, depending on the type of index the change might not appear to be entirely "online".
You need enough disk space (the old and new key will exist on disk at the same time).
Are there foreign keys referencing the PK? FK ints to PK bigins are allowed, but generally a very bad idea.
If this is a production system the only responsible thing to do is to test it. If there is no test system then copy a subset of the base table to a parallel table on the production server and do the change. Check if normal production queries can be executed against the table during the change.
This might be too subjective, but it's been puzzling me some time.
If you have a Fact table that allows duplicates with 10 dimensions that do not, do you really need a primary key?
Why Are There Duplicates?
It's a bit tricky, but ideally each duplicate is actually valid. There is just not a unique identifier to separate them from the source system recording the record. We don't own that system so there is no way to ever change it.
Data
The data is in batch and only include the previous days worth of records. Therefore, in the event of a republish. We just drop the entire days worth of records and republish the new day of records without the use of a primary key.
This is how I would fix bad data.
Generate A Primary Key Already
I can, but if it's never used or have anyway to validate if the duplicate is legit, why do it?
SQL Server database tables do not require a primary key.
A database engine may well create a primary key in the background though.
Yes, SQL Server don't need primary key. Mostly, it needs in CLUSTERED index. Because, if you have another NONCLUSTERED indexes on this table, every of them will use CLUSTERED index for pointing data. So, primary key is good example of clustered key. And if it's short, and you have another indexes - it's reason to create it.
First, I want to talk a little about the Foreign key constraint rule and how helpful it is. Suppose I have two tables, a primary table with the primary column called ID, the other table is the foreign one which also has a primary column called ID. This column in the foreign table refers to the ID column in the primary table. If we don't establish any Foreign key relation/constraint between those tables, we may fall foul of many problems related to integrity.
If we create the foreign key relation for them, any changes to the ID column in primary table will 'auto' reflect to the ID column in the foreign table, changes here can be made by DELETE, UPDATE queries. Moreover, any changes to the ID in the foreign table should be constrained by the ID column in the primary table, for example there shouldn't any new value inserted or updated in the ID column of the foreign table unless it does exist in the ID column of the primary table.
I know that SQLite doesn't support foreign key constraint (with full functions as detailed above) and I have to use TRIGGER to work around this problem. I have used TRIGGER to work around successfully in one way (Any changes to the ID column in the primary table will refect to the ID column in the foreign table) but the reverse way (should throw/raise any error if there is a confict occurs, for example, there are only values 1,2,3 in the ID column of the primary table, but the value 2 in the ID column of the foreign table is updated to 4 -> not exist in the primary table -> should throw error) is not easy. The difficult is SQLite doesn't also support IF statement and RAISERROR function. If these features were supported, I could work around easily.
I wonder how you can use SQLite if it doesn't support some important features? Even working around by using TRIGGER is not easy and I think it's impossible, except that you don't care about the reverse way. (In fact, the reverse way is not really necessary if you set up your SQL queries carefully, but who can make sure? Raising error is a mechanism reminding us to fix and correct and making it work exactly without corrupting data and the bugs can't be invisible.
If you still don't know what I want, I would like to have some last words, my purpose is to achieve the full functionality of the Foreign key constraint which is not supported in SQLite (even you can create such a relationship but it's fake, not real as you can benefit from it in SQL Server, SQL Server Ce, MS Access or MySQL).
Your help would be highly appreciated.
PS: I really like SQLite because it is file-based, easy to deploy, supports large file size (an advantage over SQL Server Ce) but some missing features have made me re-think many times, I'm afraid if going for it, my application may be unreliable and corrupt unpredictably.
To answer the question that you have skillfully hidden in your rant:
SQLite allows the RAISE function inside triggers; because of the lack of control flow statements, this must be used with a SELECT:
CREATE TRIGGER check_that_id_exists_in_parent
BEFORE UPDATE OF id ON child_table
FOR EACH ROW
BEGIN
SELECT RAISE(ABORT, 'parent ID does not exist')
WHERE NOT EXISTS (SELECT 1
FROM parent_table
WHERE id = NEW.id);
END;
I'm attempiting to use cx_OracleTool's CopyData.py script to copy data between two tables on separate Oracle schemas/instances:
http://cx-oracletools.sourceforge.net/cx_OracleTools.html
When I run it against my tables, I get the error:
No primary or unique constraint found on table.
I don't know much about Oracle, to be honest, but from what I can tell the tables don't seem to have any PK constraint or anything like that defined.
The merits of this aside, I think it's simply been setup that way for expediency, and it's unlikely to change anytime nearterm.
Is there any way to get copyData.py to run in this scenario without a PK constraint?
Cheers,
Victor
The issue is that CopyData checks to see if the row exists in the destination table, and it can't do that without a unique key.
If it is acceptable to insert all rows and not update changed ones, use the --no-check-exists option. According to the code this will bypass the primary key check.
Otherwise, use the --key-columns=COLS option to manually specify the columns to be used as the unique key. This will also bypass the primary key check.
I've got a database that doesn't have any foreign keys. I've done some checks and there are a a fair few orphaned records.
Its a pretty large database 500 + tables and I'm looking at the possibility of building the foreign keys back in.
Other than trawling though every single table over time?
Has anybody ever been through this process before and can maybe offer some insights or tips on how to make the process a little easier.
Any help advice appreciated.
I assume you mean "doesn't have any foreign key constraints"...if there were no foreign keys, you wouldn't know which records matched at all.
Do the primary and foreign key fields have the same name? As in, the PK table has a "CustomerId" field and the FK table(s) also have a "CustomerId" field? If so, you might be able to query the column properties (perhaps using INFORMATION_SCHEMA, you didn't mention an RDBMS) to figure out some implied relationships. Just query for all the tables that have a field called "CustomerId" that is not a PK and there's a good (but not certain) bet that those tables should have an FK constraint to the Customer table. You could even use the output of the query to generate the DDL to create the constraints.
You can work from the largest to smallest tables, or start with the least performant area of the database. Adding keys should help your performance significantly, but you'll have to resolve the orphan rows first. You may need input from the business for that. Expect them to be very confused about what's going on.