So I have a Drupal 7 database with 2 million users that need to move to Drupal 8 with a minimum of downtime (target is an hour). The Drupal migrate module appears to solve this problem, but it writes new rows one item at a time and in my tests, 4 thousand users + related data took 20 minutes on frankly beastly AWS instances. Extrapolating to the full dataset, it would take me 7 days to run the migration, and that amount of downtime is not reasonable.
I've made a feature request against Drupal core but I also wanted to see if the community has any ideas that I missed. Also, I want to spawn some discussion about this issue.
If anyone still cares about this, I have resolved this issue. Further research showed that not only does the Drupal migration module write new rows one at a time, but it also reads rows from the source one at a time. Further, for each row Drupal will write to a mapping table for the source table so that it can support rollback and update.
Since a user's data is stored in one separate table per custom field, this results in something like 8 reads and 16 writes for each user.
I ended up extending Drupal's Migration Executable for running the process. Then I overrode both the part that reads data and the part that writes it to do their work in batches, and to not write to the mapping tables. I believe that my projected time is now down to less then an hour (A speed up of 168 times!).
Still, trying to use the Drupal infrastructure was more trouble then it was worth. If you are doing this yourself just write a command line application and do the SQL queries manually.
I am using JDeveloper 11.1.2.3.0
I have implemented af:calenar functionality in my application. My calendar is based in a ViewObject that queries a database table with a big number of records (500-1000). Performing the selection through a select query to my database table is very fast, only some ms. The problem is that the time to load of my af:calendar is too long. It requires more than 5 seconds. If I just want to change the month, or the calendar view I have to wait approximately that amount of time. I searched a lot through the net but I found no explanation to this. Can anyone please explain why it takes so long? Has anyone ever faced this issue?
PS: I have tested even with JDeveloper 12 and the problem is identically the same
You should look into the viewobject tuning properties to see how many records you fetch in a single network access, and do the same check for the executable that populates your calendar.
Also try using the HTTP Analyzer to see what network traffic is going on and the ADF Logger to check what SQL is being sent to the DB.
https://blogs.oracle.com/shay/entry/monitoring_adf_pages_round_trips
Well, I am going to query a 4 GB data using a cfquery. It's gonna be pain to query
the whole database as it's gonna take very long time to get the data back.
I tried stored procedure when the data was 2 GB and it wasn't really fast at that time either.
The data pulling will be done based on the date range user is gonna select from a HTML page.
I have been suggested to follow data archiving in order to speed up querying the database.
Do you think that I'll have to create a separate table with only fields that are required and then query this newly created table?
Well, the size of the current table is 4GB but it is increasing day by day, basically, it's a response database ( getting the information stored from somewhere
else). After doing some research, I am wondering if writing a Trigger could be one option? So, if I do this, then as soon as a new entry (row) will be added
into the current 4GB table , the trigger will initiate some SQL Query which will transfer the contents of the required fields into the newly created table.
This will keep on happening as long as I keep on getting new values in my original 4GB database.
Does above approach sounds good enough to tackle my problem? I have one more concern, even though I am filtering out the only fields required to querying into
a new table, at some point of time, the size of my new database will also increase and that could alsow slower the speed of querying the new table?
Please correct me if Iam wrong somewhere.
Thanks
More Information:
I am using SQL Server. Indexing is currently done but it's not effective.
Archiving the data will be farting against thunder. The data has to travel from your database to your application. Then your application has to process it to build the chart. The more data you have, the longer that will take.
If it is really necessary to chart that much data, you might want to simply acknowledge that your app will be slow and do things to deal with it. This includes code to prevent multiple page requests, displays to the user, and such.
I have a wordpress blog with about 33000 posts and the database is about 2.2gb. The speed of the blog is very fast except when i try to post or update any post.. It will run for minutes until timeout but the process continues to run in the background at 100% cpu. I am wondering if there is any workaround? I am sure there are wp can handle alot more posts and a bigger database without such issue.
Delete your post/page revisions. WordPress saves a full copy of each post as a revision file upon saves. That will drop the size of your database drastically - I've reduced databases to 10% of their original sizes - with a subsequent increase in performance. See http://wordpress.org/extend/plugins/better-delete-revision/
Or run this query in phpmyadmin:
DELETE a,b,c
FROM wp_posts a
LEFT JOIN wp_term_relationships b ON (a.ID = b.object_id)
LEFT JOIN wp_postmeta c ON (a.ID = c.post_id)
WHERE a.post_type = 'revision'
Add define ('WP_POST_REVISIONS', 0); to the wp-config.php file to prevent future revisions from being saved.
If you have your own server, look into using mysqltuner.pl to analyze the MySQL database server and loads and be able to adjust your my.cnf file for better performance. See https://github.com/rackerhacker/MySQLTuner-perl
Also look for other non-WP tables in the database that are large. Some web stats plugins write logs to the database, and those tables can get huge. Even if not in use, such large tables can impact performance. Deactivate/delete the plugins to clear and tables, or clear the tables manually.
I've got ~14100 posts in the wp_posts table on a site I manage. The DB is ~102MB. Using that as a reference your DB should be 240MB or so. Why is your DB so large?
If the problem occurs when you post, chances are that you've got a plugin that is trying to do a tremendous amount of work (possibly why your DB is so large), or maybe that is trying to contact a third party site, and is timing out. Check your plugins. Disable them one by one if it isn't obvious where the problem is.
Use this. Problem Solved.
http://wordpress.org/extend/plugins/wp-super-cache/
Tested on a server with over 500K unique users per month.
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
For example: Updating all rows of the customer table because you forgot to add the where clause.
What was it like, realizing it and reporting it to your coworkers or customers?
What were the lessons learned?
I think my worst mistake was
truncate table Customers
truncate table Transactions
I didnt see what MSSQL server I was logged into, I wanted to clear my local copy out...The familiar "OH s**t" when it was taking significantly longer than about half a second to delete, my boss noticed I went visibily white, and asked what I just did. About half a mintue later, our site monitor went nuts and started emailing us saying the site was down.
Lesson learned? Never keep a connection open to live DB longer than absolutly needed.
Was only up till 4am restoring the data from the backups too! My boss felt sorry for me, and bought me dinner...
I work for a small e-commerce company, there's 2 developers and a DBA, me being one of the developers. I'm normally not in the habit of updating production data on the fly, if we have stored procedures we've changed we put them through source control and have an officially deployment routine setup.
Well anyways a user came to me needing an update done to our contact database, batch updating a bunch of facilities. So I wrote out the query in our test environment, something like
update facilities set address1 = '123 Fake Street'
where facilityid in (1, 2, 3)
Something like that. Ran it in test, 3 rows updated. Copied it to clipboard, pasted it in terminal services on our production sql box, ran it, watched in horror as it took 5 seconds to execute and updated 100000 rows. Somehow I copied the first line and not the second, and wasn't paying attention as I CTRL + V, CTRL + E'd.
My DBA, an older Greek gentleman, probably the grumpiest person I've met was not thrilled. Luckily we had a backup, and it didn't break any pages, luckily that field is only really for display purposes (and billing/shipping).
Lesson learned was pay attention to what you're copying and pasting, probably some others too.
A junior DBA meant to do:
delete from [table] where [condition]
Instead they typed:
delete [table] where [condition]
Which is valid T-Sql but basically ignores the where [condition] bit completely (at least it did back then on MSSQL 2000/97 - I forget which) and wipes the entire table.
That was fun :-/
About 7 years ago, I was generating a change script for a client's DB after working late. I had only changed stored procedures but when I generated the SQL I had "script dependent objects" checked. I ran it on my local machine and all appeared to work well. I ran it on the client's server and the script succeeded.
Then I loaded the web site and the site was empty. To my horror, the "script dependent objects" setting did a DROP TABLE for every table that my stored procedures touched.
I immediately called the lead dev and boss letting them know what happened and asking where the latest backup of the DB could be located. 2 other devs were conferenced in and the conclusion we came to was that no backup system was even in place and no data could be restored. The client lost their entire website's content and I was the root cause. The result was a $5000 credit given to our client.
For me it was a great lesson, and now I am super-cautious about running any change scripts, and backing up DBs first. I'm still with the same company today, and whenever the jokes come up about backups or database scripts someone always brings up the famous "DROP TABLE" incident.
Something to the effect of:
update email set processedTime=null,sentTime=null
on a production newsletter database, resending every email in the database.
I once managed to write an updating cursor that never exited. On a 2M+ row table. The locks just escalated and escalated until this 16-core, 8GB RAM (in 2002!) box actually ground to a halt (of the blue screen variety).
update Customers set ModifyUser = 'Terrapin'
I forgot the where clause - pretty innocent, but on a table with 5000+ customers, my name will be on every record for a while...
Lesson learned: use transaction commit and rollback!
We were trying to fix a busted node on an Oracle cluster.
The storage management module was having problems, so we clicked the un-install button with the intention of re-installing and copying the configuration over from another node.
Hmm, it turns out the un-install button applied to the entire cluster, so it cheerfully removed the storage management module from all the nodes in the system.
Causing every node in the production cluster to crash. And since none of the nodes had a storage manager, they wouldn't come up!
Here's an interesting fact about backups... the oldest backups get rotated off-site, and you know what your oldest files on a database are? The configuration files that got set up when the system was installed.
So we had to have the offsite people send a courier with that tape, and a couple of hours later we had everything reinstalled and running. Now we keep local copies of the installation and configuration files!
I thought I was working in the testing DB (which wasn't the case apparently), so when I finished 'testing' I run a script to reset all data back to the standard test data we use... ouch!
Luckily this happened on a database that had backups in place, so after figuring out I did something wrong we could easily bring back the original database.
However this incident did teach the company I worked for to realy seperate the production and the test environment.
I don't remember all the sql statements that ran out of control but I have one lesson learned - do it in a transaction if you can (beware of the big logfiles!).
In production, if you can, proceed the old fashioned way:
Use a maintenance window
Backup
Perform your change
verify
restore if something went wrong
Pretty uncool, but generally working and even possible to give this procedure to somebody else to run it during their night shift while you're getting your well deserved sleep :-)
I did exactly what you suggested. I updated all the rows in a table that held customer documents because I forgot to add the "where ID = 5" at the end. That was a mistake.
But I was smart and paranoid. I knew I would screw up one day. I had issued a "start transaction". I issued a rollback and then checked the table was OK.
It wasn't.
Lesson learned in production: despite the fact we like to use InnoDB tables in MySQL for many MANY reasons... be SURE you haven't managed to find one of the few MyISAM tables that doesn't respect transactions and you can't roll back on. Don't trust MySQL under any circumstances, and habitually issuing a "start transaction" is a good thing. Even in the worst case scenario (what happened here) it didn't hurt anything and it would have protected me on the InnoDB tables.
I had to restore the table from a backup. Luckily we have nightly backups, the data almost never changes, and the table is a few dozen rows so it was near instantaneous. For reference, no one knew that we still had non-InnoDB tables around, we thought we converted them all long ago. No one told me to look out for this gotcha, no one knew it was there. My boss would have done the same exact thing (if he had hit enter too early before typing the where clause too).
I discovered I didn't understand Oracle redo log files (terminology? it was a long time ago) and lost a weeks' trade data, which had to be manually re-keyed from paper tickets.
There was a silver lining - during the weekend I spent inputting, I learned a lot about the useability of my trade input screen, which improved dramatically thereafter.
Worst case scenario for most people is production data loss, but if they're not running nightly backups or replicating data to a DR site, then they deserve everything they get!
#Keith in T-SQL, isn't the FROM keyword optional for a DELETE? Both of those statements do exactly the same thing...
The worst thing that happened to me was that a Production server consume all the space in the HD. I was using SQL Server so I see the database files and see that the log was about 10 Gb so I decide to do what I always do when I want to trunc a Log file. I did a Detach the delete the log file and then attach again. Well I realize that if the log file is not close properly this procedure does not work. so I end up with a mdf file and no log file. Thankfully I went to the Microsoft site I get a way to restore the database as recovery and move to another database.
Updating all rows of the customer table because you forgot to add the where clause.
That was exactly i did :| . I had updated the password column for all users to a sample string i had typed onto the console. The worst part of it was i was accessing the production server and i was checking out some queries when i did this. My seniors then had to revert an old backup and had to field some calls from some really disgruntled customers. Ofcourse there is another time when i did use the delete statement, which i don't even want to talk about ;-)
I dropped the live database and deleted it.
Lesson learned: ensure you know your SQL - and make sure that you back up before you touch stuff.
This didn't happen to me, just a customer of ours whos mess I had to clean up.
They had a SQL server running on a RAID5 disk array - nice hotswap drives complete with lighted disk status indicators. Green = Good, Red = Bad.
One of their drives turned from green to red and the genius who was told to pull and replace the (Red) bad drive takes a (Green) good one out instead. Well this didn't quite manage to bring down the raid set completely - opting for the somewhat readable (Red) vs unavaliable (Green) for several minutes.. after realizing the mistake and swapping the drives back any data blocks that were written during this time became jyberish as disk synchronization was lost) ... 24-straight hours later writing meta programs to recover readable data and reconstruct a medium sized schema they were back up and running.
Morals of this story include...Never use RAID5, always maintain backups, careful who you hire.
I made a major mistake on a customers production system once -- luckily while wondering why the command was taking so long to execute realized what I had done and canceled it before the world came to an end.
Moral of this story include ... always start a new transaction before changing ANYTHING, test the results are what you expect and then and only then commit the transaction.
As a general observation many classes of rm -rf / type errors can be prevented by properly defining foreign key constraints on your schema and staying far away from any command labled 'CASCADE'
Truncate table T_DAT_STORE
T_DAT_STORE was the fact table of the department I work in. I think I was connected to the development database. Fortunately, we have a daily backup, which hasn't been used until that day, and the data was restored in six hours.
Since then I revise everything before a truncate, and periodically I ask for a backup restoration of minor tables only to check the backup is doing well (Backup isn't done by my department)