Migrating data between different schemas - database

I have a small django website where people have signed up and uploaded pictures and stuff.
I now want to rebuild the website API. This will change the database schema and I want to migrate all the user information from old database to new database.
Whats the best practice of doing this? Links to tutorials will be helpful.
The database backend is postgres-postgis.
TIA

There are different approaches to data migration. In my previous employer we rewrote much of the code from scratch and before deploying the new application we had to migrate old data. Two methods are:
Migrate data from the first schema directly from the DB: This is
very useful especially if the data you have in legacy DB is huge. If
you let the DB copy from one table/database to another it will be extremely
fast. You need to have SQL knowledge for this though (google 'insert into from another database').
Write a script or django command to load the data into django models and go from there. This will not be as fast as DB option but it may be easier to code and depending on your scale of changes, your only option. If you are going to do some computation beforehand then a high level language such as python will be helpful.

Related

New node.js app using legacy database, new database, and Redis caching layer

We're developing a new version of our site using Node, but we need to continue using a legacy mysql database as-is yet also add new fields to some models via new tables in a new database, AND add a caching layer.
What's the best way to do this? We were thinking of using Jugglingdb and writing our own adapter. It would need to do several things:
round-robin select from several servers in our db herd.
cache into Redis for read-only connections
know which fields are in the legacy database and which are in the new database.
connect to databases for CRUD connections.
Is this something theoretically doable using a jugglingdb adapter? Or does anyone have other recommendations using another better technique and/or a completely different ORM package?
There's an adapter, jugglingdb-redis-hq, that has a "backyard" feature that is almost what we want, except that it seems to basically be for a sort of backwards caching, i.e. making a persistent copy of expired data in redis over to the database. We don't want to touch the database read/write unless we're changing or inserting something.
Amazing that it's been 3 years since I posted this question. What we ended up doing, and we're finally almost live with this, is this stack:
nodejs (of course)
hapijs for backend framework
Sequelize ORM to talk
to mysql (Sequelize has built-in connection pooling!)
Redis for caching
graphql api using graphql-sequelize module
wrote a service layer under hapi application layer to make queries to graphql api
Crucially, Sequelize did not make it easy to have connections to 2 different databases, so we made the decision to just only add new tables to the old schema, and not make any changes to the old tables. We've since ended up making a couple minor ALTER TABLEs when we really had to. Am still curious if we could have done this part another way, if another ORM would have let us more easily meld the 2 databases under the hood.

Database creation and query

So I have to created a recipe website and HTML-CSS is mainly my forte. I need a database to search through over a 100 recipes and mainly sort them,by author, apart from the other sorting orders. I don't want to use a CMS like Joomla. How do I start about?
Do I store the entire recipe(with a picture or two), into the database, or only a link to the recipe?
Secondly, the client would be updating the website as well, is there any way to simplify the process for the client who has absolutely no knowledge of adding into a database.
You're going to need to do some server-side scripting. If you don't want to use a CMS or framework, you (or someone else) will have to write the code for all of the site.
DB design pointers:
Store the recipe in the database, along with the author, etc.
Don't store the pictures in the db, even though it's easy enough to do. Better store than in a field in the db, called 'filename' or something which stores the path of the images on the server.
For the client - you will need to build a backend/admin page(s) with 'forms' for the client to upload (add), update and delete recipes and pictures.
You don't need save pictures into database. See database model of Prestashop(see only relative to images because are various tables), for example.
Regards and good luck!
You can add pictures into data bases as well. For that you can always reduce the size of the images before inserting into database.
For database, you can use php or javascript. Both provide easy way of accessing database.
Javascript even has inbuilt transaction commit and rollback feature.

Google AppEngine DB Management best practice?

Google app engine offer a data store (some kind of DB wrapper) to hold your data.
It does not supply an editor to this data store - only a viewer.
When developing a web application with other DB - MSSQL, MySql etc. - I change the DB structure in the development process many times.
In AE data store you should edit it's structure and data by using code - Java in my case.
Do you - AE developers - have any best practice to manage this DB updates and save them in some smart way for deployment?
I don't know about "best practice", but I have a Servlet that I use during development which can upload and download all entity data as JSON.
I can then use a regular text editor to make changes or I use a hacked version of JSONpad to edit data live in the system.
Since, I use JSON through out my application this works best for me. One could also do the sample thing with XML and use any one of the many XML editors.
Also, I do use the low-level API for all my applications, so my data models tends to be fairly simple.
There are plenty of JSON/XML editors that could be adapter for your purposes, with a little bit of work.

Interacting with external DB via Django

I'm working on a Django app that interacts with an existing database (think ERP/transaction type data) to perform analysis. There will be minimal/no updating of the existing database mainly reading data in. Its just a simple small setup so no replication etc. issues to think about re. updating.
The analysis would result in new records created within the Django Model.
Currently the existing DB runs on PostgreSQL.
I am aware of Alex Gaynor's GSOC multidb code which, from what I gather is ticket #1142 which has no patch yet to trunk.
So from what I gather there are three options I can see:
1) Point Django db to the same db as the ERP and let it create the tables it needs within it (all the ERP tables have a prefix so there would be no collision) however this strikes me as hackey and a recipe for disaster.
2) Create a new db for Django and automatically copy over the required tables. Better but I cant update, thought I can probably live with this.
3) Try out the multidb patch.
Are there other better ideas out there? I'm leaning towards at least trying out the multidb patch but I'm a little worried about stability and forwards compatibility.
How about not using Django's ORM layer at all for that DB? It the interaction is minimal, you might do it faster by just using direct SQL with the appropriate postgresql-python library.

Umbraco Database Question- Adding custom tables

I'm working on a site managed by Umbraco. I need to store data about images and clients. I don't think there is any way I can store that data in the existing tables.
Is there any reason I shouldn't add the tables I'll need to the Umbraco database, rather than creating a separate DB? I like Umbraco so far but the documentation is a little thin and I haven't found any suggestions one way or the other.
TIA
I have built a site using Umbraco, with a separate application with a database of vehicles. I used the same database as Umbraco is using, and prefixed all my custom app tables with a few letters to distinguish them easily (eg: vehicles_xxx)
I have had no problems with this arrangement, and don't believe there's much risk involved. Of course you'll need to take care when upgrading Umbraco (never upgrade in the live environment before fully testing, and preferably do it locally anyway), however its unlikely an upgrade script will ever alter or delete any tables that it does not know about.
There's heaps of doco available for umbraco now - much more than when i started.. however a question like this is always best for the forums. :)
all the best
greg
You might use the Umbraco API to store and retrieve your data and enjoy the ease of not having to worry bout tables and much more. Or you create your own tables. Do as Gregorius says - using umbraco db is fine.
Your choice depends on:
do you have a lot of data?
do you have a large relation model?
If not - then go with Umbraco API
The rest of the answers you'll find on http://our.umbraco.org
/Jesper Ordrup

Resources