Django database scalability - database

We have a new django powered project which have a potential heavy-traffic characteristic(means a heavy db interaction). So we need to consider the database scalability in advance. With some researches, the following questions are still not clear to us:
coarse-grained: how to specify one db table(a django model) to a specific db(maybe in another server)?
fine-grained: how to specify a group of table rows to a specific db(so-called sharding, also can in another db server)?
how to specify write and read to different db?(which will be helpful for future mysql master/slave replication)
We are finding the solution with:
be transparent to application program(means we don't need to have additional codes in views.py)
should be in ORM level(means only needs to specify in models.py)
compatible with the current(or future) django release(to keep a minimal change for future's upgrading of django)
I'm still doing the research. And will share in this thread later if I've got some fruits.
Hope anyone with the experience can answer. Thanks.

Don't forget about caching either. Using memcached to relieve your DB of load is key to building a high performance site.
As alex said, django-core doesn't support your specific requests for those features, though they are definitely on the todo list.
If you don't do this in the application layer, you're basically asking for performance trouble. There aren't any really good open source automation layers for this sort of task, since it tends to break SQL axioms. If you're really concerned about it, you should be coding the entire application for it, not simply hoping that your ORM will take care of it.

There is the GSoC project by Alex Gaynor that in future will allow to use multiple databases in one Django project. But now there is no cross-RDBMS working solution.
There is no solution right now too.
And again - there is no cross-RDBMS solution. But if you are using MySQL you can try excellent third-party Django application called - mysql_replicated. It allows to setup master-slave replication scenario easily.

here for some reason we r using django with sqlalchemy. maybe combination of django and sqlalchemy also works for your needs.

Related

Database Agnostic Application

The database for one the application that I am working on is not confirmed yet by the business.
Best guess is Oracle and DB2.
What I've heard is initially the project will go live with DB2 V9 and then to Oracle 11g.
We are using Spring 3.0.5, Hibernate 3.5, JPA2 and JBoss 5 for this project
So what are the best practices here going into the build phase and test phase?
Shall I build using DB2 first and worry about Oracle later (this
doesn't sound right)?
Or, shall I write using JPA (Hibernate) and
then generate the database schema?
Or something else?
PS: I've no control over the choice of the DB, what and when, as these are strategic decision made by people sitting in nice rooms getting fat cheques and big bonuses.
Thanks,
Adi
Obviously you are loosing the access to specific features of the database if you are writing your application database agnostic. The database is, except for automatic optimizations done by JPA and Hibernate, reduced to common features. You have to set some things to automatic and trust JPA/Hibernate to do it right that you could set specifically if you knew the database (e.g. id generator strategies).
But it seems that the specific developer features of the database are not relevant for the decision so they can't be relevant to the application. What other reasons may influence the decision (like price, money, cash, personal relations, management tools, hardware requirements, existing knowledge and personell) can only be speculated about.
So IMHO you don't have a choice. Strictly avoid anything database specific. That includes letting the JPA/Hibernate generate the schema (your point #2). In this project setup you shouldn't tinker with the database manually.
Well... sadly there ARE some hidden traps in JPA/Hibernate developement that make it database dependent (e.g. logarithmic functions are not mapped consistenly). So you should run all your tests against all possible databases from day one. As you write "Best guess is..." you should just grab any database available and test against it. Should be easly setup with the given stack.
And you should try to accelerate the decision about the database used, if possible.
Just "write using JPA (Hibernate)" develop it to be de database agnostic. Put all you business logic in java code not stored procedures.
If you are using spring you don't need jboss you could use just tomcat, about a quarter of the foot print, and much simpler imho.
Spring vs Jboss and jboss represents all that is bad, while spring represents all that is good in Java enterprise development
We have add this issue and had to migrate late in the project, leading to a lot of extra works, frustrations and delays.
My advise is to define an abstract layer. Go to the point you may have a data model without any database, say with tables or text files.
Then when you have to switch to some database, you can optimize for it, while staying free to continue application development on any already developped model. So you don't delay the developpers on the app while one is tuning the DB2 layer. When everything is duly validated, the team can switch on it.
I will disagree with the currently accepted answer suggesting avoiding database specific things. From a performance perspective, that would be a pity, and it's definitely doable.
JPA/Hibernate and also jOOQ can abstract over a lot of things and if you're using the query builder APIs of either technology (criteria query in JPA, or jOOQ for more advanced SQL), you can get very far in a vendor agnostic way without removing all the vendor specific stuff. For example, you can easily create a vendor specific predicate like this:
.where(oracle ? oracleCondition() : db2Condition())
What you should do from the very beginning of such a project, once you know you'll have to support both dialects is to run integration tests on both database products. For this, I recommend testcontainers, which makes running such tests quite simple. If you have to add support for another dialect, and if you're using one of the above abstractions, you can simply add another testcontainers configuration, check if your application still works, tweak 2-3 things, and you're set.
Disclaimer: I work for the company behind jOOQ.

If I choose RavenDB, what benefits of SQL Server do I lose?

If I choose RavenDB for a fairly standard CMS-like web application, what do I lose compared to SQL Server?
EDIT: There is a word "benefits" in the title which is a little controversial term. Maybe I should have said something like "possibilities" or "features", hope it's clear what I'm after.
A few things that come to mind (but I'm new to RavenDB so this is just a few suggestions, some may be wrong, I hope someone would provide a more complete and accurate list):
Quick but customizable administrative interface using ASP.NET Dynamic Data (there is some built-in Silverlight admin application but I'm quite sure that it wouldn't replace a full-fledged admin section in my case)
Possibly some querying capabilities? Or can Raven indexes replace virtually every SQL query I might think of?
Entity Framework integration (I know some people hate EF but I think that being an EF provider means that you can easily publish the data as OData, use EF code-first etc., right?)
Azure deployment (not true according to comments)
Myriad of SQL querying / management tools
A more complete / accurate list would be greatly appreciated.
(Note: I'm not saying that I will need all (or any) of those, I'd just like to understand what's going to be unavailable if I choose RavenDB. Also, please don't discuss RavenDB strengths, I am aware of them and they are easily digestible from the official website.)
You may want to look # these 2 recent blog posts by Ayende (RavenDB creator) on when you should use RavenDB and when you shouldn't.
When should you use ravendb
When should you not use ravendb
Beyond the technology, you should consider your team members as RavenDB is an adjustment in thinking for those of us who have backgrounds in RDBMS. What type of stretch will this be for those involved? Will your users expect reports and what will the say when you tell them that you did not consider answering the questions that they want answered when you create the indexes for the document database? While you get a big boost in productivity when designing and implementing your domain, document databases are different than SQL.
Quick but customizable administrative interface using ASP.NET Dynamic
Data (there is some built-in Silverlight admin application but I'm
quite sure that it wouldn't replace a full-fledged admin section in my
case)
ASP.NET MVC supports scaffolding based on POCOs since second version. But it's not so quick'n'dirty solution.
Possibly some querying capabilities? Or can Raven indexes replace
virtually every SQL query I might think of?
You should to think about your queries first. Raven DB is not reporting database.
Entity Framework integration (I know some people hate EF but I think
that being an EF provider means that you can easily publish the data
as OData, use EF code-first etc., right?)
You are so focused on tools. Code First is the way how you work with document databases. Why you need OData? RavenDB has REST API out of the box.
WCF RIA Services (Silverlight).
You'll need to do all that WCF plumbing work.

Simplest way to develop an app that can use multiple types of databases?

I have a project for a class which requires that if a database is used, options exist for the user to pick a database to use which could be of a different type. So while I can use e.g. MySQL for development, in the final version of the project, the user must be able to choose a database (Oracle, MySQL, SQLite, etc.) upon installation. What's the easiest way to go about this, if there is an easy way?
The language used is up to me as long as it's supported by the department's Linux machines, so it could be Java, PHP, Perl, etc. I've been researching and found info on ODBC, JDBC, and SQLJ (such as this) but I'm quite new to databases so I'm having a hard time figuring out what would be best for my needs. It's also possible there may not be a simple enough way to do this; the professor admitted he's not a database guy either and he seemed to think it would be easy without having a clear idea of what it would take.
This is for a web app, but it ought to be fairly straight forward, using for example HTML and Javascript on the client side and Java with a MySQL database on the server side. No mention has been made of frameworks so I think they're too much. I have the option of using Tomcat or Apache if necessary but the overall idea is to keep things simple, and everything used should be able to be installed/changed/configured with just user level access. So something like having to recompile PHP to use ODBC would be out, I think.
Within these limitations, what would be the best way (if any) to be able to interact with an arbitrary database?
The issue I think you will have here is that SQL is not truely standard. What I mean is that vendors (Oracle, MySQL etc) have included types and features that are not SQL standard in order to "tie you in" to their DB, such as Oracle's VARCHAR2 and so on.
When I was at university, my final year project was to create an application that allowed users to create relational databases using JDBC with a Java front-end.
The use of JDBC was very simple but the issue was finding enough SQL features/types that all the vendors have in common. So they could switch between them without any issues. A way round this is to implement modules to deal with vendor specific issues and write ways to translate between them. So for example you may develop a database for MySQL with lots of MySQL specific code in that, but then you may want to use Oracle and then there are issues, which you would need to resolve.
I would spend some time looking at what core SQL standard all the vendors implement and then code for these features. But I think the technology you use wouldn't be the issue but rather the SQL you create.
Hope this helps, apologies if its not helpful!
Well, you can go two ways (in Java):
You can develop your own classes to work with different databases and load their drivers in JDBC. This way you will create a data access layer for yourself, which takes some time.
You can use Hibernate (or other ORMs). This way Hibernate will take care of things for you and you only have to know how to use Hibernate. Learning Hibernate may take some time, but when you get used to it, it can be very useful for your future projects.
If you want to stick Java there Hibernate (which wouldn't require a framework). Hibernate is fairly easy to use. You write HQL which gets translated to the SQL needed for the database you're using.
Maybe use an object relational mapper (ORM) or database abstraction layer (DAL). They are designed to provide a standard API to multiple database backends, making it possible to switch between different backends with minimal or no changes to your code. In Python, for example, a popular ORM is SQLAlchemy, and an excellent DAL is the web2py DAL (it's part of the web2py framework but can be used as a standalone DAL outside the framework as well). There are many other options in other languages as well.
use a framework with database abstraction layer and orm . try symfony or rails
There are a lot of Object relational database frameworks, unless you prefer jdbc. For simple/small applications this should work fine.

Umbraco Database Question- Adding custom tables

I'm working on a site managed by Umbraco. I need to store data about images and clients. I don't think there is any way I can store that data in the existing tables.
Is there any reason I shouldn't add the tables I'll need to the Umbraco database, rather than creating a separate DB? I like Umbraco so far but the documentation is a little thin and I haven't found any suggestions one way or the other.
TIA
I have built a site using Umbraco, with a separate application with a database of vehicles. I used the same database as Umbraco is using, and prefixed all my custom app tables with a few letters to distinguish them easily (eg: vehicles_xxx)
I have had no problems with this arrangement, and don't believe there's much risk involved. Of course you'll need to take care when upgrading Umbraco (never upgrade in the live environment before fully testing, and preferably do it locally anyway), however its unlikely an upgrade script will ever alter or delete any tables that it does not know about.
There's heaps of doco available for umbraco now - much more than when i started.. however a question like this is always best for the forums. :)
all the best
greg
You might use the Umbraco API to store and retrieve your data and enjoy the ease of not having to worry bout tables and much more. Or you create your own tables. Do as Gregorius says - using umbraco db is fine.
Your choice depends on:
do you have a lot of data?
do you have a large relation model?
If not - then go with Umbraco API
The rest of the answers you'll find on http://our.umbraco.org
/Jesper Ordrup

Is there an ORM that can dynamically generate a DAL from within a .Net WinForms app?

I'd greatly appreciate your advice on a strange specification.
We have a requirement to create an application where users can drag/drop field types onto a form so that they can create their own "app". I have the front-end setup, but the back-end is a big problem.
There are forward mapping ORMs and reverse mapping ORMs, yet I've not found one that can embed within the application and generate the tables, relationship, etc. when the user starts up the app. Of course, if a table, field or other entity already exists, it would not overwrite them (and overwrite the underlying data).
ActiveRecord is the closest I've found, but it is web based and does not extend to a WinForm environment. I would SO prefer not have our crew write our own DAL, debug it, etc. when there might be an ORM out there that can do this.
Does anyone know of an ORM that can do this? If not, how would you go about solving this nightmare in the making?
Thank you so much for your help.
Although you can think of a solution for this problem built with ORM, I do not think this is a good idea: these tools are designed to solve another class of problems. The only way is to build the application yourself.
That's an unfortunate app-- if your users wanted to do that, they'd just buy Visual Studio!
This isn't a good position to be in, because no, I'm not aware of any suitable way to do this with an ORM. Sadly, if you're looking for this because of schedule pressure, your project may be in trouble.
In theory you can use any ORM, that can automatically generate database schema. For example see DataObjects.Net, it typically generates and upgrades database schema using persistent model, based on persistent classes and additional custom definitions. But I can hardly imagine how your whole application will work in this case... there are so many potential issues.

Resources