As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
Would any one please differentiate what is best to use SQLite or SQL Server? I was using XML file as a data storage ADD, delete , update.. Some one suggested to use SQLite for fast operation but I am not familier with SQLite I know SQL Server.
SQLite is a great embedded database that you deploy along with your application. If you're writing a distributed application that customers will install, then SQLite has the big advantage of not having any separate installer or maintenance--it's just a single dll that gets deployed along with the rest of your application.
SQLite also runs in process and reduces a lot of the overhead that a database brings--all data is cached and queried in-process.
SQLite integrates with your .NET application better than SQL server. You can write custom function in any .NET language that run inside the SQLite engine but are still within your application's calling process and space and thus can call out to your application to integrate additional data or perform actions while executing a query. This very unusual ability makes certain actions significantly easier.
SQLite is generally a lot faster than SQL Server.
However, SQLite only supports a single writer at a time (meaning the execution of an individual transaction). SQLite locks the entire database when it needs a lock (either read or write) and only one writer can hold a write lock at a time. Due to its speed this actually isn't a problem for low to moderate size applications, but if you have a higher volume of writes (hundreds per second) then it could become a bottleneck. There are a number of possible solutions like separating the database data into different databases and caching the writes to a queue and writing them asynchronously. However, if your application is likely to run into these usage requirements and hasn't already been written for SQLite, then it's best to use something else like SQL Server that has finer grained locking.
UPDATE: SQLite 3.7.0 added a new journal mode called Write Ahead Locking that supports concurrent reading while writing. In our internal multi-pricess contention test, the timing went from 110 seconds to 8 seconds for the exact same sequence of contentious reads/writes.
Both are in different league altogether. One is built for enterprise level data management and another is for mobile devices (embedded or server less environment). Though SQLite deployments can hold data in many hundred GBs but that is not what it is built for.
Updated: to reflect updated question:
Please read this blog post on SQLite. I hope that would help you and let you access it from redirect you to resources to programatically access SQLite from .net.
Related
As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 9 years ago.
As Wikipedia says
Database Triggers are commonly used to:
audit changes (e.g. keep a log of the users and roles involved in
changes)
enhance changes (e.g. ensure that every change to a record
is time-stamped by the server's clock)
enforce business rules (e.g.
require that every invoice have at least one line item) etc.
ref: database triggers - wikipedia
But we can do these things inside the Business Layer using a common programming language (especially with OOP) easily. So what is the necessity of database triggers in modern software architecture? Why do we really need them?
It might work, if all data is changed by your application only. But there are other cases which I have seen very frequently:
There are other applications (like batch jobs doing imports etc.) which do not use the business layer
You cannot use plain SQL scripts as a means for hotfixes easily
Apart from that in some cases you can even combine both worlds: Define a trigger in the database, and use Java to implement it. PostgreSql for examples supports triggers written in Java. As for Oracle, you can call a Java method from a PL/SQL trigger. You can define CLR based triggers in MS SQL Server.
This way not every programmer needs to learn PL/SQL, and data integrity is enforced by the database.
Think about the performance. IF this is all to be done from the application, there are most likely a lot of extra sql*net round trips, slowing down the application. Having those actions defined in the database makes sure that they are always enforced, not only when the application is used to access the data.
When the database is in control, you have your rules defined on the central location, the database, instead of in many locations in the application.
Yes, you can completely omit database triggers.
However, if you can't guarantee that your database will only be accessed from the application layer (which is impossible) then you need them. Yes, you can perform all your database logic in the application layer but if you have a table that needs X done to it when you're updating it then the only way to do that is in a trigger. If you don't then people accessing your database directly, outside your application, will break your application.
There is nothing else you can do. If you need a trigger, use one. Do not assume that all connections to your database will be through your application...
As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
I'm gathering information for upcoming massive online game. I has my experience with MEGA MASSIVE farm-like games (millions of dau), and SQL databases was great solution. I also worked with massive online game where NoSQL db was used, and this particular db (Mongo) was not a best fit - bad when lot of connections and lot of concurrent writes going on.
I'm looking for facts, benchmarks, presentation about modern massive online games and technical details about their backend infrastructure, databases in particular.
For example I'm interested in:
Can it manage thousands of connection? May be some external tool can help (like pgbouncer for postgres).
Can it manage tens of thousands of concurrent read-writes?
What about disk space fragmentation? Can it be optimized without stopping database?
What about some smart replication? Can it tell that some data is missing from replica, when master fails? Can i safely propagate slave to master and know exactly what data is missing and act appropriately?
Can it fail gracefully? (like postgres for ex.)
Good reviews from using in production
Start with the premise that hard crashes are exceedingly rare, and when they occur
it won't be a tragedy of some information is lost.
Use of the database shouldn't be strongly coupled to the routine management of the
game. Routine events ought to be managed through more ephemeral storage. Some
secondary process should organize ephemeral events for eventual storage in a database.
At the extreme, you could imagine there being just one database read and one database
write per character per session.
Have you considered NoSQL ?
NoSQL database systems are often highly optimized for retrieval and
appending operations and often offer little functionality beyond
record storage (e.g. key–value stores). The reduced run-time
flexibility compared to full SQL systems is compensated by marked
gains in scalability and performance for certain data models.
In short, NoSQL database management systems are useful when working
with a huge quantity of data when the data's nature does not require a
relational model. The data can be structured, but NoSQL is used when
what really matters is the ability to store and retrieve great
quantities of data, not the relationships between the elements. Usage
examples might be to store millions of key–value pairs in one or a few
associative arrays or to store millions of data records. This
organization is particularly useful for statistical or real-time
analyses of growing lists of elements (such as Twitter posts or the
Internet server logs from a large group of users).
There are higher-level NoSQL solutions, for example CrouchDB, which has built-in replication support.
As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
When there is a web app will query some information frequently, how to improve the performance by cache the query result?
(The information is like top news in a website and my database is SQL Server 2008, the application is on tomcat.)
I can suggest the following:
In your database you can use idex views, please check: How to mimick Oracle Materialized Views on MS SQL Server?.
If you has used JPA or Hibernate it can cache Entities (objects).
http://docs.jboss.org/hibernate/orm/3.3/reference/en/html/performance.html#performance-cache
http://en.wikibooks.org/wiki/Java_Persistence/Caching
If you're looking for a cache system that is foreign to database and ORM, maybe you can review MemCache or EHCache.
http://memcached.org/
http://ehcache.org/
An option but not recommended is that you manage a cache in your application, by example you can store at ServletContext (also know as ApplicationContext) the list of Countries, but you need to implement the business logic for cache (update, delete and insert objects), also you need to be careful with the Heap Memory.
You can use a combination of the above strategies it depends of the context of your business
Best regards,
Ernesto.
This is a pretty general question and as you'd expxect, there are many options.
Closest to the UI, your web platform might have 'content caching.' ASP.NET, for example, will cache portions of a page for specified periods of time.
You could use a caching tool like memcached and cache a recordset (or whatever the stand-alone Java data structure is).
Some ORM's provide caching too.
And (probably not finally) you could define structure in your database to 'cache' results like this by running complex queries and saving the results into tables that are queried more often but are cheaper to query.
Just some ideas.
The answer for a really big site is all of the above. We do all our queries via stored procs. That helps because the query is compiled and one execution plan is reused. We have a wicked ccomplicated table valued function. It's so expensive we built a cache table. The table has the same general foormat as the function but with two extras. One is an expire time. The other is a search key. The search key is the parameters that go into the function concatenated together. Whenever we're about to query that table we run a Proc to check if the data is stale. If it is we start a transaction delete the rows, and then run the function and insert the rows. This means we run the function maybe 2 or 3% of the times we used to and the proc call we make to check for staleness is much cheaper. Whenever the app updates the relevant data it goes and updates the cache rows as stale - but it doesn't delete them we leave that to the cache check function. Why? Well maybe nobody will need that data right now, so less db hit. Then we hit the second layer. We cache many recordsets in memcached. Including all of the procs that call that function, and many more. That actually happens in our asp layer, which we still have. ADO recordsets can be persisted to xml natively, which then goes into memcache as a string.
As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
I would like to remove sql dependency of small chunks of data that I load on (almost) each request on a web application. Most of the data is key-value/document structured, but a relational solution is not excluded. The data is not too big so I want to keep it in memory for higher availability.
What solution would you recommend?
The simplest and most widely used in-memory Key-value storage is MemcacheD. The introduction page re-iterates what you are asking for:
Memcached is an in-memory key-value store for small chunks of arbitrary data (strings, objects) from results of database calls, API calls, or page rendering.
The client list is impressive. It's been for a long time. Good documentation. It has API for almost every programming language. Horizontal scaling is pretty simple. As my experience goes Memcached is good.
You may also want to look into MemBase.
Redis is perfect for this kind of data. It also supports some fundamental datastructures and provides operations on them.
I recently converted my Django forum app to use it for all real-time/tracking data - it's so good to no longer have the icky feeling you get when you do this kind of stuff (SET views = views + 1 and other writes on every page view) with a relational database.
Here's an example of using Redis to store data required for user activity tracking, including keeping an ordered set of last seen users up to date, in Python:
def seen_user(user, doing, item=None):
"""
Stores what a User was doing when they were last seen and updates
their last seen time in the active users sorted set.
"""
last_seen = int(time.mktime(datetime.datetime.now().timetuple()))
redis.zadd(ACTIVE_USERS, user.pk, last_seen)
redis.setnx(USER_USERNAME % user.pk, user.username)
redis.set(USER_LAST_SEEN % user.pk, last_seen)
if item:
doing = '%s %s' % (
doing, item.get_absolute_url(), escape(str(item)))
redis.set(USER_DOING % user.pk, doing)
If you don't mind the sql but want to keep the db in memory, you might want to check out sqlite (see http://www.sqlite.org/inmemorydb.html).
If you don't want the sql and you really only have key-value pairs, why not just store them in a map / hash / associative array and be done with it?
If you end up needing an in-memory database, H2 is a very good option.
One more database to consider: Berkeley DB. Berkeley DB allows you to configure the database to be in-memory, on-disk or both. It supports both a key-value (NoSQL) and a SQL API. Berkeley DB is often used in combination with web applications because it's embedded, easily deployed (it deploys with your application), highly configurable and very reliable. There are several e-Retail web sites that rely on Berkeley DB for their e-Commerce applications, including Amazon.com.
I'm not sure this is what you are looking for but you should look into a caching framework (something that may be included in the tools you are using now). With a repository pattern you ask for the data, there you check if you have it in cache by key. I you don't, you fetch it from the database, if you do, you fetch it from the cache.
It will depend on what kind of data you are handling so it's up to you to decide how long to keep data in cache. Perhaps a sliding timeout is best as you'll keep the data as long as the key keeps being request. Which means if the cache has data for a user, once the user goes away, the data will expire from the cache.
Can you shard this data? Is data access pattern simple and stable (does not change with changing business requirements)? How critical is this data (session context, for example, is not too hard to restore, whereas some preferences a user has entered on a settings page should not be lost)?
Typically, provided you can shard and your data access patterns are simple and do not mutate too much, you choose Redis. If you look for something more reliable and supporting more advanced data access patterns, Tarantool is a good option.
Please do check out this :
http://www.mongodb.org/
Its a really good No-SQL database with drivers and support for all major languages.
As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 11 years ago.
What would be the best DB for Inserting records at a very high rate.
The DB will have only one table and the Application is very simple. Insert a row into the DB and commit it but the insertion rate will be very high.
Targetting about 5000 Row Insert per second.
Any of the very expensive DB's like Oracle\SQLServer are out of option.
Also what are the technologies for taking a DB Backup and will it be possible to create one DB from the older backed up DB's ?
I can't use InMemory capabilities of any DB's as I can't afford a crash of the Application. I need to commit the row as soon as I recieve it.
If your main goal is to insert a lot of data in a little time, perhaps the filesystem is all you need.
Why not write the data in a file, optionally in a DB-friendly format (csv, xml, ...) ? That way you can probably achieve 10 times your performance goal without too much trouble. And most OSs are robust enough nowadays to prevent data loss on application failures.
Edit: As said below, jounaling file systems are pretty much designed so that data is not lost in case of software (or even hardware in case of raid-arrays) failures. ZFS has a good reputation.
Postgres provides WAL (Write Ahead Log) which essentially does inserts into RAM until the buffer is full or the system has time to breath. You combine a large WAL cache with a UPS (for safety) and you have very efficient insert performance.
If you can't do SQLite, I'd take a look at Firebird SQL if I were you.
To get high throughput you will need to batch inserts into a big transaction. I really doubt you could find any db that allows you to round trip 5000 times a second from your client.
Sqlite can handle tons of inserts (25K per second in a tran) provided stuff is not too multithreaded and that stuff is batched.
Also, if structure correctly I see no reason why mysql or postgres would not support 5000 rows per second (provided the rows are not too fat). Both MySQL and Postgres are a lot more forgiving to having a larger amount of transactions.
The performance you want is really not that hard to achieve, even on a "traditional" relational DBMS. If you look at the results for unclustered TPC-C (TPC-C is the de-facto standard benchmark for transaction processing) many systems can provide 10 times your requirements in an unclustered system. If you are going for cheap and solid you might want to check out DB2 Express-C. It is limited to two cores and two gigabytes of memory but that should be more than enough to satisfy your needs.