Is in-memory database a viable backup option for performing read operations in case of database failures? One can insert data into an in-memory database once in a while and in case the database server/web server goes down (rare occurence), one can still access the data present in the in-memory database outside of web server.
If you're going to hold your entire database in memory, you might just as well perform all operations there and hold your backup on disk.
No, since a power outage means your database is gone. Or if the DB process dies, and the OS deallocates all the memory it was using.
I'd recommend a second hard drive, external or internal, and dump the data to that hard drive.
Obviously it probably depends on your database usage. For instance it would be hard for me to imagine StackOverflow doing this.
On the other hand not every application is SO. If your database usage is limited you could take a cue from Mobile Applications which accept the fact that a server may not always be available. And treat your web application as though it were a Mobile Client. See Architecting Disconnected Mobile Applications Using a Service Oriented Architecture
Related
What is a memory database? Is sqlite a memory database?
On this mode, does it support persistence data to a local file?
An in-memory database supports all operations and database access syntax, but doesn't actually persist; it's just data structures in memory. This makes it fast, and great for developer experimentation and (relatively small amounts of) temporary data, but isn't suited to anything where you want the data to persist (it's persisting the data that really costs, yet that's the #1 reason for using a database) or where the overall dataset is larger than you can comfortably fit in your available physical memory.
SQLite databases are created coupled to a particular file, or to the pseudo-file “:memory:” which is used when you're wanting an in-memory database. You can't change the location of a database while it is open, and an in-memory database is disposed when you close its connection; the only way to persist it is to use queries to pull data out of it and write it to somewhere else (e.g., an on-disk database, or some kind of dump file).
SQLite supports memory-only databases - it's one of the options. It is useful when persistence is not important, but being able to execute SQL queries quickly against relational data is.
The detailed description of in-memory databases:
https://www.sqlite.org/inmemorydb.html
Typically I use a database such as MySQL or PostGreSQL on the same machine as the application using it, which makes access easy and secure. I'm just now building the first site that will have a separate physical database server (later this year it will). I'm wondering 3 things:
(security) What things should I look into for starters pertaining to security of accessing a separate machine's database?
(scalability) Are their scalability issues that I should think about pertaining to this (technology agnostic)?
(more ServerFaultish but related) If starting the DB out on the same physical server (using a separate VMWare VM) and later moving to a different physical server, are there implicit problems that I'll have to deal with? Isn't another VM still accessed via localhost?
If these questions are completely ludicrous, I apologize to you DB experts.
Easy, I'll grant you. Secure.. well, security has very little to do with the physical location of the database server.
To get to your three questions though:
First, look at how you can limit access to database tables using the database servers security model. Namely, if your application does not need to drop tables, make sure the user it uses to connect does not have that ability. Second, look into how to encrypt the connection between the database server and your application. In windows this is pretty transparent through kerberos and can even be enforced by group policy settings, not sure about other platforms. Third, look into what features the database has for encrypting the data "at rest". Meaning, does it natively support encryption of the actual data files themselves?
The point here is that your application is only one possible entry point to the database server itself. Ask yourself, what would happen if someone can connect directly without going through your application using your apps credentials. Next ask, what can happen if they find a SQL Injection issue.. Also, ask yourself, what information can be gleaned if someone is able to monitor the IP traffic going between your app and the server. Can they discern any data? Finally, ask yourself, what if they get a copy of the database itself?
The lengths you go for #1 is going to be dependent on several factors such as How valuable is the data (eg: what would happen to you, your company, or your clients if it was lost); and, How much time do you have to come up with an ideal solution?
scalability: This is purely a function of load. Unfortunately, the only way to scale most database applications is to scale up. Meaning that you acquire a larger database server as the need arises. Stack Overflow went through this not too long ago. Some database types (nosql, mongodb, etc) support a concept known as shredding or sharding. MySql, PostGreSql, etc don't. Instead you'll have to specifically design the app to handle it. Which means not using things like auto incrementing keys, etc. This can be a royal PITA... which is why scaling up is a much easier prospect depending on your application.
Another VM is not accessible via "localhost". localhost defines access to your current server. Whether that server is a VM or not is immaterial. You'll have to reference your database server by name. Now, transitioning the database VM to another physical server should have zero impact as your are referencing it by name. Beyond that there aren't any other considerations.
In addition to Chris's valid response,
Security
Use a security mechanism on the network in addition to whatever security features the database or app framework provides. Perhaps this is a simple as firewalling the network, running IPSEC, or over an ssl tunnel. The point is that you shouldn't assume the DB authors are network security experts, or that the DB authentication mechanism has even addressed network security at all.
Scalability
One scalability issue comes to mind when moving from local to remote dbs. Remote TCP/IP communication is much slower than local pipe communication. Your app may have hidden scalability issues due to frequent round-trips to the DB. Between each query, your app waits for each DB response in succession. On a local system, the latency is so small you may not have noticed it.
I'm writing a Comet application that has to keep track of each open connection to the server. I want to write an entry to the database for each connection, and I will have to search the database for the proper connections every time the application receives new data (often), which is why I don't want to start off on the wrong foot by choosing slow database software. Any suggestions for a database that favors rapid, small pieces of data (rather than occasional large pieces of data)?
I suggest rather using a server platform that allows the creation of persistent servers, that keep all such info in the memory. Thus all database access will be limited to writing (if you want to actually save any information permanently), which usually is signifficantly less in typical Comet-apps (such as chats/games).
Databases are not made to keep such data. Accessing a database directly always means composing query strings, often sending them to a db server (sometimes even over the network), db lookup, serialization of the results, sending back, deserialization and traversing the fetched results. There is no way this can be even nearly as fast as just retrieving a value from memory.
If you really want to stick with PHP, then I suggest you have a look at memcached and similar caching servers.
greetz
back2dos
SQL Server 2008 has a FileStream data type that can be used for rapid, small pieces of data. McLaren Electronic Systems uses it to capture and analyze telemetry/sensor data from Formula One race cars.
Hypersonic: http://hsqldb.org/
MySQL (for webapps)
I just wanted to know what you guys think about this.
I have an app written in Visual Basic .Net as my front end and and Oracle 11g Standart database as the back-end. So I have like 20 pc's running this app locally. They're all inserting, updating, deleting data on this single database. I want to develop a solution in the case that the server database crashes or cannot stay on line. So i think of having oracle 10g XE on each pc. Thus all the data will be stored in the local db. I think about running a proccess once in a while (ex. every 15 minutes) to send/get the data to/from the main server. What do you think about this strategy?
Oracle does have a mechanism for sharing data between databases, called Replication. Oracle XE supports Basic Replication (read-only and updateable materialized view site only). Obviously it depends on the specifics of your requirements, but from the little you have given us this might be a viable solution for you. Run each POS off its own Oracle XE database with regular synchronisations to the main (master) database.
Each POS has its data in updatable materialized views. That is, it can read and write its own data to the local XE database. These materialized views are part of a replication group which synchronizes their data with a master table in the main database. Going the other way the main database pushes its product data to read-only materialized views in the POS databases. The value of this architecture is that the POS always connect to their local XE databases, and never connect to the master database. This is a lot cleaner than connecting to the central database most of the time and switching to local databases in an emergency.
Unfortunately the documentation is a bit confusing, because it is called Advanced Replication and doesn't really mention "basic replication" at all. But Basic Replication covers most things - Advanced Replication is mainly Writeable Materialized Views and Multi-Master replication, neither of which you need anyway. I'm not saying Replication is easy, because it does cover some tricksy concepts. But using Oracle's built-in functionality has surely got to be better than rolling your own.
Note that your system would still be extremely exposed to the failure of the main database. Your client may think another Oracle license is a bit pricy (I wouldn't disagree). However, in extreme cases, failure to recover a database can kill a company.
This sounds like an horrendous idea. Duplicating data from one database to another is a complex subject. The process you're describing involves 20 duplications !
To be of any use in the event of a crash, you will also need a two-way replication mechanism: the 20 clients will continue to update their local DB. How do you deal with concurrent updates? The merging process alone with 20 databases will cost so much in resources it would have been cheaper to have a tried and tested professional DR (Disaster Recovery) process.
A true standby database on the other hand would be simpler to deploy, simpler to test, simpler to maintain and will cost less in resources. I suggest you don't reinvent the wheel :)
Edit:
By the way if you just want a backup and recovery plan, duplicating the database is NOT the solution. I suggest you read the online documentation about recovery:
Oracle Database Backup and Recovery Basics
Oracle Database Backup and Recovery Advanced User's Guide
I had the "pleasure" of trying to make exactly this sort of solution more robust on a SQL Server based POS system. As Vincent says, it's a complex process, fraught with unforseen nightmare scenarios and difficult to maintain code (e.g., ugly DOS .bat files I had to write). I would have to agree with him that it's a more robust solution to use a standby scenario.
That said, if your client won't spring for another license (and I do see their point) you seem to be stuck doing exactly this sort of thing. It can be done, but let your client know that the homegrown replication system is going to be a costly one, and will likely take quite some time to get the wrinkles worked out. It also probably won't scale well as the number of retail sites increases.
I run a very high traffic(10m impressions a day)/high revenue generating web site built with .net. The core meta data is stored on a SQL server. My team and I have a unique caching strategy that involves querying the database for new meta data at regular intervals from a middle tier server, serializing the data to files and sending those to the web nodes. The web application uses the data in these files (some are actually serialized objects) to instantiate objects and caches those in memory to use for real time requests.
The advantage of this model is that it:
Allows the web nodes to cache all data in memory and not incur any IO overhead querying a database.
If the database ever goes down either unexpectedly or for maintenance windows, the web servers will continue to run and generate revenue. You can even fire up a web server without having to retrieve its initial data from the DB because all the data it needs are in files on its own disks.
Allows us to be completely horizontally scalable. If throughput suffers, we can just add a web server.
The disadvantages are that this caching and persistense layers adds complexity in the code that queries the database, packages the data and unpackages it on the web server. Any time our domain model requires us to add entities, more of this "plumbing" has to be coded. This architecture has been in place for four years and there are probably better ways to tackle this.
One strategy I have been considering is using replication to replicate our master sql server database to local database instances installed on each web server. The web server application would use normal sql/ORM techniques to instantiate objects. Here, we can still sustain a master database outage and we would not have to code up specialized caching code and could instead use nHibernate to handle the persistence.
This seems like a more elegant solution and would like to see what others think or if anyone else has any alternatives to suggest.
I think you're overthinking this. SQL Server already has mechanisms available to you to handle these kinds of things.
First, implement a SQL Server cluster to protect your main database. You can fail over from node to node in the cluster without losing data, and downtime is a matter of seconds, max.
Second, implement database mirroring to protect from a cluster failure. Depending on whether you use synchronous or asynchronous mirroring, your mirrored server will either be updated in realtime or a few minutes behind. If you do it in realtime, you can fail over to the mirror automatically inside your app - SQL Server 2005 & above support embedding the mirror server's name in the connection string, so you don't even have to lift a finger. The app just connects to whatever server's live.
Between these two things, you're protected from just about any main database failure short of a datacenter-wide power outage or network outage, and there's none of the complexity of the replication stuff. That covers your high availability issue, and lets you answer the scaling question separately.
My favorite starting point for scaling is using three separate connection strings in your application, and choose the right one based on the needs of your query:
Realtime - Points directly at the one master server. All writes go to this connection string, and only the most mission-critical reads go here.
Near-Realtime - Points at a load balanced pool of read-only SQL Servers that are getting updated by replication or log shipping. In your original design, these lived on the web servers, but that's dangerous practice and a maintenance nightmare. SQL Server needs a lot of memory (not to mention money for licensing) and you don't want to be tied into adding a database server for every single web server.
Delayed Reporting - In your environment right now, it's going to point to the same load-balanced pool of subscribers, but down the road you can use a technology like log shipping to have a pool of servers 8-24 hours behind. These scale out really well, but the data's far behind. It's great for reporting, search, long-term history, and other non-realtime needs.
If you design your app to use those 3 connection strings from the start, scaling is a lot easier, and doesn't involve any coding complexity - just pick the right connection string.
Have you considered memcached? Since it is:
in memory
can run locally
fully scalable horizontally
prevents the need to re-cache on each web server
It may fit the bill. Check out Google for lots of details and usage stories.
Just some addition to what RickNZ proposed above..
Since your master data which you are caching currently won't change so frequently and probably over some maintenance window, here is what should you do first on database side:
Create a SNAPSHOT replication for the master tables which you want to cache. Adding new entities will be equally easy.
On all the webservers, install SQL Express and subscribe to this Publication.
Since, this is not a frequently changing data, you can rest assure, no much server resource usage issue minus network trips for master data.
All your caching which was available via previous mechanism is still availbale minus all headache which comes when you add new entities.
Next, you can leverage .NET mechanisms as suggested above. You won't face memcached cluster failure unless your webserver itself goes down. There is a lot availble in .NET which a .NET pro can point out after this stage.
It seems to me that Windows Server AppFabric is exactly what you are looking for. (AKA "Velocity"). From the introductory documentation:
Windows Server AppFabric provides a
distributed in-memory application
cache platform for developing
scalable, available, and
high-performance applications.
AppFabric fuses memory across multiple
computers to give a single unified
cache view to applications.
Applications can store any
serializable CLR object without
worrying about where the object gets
stored. Scalability can be achieved by
simply adding more computers on
demand. The cache also allows for
copies of data to be stored across the
cluster, thus protecting data against
failures. It runs as a service
accessed over the network. In
addition, Windows Server AppFabric
provides seamless integration with
ASP.NET that enables ASP.NET session
objects to be stored in the
distributed cache without having to
write to databases. This increases
both the performance and scalability
of ASP.NET applications.
Have you considered using SqlDependency caching?
You could also write the data to the local disk at the web tier, if you're concerned about initial start-up time or DB outages. But at least with a SqlDependency, you shouldn't have to poll the DB to look for changes. It can also be made relatively transparent.
In my experience, adding a DB instance on web servers generally doesn't work out too well from a scalability or performance perspective.
If you're concerned about performance and scalability, you might consider partitioning your data tier. The specifics depend on your app, but as an example, you could move read-only data onto a couple of SQL Express servers that are populated with replication.
In case it helps, I talk about this subject at length in my book (Ultra-Fast ASP.NET).