I have got some tabular data that due to unrelated issues is proving too slow to get out of SQL Server in realtime. As we get more users this will only get worse so I am thinking of using Redis as a front-end cache to store users' tabular pageable data.
This data could become stale after about 10 minutes after which time I would like to get the record set again and put in in Redis.
The app is an .NET MVC app. I was thinking that when the user logs into the app this data gets pulled out of the database (takes around 10 seconds) and put into Redis ready to be consumed by the MVC client. I would put an expiry on that data and then when it becomes stale it will get refetched from the SQL Server database.
Does this all sound reasonable? I'm a little bit scared that:
The user could get to the page before the data is in Redis
If Redis goes down or does not respond I need to ensure that the ViewModel can get filled direct from SQL SErver without Redis being there
I will go for Service stack redis implementation, here are all the details required. Redis is particularly good when doing caching in compare to other nosql. But if you are having high read - write application, I will insist to check out nosql database as database combined with SQL server. That will help in case of scalability.
Please let me know if any further details required. You just need to fire nuget command and you are almost up and running.
You could use something like MemcacheD to store cached pages in memory.
You can set a validity of 10 minutes on a cached object. After that the cache will automatically remove the object.
Your actual repository would have to do these steps:
1. Check the cache for the data you want, if it is there, great use it
2. If the cached data doesn't exist, go to SQL server to retrieve it
3. Update the cache with data returned from SQL server
I've used the Enyim client before. It works great. https://github.com/enyim/EnyimMemcached
I might also use something like Quartz to schedule a background task to prime the cache. http://quartznet.sourceforge.net/
Related
I have a service that posts data to web-server (asp.net core 3.1) each second that I store to a sql server using EF Core 3.1.
Up til now I have, when trying to store new data, for each new data row separately:
Checked if data entity exist in database (the entity type is configured with .IsUnique() index in the OnModelCreating() method)
If not exists - add single entity
DBContext.SaveChanges()
However, this seems like it is a bit "heavy" on the sql server with quite a lot of calls. It is running on Azure and sometimes it seems that the database has some problems following along and the web-server starts returning 500 (internal server error as far as I understand). This happens sometimes when someone calls another controller on the web-server and tries to retrieve some data (larger chunks) from the sql server. (that's perhaps for another question - about Azure SQL server reliability)
Is it better to keep a buffer on the web-server and save all in one go, like: DBContext.AddRange(entities) with a bit coarser time resolution (i.e. each minute)? I do not know exactly what happens if one or more of the data is/are duplicates? Are the ones not being duplicates stored or are all inserts refused? (I can't seem to find an explanation for this).
Any help on the matter is much appreciated.
EDIT 2021-02-08:
I try to expand a bit on the situation:
outside my control: MQTT Broker(publishing messages)
in my control:
MQTT client (currently as an azure webjob), subscribes to MQTT Broker
ASP.NET server
SQL Database
The MQTT client is collecting and grouping messages from different sensors from mqtt broker into a format that (more or less) can be stored directly in the database.
The asp.net server acts as middle-man between mqtt client and sql database. BUT ALSO sends continuously "live" updates to anyone visiting the website. So currently the web-server has many jobs (perhaps the problem arises here??)
receive data form MQTT service
to store/retrieve data to/from database
serve visitors with "live" data from MQTT client as well as historic data from database
Hope this helps with the understanding.
I ended up with a buffer service with a ConcurrentDictionary that I use in my asp.net controller. That way I can make sure that duplicates are handled in my code in a controlled way (updates existing or discarded based on quality of the received data). Each minute I empty last minutes data to the database so that I always keep one minute of data. Bonus: I can also serve current data to visitors much more quickly from the buffer service instead of going to the database.
The current single application server can handle about 5000 concurrent requests. However, the user base will be over millions and I may need to have two application servers to handle requests.
So the design is to have a load balancer to hope it will handle over 10000 concurrent requests. However, the data of each users are being stored in one single database. So the design is to have two or more servers, shall I do the followings?
Having two instances of databases
Real-time sync between two database
Is this correct?
However, if so, will the sync process lower down the performance of the servers
as Database replication seems costly.
Thank you.
You probably want to think of your service in "tiers". In this instance, you've got two tiers; the application tier and the database tier.
Typically, your application tier is going to be considerably easier to scale horizontally (i.e. by adding more application servers behind a load balancer) than your database tier.
With that in mind, the best approach is probably to overprovision your database (i.e. put it on its own, meaty server) and have your application servers all connect to that same database. Depending on the database software you're using, you could also look at using read replicas (AWS docs) to reduce the strain on your database.
You can also look at caching via Memcached / Redis to reduce the amount of load you're placing on the database.
So – tl;dr – put your DB on its own, big, server, and spread your application code across many small servers, all connecting to that same DB server.
Best option could be the synchronizing the standby node with data from active node as cost effective solution since it can be achievable using open source relational database(e.g. Maria DB).
Do not store computable results and statistics that can be easily doable at run time which may help reduce to data size.
If history data is not needed urgent for inquiries , it can be written to text file in easily importable format to database(e.g. .csv).
Data objects that are very oftenly updated can be kept in in-memory database as key value pair, use scheduled task to perform batch update/insert to relation database to achieve persistence
Implement retry logic for database batch update tasks to handle db downtimes or network errors
Consider writing data to relational database as serialized objects
Cache configuration data to memory from database either periodically or via API to refresh the changing part.
I'd like to be able to do the following in a HTML5 (iPad) web app:
upload data to an online database (which would be probably <50Mb in size if I was to build the online database in something like SQLite)
extract either a subset or a full copy of data to an offline webdatabase
(travel out of 3G network coverage range)
perform a bunch of analytic-type calculations on the downloaded data
save parameters for my calculations to the offline webdatabase
repeat, saving different parameter sets for several different offline analytic-type calculation sessions over an extended period
(head back into areas with 3G network coverage)
sync the saved parameters from my offline webdatabase to the central, online database
I'm comfortable with every step up till the last one...
I'm trying to find information on whether it's possible to sync an offline webdatabase with a central database, but can't find anything covering the topic. Is it possible to do this? If so, could you please supply link/s to information on it, OR describe how it would work in enough detail to implement it for my specific app?
Thanks in advance
I haven't worked specifically with HTML5 local databases, but I have worked with mobile devices that require offline updates and resyncing to a central data store.
Whether the dataset is created on the server or on the offline client, I make sure its primary key is a UUID. I also make sure to timestamp the record each time its updated.
I also make not of when the last time the offline client was synced.
So, when resyncing to the central database, I first query the offline client for records that have changed since the last sync. I then query the central database to determine if any of those records have changed since the last sync.
If they haven't changed on the central database, I update them with the data from the offline client. If the records on the server have changed since last sync, I update them to the client.
If the UUID does not exist on the central server but does on the offline client, I insert it, and vice versa.
To purge records, I create a "purge" column, and when the sysnc query is run, I delete the record from each database (or mark it as inactive, depending on application requirements).
If both records have changed since last update, I have to either rely on user input to reconcile or a rule that specifies which record "wins".
I usually don't trust built-in database import functions, unless I'm importing into a completely empty database.
Steps:
Keep a list of changes on the local database.
When connected to remote database, check for any changes since last sync on remote.
If changes on remote side has conflicts with local changes, inform the user on what to do.
For all other changes, proceed with sync:
download all online changes which did not change locally.
upload all local changes which did not change remotely.
This method can actually work on any combination of databases, provided there is a data convertor on one side.
It looks to me, from a few sites I visited, that (as long as you are using SQLite for your Server db) it should be possible.
HTML5 webdatabases also use SQLite (although not all browsers support it and W3C seems to have dropped support for it)
so...
If you export the data using the .dump command and then import the data into the webdatabase using the $sqlite mydb.db < mydump.sql syntax you should be able to do this with some fidgeting with a php or java backend?
Then when you want to sync the "offline" data to your server just do the opposite dump from the webdatabase to a dump.sql file and import to the server database
This site explains exporting to and importing from SQLite dumps
SOURCE: dumping and restoring an SQLite DB
HTML5 supports browser db SQLite , I have tried in Mozilla and chrome, and it works fine.
I also have a requirement where I have some offline form, user punches the data and click on save, it saves in local browser db, and later when user syncs with the server or comes online, he can click on sync button which actually sync data from browser db any other datasource.
In a Firebird database driven Delphi application we need to bring some data online, so we can add to our application online-reporting capabilities.
Current approach is: whenever data is changed or added send them to the online server(php + mysql), if it fails, add it to the queue and try again. Then the server having the data is able to create it's own reports.
So, to conclude: what is a good way to bring that data online.
At the moment I know these two different strategies:
event based: whenever changes are detected, push them to the web server / mysql db. As you wrote, this requires queueing in case the destination system does not receive the messages.
snapshot based: extract the relevant data in intervals (for example every hour) and transfer it to the web server / mysql db.
The snapshot based strategy allows to preprocess the data in a way that if fits nicely in the wb / mysql db data structure, which can help to decouple the systems better and keep more business logic on the side of the sending system (Delphi). It also generates a more continuous load, as it does not care about mass data changes.
One other way can be to use replication but I don't know system who make replication between Firebird and MySQL database.
For adding reporting tools capability on-line : you can also check fast report server
I am building an Asp.net MVC site where I have a fast dedicated server for the web app but the database is stored in a very busy Ms Sql Server used by many other applications.
Also if the web server is very fast, the application response time is slow mainly for the slow response from the db server.
I cannot change the db server as all data entered in the web application needs to arrive there at the end (for backup reasons).
The database is used only from the webapp and I would like to find a cache mechanism where all the data is cached on the web server and the updates are sent to the db asynchronously.
It is not important for me to have an immediate correspondence between read db data and inserted data: think like reading questions on StackOverflow and new inserted questions that are not necessary to show up immediately after insertion).
I thought to build an in between WCF service that would exchange and sync the data between the slow db server and a local one (may be an Sqllite or an SqlExpress one).
What would be the best pattern for this problem?
What is your bottleneck? Reading data or Writing data?
If you are concerning about reading data, using a memory based data caching machanism like memcached would be a performance booster, As of most of the mainstream and biggest web sites doing so. Scaling facebook hi5 with memcached is a good read. Also implementing application side page caches would drop queries made by the application triggering lower db load and better response time. But this will not have much effect on database servers load as your database have some other heavy users.
If writing data is the bottleneck, implementing some kind of asyncronyous middleware storage service seems like a necessity. If you have fast and slow response timed data storage on the frontend server, going with a lightweight database storage like mysql or postgresql (Maybe not that lightweight ;) ) and using your real database as an slave replication server for your site is a good choise for you.
I would do what you are already considering. Use another database for the application and only use the current one for backup-purposes.
I had this problem once, and we decided to go for a combination of data warehousing (i.e. pulling data from the database every once in a while and storing this in a separate read-only database) and message queuing via a Windows service (for the updates.)
This worked surprisingly well, because MSMQ ensured reliable message delivery (updates weren't lost) and the data warehousing made sure that data was available in a local database.
It still will depend on a few factors though. If you have tons of data to transfer to your web application it might take some time to rebuild the warehouse and you might need to consider data replication or transaction log shipping. Also, changes are not visible until the warehouse is rebuilt and the messages are processed.
On the other hand, this solution is scalable and can be relatively easy to implement. (You can use integration services to pull the data to the warehouse for example and use a BL layer for processing changes.)
There are many replication techniques that should give you proper results. By installing a SQL Server instance on the 'web' side of your configuration, you'll have the choice between:
Making snapshot replications from the web side (publisher) to the database-server side (suscriber). You'll need a paid version of SQLServer on the web server. I have never worked on this kind of configuration but it might use a lot of the web server ressources at scheduled synchronization times
Making merge (or transactional if requested) replication between the database-server side (publisher) and web side(suscriber). You can then use the free version of MS-SQL Server and schedule the synchronization process to run according to your tolerance for potential loss of data if the web server goes down.
I wonder if you could improve it adding a MDF file in your Web side instead dealing with the Sever in other IP...
Just add an SQL 2008 Server Express Edition file and try, as long as you don't pass 4Gb of data you will be ok, of course there are more restrictions but, just for the speed of it, why not trying?
You should also consider the network switches involved. If the DB server is talking to a number of web servers then it may be being constrained by the network connection speed. If they are only connected via a 100mb network switch then you may want to look at upgrading that too.
the WCF service would be a very poor engineering solution to this problem - why make your own when you can use the standard SQLServer connectivity mechanisms to ensure data is transferred correctly. Log shipping will send the data across at selected intervals.
This way, you get the fast local sql server, and the data is preserved correctly in the slow backup server.
You should investigate the slow sql server though, the performance problem could be nothing to do with its load, and more to do with the queries and indexes you're asking it to work with.