I deleted everything in my database. Then, I queried the data that was just deleted and I get results! How is this possible?
I am using the Realtime Database Unity SDK. For testing purposes, I want to regularly purge the whole database and populate it with new data. Imagine my surprise when my queries returned some old, deleted data. It is as if the deleted data persists in some void that can still be accessed.
I have been tinkering with this issue for days now. Here are my steps:
I'm using GetReference(item).Push().Key; which auto-generates a unique key.
I write the new item to the database with GetReference(item).SetValueAsync().
I check my Firebase console, and indeed, the data was correctly recorded.
I create a query that returns the JSON value of item. I works fine.
I delete item from the data base.
I run the query again and item is returned. ITEM IS NOT SUPPOSED TO EXIST ANYMORE!
Out of curiosity, I write a query to return all the data in my database (which should be empty) and it returns every object I have create over the last few days. This is literally hundreds of items....from an empty database.
It seems like data persists for a few days after it is deleted.
Realizing this I decided to test what would happen if I manually made an object that uses an existing key from one of the deleted objects.
My query returns the new object. Yay!
I take a break and come back 15 minutes later. I run the exact query again. I get the old, deleted object and not the new one. WHAT THE HECK IS GOING ON?
At this point I am questioning whether Realtime Database is even a real database. It seems to break the rules of both consistency and integrity.
I've also considered that I might be deleting the data incorrectly. I was mostly doing it manually, through the browser. I also have tried RemoveValueAsync() and SetRawJsonValueAsync(null). Nothing seems to make a difference.
Please, please, please can someone tell me what is going on? I will be forever grateful.
EDIT: It turns out that the phantom data was coming from the cache on my device. Turning persistence off solved the problem. Apparently, performing the same query multiple times only retrieves the data for the database the first time. The subsequent queries look into the cache.
It turns out that the phantom data was coming from the cache on my device. Turning persistence off solved the problem. Apparently, performing the same query multiple times only retrieves the data for the database the first time. The subsequent queries look into the cache.
Related
Well, I am going to query a 4 GB data using a cfquery. It's gonna be pain to query
the whole database as it's gonna take very long time to get the data back.
I tried stored procedure when the data was 2 GB and it wasn't really fast at that time either.
The data pulling will be done based on the date range user is gonna select from a HTML page.
I have been suggested to follow data archiving in order to speed up querying the database.
Do you think that I'll have to create a separate table with only fields that are required and then query this newly created table?
Well, the size of the current table is 4GB but it is increasing day by day, basically, it's a response database ( getting the information stored from somewhere
else). After doing some research, I am wondering if writing a Trigger could be one option? So, if I do this, then as soon as a new entry (row) will be added
into the current 4GB table , the trigger will initiate some SQL Query which will transfer the contents of the required fields into the newly created table.
This will keep on happening as long as I keep on getting new values in my original 4GB database.
Does above approach sounds good enough to tackle my problem? I have one more concern, even though I am filtering out the only fields required to querying into
a new table, at some point of time, the size of my new database will also increase and that could alsow slower the speed of querying the new table?
Please correct me if Iam wrong somewhere.
Thanks
More Information:
I am using SQL Server. Indexing is currently done but it's not effective.
Archiving the data will be farting against thunder. The data has to travel from your database to your application. Then your application has to process it to build the chart. The more data you have, the longer that will take.
If it is really necessary to chart that much data, you might want to simply acknowledge that your app will be slow and do things to deal with it. This includes code to prevent multiple page requests, displays to the user, and such.
I am developing a web-app right now, where clients will frequently (every few seconds), send read/write requests on certain data. As of right now, I have my server immediately write to the database when a user changes something, and immediately read from the database when they want to view something. This is working fine for me, but I am guessing that it would be quite slow if there were thousands of users online.
Would it be more efficient to save write requests in an object on the server side, then do a bulk update at a certain time interval? This would help in situations where the same data is edited multiple times, since it would now only require one db insert. It would also mean that I would read from the object for any data that hasn't yet been synced, which could mean increased efficiency by avoiding db reads. At the same time though, I feel like this would be a liability for two reasons: 1. A server crash would erase all data that hasn't yet been synced. 2. A bulk insert has the possibility of creating sudden spikes of lag due to mass database calls.
How should I approach this? Is my current approach ok, or should I queue inserts for a later time?
If a user makes a change to data and takes an action that (s)he expects will save the data, you should do everything you can to ensure the data is actually saved. Example: Let's say you delay the write for a while. The user is in a hurry, makes a change then closes the browser. If you don't save right when they take an action that they expect saves the data, there would be a data loss.
Web stacks generally scale horizontally. Don't start to optimize this kind of thing unless there's evidence that you really have to.
I am having a problem and I need your help.
I am working with Play Framework v1.2.4 in java, and my server is uploaded in the Heroku servers.
All works fine, I can access to my databases and all is ok, but I am experiment troubles when I do a couple of saves to the database.
I have a method who store data many times in the database and return a notification to a mobile phone. My problem is that the notification arrives before the database finish to save the data, because when it arrives I request for the update data to the server, and it returns the data without the last update. After a few seconds I have trying to update again, and the data shows correctly, therefore I think there is a time-access problem.
The idea would be that when the databases end to save the data, the server send the notification.
I dont know if this is caused because I am using the free version of the Heroku Servers, but I want to be sure before purchasing it.
In general all requests to cloud databases are always slower than the same working on your local machine. Even simply query that on your computer needs just 0.0001 sec can be as slow as 0.5 sec in the cloud. Reason is simple clouds providers uses shared databases + (geo) replications, which just... cannot be compared to the database accessed only by one program on the same machine.
Also keep in mind that free Heroku DB plans doesn't offer ANY database cache, which means that every query is fetched from the cloud directly.
As we don't know your application it's hard to say what is the bottleneck anyway almost for sure you have at least 3 ways to solve your problem. They are not an alternatives, probably you will need to use (or at least check) all of them.
You need to risk some basic plan and see how things changed with paid version, maybe it will be good enough for you, maybe not.
Redesign your application to make less queries. For an example instead sending 10 queries to select 10 different rows, you will need to send one query, which selects all 10 records at once.
Use Play's cache API to avoid repeating selecting the same set of data again and again. For an example, if you have some categories, which changes rarely, but you need category tree for each article, you don't need to fetch categories from DB every time, instead you can store a List of categories in cache, so you will need to use only one request to fetch article's content (which can be cached for some short time as well...)
I'm working on a system that mirrors remote datasets using initials and deltas. When an initial comes in, it mass deletes anything preexisting and mass inserts the fresh data. When a delta comes in, the system does a bunch of work to translate it into updates, inserts, and deletes. Initials and deltas are processed inside long transactions to maintain data integrity.
Unfortunately the current solution isn't scaling very well. The transactions are so large and long running that our RDBMS bogs down with various contention problems. Also, there isn't a good audit trail for how the deltas are applied, making it difficult to troubleshoot issues causing the local and remote versions of the dataset to get out of sync.
One idea is to not run the initials and deltas in transactions at all, and instead to attach a version number to each record indicating which delta or initial it came from. Once an initial or delta is successfully loaded, the application can be alerted that a new version of the dataset is available.
This just leaves the issue of how exactly to compose a view of a dataset up to a given version from the initial and deltas. (Apple's TimeMachine does something similar, using hard links on the file system to create "view" of a certain point in time.)
Does anyone have experience solving this kind of problem or implementing this particular solution?
Thanks!
have one writer and several reader databases. You send the write to the one database, and have it propagate the exact same changes to all the other databases. The reader databases will be eventually consistent and the time to update is very fast. I have seen this done in environments that get upwards of 1M page views per day. It is very scalable. You can even put a hardware router in front of all the read databases to load balance them.
Thanks to those who tried.
For anyone else who ends up here, I'm benchmarking a solution that adds a "dataset_version_id" and "dataset_version_verb" column to each table in question. A correlated subquery inside a stored procedure is then used to retrieve the current dataset_version_id when retrieving specific records. If the latest version of the record has dataset_version_verb of "delete", it's filtered out of the results by a WHERE clause.
This approach has an average ~ 80% performance hit so far, which may be acceptable for our purposes.
I am not a great VB programmer, but I am tasked with maintaining/enhancing a VB6 desktop application that uses Sybase ASE as a back-end. This app has about 500 users.
Recently, I added functionality to this application which performs an additional insert/update to a single row in the database, key field being transaction number and the field is indexed. The table being updated generally has about 6000 records in it, as records are removed when transactions are completed. After deployment, the app worked fine for a day and a half before users were reporting slow performance.
Eventually, we traced the performance issue to a table lock in the database and had to roll back to the previous version of the app. The first day of use was on Monday, which is generally a very heavy day for system use, so I'm confused why the issue didn't appear on that day.
In the code that was in place, there is a call to start a Sybase transaction. Within the block between the BeginTrans and CommitTrans, there is a call to a DLL file that updates the database. I placed my new code in a class module in the DLL.
I'm confused as to why a single insert/update to a single row would cause such a problem, especially since the system had been working okay before the change. Is it possible I've exposed a larger problem here? Or that I just need to reconsider my approach?
Thanks ahead for anyone who has been in a similar situation and can offer advice.
It turns out that the culprit was a message box that appears within the scope of the BeginTrans and CommitTrans calls. The user with the message box would maintain a blocking lock on the database until they acknowledged the message. The solution was to move the message box outside of the aforementioned scope.
I am not able to understand the complete picture without the SQL code, that you are using.
Also, if it is a single insert OR update, why are you using a transaction? Is it possible that many users will try to update the same row?
It would be helpful if you posted both the VB code and your SQL (with the query plan if possible). However with the information we have; I would run update statistics table_name against the table to make sure that the query plan is up to date.
If you're sure that your code has to run within a transaction have you tried adding your own transaction block containing your SQL rather than using the one already there?