Database for read and append only [closed] - database

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 5 years ago.
Improve this question
Basically my application needs to dump data daily into a database. But for any data written down, there is no need to update.
Hence, is appending to csv or json file sufficient for the purpose. Or it will be more computationally efficient to write in standard SQL?
Edit
Use-Case Update
I am expecting to store one entry of for each particular activity count daily. There are about 6-8 activities.
It is exactly like a log in some sense. I would like to perform some analysis with the trend of activities for example. There is no relations between different activities though.
If say in some cases there might be a need for update, would that imply a proper database will be more suitable rather than text file?

It depends on the nature of the data, but there may be another style of database other than an SQL one which could be suitable, like MongoDB which essentially stores JSON objects.
SQL is great when you need entities to have relationships to each other, or if you can take advantage of the type of select queries it can provide you with.
Database systems do have some overhead and could have some gotchas you might not expect, like loading up a heap of crap into memory so it's ready to be searched.
But storing text files can have drawbacks, like it might become difficult to manage your data in the future.
It basically sounds like your use-case is similar to logging, in which case dumping it into a file is fine.

Related

Database growth - How to handle with big tables [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
My SQL Server database is getting bigger and bigger. Nowadays, with 2 GB, it's increasing a lot with many data. I have some table with a lot of data, like millions. These data are very important for SELECTS, like graphics and reports.
I expect that in one more year, I'll have about 5-6 million rows in one of the tables. I have indexes, the database is well organized... my unique worry is about the time that it will take to generate some reports and so on...
How to find data, SUM, COUNT, check 'n' variables based on columns, in so big tables?
What can you suggest? Is there a way to reorganize or split tables? I'm worried in the situation to use always the better manner and make everything look OK.
If it's an ordinary transactional database, you can go for a data warehousing solution for your reporting purposes.
Data warehouses are usually more efficient in these type of situations.

Why do NoSql databases want data to be as flat as possible? Firebase [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 5 years ago.
Improve this question
Why do NoSQL databases want the data to be as flat as possible? Especially in the case of firebase. Can anyone provide read or write calculations to show the differences in I/O between relatively flat databases and relatively deep, multi-level, databases?
I what comes to Firebase Realtime Database, it's much more than performance, it's about information retrieval strategy.
You are free to save the ids of every liker of a specific post nested in the post structure, but if you feel you don't need to retrieve all this information every time you get a post (let's suppose you query a list of posts only to show as a summary cards), then you won't want to have it nested, but flat under a "post_likes/{postId}" node for example.
Remember that in the Firebase Realtime Database you can't filter out the nodes you don't want to receive. At the moment you retrieve a node, you get it all the way deep down the structure.
Think about the same example now, but for comments. The same thing apply, so we could structure our comments under a "post_comments/{postId}" node and only retrieve it when we are willing to show the comments.

Redmine - Database Structure/Normalization [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
I am using redmine for project management and issue tracking.
I was looking at the database tables and the underlying structure and was wondering if anyone who is VERY experienced with database architecture can comment on the structure.
I am concerned that once there are many users and hundreds (or thousands) of projects (each project containing many issues, with each issue containing many messages, etc.), the database structure could possibly turn out to be a weak point.
How is the performance impacted by this design?
I would like to hear about the Pros/cons of how the tables are laid
out and how the data is separated or normalized, and whether or not
it might be worth re-structuring.
What would be the benefits of
separating the data out to more tables (with less columns per table)
The database structure looks typical for an issue/project tracking system. If you can come up with a better structure, I would be very interested in seeing it :).
What you have to remember is that applying normalisation rules are all fine and dandy but if you apply it too much then sometimes you may hit performance problems (and the dreaded de-normalisation hacks start to creep in). In other words, there's a balancing act to be done between some normalisation and hardcore (too much) normalisation.
You would have to have a good reason to re-structure that database model. For example, it could be that for some particular query the database design does not serve the answer in an efficient manner. You could then start asking yourself what other table(s) could be created that would hold the data that I need in an efficient manner for optimal query performance. Also you could ask yourself what other indexes could be in place which will allow for optimal performance.
The fact is that until you have the very high number of users and projects and issues in this database as you predict it is hard to answer those questions. Maybe you could generate the data for some fake users and projects and test out the database to backup your concerns? Remember the adage of Professor Donald Knuth: Premature optimization is the root of all evil.

When developing a database, is it important to keep in mind a future application? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
I am in the process of designing a database for the first time outside of the classroom in order to make a future java application work with complete desired functionality. As I am trying to design entity relationship diagrams and tables, I find myself always thinking about my java project that is required later. I am beginning to wonder if this is making me more confused and if I am making this more difficult for myself; I am beginning to get nervous that I might not be skilled enough yet to pull this off.
Should I just focus on producing the most normalized database I can and trust that it will allow for my application to do everything it needs to do?
Or,
Should I definitely be keeping my future application in mind with each step of database development to ensure total functionality?
Edit: I would also appreciate any recommendations on free database design tools.
Databases are notoriously hard to refactor, so if you know about something you haven't gotten to yet but are definitely going to do, you need to consider that in your design. This is espcially true if the future something (For example reporting) is going to need to look at lots of records or is going to need moment in time data as opposed to doing calculations on the fly. This is the difference between storing the cost of an order vice calculating it based on current prices for instance. If you just look at the order process, you may thing it is ok to just calculate the price, but reporting will need to know what the price was at the time the order happened or the financial records will be messed up.
You might read this:
What are the general guidelines and best practices to keep in mind while designing database for an application?

Is it practical using cursors when it comes to database auditing (only on SQL Server) [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
I've been researching on SQL Cursors recently and a colleague of mine said that Cursors are best used for auditing. I tried to look for materials over in the internet but no luck.
Can anyone explain why Cursor is good for auditing despite its disadvantages?
Like any task, it's about picking the right tool for the job. Some disparage the use of cursors due to obviously bad examples of their use, but cursors have their place. They are particularly useful for subsetting data and for reducing code redundancy:
Primarily, I use cursors to perform tasks on subsets of very large datasets, ie, banking data. With billions of records there are some operations you wouldn't want to do all at once, so looping through by day is a good option. There are other methods of iterating through subsets, but a cursor performs well at this task, it's still set-based operations, just on smaller sets.
Cursors are also great for looping through multiple tables/fields in a database, no need to re-write a procedure for multiple tables if it's going to be doing the same thing in each table, or if you are consistently working on a variety of databases. For example, I had need to analyze a multitude of various log files generated by multiple systems, but they all had date and ip fields. Trivial to have a cursor loop through each of the tables and combine all relevant data into one spot.
I wouldn't use a cursor to perform row by row actions unless necessary, and while I can't think of a use-case off the top of my head I'm sure they exist.

Resources