Opinions on NoSQL and indexing lots of data? [closed] - database

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
I was at a .NET development group meeting a couple weeks ago and the speaker was extolling the virtues of NoSQL and how even relational data doesn't have to be stored relationally if you just index lots of data. So, my questions are: was he blowing smoke? How does one craft an index to be more efficient than the last? Does indexing just logically store the information in a table in a logical format i.e. alphabetically?

Well relational data is needed more for data integrity than indexing. Speed is not the only consideration when choosing a database. SQL Server and other enterpise databases can perform very well if they are designed by people who know what they are doing. Unforuntately most relational databases are designed by data amateurs and their performance reflects that.
NoSQL databases and relational database are used for different things. I would never consider putting a financial application in noSQL for instance because of the need for data integrity and internal controls to prevent fraud and ensure records are consistent and correct. However a website where data quality doesnt matter so much (think Google - who would notice if they failed to serve up every single website that mentions Bill Gates in a query) then yes it is a good choice.

Related

Database growth - How to handle with big tables [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
My SQL Server database is getting bigger and bigger. Nowadays, with 2 GB, it's increasing a lot with many data. I have some table with a lot of data, like millions. These data are very important for SELECTS, like graphics and reports.
I expect that in one more year, I'll have about 5-6 million rows in one of the tables. I have indexes, the database is well organized... my unique worry is about the time that it will take to generate some reports and so on...
How to find data, SUM, COUNT, check 'n' variables based on columns, in so big tables?
What can you suggest? Is there a way to reorganize or split tables? I'm worried in the situation to use always the better manner and make everything look OK.
If it's an ordinary transactional database, you can go for a data warehousing solution for your reporting purposes.
Data warehouses are usually more efficient in these type of situations.

Where does an app / website hold its data? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 6 years ago.
Improve this question
For a small start-up mobile app/website what options are there for storing its data? I.e. Physical server or cloud hosted data base such as azure.
Any other options or insight would be helpful thank you!
Edit:
For some background I'm looking at something that users could regularly upload data to and consumers could query to find results through an app or website.
I guess it depends on your work load and also on the your choice of data store. Generally, SQL based storage are costlier on cloud based solution due to the fact that those can be only vertically upgraded whereas no-sql ones are cheaper.
So according to me you should first decide on your choice of data-store, which depends on following factors:
The type of data; is your data structured or it falls under non-structured category?
Operations that you will perform on the data. Do you have any transactional use-cases?
Write/Read pattern; is it a read heavy use case or a write heavy one ?
These factors should help you decide on an appropriate data-store. Each database has its own set of advantages and disadvantages. The trick is to choose one based on your use cases and above mentioned factors.
Hope it helps.

Should all tables be related in a Database or Is it ok to leave some of them? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 7 years ago.
Improve this question
Often we come across some small insignificant (debatable) tables left out as stand alone. Although they are used in joins (sparingly) but still developers don't bother to relate them.
May be too many References made the inserts slow.
This leads to this question :
As a thumb rule should we relate all the tables in the database ? If no then where to draw the line?
thanks
Foreign Keys are not always a negative impact to performance, they can be a positive impact as well. Database relationships do more than just ensure referential integrity, they also help teach SQL Server about the nature of your data. The fact that two fields are related can give clues as to the cardinality of your queries and thus the optimizer actually takes these relationships into consideration when it's estimating the cost of your query.
In my opinion, if two fields are related in your database, they should have a defined relationship. In general, the more you can teach SQL Server about your data (not just relationships, but CHECK constraints as well), the better it will be at generating efficient query plans. Of course like anything in SQL Server, there are exceptions to the rule, but if you want a rule of thumb, I would lean toward defining all the relationships.

Redmine - Database Structure/Normalization [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
I am using redmine for project management and issue tracking.
I was looking at the database tables and the underlying structure and was wondering if anyone who is VERY experienced with database architecture can comment on the structure.
I am concerned that once there are many users and hundreds (or thousands) of projects (each project containing many issues, with each issue containing many messages, etc.), the database structure could possibly turn out to be a weak point.
How is the performance impacted by this design?
I would like to hear about the Pros/cons of how the tables are laid
out and how the data is separated or normalized, and whether or not
it might be worth re-structuring.
What would be the benefits of
separating the data out to more tables (with less columns per table)
The database structure looks typical for an issue/project tracking system. If you can come up with a better structure, I would be very interested in seeing it :).
What you have to remember is that applying normalisation rules are all fine and dandy but if you apply it too much then sometimes you may hit performance problems (and the dreaded de-normalisation hacks start to creep in). In other words, there's a balancing act to be done between some normalisation and hardcore (too much) normalisation.
You would have to have a good reason to re-structure that database model. For example, it could be that for some particular query the database design does not serve the answer in an efficient manner. You could then start asking yourself what other table(s) could be created that would hold the data that I need in an efficient manner for optimal query performance. Also you could ask yourself what other indexes could be in place which will allow for optimal performance.
The fact is that until you have the very high number of users and projects and issues in this database as you predict it is hard to answer those questions. Maybe you could generate the data for some fake users and projects and test out the database to backup your concerns? Remember the adage of Professor Donald Knuth: Premature optimization is the root of all evil.

When to use the best data store and when to stick to relational? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
I find myself very frequently taking the decision between storing a object in the data store more appropriate to its nature (Events, Documents, Graph, etc) or just sticking to the relational database and moving on with my life, and i bet some of you do too.
I'd like to know what criteria you use to take this decision, for example, when is using NoSQL with little data is "premature optmization" and when it is "good enginneering"...
So, When to use the best data store and when to stick to relational?
I see a lot of questions with the nosql tag that include the following:
They want to mix RDBMS and NoSQL systems.
They think NoSQL for large data out performs RDBMS always.
They believe data modeling is always easier with NoSQL.
From personal experience I would consider the cost of discovery when picking a new database.
It is far from easy to move data between relational stores and NoSQL. It's definitely not always intuitive how to model data when working with a document store. Also some of these databases are so new that their query optimization is no where near a relational system.
The things I mentioned above might not seem like a problem when you're doing a proof of concept or working with small amounts of data.
My recommendation would be to not let the hype get to you when picking a solution.
(I've worked with production implementations of Mongo, Couchbase, CouchDB and Redis.)
I'm working with Oracle and Couchbase(nosql document-oriented DB). I think that the use of the NoSql in most cases it is easier and less expensive. Every NoSql DB is a mechanism for solving a rather small range of tasks, and if under your task suited one of them, using NoSql solution will be more optimal than using monstrous large Oracle or MSSql Server. Ofter we use not more than ten per cent of the capabilities of these powerful databases, but not because we know them not at a high level, but because we just don't need all of leeway that they provide

Resources