How do you handle collection and storage of new data in an existing system? [closed] - database

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 3 years ago.
Improve this question
I am new to system design and have been asked to solve a problem.
Given a car rental service website, I need to work on a new feature.
The company has come up with some more data that they would like to capture and analyze along with the data that they already have.
This new data can be something like time and cost to assemble a car.
I need to understand the following:
1: How should I approach the problem, from API design perspective?
2: Is changing the schema of your tables going to do any good, if that is an option?
3: Which databases can be used?
The values once stored can be changed. For example, the time to assemble can reduce or increase, hence the users should be able to update the values.

To answer you question let's divide it in two parts, ideal architecture and Q&A's
Architecture:
A typical system would consist of many technologies working together to solve a practical problem. Problems can be solved in many ways and may have more than one solution. We are not talking about efficiency and effectively of any architecture here as it's whole new subject to explore. But it's always wise to choose what's best for your use case.
Since you already have existing software built, it's always helpful to follow it's existing design pattern which will help you understand existing code in detail and allow you to create logical blocks which will fit nicely and actually help in integrating functionality instead of working against it.
Since this clears the pre planning phase let's discuss on how this affects what solution is ideal for your use case in my opinion.
Q&A's
1. How should I approach the problem, from API design perspective?
There will be lots of assumption, anything but less system consisting of api should have basic functionality of authentication and authorization when ever needed. Apart from that, try to stick to full REST specification, which will allow API consumers to follow standard paths and integration would have minimal impact when deciding what endpoints would look like and what they expect from consumer.
Regardless, not all systems are ideal for such use case and thus it's in up to system designer how much of system is compatible with standard practices.
Name convention matters when newer version api will have api/v2
paths and old one having api/v1, which is good practice for routing
new functionality. Which allows system to expand seamlessly.
2: Is changing the schema of your tables going to do any good, if that is an option?
In short term when you do not have much data, it's relatively easy to migrate data. When it becomes huge, it's much more painful and resource intensive.
Good practices would allow you to prevent such scenarios where you might not need migrate data.
Database normalization becomes so crucial in such cases when potential data structure would grow rapidly and requires attention.
Regardless of using any sql or nosql solution, a good data structure will always be helpful in both data management and programming implementation.
In my opinion, getting data structures near perfect is always a good
idea, because it will reduce future costs of migration and
frustration it brings. Still some use cases requires addition on
columns and its okay to add them as long as it does not have much
impact on existing code. Otherwise it can always be decoupled in
separate table for additional fields.
3: Which databases can be used?
Typically any rdbms is enough for this kind of tasks. You might be surprised when you see case studies of large data creators still using mysql in clusters.
So answer is, as long as you have normal scenario, go ahead and pick any database of your choice, until you hit its single instance scalability limits. And those limits are pretty huge for small to mod scale apps.

How should I approach the problem, from API design perspective?
Design a good data model which is appropriate for the data it needs to store. The API design will follow from the data model.
Is changing the schema of your tables going to do any good, if that is an option?
Does the new data belong in the existing tables? Then maybe you should store it there. Except: can you add new columns without breaking any existing applications? Maybe you can but the regression testing you'll need to undertake to prove it may be ruinous for your timelines. Separate tables are probably the safer option.
Which databases can be used?
You're rather vague about the nature of the data you're working with, but it seems structured (numbers?). So that suggests a SQL with strong datatypes would be the best fit. Beyond that, use whatever data platform is currently being used. Any perceived benefits from a different product will be swept away by the complexities and hassle of deploying it.
Last word. Talk this over with your boss (or whoever set you this task). Don't rely on the opinions of some random stranger on the interwebs.

Related

What type of database is the best for storing array or object like data [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 7 years ago.
Improve this question
I'm just curious what the best method would be if I'm trying to have a bot running on my Node server that I could play Blackjack against.
But for multiple connected clients via sockets, each connected socket will have their own bot to play against but I need some way to keep the bots available cards for each time they send a POST request with whatever card they pull out of their deck.
I figured MySQL would get messy really quickly because I cannot just store an array or an object and splice out each card as it gets used, but I'm not really familiar with which database would specialize in this kind of use.
If I didn't make any sense, basically:
I need to store cards for the bot (but for each connected users session) not just 1 deck for 1 person but multiple decks for multiple people.
I'm not asking you to write any code for me, just point me in the direction of which database would be ideal for this kind of setup.
I was thinking maybe Redis or MongoDB?
Redis would probably be fastest, especially if you don't need a durability guarantee - most of the game can be played out using Redis' in-memory datastore, which is probably gonna be faster than writing to any disk in the world. Perhaps periodically, you can write the "entire game" to disk. If the project is not meant for commercial purposes, i.e. computer errors aren't gonna cause players to lose money, this is definitely an enticing choice.
MongoDB is popular, especially easy to get started with Node, and is definitely faster than most relational SQL solutions, but transactions may be a problem. For a prototype or proof-of-concept projec, it should do fine. But you may also want to look at other "NoSQL" solutions as well.
Cassandra is another popular document-oriented DB, and many people prefer it over MongoDB, for various reasons - most notably, for better scalability.
The choice really highly depends on how you model your data. In your current scenario, I know you want to simply store an object/array, which sounds like you are basically going the way of the aggregated document (MongoDB). You are, in effect, "denormalizing" the entire DB into an aggregate, and performing reads/writes on the entire object every single time in order to achieve consistency. This is a prevalent technique in MongoDB and other document-oriented DBs. But do note that this solution only works because you are not operating across partitions. Think about what happens when you have multiple servers serving the application writing to a separate DB cluster.
You've really got to analyze and decide for yourself what is the best way to model data, if scalability is a concern. Would it be a better model to NOT continually write to this array? For example, generate the sequence of cards once, store it in DB as a Game, and only do reads on it to draw cards? Then, each player's move can be stored as a very succinct data structure Hit referencing a card from the Game. Although the data becomes very relational (back to old school SQL), but the writes are much smaller, and your server never gets into a lock state waiting for players to release the Game object. It may or may not work for your use case, but think about how to model the data for maximum reads and minimum independent writes.
Personally (IMO), if this project is for fun, I'd go with Redis as an in-memory cache layer where most reads/writes happens, and write the game logs into Cassandra. But if this is serious business and I need some real consistency guarantees, I'd probably go back to relational DBs, with a Redis cache layer to speed up reads.
Because there is no one correct answer, the only advice anyone can give is to weigh your application's persistence needs against the strengths/weaknesses of each DB solution, and do a hell lot of research before making an important decision like "which technology to use for persistence". For example, there may be long-term problems with MongoDB that you overlooked - if you'd just Google "MongoDB problems" or "MongoDB sucks". Hell, there may even be long-term problems with all current NoSQL offerings with regards to transactions or consistency.

Good place to look for example Database Designs - Best practices [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 4 years ago.
Improve this question
I have been given the task to design a database to store a lot of information for our company. Because the task is rather big and contains multiple modules where users should be able to do stuff, I'm worried about designing a good data model for this. I just don't want to end up with a badly designed database.
I want to have some decent examples of database structures for contracts / billing / orders etc to combine those in one nice relational database. Are there any resources out there that can help me with some examples regarding this?
Barry Williams has published a library of about six hundred data models for all sorts of applications. Almost certainly it will give you a "starter for ten" for all your subsystems. Access to this library is free so check it out.
It sounds like this is a big "enterprise-y" application your organisation wants, and you seem to be a bit of a beginner with databases. If at all possible you should start with a single sub-system - say, Orders - and get that working. Not just the database tables build but some skeleton front-end for it. Once that is good enough add another, related sub-system such as Billing. You don't want to end up with a sprawling monster.
Also make sure you have a decent data modelling tool. SQL Power Architect is nice enough for a free tool.
Before you start read up on normalization until you have no questions about it at all. If you only did this in school, you probably don't know enough about it to design yet.
Gather your requirements for each module carefully. You need to know:
Business rules (which are specific to applications and which must be enforced in the database because they must be enforced on all records no matter the source),
Are there legal or regulatory concerns (HIPAA for instance or Sarbanes-Oxley requirements)
security (does data need to be encrypted?)
What data do you need to store and why (is this data available anywhere else)
Which pieces of data will only have one row of data and which will need to have multiple rows?
How do you intend to enforce uniqueness of the the row in each table? Do you have a natural key or do you need a surrogate key (suggest a surrogate key in almost all cases)?
Do you need replication?
Do you need auditing?
How is the data going to be entered into the database? Will it come from the application one record at a time (or even from multiple applications)or will some of it come from bulk inserts from an ETL tool or from another database.
Do you need to know who entered the record and when (highly likely this will be necessary in an enterprise system.
What kind of lookup tables will you need? Data entry is much more accurate when you can use lookup tables and restrict the users to the values.
What kind of data validation do you need?
Roughly how many records will the system have? You need to have an idea to know how big to create your test data.
How are you going to query the data? Will you be using stored procs or an ORM or dynamic queries?
Some very basic things to remember in your design. Choose the right data type for your data. Do not store dates or numbers you intend to do math on in string fields. Do store numbers that are not candidates for math (part numbers, zip codes, phone numbers, etc) as string data as you may need leading zeros. Do not store more than one piece of information in a field. So no comma-concatenated lists (these indicate the need for a related table) and while you are at it if you find yourself doing something like phone1, phone2, phone 3, stop right away and design a related table. Do use foreign keys for data integrity purposes.
All the way through your design consider data integrity. Data that has no integrity is meaningless and useless. Do design for performance, this is critical in database design and is NOT premature optimization. Database do not refactor easily, so it is important to get the most critical parts of the performance equation right the first time. In fact all databases need to be designed for data integrity, performance and security.
Do not be afraid to have multiple joins, properly indexed these will perform just fine. Do not try to put everything into an entity value type table. Use these as sparingly as possible. Try to learn to think in terms of handling sets of data, it will help your design. Databases are optimized to do things in sets.
There's more but this is enough to start digesting.
Try to keep your concerns separate here. Users being able to update the database is more of an "application design" problem. If you get your database design right then it should be a case of developing a nice front end for it.
First thing to look at is Normalization. This is the process of eliminating any redundant data from your tables. This will help keep your database neat, and only store information that is relevant to your needs.
The Data Model Resource Book.
http://www.amazon.com/Data-Model-Resource-Book-Vol/dp/0471380237/ref=dp_cp_ob_b_title_0
HEAVY stuff, but very well through out. 3 volumes all in all...
Has a lot of very well through out generic structures - but they are NOT easy, as they cover everything ;) Always a good starting point, though.
The database should not be the model. It is used to save informations between sessions of work.
You should not build your application upon a data model, but upon a good object oriented model that follows business logic.
Once your object model is done, then think about how you can save and load it, with all the database design that goes with it.
(but apparently your company just want you to design a database ? not an application ?)

Is ORM fit for complex projects? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
I've not started the ORM trip yet,
because I'm not sure how it works when the project becomes very complex.
What's your opinion or experience?
This question is a difficult one to answer sometimes. You may have heard of the Object-Relational Impedance Mismatch before; that is the issue that ORM tools attempt to solve, but it is fraught with problems. It is one of those situations where you can solve 90% of the problem in a very short time, but every additional 1% from there on up seems to increase exponentially in complexity because of all the dependencies.
An ORM framework is an abstraction, and becomes a leaky abstraction at several points:
Complex queries/scripts involving concepts like UDTs, CTEs, query hints, temporary tables, windowing functions, etc.
Performance-optimized queries. As Quassnoi mentioned in his answer, ORMs are getting better at this but frequently generate sub-optimal queries, and sometimes the effect is extremely noticeable.
Transaction management - the Unit-of-Work pattern can only get you so far when you have to deal with large batch updates.
Cross-database or cross-server actions. There are workarounds, but they are just that - workarounds. No ORM I've seen really handles this well.
Multiple-table inheritance - this is the only form of inheritance that is actually normalized, and it is really not that hard to manage using pure SQL and manual mapping, but O/R mappers are lousy at it. For many of us, single-table inheritance is not an acceptable alternative.
Those are some of many areas where O/R mappers seem to fail us. Having said that, this does not mean that O/R Mappers are not "fit" for large projects.
In my opinion, ORM in general and O/R Mappers specifically are almost vital for large projects. They save enormous amounts of effort and can help you get an application out the door in a fraction of the time it would have taken you otherwise. They just do not solve the whole problem. You have to be prepared to profile your application to see what the ORM is really doing, and you have to be prepared to drop back down to pure SQL when the situation calls for it (i.e. in several of the situations above).
Some frameworks, such as Linq to SQL, expect you to do this and give you ready-made facilities for executing commands or stored procedures on the same connection and in the same transaction used for the mapper's "regular" duties. L2S is not the only framework that lets you do this, but several are more restrictive, and you end up jumping through many hoops to get what you need. When choosing an ORM, I think that the ability to bypass the abstraction is an important consideration, at least today.
I think the best answer to this question is: Yes, they are fit for large projects, as long as you do not rely exclusively on them. Know the limits of your ORM tool of choice, use it as a time-saver in the 90% of instances when you can, and make sure you and your team understand what's really going on under the hood for those instances when the abstraction leaks.
ORM. by defintion, is object-relational mapping.
This means that you should transform the data stored in a relational database into the objects usable by an object-oriented programming language.
The objects may supply some methods that may involve data processing and searching for the other objects.
This is where the problems begin.
The data processing may be implemented on the ORM side (which means loading the data from the database, applying the object wraparound and implementing the methods on the programming language you use), or on the database side (when the data processing commands are issued as a query to the database).
Compare this:
MyAccount->Transactions()->GetLast()
This can be implemented in two ways:
SELECT *
FROM Accounts
WHERE user_id = #me
into $MyAccount
, then
SELECT *
FROM Transactions
WHERE account_id = #myaccount
into a client-side array #Transactions
, then $Transactions[-1] to get the last.
This is inefficient way, and you'll notice it as you get more data.
Alternatively, a smart ORM can convert it into this:
SELECT TOP 1 Transactions.*
FROM Accounts
JOIN Transactions
ON Transactions.Account = Accounts.id
WHERE Accounts.UserID = #me
ORDER BY
Transactions.Date DESC
, but it has to be a really smart ORM.
So the answer to the question "whether to use an ORM or not" is the answer to the question "will my ORM allow me to issue set-based operations to the database should the need arise"?
This is subjective. My answer is specifically about automated ORM Tools.
I have a philosophical objection to ORM Tools for the following reasons:
1- A table is not and should not necessarily be a one-to-one mapping to a business object.
2- Base CRUD/Business Object code is boring to write, but it's critical to your application. I'd rather be in control and have knowledge of it. (a little NIH syndrome)
3- A new developer coming in is going to have an easier time learning a traditional object model versus whatever bizarre syntax is created by the ORM tool.
You don't mention what platform you are using, but if I wanted to read a record from a database in .NET without using an ORM, I would have to:
Read a Connection String
Open a Database Connection
Open a Command object against the connection
Read my Record (by execute a SQL statement against the Command object)
Transform that Record to an object
in my language of choice
Close my query
Close my connection
Sound complicated? An ORM does all of the same things automatically under the covers, and I only need a few lines of code. In addition, because the ORM has knowledge of your data model, it can sometimes perform optimizations such as caching and lazy loading.
When project becomes more complex it is even better, because it let's keep everything at the same level of abstraction, rather than jumping from objects to SQL. We once have written our own layer in paralell to developing the application (because we couldn't use any traditional ORM), and the more powerful it became, the easier managing application become.
Performance concerns are you usually overrated. It's usually in different place, then you would expect. We had some badass abstraction layer written in Python, and it working great. What sucked, was url library, which we had to rewrite in C. Really, you can always optimize queries, that are most important at the end, writing SQL by hand at the moment, when you see, that performance needs it. But in most times - you won't have to.
In my opinion ORM built for big projects, to minimize effort and development time.
But if you are developing an application which needs very high speed data access code, you need to avoid ORMs as you can because ORMs add a new layer in your application
ORMs are ideal for large projects because they provide a layer of protection from changes on the database side, and speed up the process of adding new features. If performance becomes an issue, you can use a different method to get to your data at the query where you encounter a bottleneck, rather than hand-optimizing every query in the application.
ORM's are cool if you want to pump out web app's quickly to do customer development and see if people actually use your product.
http://www.youtube.com/watch?v=uFLRc6y_O3s
The very last topic that Josh Berkus discusses is "Runaway ORM's". Check it out at 37:20.

Why should you use an ORM? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
If you are motivate to the "pros" of an ORM and why would you use an ORM to management/client, what are those reasons would be?
Try and keep one reason per answer so that we can see which one gets voted up as the best reason.
The most important reason to use an ORM is so that you can have a rich, object oriented business model and still be able to store it and write effective queries quickly against a relational database. From my viewpoint, I don't see any real advantages that a good ORM gives you when compared with other generated DAL's other than the advanced types of queries you can write.
One type of query I am thinking of is a polymorphic query. A simple ORM query might select all shapes in your database. You get a collection of shapes back. But each instance is a square, circle or rectangle according to its discriminator.
Another type of query would be one that eagerly fetches an object and one or more related objects or collections in a single database call. e.g. Each shape object is returned with its vertex and side collections populated.
I'm sorry to disagree with so many others here, but I don't think that code generation is a good enough reason by itself to go with an ORM. You can write or find many good DAL templates for code generators that do not have the conceptual or performance overhead that ORM's do.
Or, if you think that you don't need to know how to write good SQL to use an ORM, again, I disagree. It might be true that from the perspective of writing single queries, relying on an ORM is easier. But, with ORM's it is far too easy to create poor performing routines when developers don't understand how their queries work with the ORM and the SQL they translate into.
Having a data layer that works against multiple databases can be a benefit. It's not one that I have had to rely on that often though.
In the end, I have to reiterate that in my experience, if you are not using the more advanced query features of your ORM, there are other options that solve the remaining problems with less learning and fewer CPU cycles.
Oh yeah, some developers do find working with ORM's to be fun so ORM's are also good from the keep-your-developers-happy perspective. =)
Speeding development. For example, eliminating repetitive code like mapping query result fields to object members and vice-versa.
Making data access more abstract and portable. ORM implementation classes know how to write vendor-specific SQL, so you don't have to.
Supporting OO encapsulation of business rules in your data access layer. You can write (and debug) business rules in your application language of preference, instead of clunky trigger and stored procedure languages.
Generating boilerplate code for basic CRUD operations. Some ORM frameworks can inspect database metadata directly, read metadata mapping files, or use declarative class properties.
You can move to different database software easily because you are developing to an abstraction.
Development happiness, IMO. ORM abstracts away a lot of the bare-metal stuff you have to do in SQL. It keeps your code base simple: fewer source files to manage and schema changes don't require hours of upkeep.
I'm currently using an ORM and it has sped up my development.
So that your object model and persistence model match.
To minimise duplication of simple SQL queries.
The reason I'm looking into it is to avoid the generated code from VS2005's DAL tools (schema mapping, TableAdapters).
The DAL/BLL i created over a year ago was working fine (for what I had built it for) until someone else started using it to take advantage of some of the generated functions (which I had no idea were there)
It looks like it will provide a much more intuitive and cleaner solution than the DAL/BLL solution from http://wwww.asp.net
I was thinking about created my own SQL Command C# DAL code generator, but the ORM looks like a more elegant solution
Abstract the sql away 95% of the time so not everyone on the team needs to know how to write super efficient database specific queries.
I think there are a lot of good points here (portability, ease of development/maintenance, focus on OO business modeling etc), but when trying to convince your client or management, it all boils down to how much money you will save by using an ORM.
Do some estimations for typical tasks (or even larger projects that might be coming up) and you'll (hopefully!) get a few arguments for switching that are hard to ignore.
Compilation and testing of queries.
As the tooling for ORM's improves, it is easier to determine the correctness of your queries faster through compile time errors and tests.
Compiling your queries helps helps developers find errors faster. Right? Right. This compilation is made possible because developers are now writing queries in code using their business objects or models instead of just strings of SQL or SQL like statements.
If using the correct data access patterns in .NET it is easy to unit test your query logic against in memory collections. This speeds the execution of your tests because you don't need to access the database, set up data in the database or even spin up a full blown data context.[EDIT]This isn't as true as I thought it was as unit testing in memory can present difficult challenges to overcome. But I still find these integration tests easier to write than in previous years.[/EDIT]
This is definitely more relevant today than a few years ago when the question was asked, but that may only be the case for Visual Studio and Entity Framework where my experience lies. Plugin your own environment if possible.
.net tiers using code smith templates
http://nettiers.com/default.aspx?AspxAutoDetectCookieSupport=1
Why code something that can be generated just as well.
convince them how much time / money you will save when changes come in and you don't have to rewrite your SQL since the ORM tool will do that for you
I think one cons is that ORM will need some updation in your POJO. mainly related to schema, relation and query. so scenario where you are not suppose to make changes in model objects, might be because it is shared among more that on project or b/w client and server. so in such cases you will need to split it in two levels, which will require additional efforts .
i am an android developer and as you know mobile apps are usually not huge in size, so this additional effort to segregate pure-model and orm-affected-model does not seems worth full.
i understand that question is generic one. but mobile apps are also come inside generic umbrella.

What are the general guidelines and best practices to keep in mind while designing database for an application? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
My questions is regarding Database Modeling. I tried to look for this question amongst other Database Designing questions on SO but haven't found it and so here am asking about:
What are the general guidelines and best practices to keep in mind while designing database for an application ?
What are the best resources/books/University Lectures available on Database Design Concepts ?
Thanks.
Just some things I've learned from experience (I'm sure some will disagree, but I've been querying and designing and programming databases for 30+years and have seen the effects of stupid design up close and personal):
There are three critical things to consider in all database design - data integrity (without this you essentially have no data), security and performance. All other considerations take a back seat to these three.
Never create a table without a way to uniquely identify a record.
There really are very few true natural keys that really work as a primary key, if you don't have control over whether it will change, do not use it as a primary key (you don't really want to change the company name through 27 child tables do you?). Use a surrogate key instead. Using a surrogate key does not exempt you from the need to set unique indexes if you could have used a unique composite key. Always set these indexes if you can determine a way to have a unique composite. Duplicate records are the bane of an application's existance. It seems obvious but never ever consider name to be a key field, names are not and never will be unique.
Do not use a GUID as your primary key as it can kill performance. If you need a guid for replication also consider having an int or big int primary key.
Do not design as if you will be changing database backends unless you know up front you will be doing so. Virtually all the good techniques for performance tuning are database specific, don't harm your own ability to tune your database for a non-existant requirement.
Avoid value-entity table structures. They are miserable to query.
Add all things you need to ensure data integrity into your database design, things like defaults, constraints, triggers, etc. are necessary to avoid having useless data. Do not rely on the application code to do this or you will be sorry.
Others mentioned normalization, I agree you must understand this thoroughly even if you later decide to denormalize.
Do not stack views on top of views if you want any kind of performance at all. Every database I've seen that does this is eventually a huge performance problem.
Consider what data you will need to manage the database as well as what the application needs. If you are going to be serious about databases you need to understand database auditing and your database should implement ways to find out who made what change and when and what the old data was. You'll thank me the first time someone malicious changes the data or someone deletes all the records in a table accidentally.
Really think through how the data will be queried when designing. It can make a huge difference in the design.
Do not store more than one piece of information in a field. It might look cool to put a comma delimited list into one field rather than add a related table but it is a really bad idea.
Elegance is often the enemy of performance in databases. Pick performance over elegance every time and you won't go wrong.
Avoid the use of database keywords in the naming of objects. Your programmers will thank you. Pick a naming convention and be consistent in always using it. If a field is in mulitple tables make sure it is the same name (exception if an id field has two possible foreign keys in the same table use the id field name and a prefix to identify the differnce between say Sales_person_id and Customer_person_id), same datatype and length, if applicable in all of them. Fix misspellings right away, you really don't want to spend the next ten years remembering that in tablea it is the persnoid instead of personid.
Read about database refactoring (search on amazon for some good books) and consider how to be able to do this in your design. Few databases are designed to be refactored and being able to do so is critical towards being able to fix database problems that arise from badly thought out designs or changes to business requirements.
While you are reading, read about performance tuning, you'll learn a tremendous amount about what to avoid in designing the database.
I'm sure there's more but this is enough to start with.
One addtional thing I wanted to add. Do not design your database as if the data entry application page is the most critical thing. Data is often queried more often than it is written even in a transactional database. Really think about how easy it will be to to get data back out of the database (Oh so that's why the EAV model is so bad!) and what effect the design will have on reporting. This is espcially critical as I often see that the people doing the reporting are not the people who design the database or reporting tasks are later in the project than createing the data entry. Databases are not easy to refactor, consider the whole life cycle of the data when designing a database. Think about things like storing moment in time values as you can't find out how much an order was for two years later by multiplying the quantity ordered by the price in the products table as that wasn't the price at the time of the order. Reporting needs this type if information, but it often too late to get it by the time the reports are written when the design is done badly. Stuff that works fine when you are handling one record at a time can be a disaster when you need to look at thousands or millions of records. Not every application is going to create a separate reporting datbase, so really think about this.
DEPENDS
this question is like saying "what is the best car to buy", it really depends on many factors including amount of data, number of concurrent users, what you are trying to do, etc. FYI, normalization is good for some database uses, but bad for others (data warehouse).
Give us a better idea of how you intend to use the data, and you'll get some better recommendations.
While I agree with others that your question right now is much too broad and can't really be answered (except for the "it depends" approach :-)), there is one book I would wholeheartedly recommend for anyone beginning database design in general:
Michael Hernandez: Database Design for Mere Mortals(R): A Hands-On Guide to Relational Database Design
It's a really hands-on, no-frills, down to earth book and introduces all the major and important concepts in a very understandable, very approachable fashion. Well written, interesting, very sound and useful - highly recommended!
Marc
your question is too broad. Normalization and denormalization are most used concepts.
The best thing to do is to start with a well normalized database. The wikipedia article has some good information on that along with some good references.
Typically you'll end up denormalizing parts of your database for better performance, but you almost always want to start with it in 4th normal form.
Look at wikipedia article about database normalization. There is also further reading section.
If you design a new database for brand new application you should try use ORM library (like JPA implementations in Java) that release you from database design, because these tools generate database from domain model. If you don't have any experience in this field - database generated with ORM tools will be much better of yours.
Consider all your use cases. Think about every single possible way someone might want to get to data, and plan for those. Wear your designer, developer, tester, and user hats.
Try to think of database tables as representing physical objects.
Normalize, as others have said.

Resources