When to use a smaller table - database

I'm working on a database design for an application that will manage Model Congress groups. I'm having particular trouble with the object meant to represent the congress. Right now, the field list looks like this:
| Congress |
+------------+
| congressID |
| adminID |
| speakerID |
| hopperID |
| floorID |
| rulesID |
| (etc.) |
This is what each field is meant to represent. Tables/objects are all caps.
congressID: The primary key (obviously)
adminID: References the unique PERSON who runs the model congress, i.e., the teacher.
speakerID: References the unique REP (representative) who acts as speaker for the congress
hopperID: References the special COMMITTEE (a committee being any place a bill can be sent) to which new bills are initially sent.
floorID: References the COMMITTEE used to represent the congress's floor
rulesID: References the Rules COMMITTEE
As you can see, these fields are important to reference in the context of each model congress. The issue I am having is how to represent the foreign keys, primarily the last four.
It seems I have two choices:
Include them all in the Congress table as they are now, or
Make smaller tables for each field with composite primary keys, e.g. repID+congressID in Speaker, committeeID+congressID in Hopper, etc.
Is more granular necessarily better? Or does this needlessly complicate things? I've been skirting around my design with the first layout for a while, but whenever I try to draw the ERD from that point, the relationships appear hopelessly mangled.

GROUPS should be a table that can define simply the # of the congress in the name or description (1st, 2nd, etc..) and each PERSON should have an ID, with fields for their names and biographics. Then, an M:N table (call it say, GROUPMEMBERS) with it's own ID, an ID referencing GROUPS and (PERSON or GROUPS - would be two fields so you could attach groups to other groups in a hierarchy). Then, you have a tag table (GROUPROLES) with its own ID, a begin date (and possibly an end date), a reference to GROUPMEMBERS, and a reference to a TYPES table defining one role per row within the membership.
What does this allow you to do? A person's role can now change mid-congress with the dates attached to the roles. Committees can be part of congress (named groups with the # of congress in the description or name), and roles within a committee (like chairman) can be assigned.
Drawbacks? Reporting out involves a bit of gymnastics, but not impossible. The table names aren't directly correlative, but they shouldn't need to be exposed to the application users. Constraints about number of members per role per group would need to be either set up as custom functions in the database, or application-controlled (not recommended).
This also allows you to set up dependent infrastructure that tracks flow of items through committees and you can even report out how many times a single person saw a single thing by their membership to groups the item went through. It can also be used to track votes and denormalize information to large group votes (like passing or dying in committee based on votes) simply by running an aggregate on the vote in question and referencing the group. If this were the case, I'd probably make roles a subclass of a historical table and run everything out of a central history.

Related

RDBMS: How to model a company having different products at multiple locations

I have an existing database that models all products, a company is either producing or consuming. The database is quite simple:
Table: companies {PK: company_id}
+------------+--------------+
| company_id | company_name |
+------------+--------------+
Table: products {PK: product_id}
+------------+--------------+---------------+
| company_id | product_id | product_price |
+------------+--------------+---------------+
Now, if I need to add location information to it, it starts to get complicated.
Basically, now a company has many locations and each location has many products.
To further complicate matters, some attributes of the product e.g. price may not be the same at each location. I would like to share other common attributes at all locations (Basically, I want to avoid creating three copies of product A that's used at all three locations).
I'm not sure what the best way to model this is. I can think of
Table: company_location
+------------+-------------+
| company_id | location_id |
+------------+-------------+
Table: location_product
+-------------+------------+
| location_id | product_id |
+-------------+------------+
But this design would not allow product attributes to change per location, without creating an entirely different product for each location. I also don't have a way to maintain a master product list per company.
Any help is appreciated.
PS: I'm using a postgreSQL database
The rules of normalization would tell you that you need your non-key attributes to depend on all of the key values (and nothing else).
If price is determined by:
- The company who makes it
- The location that sells it
- What the product actually is
Then that implies that PRICE needs a candidate key that specifies company, location and production.
The issue becomes what the relationships are between companies, products and locations. Also, what else do you know (what columns do you have) about these three kinds of things?
If they are all totally independent, for example, the products are commodities and don't depend at all on companies and the locations are independent distributors, which have nothing to do with either companies or what kind of products are sold there, then really a single three-way join is probably your best bet.
However, if there are some linkages between company, product and location, then you need to normalize these items out appropriately. At the end, you may still find yourself tempted to keep price as the only attribute in a three-way join. Alternatively, you may find that your data is actually more hierarchical (companies have locations which sell products that are fundamentally different in some meaningful way from similar products sold at other locations). In such a case the price might live on the leaf level of a tree structure.
It's really hard to say for sure what would work best for you without understanding your business rules better.
The bottom line is, you should aim for third normal form (3NF).
You probably want something like this:

What is the best way to represent/design groups and members in a database

Suppose that Group and Member are two java objects. Each Member occurs only once in a Group, a Group has zero or more Members. The attributes of these two objects are almost the same. A Member, however, inherits its values from the Group that it belongs to. But the values of the member can be overwritten as well.
What is the best way to design this in a database? The only thing I can think of is to have two tables that represent these two objects. But this mean that the member table contains many similar values.
Group table
id| attr1 | attr2 |
---------------------
1 | value1 | value2 |
Member table
id| attr1 | attr2 | group_id | attr3 |
----------------------------------------
1 | value1 | value2 | 1 | foo |
2 | bar | value2 | 1 | foo |
As you can see the member 1 has "inherited" its values from group 1 and has its own attr3 value foo. Member 2 also "inherited" values from group 1 but its attr1 value has been overwritten by bar.
What is the best way to design this in a database?
The best way is to understand the scientific principles, that data and program elements (including objects) are completely different species, and each has quite different methods and rules for analysis, design, and implementation. A real man and a real woman make a great marriage, precisely because each is different. A confused or enmeshed partner makes a disastrous marriage.
Therefore I will address the data requirement in your question, using standards, such that the data is stable and easy to extend. And you are free to build any object from that, either using Standards, such that the objects are stable and easy to extend. Or not.
Here is the Normalised Relational database that supports your stated requirement.
Group vs Member Attribute Data Model
No Nulls. No duplicates of anything. No Update Anomalies.
In the default case, the Group attributes are each Members attributes. These defaults should not be stored per Member, that would be massive and unnecessary duplication.
You need Optional Columns for the Member attributes that, if set, override the default attributes.
I have given each attribute separately, allowing each of them to be set independently over time. If all of them were to be set together, you can merge them into one optional table.
Relational Keys are provided, which means you will have the highest level of Relational Integrity, power, and speed. Given the level of your question, you may not appreciate the value of that right now, but you will appreciate it once you start coding.
That is an IDEF1X data model. IDEF1X is the Standard for modelling Relational Databases. Please be advised that every little tick; notch; and mark; the crows foot; the solid vs dashed lines; the square vs round corners; means something very specific and important. Refer to the IDEF1X Notation. If you do not understand the Notation, you will not be able to understand or work the model.
Of course, a Member should occur just once in each Group (otherwise you would have row duplication, which is prohibited in the Relational Model).
A Member, however, inherits its values from the Group that it belongs to
That implies that each Member belongs to just one Group.
If Members can belong to more than one Group, we have to (a) specify which Group he receives default attributes from, and (b) change the model. It is easy.
If you would like the Predicates, please ask.
What is the best way to design this in a database?
That of course, leads to the next, and obvious, question: What is the best way to design the objects to use the database?
If I were in your position, I would use a View each, gather all the data for Group, and for Member, and then use that to load your objects. If you need the code for that, just ask.
Keep the objects simple, you do not need to mess around with trying to implement "inheritance" in the objects. That is, keep the data issues in the database, and the object issues in the objects, and do not scramble your eggs. We build software components for deployment, not pre-1970 style monolithic object layers.
And of course, use ACID Transactions to update the database, not OO or ORM "persistence".
It is 2015, after all, and we have had the Relational Model since 1970; SQL platforms including ACID since 1984. There is no need to regress to ancient filing systems. I give this warning because I am quite aware that the OO/ORM crowd advise the implementation of pre-relational filing systems.
Please feel free to ask questions or comment.
You can try to have attr1 and attr2 fields in Member table contain NULL initially. And you will be able to check if attr1 or attr2 is NULL than you need to query Group table
If attr1 and attr2 contain some values that means those field(s) were overwritten.

Database Design: Use mapping tables or include data directly in tables

I am writing a project management web app just for practice. The basic idea is that a user can add a project to the app and then manage their tasks and appointments related to the project through the interface. I'm currently designing the Database and I was wondering what best practice would dictate here.
I have 4 tables so far:
+----------+ +-------------+ +--------------+ +-------------+
|Users | |Projects | |Tasks | |Appointments |
+----------+ +-------------+ +--------------+ +-------------+
|id | |id | |id | |id |
|username | |project_name | |task_name | |appt_name |
|fname | |project_desc | |task_details | |appt_details |
|sname | | | |task_deadline | |appt_date |
+----------+ +-------------+ +--------------+ +-------------+
I'm taking the basic relationships as:
one user can have many projects,tasks, and appointments.
one project can have many users, tasks and appointments.
one task can have many users, but only be associated with one project. A task can't be associated with an appointment.
The rules for the tasks also apply to the appointments.
My question is: when is it suitable to use mapping tables and when is it suitable to include the data directly in the associated table? My take on my example would be:
have a mapping table for each of users-projects/tasks/appts because there can be many users for each type and many of each type per user
in the tasks and appointments tables include a project_id field that can be used to associate tasks and appointments with projects and thereby the users of that project.
Would this be the correct approach or is there a better solution? I'm fairly new to database design so I would really appreciate some constructive criticism
I'm currently designing the Database and I was wondering what best practice would dictate here
Best Practice dictates that the data must be modelled, as data, without regard to the use or the app. Without regard to the platform as well, but the world is upside-down and backwards these days, the platform is chosen first.
Modelling means that you identify and consider the entities first, before you consider what you are going to do with them second (such as "mapping").
No Option
My question is: when is it suitable to use mapping tables
It is the normal method.
Correct
theoretically founded
allows all functions and capabilities that users expect databases to have
eg. aggregation, single or multiple item (subset of the list) searches are very fast, etc
easy to expand
prevents preventable errors
gives you chips that you can cash in, in Heaven.
and when is it suitable to include the data directly in the associated table?
Never. That will create a comma-separated list in a single column.
Incorrect
No theoretical basis
breaks First Normal Form
beloved of the incompetent (they not only don't know the rules, they don't know when they are breaking the few rules they do know)
database features and functions cannot be used
eg. searching for, determining if, a specific user is working on a project will cause a tablescan
result is not a database, it is a Record Filing System
difficult to expand
you will spend half your life fixing preventable errors, and the other half thinking about how to replace it without letting anyone noticing
guarantees you a specific place in hell, sixth level, with the frauds and those who cheat workers out of their wages, one level below murders, one above pædophiles and war-mongers
have a mapping table for each of users-projects/tasks/appts because there can be many users for each type and many of each type per user
Generally, yes. But that is not clear. "Type" rings alrm bells, it sounds like you intend to have one table that asupports all possibilities; nullable Foreign Keys; etc. Refer "Never" above.
There should be an Associative Table (not "mapping") between only those pairs of tables that need it, not between each and every possibility. And each such table relates ("links", "maps", "connects") just one discrete pair.
This will be resolved when the Normalisation is completed, next ...
Consideration
The requirement does sound a bit suspicious. I do not accept that those tables are all isolated, fragmentary facts. Consider:
First, Tasks are probably a child of Project (you've implied that, such a dependency should be explicit). Likewise, Appointments should be a child of Project. As in, a Tasks cannot exist, except in the context of a Project. Likewise for Appointment.
Then you have to evaluate whether Users should be related to Projects (as given in the requirement). It seems to me that an User is assigned to a Task (and thus related to the Project because the Tasks belongs to one Project), and not to all Tasks in the Project. Likewise for User::Appointment.
if Users are related to Projects (and not to specific Tasks), as per the requirement, it does seem too general. Especially if an Appointment applies a Project, and therefore to all Users assigned to the Project.
So it appears to me on the info received thus far, plus my suggestions (which have not been confirmed, so this one is thin ice), that Appointments are made at the lower level, the Task level, and may well apply to all Users assigned to the Task.
There may be a second type of Appointment, at the Project level, which applies to the distinct set of all Users assigned to all Tasks in the Project.
As long as my suggestions above are correct, particularly that Users are assigned to Tasks, if an Appointment is made at the Task level, it applies to all Users assigned to that Task, then there are no Associative ("mapping") Tables at all.
IDs cannot provide row uniqueness. How do you ensure row uniqueness, as demanded for relational databases ?
As you can see, stamping an ID column on every table that is perceived in the first draft of the model cripples, prevents, the data modelling exercise. You need 10 to 12 drafts. Somewhere around the fifth, you will assign Keys. At 9 or 10, you will assign IDs to the few tables (if any) that need them.
Assigning IDs first guarantees a first draft implementation in an RFS, which means no database integrity, no database capability.
Consider, confirm/dent, discuss, etc.
Here's a diagram to use as a discussion platform. Please use the link at the bottom of it, and familiarise yourself with the Notation, to whatever level you see fit.
Project Management ERD • Second Draft
One suggestion may not sound like a technical one, more like grammar. When describing your entities and their relationships with each other, do not mention or even think about tables, columns or whatever. At the beginning of the design process, they are entities -- not tables, attributes -- not columns. Don't influence the physical design too early.
Do use words that closely match the relationships. For example, I doubt the in the normal course of conversation, one user will ask another if they "have a relationship" with a project. It will be more like "Are you involved in this project?" or "Are you working on this project?" So a user can be involved in many projects and a project can have many users involved in it. Be specific in naming just what the relationship is but you don't have to get anal about it. There could be several close fits -- choose one and go on.
As for mapping tables, when you describe a many-to-many relationship, you don't really have much choice.
You do have a choice in a one-to-many relationship. A task, for example, is "performed for" only one project. This means that the FK to Project can be part of the Task tuple. But you can also implement a one-to-many mapping table. This is generally done when there seems to be at least a possibility that the relationship might evolve into a many-to-many sometime in the future.
The difference between a many-to-many and a one-to-many mapping table is trivial:
create table UserProjectMap(
int UserID not null,
int ProjectID not null,
constraint FK_UserProject_User foreign key( UserID )
references Users( ID ),
constraint FK_UserProject_Project foreign key( ProjectID )
references Projects( ID ),
constraint PK_UserProjectMap primary key( UserID, ProjectID )
);
create table TaskProjectMap(
int TaskID not null,
int ProjectID not null,
constraint FK_TaskProject_Task foreign key( TaskID )
references Tasks( ID ),
constraint FK_TaskProject_Project foreign key( ProjectID )
references Projects( ID ),
constraint PK_TaskProjectMap primary key( TaskID )
);
In case you missed it, it's the last line of each definition.
Converting a one-to-many mapping table to many-to-many is easy -- just drop the unique constraint on one side. Or, in the example above, redefine the PK to include both FK fields. That means no structural changes, which are extremely difficult to do when a design has been in use for any length of time -- unless you've prepared for them ahead of time.
But that's 500-level work.
Oh, one more piece of advice. Don't be too quick to denormalize or make any changes for no better reason than it will make queries or DML easier for the developers. The sole purpose of the database (and your goal as the designer) is to serve the needs of the users, not the db developers. At the top of that list of needs is data integrity. Don't sacrifice data integrity for a little more performance or for ease of maintenance. The DBAs may grumble, but the users will appreciate it -- and it's the users who ultimately pay your salary.

How to better organise database to account for changing status in users

The users I am concerned with can either be "unconfirmed" or "confirmed". The latter means they get full access, where the former means they are pending on approval from a moderator. I am unsure how to design the database to account for this structure.
One thought I had was to have 2 different tables: confirmedUser and unconfirmedUser that are pretty similar except that unconfirmedUser has extra fields (such as "emailConfirmed" or "confirmationCode"). This is slightly impractical as I have to copy over all the info when a user does get accepted (although I imagine it won't be that bad - not expecting heavy traffic).
The second way I imagined this would be to actually put all the users in the same table and have a key towards a table with the extra "unconfirmed" data if need be (perhaps also add a "confirmed" flag in the user table).
What are the advantages adn disadvantages of each approach and is there perhaps a better way to design the database?
The first approach means you'll need to write every query you have for two tables - for everything that's common. Bad (tm). The second option is definitely better. That way you can add a simple where confirmed = True (or False) as required for specific access.
What you could actually ponder over is whether or not the confirmed data (not the user, just the data) is stored in the same table. Perhaps it would be cleaner + normalized to have all confirmation data in a separate table so you left join confirmation on confirmation.userid = users.id where users.id is not null (or similar, or inner join, or get all + filter in server side script, etc.) to get only confirmed users. The additional data like confirmation email, date, etc. can be stored here.
Personally I would go for your second option: 1 users table with a confirmed/pending column of type boolean. Copying over data from one table to another identical table is impractical.
You can then create groups and attach specific access rights to each group and assign each user to a specific group if the need arises.
Logically, this is inheritance (aka. category, subclassing, subtype, generalization hierarchy etc.).
Physically, inheritance can be implemented in 3 ways, as mentioned here, here, here and probably in many other places on SO.
In this particular case, the strategy with all types in the same table seems most appropriate1, since the hierarchy is simple and unlikely to gain new subclasses, subclasses differ by only a few fields and you need to maintain the parent-level key (i.e. unconfirmed and confirmed user should not have overlapping keys).
1 I.e. the "second way" mentioned in your question. Whether to also put the confirmation data in the same table depends on the needed cardinality - i.e. is there a 1:N relationship there?
the Best way to do this is to have a Table for the users with a Status ID as a Foreign Key, the Status Table would have all the different types of Confirmations all the different combinations that you could have. this is the best way, in my opinion, to structure the Database for Normalization and for your programming needs.
so your Status Table would look like this
StatusID | Description
=============================================
1 | confirmed
2 | unconfirmed
3 | CC confirmed
4 | CC unconfirmed
5 | acct confirmed CC unconfirmed
6 | all confirmed
user table
userID | StatusID
=================
456 | 1
457 | 2
458 | 2
459 | 1
if you have a need for the Confirmation Code, you can store that inside the user table. and program it to change after it is used, so that you can use that same field if they need to reset a password or what ever.
maybe I am assuming too much?

How can you represent inheritance in a database?

I'm thinking about how to represent a complex structure in a SQL Server database.
Consider an application that needs to store details of a family of objects, which share some attributes, but have many others not common. For example, a commercial insurance package may include liability, motor, property and indemnity cover within the same policy record.
It is trivial to implement this in C#, etc, as you can create a Policy with a collection of Sections, where Section is inherited as required for the various types of cover. However, relational databases don't seem to allow this easily.
I can see that there are two main choices:
Create a Policy table, then a Sections table, with all the fields required, for all possible variations, most of which would be null.
Create a Policy table and numerous Section tables, one for each kind of cover.
Both of these alternatives seem unsatisfactory, especially as it is necessary to write queries across all Sections, which would involve numerous joins, or numerous null-checks.
What is the best practice for this scenario?
#Bill Karwin describes three inheritance models in his SQL Antipatterns book, when proposing solutions to the SQL Entity-Attribute-Value antipattern. This is a brief overview:
Single Table Inheritance (aka Table Per Hierarchy Inheritance):
Using a single table as in your first option is probably the simplest design. As you mentioned, many attributes that are subtype-specific will have to be given a NULL value on rows where these attributes do not apply. With this model, you would have one policies table, which would look something like this:
+------+---------------------+----------+----------------+------------------+
| id | date_issued | type | vehicle_reg_no | property_address |
+------+---------------------+----------+----------------+------------------+
| 1 | 2010-08-20 12:00:00 | MOTOR | 01-A-04004 | NULL |
| 2 | 2010-08-20 13:00:00 | MOTOR | 02-B-01010 | NULL |
| 3 | 2010-08-20 14:00:00 | PROPERTY | NULL | Oxford Street |
| 4 | 2010-08-20 15:00:00 | MOTOR | 03-C-02020 | NULL |
+------+---------------------+----------+----------------+------------------+
\------ COMMON FIELDS -------/ \----- SUBTYPE SPECIFIC FIELDS -----/
Keeping the design simple is a plus, but the main problems with this approach are the following:
When it comes to adding new subtypes, you would have to alter the table to accommodate the attributes that describe these new objects. This can quickly become problematic when you have many subtypes, or if you plan to add subtypes on a regular basis.
The database will not be able to enforce which attributes apply and which don't, since there is no metadata to define which attributes belong to which subtypes.
You also cannot enforce NOT NULL on attributes of a subtype that should be mandatory. You would have to handle this in your application, which in general is not ideal.
Concrete Table Inheritance:
Another approach to tackle inheritance is to create a new table for each subtype, repeating all the common attributes in each table. For example:
--// Table: policies_motor
+------+---------------------+----------------+
| id | date_issued | vehicle_reg_no |
+------+---------------------+----------------+
| 1 | 2010-08-20 12:00:00 | 01-A-04004 |
| 2 | 2010-08-20 13:00:00 | 02-B-01010 |
| 3 | 2010-08-20 15:00:00 | 03-C-02020 |
+------+---------------------+----------------+
--// Table: policies_property
+------+---------------------+------------------+
| id | date_issued | property_address |
+------+---------------------+------------------+
| 1 | 2010-08-20 14:00:00 | Oxford Street |
+------+---------------------+------------------+
This design will basically solve the problems identified for the single table method:
Mandatory attributes can now be enforced with NOT NULL.
Adding a new subtype requires adding a new table instead of adding columns to an existing one.
There is also no risk that an inappropriate attribute is set for a particular subtype, such as the vehicle_reg_no field for a property policy.
There is no need for the type attribute as in the single table method. The type is now defined by the metadata: the table name.
However this model also comes with a few disadvantages:
The common attributes are mixed with the subtype specific attributes, and there is no easy way to identify them. The database will not know either.
When defining the tables, you would have to repeat the common attributes for each subtype table. That's definitely not DRY.
Searching for all the policies regardless of the subtype becomes difficult, and would require a bunch of UNIONs.
This is how you would have to query all the policies regardless of the type:
SELECT date_issued, other_common_fields, 'MOTOR' AS type
FROM policies_motor
UNION ALL
SELECT date_issued, other_common_fields, 'PROPERTY' AS type
FROM policies_property;
Note how adding new subtypes would require the above query to be modified with an additional UNION ALL for each subtype. This can easily lead to bugs in your application if this operation is forgotten.
Class Table Inheritance (aka Table Per Type Inheritance):
This is the solution that #David mentions in the other answer. You create a single table for your base class, which includes all the common attributes. Then you would create specific tables for each subtype, whose primary key also serves as a foreign key to the base table. Example:
CREATE TABLE policies (
policy_id int,
date_issued datetime,
-- // other common attributes ...
);
CREATE TABLE policy_motor (
policy_id int,
vehicle_reg_no varchar(20),
-- // other attributes specific to motor insurance ...
FOREIGN KEY (policy_id) REFERENCES policies (policy_id)
);
CREATE TABLE policy_property (
policy_id int,
property_address varchar(20),
-- // other attributes specific to property insurance ...
FOREIGN KEY (policy_id) REFERENCES policies (policy_id)
);
This solution solves the problems identified in the other two designs:
Mandatory attributes can be enforced with NOT NULL.
Adding a new subtype requires adding a new table instead of adding columns to an existing one.
No risk that an inappropriate attribute is set for a particular subtype.
No need for the type attribute.
Now the common attributes are not mixed with the subtype specific attributes anymore.
We can stay DRY, finally. There is no need to repeat the common attributes for each subtype table when creating the tables.
Managing an auto incrementing id for the policies becomes easier, because this can be handled by the base table, instead of each subtype table generating them independently.
Searching for all the policies regardless of the subtype now becomes very easy: No UNIONs needed - just a SELECT * FROM policies.
I consider the class table approach as the most suitable in most situations.
The names of these three models come from Martin Fowler's book Patterns of Enterprise Application Architecture.
The 3rd option is to create a "Policy" table, then a "SectionsMain" table that stores all of the fields that are in common across the types of sections. Then create other tables for each type of section that only contain the fields that are not in common.
Deciding which is best depends mostly on how many fields you have and how you want to write your SQL. They would all work. If you have just a few fields then I would probably go with #1. With "lots" of fields I would lean towards #2 or #3.
In addition at the Daniel Vassallo solution, if you use SQL Server 2016+, there is another solution that I used in some cases without considerable lost of performances.
You can create just a table with only the common field and add a single column with the JSON string that contains all the subtype specific fields.
I have tested this design for manage inheritance and I am very happy for the flexibility that I can use in the relative application.
With the information provided, I'd model the database to have the following:
POLICIES
POLICY_ID (primary key)
LIABILITIES
LIABILITY_ID (primary key)
POLICY_ID (foreign key)
PROPERTIES
PROPERTY_ID (primary key)
POLICY_ID (foreign key)
...and so on, because I'd expect there to be different attributes associated with each section of the policy. Otherwise, there could be a single SECTIONS table and in addition to the policy_id, there'd be a section_type_code...
Either way, this would allow you to support optional sections per policy...
I don't understand what you find unsatisfactory about this approach - this is how you store data while maintaining referential integrity and not duplicating data. The term is "normalized"...
Because SQL is SET based, it's rather alien to procedural/OO programming concepts & requires code to transition from one realm to the other. ORMs are often considered, but they don't work well in high volume, complex systems.
The another way to do it, is using the INHERITS component. For example:
CREATE TABLE person (
id int ,
name varchar(20),
CONSTRAINT pessoa_pkey PRIMARY KEY (id)
);
CREATE TABLE natural_person (
social_security_number varchar(11),
CONSTRAINT pessoaf_pkey PRIMARY KEY (id)
) INHERITS (person);
CREATE TABLE juridical_person (
tin_number varchar(14),
CONSTRAINT pessoaj_pkey PRIMARY KEY (id)
) INHERITS (person);
Thus it's possible to define a inheritance between tables.
Alternatively, consider using a document databases (such as MongoDB) which natively support rich data structures and nesting.
I lean towards method #1 (a unified Section table), for the sake of efficiently retrieving entire policies with all their sections (which I assume your system will be doing a lot).
Further, I don't know what version of SQL Server you're using, but in 2008+ Sparse Columns help optimize performance in situations where many of the values in a column will be NULL.
Ultimately, you'll have to decide just how "similar" the policy sections are. Unless they differ substantially, I think a more-normalized solution might be more trouble than it's worth... but only you can make that call. :)

Resources