Database Architecture: Multiple Type Relations - database

I'm building a Laravel application that has "listings". These listings can be things like boats, planes, and automobiles; each with their own specific fields.
I will also have an images table that should relate to each type of listing and a users table that needs to map to each type of listing. I'm trying to determine the best way to map each listing type back to images and users.
One way I've thought of doing this was having separate boats, planes and automobiles tables with their specific fields and then having specific boat_images plane_images and automobile_images tables to map to each respective type. But then relating each type to a user would be a bit tricky.
I don't think one giant listing table with all fields I'd ever use through these 3 (which could grow in size later) would make sense --- and I also don't believe having a general metadata field that has a JSON object full of specifications for each listing would work well either when I want to have a searchable database.
I know of pivot tables, but I'm trying to grasp the overall architecture here. Any help would be greatly appreciated. Thanks!

You could have a listings table, holding only id and name. Boats, planes, automobiles and others should be a subset table.
Each table will have its own entity. And the Listing entity will have multiple hasMany relationships with its subset tables. These relationships will be named like boats(), planes(), etc. Each subset listing entity will hold a single belongsTo relationship.
Using these subset tables should also help to compartmentalize form validation.
You can have a single images table and use a polymorphic relationship towards the listings table. This one is a huge savior.

Related

A Master Category Table Where Records Have Various Categories OR There Should Be A Table For Each Category Type

Recently I encountered an application, Where a Master Table is maintained which contain the data of more than 20 categories. For e.g. it has some categories named as Country,State and City.
So my question is, it is better to move out this category as a separate table and fetching out the data through joins or Everything should be inside a single table.
P.S. In future categories count might increase to 50+ or more than it.
P.S. application based on EF6 + Sql Server.
Edited Version
I just want to know that in above scenario what should be the best approach, one should go with single table with proper indexing or go by the DB normalization approach, putting each category into a separate Table and maintaning relationship through fk's.
Normally, categories are put into separate tables. This conforms more closely with normalized database structures and the definition of entities. In particular, it allows for proper foreign key relationships to be defined. That is a big win for data integrity.
Sometimes categories are put into a single table. This can, of course, be confusing; consider, for instance, "Florida, Massachusetts" or "Washington, Iowa" (these are real places).
Putting categories in one table has one major advantage: all the text is in a single location. That can be very handy for internationalization efforts. To be honest, that is the situation where I have seen this used.

Designing a database with similar, but different Models

I have a system whereby you can create documents. You select the document type to create and a form is displayed. Data is then added to the form, and the document can be generated. In Laravel things are done via Models. I am creating a new Model for each document but I don't think this is the best way. An example of my database :
So at the heart of it are projects. I create a new project; I can now create documents for this project. When I select project brief from a select box, a form is displayed whereby I can input :
Project roles
Project Data
Deliverables
Budget
It's three text fields and a standard input field. If I select reporting doc from the select menu, I have to input the data for this document (which is a couple of normal inputs, a couple of text fields, and a date). Although they are both documents, they expect different data (which is why I have created a Model for each document).
The problems: As seen in the diagram, I want to allow supporting documents to be uploaded alongside a document which is generated. I have a doc_upload table for this. So a document can have one or more doc_uploads.
Going back to the MVC structure, in my DocUpload model I can't say that DocUpload belongs to both ProjectBriefDoc and ProjectReportingDoc because it can only belong to one Model. So not only am I going to create a new model for every single document, I will have to create a new Upload model for each document as well. As more documents are added, I can see this becoming a nightmare to manage.
I am after a more generic Model which can handle different types of documents. My question relates to the different types of data I need to capture for each document, and how I can fit this into my design.
I have a design that can work, but I think it is a bad idea. I am looking for advice to improve this design, taking into account that each document requires different input, and each document will need to allow for file uploads.
You don't need to have a table/Model for each document type you'll create.
A more flexible approach would be to have a project_documents table, where you'll have a project_id and some data related to it, and then a doc_uploads related to the project_documents table.
This way a project can have as many documents your business will ever need and each document can have as many files as it needs.
You could try something like that:
If you still want to keep both tables, your doc_upload table in your example can have two foreign keys and two belongsTo() Laravel Model declarations without conflicts (it's not a marriage, it's an open relationship).
Or you could use Polymorphic Relations to do the same thing, but it's an anti-pattern of Database Design (because it'll not ensure data integrity on the database level).
For a good reference about Database Design, google for "Bill Karwin" and "SQL Antipatterns".
This guy has a very good Slideshare presentation and a book written about this topic - he used to be an active SO user as well.
ok.
I have a suggestion..you don't have to have such a tight coupling on the doc_upload references. You can treat this actually as a stand alone table in your model that is not pegged to a single entity.. You can still use the ORM to CRUD your way through and manage this table..
What I would do is keep the doc_upload table and use it for all up_load references for all documents no matter what table model the document resides in and have the following fields in the doc_upload table
documenttype (which can be the object name the target document object)
documentid_fk (this is now the generic key to a single row in the appropriate document type table(s)
So given a document in a given table.. (you can derive the documenttype based on the model object) and you know the id of the document itself because you just pulled it from the db context.. should be able to pull all related documents in the doc_upload table that match those two values.
You may be able to use reflection in your model to know what Entity (doc type ) you are in.. and the key is just the key.. so you should be able.
You will still have to create a new model Entity for each flavor of project document you wish to have.. but that may not be too difficult if the rate of change is small..
You should be able to write a minimum amount of code to e pull all related uploaded documents into your app..
You may use inheritance by zero-or-one relation in data model design.
IMO having an abstract entity(table) called project-document containing shared properties of all documents, will serve you.
project-brief and project-report and other types of documents will be children of project-document table, having a zero-or-one relation. primary key of project-document will be foreign key and primary key of the children.
Now having one-to-many relation between project-document and doc-upload will solve the problem.
I also suggest adding a unique constraint {project_id, doc_type} inside project-document for cardinal check (if necessary)
As other answers are sort of alluding to, you probably don't want to have a different Model for different documents, but rather a single Model for "document" with different views on it for your different processes. Laravel seems to have a good "templating" system for implementing views:
http://laravel.com/docs/5.1/blade
http://daylerees.com/codebright-blade/

Database design for a product aggregator

I'm trying to design a database for a product aggregator. Each product has information about where it comes from, what it costs, what type of thing it is, price, color, etc. Users need to able to search and filter results based on any of those product categories. I also expect to have a large number of users. My initial thought was having one big table with every product in it with a column for each piece of information and an index on anything I need to be able to search by but I think this might be inefficient with a lot of users pounding on this one table. My other thought was to organize the database to promote a tree-like navigation of tables but because you can search by anything I'm not sure how I would organize the tables.
Any thoughts on some good practices?
One table of products - databases are designed to have lots of users pounding on tables.
(from the comments)
You need to model your data. This comes from looking at the all the data you have, determining what is related to what (a table is called a relation because all the attributes in a row are related to a candidate key). You haven't really given enough information about the scope of what data (unstructured?) you have on these products and how it varies. Are you going to have difficulties because Shoes have brand, model, size and color, but Desks only have brand, model and finish? All this is going to inform your data model. Typically you have one products table, and other things link to it.
Some of those attributes will be foreign keys to lookup tables, others (price) would be simple scalars. Appropriate indexing and you'll be fine. For advanced analytics, consider a dimensionally modeled star-schema, but perhaps not for your live transaction system - depends what your data flow/workflow/transactions are. Or consider some benefits of its principles in your transactional database. Ralph Kimball is source of good information on dimensional modeling.
I dont see any need for the tree structure here. You can do with single table.
if you insist on tree structure with hierarchy here is an example to get you started.
For text based search, and ease of startup & design, I strongly recommend Apache SOLR. The SOLR API is easy to use (especially JSON). Databases do text search poorly, and I would instead recommend that you just make sure that they respond to primary/unique key queries properly, and those are the fields you should index.
One table for the products, and another table for the product category hierarchy (you don't specifically say you have this but "tree-like navigation of tables" makes me think you might).
I can see you might be concerned about over-indexing causing problems if you plan to index almost every column. In that case, it might be best to index on the top 5 or 10 columns you think users are likely to search for, unless it's possible for a user to search on ANY column. In that case you might want to look at building a data warehouse. Maybe you'll want to look into data cubes to see if those will help...?
For hierarchical data, you need a PRODUCT_CATEGORY table looking something like this:
ID
PARENT_ID
NAME
Some sample data:
ID PARENT_ID NAME
1 ROOT
2 1 SOCKS
3 1 HELICOPTER PARTS
4 2 ARGYLE
Some SQL engines (such as Oracle) allow you to write recursive queries to traverse the hierarchy in a single query. In this example, the root of the tree has a PARENT_ID of NULL, but if you don't want this column to be nullable, I've also seen -1 used for the same purposes.

Database design rules to follow for a programmer

We are working on a mapping application that uses Google Maps API to display points on a map. All points are currently fetched from a MySQL database (holding some 5M + records). Currently all entities are stored in separate tables with attributes representing individual properties.
This presents following problems:
Every time there's a new property we have to make changes in the database, application code and the front-end. This is all fine but some properties have to be added for all entities so that's when it becomes a nightmare to go through 50+ different tables and add new properties.
There's no way to find all entities which share any given property e.g. no way to find all schools/colleges or universities that have a geography dept (without querying schools,uni's and colleges separately).
Removing a property is equally painful.
No standards for defining properties in individual tables. Same property can exist with different name or data type in another table.
No way to link or group points based on their properties (somehow related to point 2).
We are thinking to redesign the whole database but without DBA's help and lack of professional DB design experience we are really struggling.
Another problem we're facing with the new design is that there are lot of shared attributes/properties between entities.
For example:
An entity called "university" has 100+ attributes. Other entities (e.g. hospitals,banks,etc) share quite a few attributes with universities for example atm machines, parking, cafeteria etc etc.
We dont really want to have properties in separate table [and then linking them back to entities w/ foreign keys] as it will require us adding/removing manually. Also generalizing properties will results in groups containing 50+ attributes. Not all records (i.e. entities) require those properties.
So with keeping that in mind here's what we are thinking about the new design:
Have separate tables for each entity containing some basic info e.g. id,name,etc etc.
Have 2 tables attribute type and attribute to store properties information.
Link each entity (or a table if you like) to attribute using a many-to-many relation.
Store addresses in different table called addresses link entities via foreign keys.
We think this will allow us to be more flexible when adding, removing or querying on attributes.
This design, however, will result in increased number of joins when fetching data e.g.to display all "attributes" for a given university we might have a query with 20+ joins to fetch all related attributes in a single row.
We desperately need to know some opinions or possible flaws in this design approach.
Thanks for your time.
In trying to generalize your question without more specific examples, it's hard to truly critique your approach. If you'd like some more in depth analysis, try whipping up an ER diagram.
If your data model is changing so much that you're constantly adding/removing properties and many of these properties overlap, you might be better off using EAV.
Otherwise, if you want to maintain a relational approach but are finding a lot of overlap with properties, you can analyze the entities and look for abstractions that link to them.
Ex) My Db has Puppies, Kittens, and Walruses all with a hasFur and furColor attribute. Remove those attributes from the 3 tables and create a FurryAnimal table that links to each of those 3.
Of course, the simplest answer is to not touch the data model. Instead, create Views on the underlying tables that you can use to address (5), (4) and (2)
1 cannot be an issue. There is one place where your objects are defined. Everything else is generated/derived from that. Just refactor your code until this is the case.
2 is solved by having a metamodel, where you describe which properties are where. This is probably needed for 1 too.
You might want to totally avoid the problem by programming this in Smalltalk with Seaside on a Gemstone object oriented database. Then you can just have objects with collections and don't need so many joins.

DB Design Pattern - Many to many classification / categorised tagging

I have an existing database design that stores Job Vacancies.
The "Vacancy" table has a number of fixed fields across all clients, such as "Title", "Description", "Salary range".
There is an EAV design for "Custom" fields that the Clients can setup themselves, such as, "Manager Name", "Working Hours". The field names are stored in a "ClientText" table and the data stored in a "VacancyClientText" table with VacancyId, ClientTextId and Value.
Lastly there is a many to many EAV design for custom tagging / categorising the vacancies with things such as Locations/Offices the vacancy is in, a list of skills required. This is stored as a "ClientCategory" table listing the types of tag, "Locations, Skills", a "ClientCategoryItem" table listing the valid values for each Category, e.g., "London,Paris,New York,Rome", "C#,VB,PHP,Python". Finally there is a "VacancyClientCategoryItem" table with VacancyId and ClientCategoryItemId for each of the selected items for the vacancy.
There are no limits to the number of custom fields or custom categories that the client can add.
I am now designing a new system that is very similar to the existing system, however, I have the ability to restrict the number of custom fields a Client can have and it's being built from scratch so I have no legacy issues to deal with.
For the Custom Fields my solution is simple, I have 5 additional columns on the Vacancy Table called CustomField1-5. This removes one of the EAV designs.
It is with the tagging / categorising design that I am struggling. If I limit a client to having 5 categories / types of tag. Should I create 5 tables listing the possible values "CustomCategoryItems1-5" and then an additional 5 many to many tables "VacancyCustomCategoryItem1-5"
This would result in 10 tables performing the same storage as the three tables in the existing system.
Also, should (heaven forbid) the requirements change in that I need 6 custom categories rather than 5 then this will result in a lot of code change.
Therefore, can anyone suggest any DB Design Patterns that would be more suitable to storing such data. I'm happy to stick with the EAV approach, however, the existing system has come across all the usual performance issues and complex queries associated with such a design.
Any advice / suggestions are much appreciated.
The DBMS system used is SQL Server 2005, however, 2008 is an option if required for any particular pattern.
Have you thought about using an XML column? You can enforce all your constraints declaratively through XSL.
Instead of EAV, have a single column with XML data validated by a schema (or a collection of schemas).
Take a look at this question/answer; describes the observation pattern.
It uses five tables and can be implemented in a "standard" RDBMS -- Sql Server 2005 will do.
No limit on number of custom properties (observations) that an entity can have.
EDIT
If tags (categories) are needed for properties, take a look at this one.
Why not store the custom fields in a key-value table?
| vacancy ID | CustomFieldType | CustomFieldValue |
Then have auxillary tables listing possible values per type (1 table) and may be possible types per vacancy type (it seems to be the original ClientCategory)

Resources