Dimensional modelling few questions - data-modeling

I am getting familiar with Dimensional model, so started looking at health claims process. I am trying to acheive the following:
1) ability to report claims by patient by speciality and service provider (monthly, quarterly and yearly)
2) claims by referring provider by service provider
3) claims by monthly payments received for (1) and (2)
4) claims by month of services for (1) and (2)
Here is the dimsion model:
FactClaims
Charge Amount
Payment Amount
Service Date Key (FK)
Payment Date Key (FK)
Patient Key (FK)
Service Provider Key (FK)
Facility Key (FK)
Referred Provider Key (FK)
Dimension Tables:
DimServiceProvider
ServiceProviderID (SK)
Service Provider Name
Speciality
DimPatient
PatientID (SK)
Name
Address
DimDate
DimFacility
FacilityID (SK, PK)
FacilityName
FacilityRegion
FacilityState
Questions:
1) Should i separate fact tables for Charges and Payments?
2) Not sure whether I am thinking correct for Referred Provider Key (which also points to DimServiceProvider)
3) Any rule of thumb to combine some of the dimension tables or separate them? what are the rules to combine Dimension tables or keep them separate?

Whether separate payments and charges depends on what kind of reports you are going to run. Also, did you consider payments/charges to the insurance, to secondary insurance, if applicable, and to the patient/person responsible for the patient?
If you keep Referred Provider Key, you should provide a special value for self-reffed patients.
There are no dimensions in your model that might be considered for consolidation.

Related

System that keeps track of inventory with a history table

I have a system that lets members rent equipment and the system should have a history of each item that was rented and by who. The system should also track who has what equipment rented/checked out and should also sort the equipment by type, status, name, etc. Lastly it should also send out notification email on equipment that are overdue.
I'm trying to understand the relationships and how I should model this. As of now my current tables and thinking is something like this:
Member Table:
Id (PK)
MemberId
FirstName
LastName
Email
EquipmentItem Table:
Id (PK)
EquipmentName
EquipmentType (FK)
EquipmentStatus (FK)
TotalQuantity
RemainingQuantity
EquipmentStatus Table:
Id (PK)
StatusName
EquipmentType Table:
Id (PK)
TypeName
EquipmentRentalHistory Table:
Id (PK)
MemberId (FK)
EquipmentId (FK)
CheckOutDate
ReturnedDate
1) I want to know the relationships between these would the rental history be a many to many relationship between the Member table and EquipmentItem table?
2) Would EquipmentItem table have a one to many relationship between the status and type, the way I see it is EquipmentItems can have many statuses or types but each status or each type can only belong to one EquipmentItem.
3) Does it make sense to have a quantity field in the EquipmentItem, I used to work in a grocery store so I'm basing the logic on barcodes where same products would usually have the same barcode e.g. (Cheetos Puff chips) all Cheetos Puff chips would have the same barcode but would have a quantity value on it. Or would it be better to have each item unique regardless if it's the same product/model?
My logic would be:
member rents out item
system logs it into the history table
system then checks how many of the same item has been checked out so far, if say we have total quantity of 4 on that item and 3 members has checked it out
we update the remaining quantity field to the difference so in this case to 1
system can then track who has what checked out by returning all records with a returned date of null
system will then check all records with a returned date of null and then do a date range on the checked out date to determine if the equipment is overdue
send notification to member emails associated with said records from step 6
I would just like some help better understanding the relationship between these and if I have modelled my tables correctly, if not, it would be great if someone can point me in the right direction of improving upon this.
To answer your questions
With respect to modelling in an ERD, I don't think that qualifies as a many-to-many relationship, but rather, EquipmentRentalHistory is its own entity that has a many-to-many relationship with both Member and EquipmentItem.
A many-to-many would be more like, "a Member has access to 0...n EquipmentItems, and each EquipmentItem can be accessed by 0...n Members".
I would disagree that they are a one-to-many relationships.
An oxygen tank and a pair of flippers can both be classified as 'Scuba Gear' and have the status 'Checked Out'.
You could have multiple 'Scuba Gear' tags and assign each unique 'Scuba Gear' tag to its very own EquipmentItem, but then you'll just be creating new tags for every new EquipmentItem, rather than reusing existing ones.
That really depends on whether you want to identify exactly which piece of equipment a member rented (maybe something is damaged you can track down everyone who rented that specific one?). If you do differentiate, then every item will just be its own row. You should also add a new column as an external identifier, but there would be no need to keep a tally.
If it's all the same to you, then I would only keep the total but not the available. If you kept the available column, then you would constantly have to update it whenever something is logged in EquipmentRentalHistory. This would be annoying if the tables fall out of sync. You could just query EquipmentRentalHistory for the Id of the equipment, and count up the entries where returnedDate IS NULL for the number of equipment that is currently in use
Additional Note
It might be good to have a 'due date' column in the rental history rather than hard code the date calculation in case you want to varying due dates. This way you can also grant extensions.

Generalization in Database

I Need to Design a database for a system where there's Customers and Vendors but they both are related to entity called Users where every user is either a customer or a vendor .
the problem is that Vendors are related to other entities that Customers aren't .
so how can I design such a database ?
The other entities will store the ID of the Vendor as a foreign key. And Vendors and Customers are not going to be in the same table anyway*, so it's not like the two have IDs that might be used at the same time for that.
Also, to add, the Foreign Key you require for User could be managed as an add/edit trigger if your DB of choice allows it. That way you can make sure that the Vendor id used for those related entities isn't a User ID linked to Customers. (...WHERE userid NOT IN (SELECT userid FROM users WHERE customer = TRUE))
* Customers and Vendors have different properties/fields so shouldn't be in the same table.
You could have Vendors and Customers have a relationship to a User table.
user
===========
userId
name
vendor
===========
vendorId
companyName
userId
customer
===========
customerId
source
userId
Then you can link to both customers and vendors from the same table, yet they can still share the same common data in the user table. In fact, a customer could also be a vendor.
Your question could be generalized as follows: how do I express subclasses in relational tables?
For the generic answer, see this:
https://stackoverflow.com/tags/class-table-inheritance/info

Using Multiple Databases

A company is hired by another company for helping in a certain field.
So I created the following tables:
Companies: id, company name, company address
Administrators: (in relation with companies) id, company_id, username, email, password, fullname
Then, each company has some workers in it, I store data about workers.
Hence, workers has a profession, Agreement Type signed and some other common things.
Now, the parent tables and data in it for workers (Agreement Types, Professions, Other Common Things) are going to be the same for each company.
Should I create 1 new database for each company? Or store All data into the same database?
Thanks.
Since "Agreement Types", "Professions" are going to be same for each company, I would suggest to have a lookup table like "AgreementTypes" with columns such as "ID", "Type" and refer "ID" column in "Workers" table. I don't think new database is required, relational databases are used to eliminate data redundancy and create appropriate relationships between entities.
By imagining having one database for one company, it ends up with having one record in "Company" table in each database. "Administrators" & "Workers" are associated with that single record. And other common entities such as "AgreementTypes" will be in other tables.
So, if there is any addition/modification to agreement type, it is difficult to do it in all databases. Similarly, if there is any new entity to be linked to "Company" entity, again all databases needs to be revisited based on assumption that these entities belong to ONE application.
You should have one single database, with a structure something like this (this is somewhat over-simplified, but you get the idea):
Companies
CompanyID PK
CompanyName
CompanyAddress
OtherCompanySpecificData
Workers
WorkerID PK
CompanyID FK
LastName
FirstName
DOB
AgreementTypeID FK
ProfessionID FK
UserID FK - A worker may need more than one user account
Other UserSpecificData
Professions
ProfessionID PK
Profession
OtherProfessionStuff
AgreementType
AgreementTypeID PK
AgreementTypeName
Description
OtherAgreementStuff
Users
UserID PK -- A Worker may need more than 1 user account
WorkerID FK
UserName
Password
AccountStatus
Groups
GroupID PK
GroupName
OtherGroupSpecificData
UserGroups --Composite Key with UserID and GroupID
UserID PK
GroupID PK
Obviously, things will grow a little more complex, and I don't know your requirements or business model. For example, if companies can have different departments, you may wish to create a CompanyDepartment table, and then be able to assign workers to various departments.
And so on.
The more atomic you can make your data structures, the more flexible your database will be as it grows. Google the term Database Normalization, and specifically the Third Normal Form (3NF) for a database (Considered the minimum for efficient database design).
Hope that helps. Feel free to elaborate if you are stuck - there is a lot of great help here on SO.

Three-way Referential Integrity - SQL Server 2008

I am building a database using SQL Server 2008 to store prices of securities that are traded on multiple markets.
For a given market, all the securities have the same holiday calendar. However, the holiday calendars are different from market to market.
I want the following four tables: Market, GoodBusinessDay, Security, and SecurityPriceHistory, and I want to enforce that SecurityPriceHistory does not have rows for business days when the market on which a security was traded was closed.
The fields in the tables are as follows:
Market: MarketID (PK), MarketName
GoodBusinessDay: MarketID (FK),
SettlementDate (the pair is the PK)
Security: SecurityID (PK), MarketID
(FK), SecurityName
SecurityPriceHistory: This is the
question - my preference is
SecurityID, SettlementDate,
SecurityPrice
How can I define the tables this way and guarantee that for every row in SecurityPriceHistory, there is a corresponding row in GoodBusinessDay?
If I added a column for MarketID to SecurityPriceHistory. I could see how I could do this with two foreign keys (one pointing back to Security and one pointing to GoodBusinessDay), but that doesn't seem like the right way to do it.
This model should do. The relationship between Market and BusinessDay is identifying, that is, a businessday does not exist outside the context of the market to which it belongs.
Similarly, the relationship between BusinessDay and SecurityPriceHistory is identifying, as it the relationship between Security and SecurityPriceHistory.
This means that the primary key of SecurityPriceHistory is composite: security_id,market_id and settlement_date.
This will enforce the constraint that each security may be have no more than one row in SecurityPriceHistory for a given market/business day. It does, however, allow for the same security to trade in multiple markets, despite the security's relationship to a particular market: to restrict that, the relationship between Market and Security needs to be identifying, thus:
SecurityPriceHistory could have a FK to GoodBusinessDay and NO FK to Market. You could figure out the market by joining to the GoodBusinessDay table. I don't really like this option, but it's a possibility.
You could also use a trigger to check and make sure that there's a proper GoodBusinessDay record on insert/update, otherwise reject the transaction.
I think you're going to need the marketID field in your securityPriceHistory table, assuming the same securityID could be sold in different markets on the same goodBusinessDay.
Setting it up the way you're thinking is fine. You have a parent, two related children, then a grandchild that has relationships with both children.
Even if a security can only be sold on one market, I'd still include the marketID in the securityPriceHistory table, with two compound FKs, MarketDay and MarketSecurity. Clearer that way, IMO.

Modeling a 1 to 1..n relationship in the database

How would you model booked hotel room to guests relationship (in PostgreSQL, if it matters)? A room can have several guests, but at least one.
Sure, one can relate guests to bookings with a foreign key booking_id. But how do you enforce on the DBMS level that a room must have at least one guest?
May be it's just impossible?
Actually, if you read the question, it states booked hotel rooms. This is quite easy to do as follows:
Rooms:
room_id primary key not null
blah
blah
Guests:
guest_id primary key not null
yada
yada
BookedRooms:
room_id primary key foreign key (Rooms:room_id)
primary_guest_id foreign key (Guests:guest_id)
OtherGuestsInRooms:
room_id foreign key (BookedRooms:room_id)
guest_id foreign key (Guests:guest_id)
That way, you can enforce a booked room having at least one guest while the OtherGuests is a 0-or-more relationship. You can't create a booked room without a guest and you can't add other guests without the booked room.
It's the same sort of logic you follow if you want an n-to-n relationship, which should be normalized to a separate table containing a 1-to-n and an n-to-1 with the two tables.
In this context I suggest that the entity you are modeling is in fact a BOOKING - a single entity - rather than two entities of room and guest.
So the table would be something like
BOOKING
-------
booking id
room id
guest id (FK to table of guests for booking)
first date of occupancy
last date of occupancy
Where guest id is not nullable,
and you have another table to hold guests per booking...
GUESTS
------
guest id
customer id (FK to customer table)
You could designate one of the guests as the "primary" guest and have it map to a column on the Rooms table. Of course, this is a ridiculous rule for a hotel, where it's perfectly valid to have a room with 0 guests (I very well could pay for a room and not stay there)...
I think what you mean is that a room BOOKING is for at least one guest. ANSI standard SQL would allow you to express the constraint as an ASSERTION something like:
create assertion x as check
(not exists (select * from booking b
where not exists
(select * from booking_guest bg
where bg.booking_id = b.booking_id)));
However, I don't suppose Postgres supports that (I'm not sure any current DBMS does).
There is a way using materialized views and check constraints, but I've never seen this done in practice:
1) Create a materialised view as
select booking_id from booking b
where not exists
(select * from booking_guest bg
where bg.booking_id = b.booking_id);
2) Add a check constraint to the materialized view:
check (boooking_id is null)
This constraint will fail if ever the materialized view is not empty, i.e. if there is a booking with no associated guest. However, you would need to be careful about the performance of this approach.
What about a room which has not been rented out? What you're looking for are reservations and a reservation presumably needs at least one guest on it.
I think what you're asking is whether you can guarantee that a reservation record is not added unless you have at least one guest for and you can't add a guest without a reservation. It's a bit of a Catch-22 for most DBMSs systems.
I'd say you should create a bookings table with three primary keys. But instead of referring to bookings rooms, you can refer to a beds table.
bookings:
bed_id: foreign_key primary
guest_id: foreign_key primary
day: date primary
bill_id: foreign_key not null
beds:
room_id: foreign_key primary
Since being the primary implies being required, and since this is the only way a guest and a room can be related, it makes sure that there cannot be a booking without a guest.
Notice that there is only one day field. This requires that you create a booking for every day a guest will stay in a room, but also ensures that nothing will be accidentally booked twice. A bed can be booked by only one customer on any given day(which is not true for rooms)
The bill_id is there so that you can refer a booking to a specific record for a bill, which can also be referenced by other things such as minibar expenses.

Resources