Database Design avoid redundancy - database

Database design is new to me, sorry for potential basic question.
The database I am designing is supposed to represent the following :
My users can be either contractors or administrators
contractors can issue withdrawals
administrators can issue refunds (to refund withdrawals)
a refund is linked to 1 or many (non cancelled) withdrawals
a withdrawal is linked to 0 or 1 refund
contractors can cancel withdrawals before they are linked to a refund
I therefore created 3 tables, USER ,WITHDRAWAL and REFUND.
As property of WITHDRAWAL records I created status , which represents if the concerned withdrawal is ( pendingrefund or cancelled or refunded).
Aiming to make a good database design, I would like to avoid redundancy and potential data missmatches.
Is creating that status property a good idea and why?
Indeed, one could deduce that withdrawal is refunded if it is linked to a refund (null or not).
On the other side, I should as well create a "isCancelled" field for withdrawal to know if it is cancelled. But that could lead to withdrawals being linked to refund while being cancelled (which is not possible)
Would you have any advice to help me making good design choices for this situation?

Related

Database design for Workflow based system

I am designing a database for a sales company having a complex workflow. The flow starts at Sales Officer, then go to Team Lead and finally Manager. Before approving a proposal, manager will send it to the Division Business Analyst. After getting remarks from dba, he may send the proposal back to sales officer for modification in the proposal. The manager may also reject the proposal. If satisfied, manager will forward it to Director, Sales. The tables designed so far as follows:-
Table: ProposalBasicData
Id, Title, ProposalDate, Scope, Objective
Table: ProposalState
Id, Name
(Values - Forwarded , Approved , Returned , Rejected)
Table: UserType
Id, Name
(Values - SalesOfficer, TeamLead, Manager , DBA, DirectorSales)
Table: WorkFlow
Id, StartUserType, NextUserType, StateId, IsActive
Table: RequestAction
Id, ProposalId, WorkFlowId, UserId, ActionDate
Please suggest regarding the design.
There are many questions raised by such problems. Ex:
in your tables, Workflow defines transitions from state to state as changing the assignment from one user to the next.
this can be an issue. Lets say the user is sick, leaving, on vacation, ... Then your flow is blocked.
It does not allow for the group concept either.
others (like I) would define a transitions table. StartState, NextState.
The workflow would be a list of transitions.
Each transition requires the user to be of a certain type. Or from a user management point of view, have a certain role, or be member of a group.
If your workflow is fixed and is not subject to change, your method could be ok. But if the workflow is flexible or might be changed / adapted, you should go with something more flexible.
The type of setup you are talking about make me think of the Jira software (form Atlassian), where you define tickets, with status, workflows and users. Is it possible for you to use (i.e purchasse or OpenSource) a workflow management tool? It might be cheaper in the long run than building one.
Your model will potentially be expanded to include:
clients. Which client is the proposal for?
who is the sales representative or account manager who is responsible for auditing the workflow and moving it forward?
link to other systems: how does it link to purchasing, accounts receivable, ...
All this to day, this requires an in-depth analysis which is hard to do on a medium such as this (SO).
EDIT: 20181004
I added the following model following your comment. I decided to put the workflow(s) in the database:
Notes (tables in alphabetical order):
Employee
Each employee can be linked to n number of EmployeeRole via the Employee_has_EmployeeRole table.
Proposal
An Employee is linked as the Sales Officer, since he initiates a proposal.
A workflow is linked since many workflows could exist for different proposals.
Transition
Linked to State twice. The start state and the end state.
An EmployeeRole is linked, to identify which role an employee must have to perform this transition.
Enforcing that will be done by the application.
Workrlow_has_Transition
Links Transitions to Workflows.
The Employee who completed the transition is recorded here.
The date it was done is also kept here.
The OrderInWorkflow is just a number that allows you to order Workflow_has_Transition entries.
The application will have to make sure a Transition is not done before the others of lower order are done (i.e. DoneDate is null).
It will also validate that the Employee trying to complete it has the proper EmployeeRole.
Now, the employee group concept. You can say that a group are employees with the same EmployeeRole. Therefore, when a notification needs to be sent by your application, send it to all users with the required role for the Transition. This avoids you having to create an EmployeeGroup table, which links employees together.
Application scenarios:
- Start a Proposal
- Verify that the user trying to start a new one has the role "Sales Officer"
- Collect basic information.
- Link the Sales Officer to it (current user).
- Link a Workflow to it. Only propose the workflows which have at least 1 Workflow_has_Transition.
- Send a notification to the Employee(s) which have the EmployeeRole for the first Workflow_has_Transition for this new Workflow.
- These employees receive a notification.
- Progressing through the workflow
- An employee receives a notification about a Proposal and it's "todo" Transition.
- Employee views Proposal and Workflow (use the OrderInWorkflow to ORDER BY Transitions).
- Employee approves if he has the proper EmployeeRole, fill DoneBy_idEmployee and DoneDate.
While going through your application scenarios, you will find gaps or missing items.
Ex.1 do you want to record the rejection of a Transition? How would that be handled then? You send a notification to the employees with the role for that Transition to review it?
Ex.2 do you want to keep the complete history of the proposal? Ex. it Transition X is rejected twice, but approved the third time around.
There are many scenarios like this which will show the weaknesses of your model, which you fix as you complete this analysis. Now it is not perfect, I did not put a lot of time on it. But it is a starting point and illustrates my idea.
I would suggest you structure some where along this
ProposalBasicData {PBID,Title, ProposalDate,Scope,Objective}
ProposalState {PSID,Name}
UserType {UTID,Name}
User {UID,Name,UTID(Usertype UTID FK) }
Request{ RID, StartUID, StartDate ,PSID, IsActive }
RequestAction {AID,RID, RequesterUID, ReceiverUID, Date, Comments, IsCompleted }
as I think there could be multiple users of the same type, more over you would want to have comments on why it a RequestAction was rejected, A requester would make a requestAction to a receiver and if its completed it would show in the system make life easier when handling multiple requestActions of the same Request.
Hope this helps but i suggest you make a flow chart and look at what are all the possible use cases.

Can or Should an ERD Action involve more than 2 Entities?

This is an problem about drawing ERD in one of my course:
A local startup is contemplating launching Jungle, a new one stop
online eCommerce site.
As they have very little experience designing and implementing
databases, they have asked you to help them design a database for
tracking their operations.
Jungle will sell a range of products, and they will need to track
information such as the name and price for each. In order to sell as
many products as possible, Jungle would like to display short reviews
alongside item listings. To conserve space, Jungle will only keep
track of the three most recent reviews for each product. Of course, if
an item is new (or just unpopular), it may have less than three
reviews stored.
Each time a customer buys something on Jungle, their details will be
stored for future access. Details collected by Jungle include
customer’s names, addresses, and phone numbers. Should a customer buy
multiple items on Jungle, their details can then be reused in future
transactions.
For maximum convenience, Jungle would also like to record credit card
information for its users. Details stored include the account and BSB
numbers. When a customer buys something on Jungle, the credit card
used is then linked to the transaction. Each customer may be linked to
one or more credit cards. However, as some users do not wish to have
their credit card details recorded, a customer may also be linked to
no credit cards. For such transactions, only the customer and product
will be recorded.
And this is the solution:
The problem is the Buys action connect with 3 others entities: Product, Customer, and Card. I find this very hard to read and understand.
Is an action involving more than 2 entities common in production? If it is, how should I understand and use it? Or if it's not, what is the better way of design for this problem?
While the bulk of relationships in practice are binary relationships, ternary and higher relationships are normal elements of the entity-relationship model. Some examples are supplies (supplier_id, product_id, region_id) or enrolled (student_id, course_id, semester_id). However, they often get converted into entity sets via the introduction of a surrogate identifier, due to dislike of composite keys or confusion with network data models in which only directed binary relationships are supported.
Reading cardinality indicators on non-binary relationships are a common source of confusion. See my answer to designing relationship between vehicle,customer and workshop in erd diagram for more info on how I handle this.
Your solution has some problems. First, Buys is indicated as an associative entity, but is used like a ternary relationship with an optional role. Neither is correct in my opinion. See my answer to When to use Associative entities? for an explanation of associative entities in the ER model.
Modeling a purchase transaction as a relationship is usually a mistake, since relationships are identified by the (keys of the) entities they relate. If (CustomerID, ProductID) is identifying, then a customer can buy a product only once, and only one product per transaction. Adding a date/time into the relationship's key is better, but still problematic. Adding a surrogate identifier and turning it into a regular entity set is almost certainly the best course of action.
Second, the Crow's foot cardinality indicators are unclear. It looks like customers and products are optional in the Buys relationship, or even as if multiple customers could be involved in the same transaction. There are three different concepts involved here - optionality, participation and cardinality - which should preferably be indicated in different ways. See my answer to is optionality (mandatory, optional) and participation (total, partial) are same? for more on the topic.
A card is optional for a purchase transaction. From the description, it sounds as if cards may participate totally, meaning we won't store information about a card unless it's used in a transaction. Furthermore, only a single card can be related to each transaction.
A customer is required for a purchase transaction, and it sounds like customers may participate totally, meaning we won't store information about customers unless they purchase something. Only a single customer can be related to each transaction.
Products are required for a purchase transaction, and since we'll offer products before they're bought, products will participate partially in transactions. However, multiple products can be related to each transaction.
I would represent transactions for this problem with something like the following structure:
I'm not saying converting a ternary or higher relationship into an entity set is always the right thing to do, but in this case it is.
Physically, that would require two tables to represent (not counting Customer, Product, Card or ProductReview) since we can denormalize TransactionCustomer and TransactionCard into Transaction, but TransactionProduct is a many-to-many relationship and requires its own table (as do ternary and higher relationships).
Transaction (TransactionID PK, TransactionDateTime, CustomerID, CardID nullable)
TransactionProduct (TransactionID PK, ProductID PK, Quantity, Price)

Database design for a wedding photography booking system website

I'm really stuck on the best way to implement this database.
Here is my problem: The database is to store information on wedding photography clients.
A user can sign up to my site, enter the details of their wedding and get their own "wedding profile page". They can do this without having to have us shoot their wedding.
At any point the user can book a meetup, wedding, or engagement shoot. The website will check if we are available. (weddings take preference over meetups so a client that wants to book a wedding on a day we have a meetup, the meetup will be flagged for a reschedule)
We must also be able to book days off. On these days a wedding/meetup/engagement shoot can not be booked.
engagement shoots cost and upfront fee. With weddings, a deposit is due within 14 days or the date is freed up again. Meetups are free.
I am so stuck with how to implement this system. I just keep going round in circles, the best way I can think is to have a "dates" table, that links all the other tables but I'm sure this is not the most efficient way.
I think what is putting me off is the fact that there can be multiple weddings on the same day (for people who just want a wedding profile), but only one BOOKED wedding per day.
So have I got this completely wrong? or do I store all appointments in one table and use a "type of appointment" table.
SO SO Stuck, I hope you can help me!
P.S. I have missed most fields out to make it simpler to understand.
Keep it simple to begin with. You've identified many of the nouns that should correspond to entities:
USERS
WEDDINGS
ENGAGEMENT_SHOOTS
MEETUPS
UNAVAILABLE
DATES
PAYMENTS
I'd suggest that WEDDINGS, ENGAGEMENT_SHOOTS, MEETUPS and UNAVAILABLE are all type of bookings. You could have just:
USERS
BOOKINGS - this has a date and perhaps a status
BOOKING_TYPE (wedding, meetup, engagement_shoots, unavailable)
PAYMENTS
You may wish to package up all on the things a use has bought relating to a single wedding into a TRANSACTION entity etc which will allow a single payment to be mapped to multiple bookings. When one of your staff isn't available they could have a booking of type holiday etc.
There may be 2 types of USERS - your photography staff who are assigned to a booking and your customer.
Based on the status of the booking, eg. CONFIRMED, PENDING, COMPLETED and the date your business logic could send out payment demands, followups etc.
Create a simple model and start adding data you will soon see the gaps and issues if any. You should have test cases up front to ensure that the data model supports all the scenarios for your application, best to do this with a set of use cases.

Suggestion on database design - multiple tables involved into a relation

My application needs to implement a one to one relation between multiple tables. I have a table which store companies (which can be customers and suppliers, or both). There are twi Bit fields, Customer and Supplier.
Then I have different tables for various operations: Invoices, Bank operations, Cashdesk operations. An I need to pair payments with invoices. A payment is not exact amount of an invoice, but it can be split over each number of invoices. Also, an invoice can be split over multiple payments. Payments can be from both bank or cashdesk operations
My original approach was to have a table, PaymentRelations, with Foreign Keys InvoiceID, BankOpID, CashOpID and Amount, and for any payment between between them, I create a record with only two foreign ID's filled, and the corresponding amount. This way in any moment I can know for each operation (invoice or payment) how much was paid.
Also there are RI requirements, so if a document is involved in payment relation, it cannot be deleted (or there is cascade delete, so if a payment of invoice document is deleted, the related PaymentRelations records are deleted, so the counterpart operations are freed - they are no longer involved into payment relations so their amount can be fully used into other payment relations).
But appeared another situation. Since partners can be both customers and suppliers, it is possible to compensate between same type of operation on customer and supplier side of the same partner (e.g. a partner is both customer and supplier, he made an invoice as supplier for 100 and received an invoice as customer for 150, 50 was compensated between the received and the sent invoice and the rest of each is paid through one or multiple payment operations).
This can also happen for the other operations (e.g. he paid through a bank operation 100, he received through another bank operation 200, and 50 needs to be compensated between those two operations; same apply for caskdesk operations).
What approach would you use to model this kind of relations?
I would buy accounting software instead of writing it. Some wheels are worth reinventing; this isn't one of them.
But if you must . . .
Bitfields are the wrong way to identify customers and suppliers. This SO answer should get you over the issues with customers and suppliers.
If I had to design an accounting system, I think I'd start with a spreadsheet. I'd design a table of transactions in that spreadsheet, so I could get the feel of how certain transactions were alike, and how others were different. At this stage, I wouldn't worry about NULLs, about repeating groups, about transitive dependencies, or anything else like that.
Having developed a working(ish) model in the spreadsheet, I'd then try to normalize it to 5NF.

Accounting database design question "accounts" and "transactions"

Option 1) user can have multiple "accounts" (eg payment, deposits, withdraws etc)
Option 2) user can only 1 single account, and transaction has types (payment, deposit, withdraw)
Both option will work just fine! They both can produce the same result! However option 1 uses more resources, but it's more flexible, option 1 is not flexible but uses less resources!
What is the question ?
Option 1 is a piece of garbage, that no accountant can use, and no auditor will pass. payment, deposits, withdraws eta re transactions, not "accounts". So what if it uses less resources. SO do cave men.
Option 2 starts to look like an accounting system, with (a) accounts and (b) transactions against accounts, as expected in most developed countries.
So there is no choice.
Start from a journal table, that link to a chart of accounts table. In your journal table is where you will store fieldnames for transaction date, account code, description, amount. chart of account table is where you will store fieldnames like account code, account type (balance sheet or profit and loss),account status (active or inactive). For greater details on the accounting database schema, download derek liew's book on accounting database design.

Resources