Database Design for a Person's Availability - sql-server

I am currently working on a web application that stores information of Cooks in the user table. We have a functionality to search the cooks from our web application. If a cook is not available on May 3, 2016, we want to show the Not-Bookable or Not-Available message for that cook if user performs the search for May 3, 2016. The solution we have come up to is to create a table named CooksAvailability with following fields
ID, //Primary key, auto increment
IDCook, //foreign key to user's table
Date, //date he is available on
AvailableForBreakFast, //bool field
AvailableForLunch, //bool field
AvailableForDinner, //book field
BreakFastCookingPrice, //decimal nullable
LunchCookingPrice, //decimal nullable
DinnerCookingPrice //decimal nullable
With this schema, we are able to tell if the user is available for a specific date or not. But the problem with this approach is that it requires a lot of db space i.e if a cook is available for 280 days/year, there has to be 280 rows to reflect just one cook's availability.
This is too much space given the fact that we may have potentially thousands of cooks registered with our application. As you can see the CookingPrice fields for breakfast, lunch and dinner. it means a cook can charge different cooking rates for cooking on different dates and times.
Currently, we are looking for a smart solution that fulfils our requirements and consumes less space than our solution does.

You are storing a record for each day and the main mistake, which led you to this redundant design was that you did not separate the concepts enough.
I do not know whether a cook has an expected rate for a given meal, that is, a price one can assume in general if one has no additional information. If that is the case, then you can store these default prices in the table where you store the cooks.
Let's store the availability and the specific prices in different tables. If the availability does not have to store the prices, then you can store availability intervals. In the other table, where you store the prices, you need to store only the prices which deviate from the expected price. So, you will have defined availability intervals in a table, specific prices when the price differs from the expected one in the oter and default meal price values in the cook table, so, if there is no special price, the default price will be used.

To answer your question I should know more about the structure of the information.
For example if most cooks are available in a certain period, it could be helpful to organize your availability table with
avail_from_date - avail_to_date, instead of a row for each day.
this would reduce the amount of rows.
The different prices for breakfast, lunch and dinner could be stored better in the cooks table, if the prices are not different each day. Same is for the a availability for breakfast, lunch and dinner if this is not different each day.
But if your information structure makes it necessary to keep a record for every cook every day this would be 365 * 280 = 102,200 records for a year, this is not very much for a sql db in my eyes. If you put the indexes at the right place this will have a good performance.

There are a few questions that would help with the overall answer.
How often does availability change?
How often does price change?
Are there general patterns, e.g. cook X is available for breakfast and lunch, Monday - Wednesday each week?
Is there a normal availability / price over a certain period of time,
but with short-term overrides / differences?
If availability and price change at different speeds, I would suggest you model them separately. That way you only need to show what has changed, rather than duplicating data that is constant.
Beyond that, there's a space / complexity trade-off to make.
At one extreme, you could have a hierarchy of configurations that override each other. So, for cook X there's set A that says they can do breakfast Monday - Wednesday between dates 1 and 2. Then also for cook X there's set B that says they can do lunch on Thursday between dates 3 and 4. Assuming that dates go 1 -> 3 -> 4 -> 2, you can define whether set B overrides set A or adds to it. This is the most concise, but has quite a lot of business logic to work through to interpret it.
At the other extreme, you just say for cook X between date 1 and 2 this thing is true (an availability for a service, a price). You find all things that are true for a given date, possibly bringing in several separate records e.g. a lunch availability for Monday, a lunch price for Monday etc.

Related

How do I persist sales price in an orders details table?

I'm writing a simple transactional database to practice my T-SQL skills.
If I sell an umbrella in my sales.orderdetails table and it's getting the current retailprice of that umbrella from the items table and putting it in the invoice, how do I keep from having incorrect historical report data 6 months from now when I jack up the retail price of the umbrella by $10?
How do i store that umbrella sold price in the orderdetails table so it's unaffected by any changes in the items table in the future?
I know you can use an SCD for a datawarehouse for this kind of issue but was wondering how to do it in an OLTP system. Computed persisted column? Can't seem to get that to work in the object explorer when I try to enter the items.retailprice as the computed value for the salesorderdetails.cost column.
The way I have seen this done in the past, without using a technique like SCD, was to have the order detail have the price that was charged and then use a foreign key to another table, possibly products or productprices, that contains the current price.
In a full-on transactional system, you'd want the order detail row to record full retail (MSRP, or what have you), current price (in case you had the item posted at a discount that day), and price charged (in case the customer used a promo/coupon code to reduce the price themselves). Unless you log all three, you're at the mercy of whatever the price changes to tomorrow or next week or next year, which makes for bad analytics.
You probably also want to capture current cost of goods, too, since that's subject to change over time, especially in an average costing scenario. Otherwise, margin calculations will be suspect.
But then, yes, a foreign key or keys to any other descriptive tables for those less ephemeral characteristics of the product.

Database Design Advice For Fees and Payments

Below is the current design of school fees and payment I have created.
I'm just a little stuck right now because I can't model the payment/transaction table. Also, I would like to know your thoughts and comments with my current design. This is the first time I'll create a database for fees and payments.
Main tables of my concern are schoolyearfee_lt , student_fee_lt and Payment
I'm thinking of using the Payment table to store the sum of all fees on which will be divided to whatever payment term was chosen (monthly,quarterly,annual,cash).
Let's say for instance, Monthly was chosen as payment plan.
amountToPayPerMonth = (sumOfAllAssignedFees / paymentterm) - downpayment
Where 11 inserts of the amountToPayPerMonth to payment table will be executed and 1 downpayment. 11+1 = 12 months
How do I mark it as paid? Should I use another Transaction table?
Is this a good design? Any thoughts or advice?
Thank you.
some (personal) thoughts about your design and question.
1- schoolyearfee_lt. It seems 1-to-n with fee_mt. If i well understand the same fee can be applied for several shoolyears, category etc, but the amount for a fee does not change. It means, for example, that every year in which fee amounts change i should create -at least- a new fee and some shoolyearfee. I believe something can be reviewed here. I could from example move some of its fields (schoolyear?) to the fee_mt table, and/or i could move the amount from fee_mt to schoolyearfee_lt. There are also some more possibilities, i.e. making a table fee_years_lt, where storing year and amount (and maybe other factors which change the fees) and so on. Maybe you could make some of these changes, maybe none, depends on your design and requirements. The questions may be: is the amount changing by year (i believe yes), gradelevel, feetype, feecategory or not ? You want a master fee that you want applicable forever or your fees are recreated each year from scratch ?
2- Payment. I would call it exactly with its meaning: payment_plan. I would add a field paid, a field payment_date and a field schoolyear (in current design).
3- Student_fee_lt and schoolyear. In the current design you better add the year too. Depending on the mode the fee_mt is managed (see above) i would put it in the PK too. Moving the year to the fee_mt, you don't need. Is student_fee_lt really needed (it seems the result of a query + the field date_effective) ?
4- Payment Formula. The downpayment and consequently the formula is a little nclear to me. Is it a kind of discount for every payment you do or is it a fixed amount ? In the latter you should review your formula. Why 11 payments ?
5- Choosen Payment Plan storage. I would have a table where storing the chosen payment plan by the student (and some other data), this should not be student_fee_lt, because it stores the single fees assigned for each student.

Dimensional model to capture Sales weighting on different date schedules

We have a requirement to come up with a strategy to show Sales revenue data weighted by dates differently on different schedules.
We currently have a FactSales table with a grain of one row per order with the measure of sales amount. We have separate DimDate and DimTime dimensions,and a DimBusinessUnit dimension with one row for each entity within the organization.
In DimDate we have a flag for the major US holidays so we know reduced sales revenue may be expected. This flag would apply globally.
The ask is that different business units might have slow revenue days. For example, Monday's might be slow in one business unit, and Friday's slow in another. For analysis it is desireable to capture these different schedules with a flag or a weighting.
Ultimately this probably be reflected as a projected sales amount in a calculated measure.
How can I best add this weighting? Does it belong in the Date dimension, Business Unit dimension, or maybe a degenerate dimension in the Fact table, or something else altogether?
The DimDate is probably not a good place to keep this information, as each Business Unit (BU) may have a different schedule, so quite possibly you will have to have a flag on each of the dates per a combination of BU and a slow day. So for example if BU1 and BU2 has a slow day on Monday, each Monday in your DimDate will have to have a way showing that it's slow for BU1 and BU2.
The Dimension BU, might be a better place, as schedule is specific to each of the unit. So you may opt for extending your dim by adding 7 days as an attributes and flag them as slow or not using for example false or true flags. You could also have one attribute with the bit mask i.e. 0100000 where position of the value corresponds to the day i.e. M T W T F S S and 0 is not slow and 1 is slow, so in this example T is a slow day.
This will also allow you to trace a history if you whish selecting relevant SCD process.
Another option may be a separate Dimension i.e. DimSchedule and Factless Fact Table.
http://www.kimballgroup.com/data-warehouse-business-intelligence-resources/kimball-techniques/dimensional-modeling-techniques/factless-fact-table/
I hope this helps.
Your situation seems to be the same as the Multiple National Calendars problem described by Kimball:
http://www.kimballgroup.com/1998/12/think-globally-act-locally/
Where Kimball is describing holidays in the left-most table, you could also add a "slow day" flag.

Searching based on a changing value in Solr

Imagine you're creating a website that allows people to search for rental cars based on price, amongst other things. Some rental cars are more popular at certain times of the year than others, so their price varies based on date. For instance, a car might cost $90/day most of the year except for December & March, when it costs $110/day, and in January & February it costs $130/day. Is it possible to have a calculated field in Solr, so you can search for a car that costs less than $X/day over the duration of your rental? I'm new to Solr, so have absolutely no idea whether this is possible or not - sorry if I'm asking a dumb question.
One possibility that I've come across would be to index the item once for each price, and have start and end dates for each of these. This copes with the price changes, but won't work for rentals that cross price boundaries; for example, a customer might want to rent a car for one week in February and two weeks in March - we'd end up not finding the car in this case.
I'm using Solr 3.5. Is it possible to do this using a FunctionQuery? I've seen some documentation on them, but all the examples I've seen are using them to return a computed value, rather than performing a search on that computed value. If I can't do this using a FunctionQuery, how could I do it?
I think this might be possible:
In your index you can have different types of columns containing the different prices of different seasons. In your query, you need need to use a product on the price during a certain season against the number of days in that season, and sum it to the product on the price during the other season against the number of days in that season. Yes, you'll have to use FunctionQuery. The means of providing how many days falls into which season that you'll probably have to do on the client that is calling the solr web service. Furthermore, you can try to apply a filter query on the result to pick out the amount that the user is willing to pay.

How to store timetables?

I'm working on a project that must store employees' timetables. For example, one employee works Monday through Thursday from 8am to 2pm and from 4pm to 8pm. Another employee may work Tuesday through Saturday from 6am to 3pm.
I'm looking for an algorithm or a method to store these kind of data in a MySQL database. These data will be rarely accessed so it's not important performance questions.
I've thought to store it as a string but I don't know any algorithm to "encode" and "decode" this string.
As many of the comments indicate, it's usually a poor idea to encode all the data into a string that is basically meaningless to the data base. It's usually better to define the data elements and their relations and represent these structures in the data base. The Wikipedia article on data models is a good overview of what's involved (although it's way more general than what you need). The problem you are describing seems simple enough that you could do this with pencil and paper.
One way to start is to write down a lists of logical relationships between concepts in your problem. For instance, the list might look like this (your rules may be different):
Every employee follows a single schedule.
Every employee has a first and last name, as well as an employee ID. Different employees may have the same name, but each employee's ID is unique to that employee.
A schedule has a start and stop day of the week and a start and stop time of day.
The start and stop time is the same for every day of the schedule.
Several employees may be on the same schedule.
From this, you can list the nouns used in the rules. These are candidates for entities (columns) in the data base:
Employee
Employee ID
Employee first name
Employee last name
Schedule
Schedule start day
Schedule start time
Schedule end day
Schedule end time
For the rules I listed, schedules seem to exist independently of employees. Since there needs be a way of identifying which schedule an employee follows, it makes sense to add one more entity:
Schedule ID
If you then look at the verbs in the rules ("follows", "has", etc.), you start to get a handle on the relationships. I would group everything so far into two relationships:
Employees
ID
first_name
last_name
schedule_ID
Schedules
ID
start_day
start_time
end_day
end_time
That seems to be all that's needed by way of data structures. (A reasonable alternative to start_day and end_day for the Schedules table would be a boolean field for each day of the week.) The next step is to design the indexes. This is driven by the queries you expect to make. You might expect to look up the following:
What schedule is employee with ID=xyz following?
Who is at work on Mondays at noon?
What days have nobody at work?
Since employees and schedules are uniquely identified by their respective IDs, these should be the primary fields of their respective tables. You also probably want to have consistency rules for the data. (For instance, you don't want an employee on a schedule that isn't defined.) This can be handled by defining a "foreign key" relationship between the Employees.schedule_ID field and the Schedules.ID field, which means that Employees.schedule_ID should be indexed. However, since employees can share the same schedule, it should not be a unique index.
If you need to look up schedules by day of week and time of day, those might also be worth indexing. Finally, if you want to look up employees by name, those fields should perhaps be indexed as well.
Assuming you're using PHP:
Store a timetable in a php array and then use serialize function to transform it in a string;
to get back the array use unserialize.
However this form of memorization is almost never a good idea.

Resources