I'd like to get your feedback on my current strategy for updating my office hours scheduling system. I'm rewriting it to avoid laborious entry by allowing employees to have a set schedule rather than individually inputting office hours. Is there an easier/normal way for doing it other than what I am planning below?
I am using a simple calendar setup that receives events in an array from an events table and then displays them in a calendar. I want to rework the way events are added into the events table.
Currently: Employees can select a day, and then add individual office hours (Monday 9:00, 10:00, 11:00, 12:00 , 1:00, 2:00, etc.)
Goal: Employees select a set schedule (ex. Mon-Wed-Fri, from 8:00-5:00PST), and the system automatically adds future dates into the events table for display (one month in advance).
Current plan: Add a schedules table with employee ID, and fields for Mon-Sunday. Daily a CRON job runs that checks what day of the week is 28 days from now. It then queries the schedules to find all employees who have hours scheduled on that day of the week. Adds events into the events table 4 weeks in advance.
Is this a satisfactory way of doing it? Thanks! I'm new in coding, so your feedback before spending a lot of time implementing is helpful!
If their schedules are consistent, it makes good sense for each employee to have a default schedule. And it makes sense to have a cron job update the calendar automatically.
But you might give some thought to what should happen when someone wants to schedule vacation time (or make any arbitrary change to their schedule) more than a month into the future. Same issue when the owner wants to see who's scheduled to work over the summer and finds a blank calendar. (The point is that some people will invariably need to know something that the calendar doesn't yet cover. You need to plan for that. )
Related
My scenario is that I have a list of different companies. I need to find a date within the month to upgrade them onto a new system. However, for each company, the assigned date must fit under certain criteria.
Example
I want to populate the Upgrade date column with an ideal date.
The upgrade date cannot fall between the Last Old Input date and the Last Old Pay date.
The upgrade date must be between the Last Old Pay date and the First New Input date.
Complications:
1. All companies that have the same value under Parent column must have the same upgrade date.
2. There are ~4 weeks in a month. I want the upgrade dates to be spread as evenly as possible throughout the weeks. So as close to 25% of all companies per week. Therefore if there are more than one viable upgrade date possibilities, try to fill out the month as evenly as possible. (If COUNT = 1/4 total companies, go to next week type deal)
3. Within the week itself, I want the upgrade date to be the same day if possible. Example: all on Monday of that week, all on Friday of that week.
So to start:
Is there a way to create an array formula that can read each company one by one, and produce out a list of possible list of dates for that company. So for company 112, I know that 8/27-8/30 are blacked out. I know that the date should be between 8/30 and 9/10. How can I get Excel to auto-populate all the options out (8/31, 9/1, 9/2, 9/3, etc.)?
And then be able to group all companies together that are under the same Parent, and find a common upgrade date that works for all of the companies combined?
This isn't exactly a programming question, as I don't have an issue writing the code, but a database design question. I need to create an app that tracks sales goals vs. actual sales over time. The thing is, that a persons goal can change (let's say daily at most).
Also, a location can have multiple agents with different goals that need to be added together for the location.
I've considered basically running a timed task to save daily goals per agent into a field. It seems that over the years that will be a lot of data, but it would let me simply query a date range and add all the daily goals up to get an goal for that date range.
Otherwise, I guess I could simply write changes (i.e. March 2nd - 15 sales / week, April 12th, 16 sales per week) which would be less data, but much more programming work to figure out goals based on a time query.
I'm assuming there is probably a best practice for this - anyone?
Put a date range on your goals. The start of the range is when you set that goal. The end of the range starts off as max-collating date (often 9999-12-31, depending on your database).
Treat this as "until forever" or "until further notice".
When you want to know what goals were in effect as of a particular date, you would have something like this in your WHERE clause:
...
WHERE effective_date <= #AsOfDate
AND expiry_date > #AsOfDate
...
When you change a goal, you need two operations, first you update the existing record (if it exists) and set the expiry_date to the new as-of date. Then you insert a new record with an effective_date of the new as-of date and an expiry_date of forever (e.g. '9999-12-31')
This give you the following benefits:
Minimum number of rows
No scheduled processes to take daily snapshots
Easy retrieval of effective records as of a point in time
Ready-made audit log of changes
tl;dr general question about handling database data and design:
Is it ever acceptable/are there any downsides to derive data from other data at some point in time, and then store that derived data into a separate table in order to keep a history of values at that certain time, OR, should you never store data that is derived from other data, and instead derive the required data from the existing data only when you need it?
My specific scenario:
We have a database where we record peoples' vacation days and vacation day statuses. We track how many days they have left, how many days they've taken, and things like that.
One design requirement has changed and now asks that I be able to show how many days a person had left on December 31st of any given year. So I need to be able to say, "Bob had 14 days left on December 31st, 2010".
We could do this two ways I see:
A SQL Server Agent job that, on December 31st, captures the days remaining for everyone at that time, and inserts them into a table like "YearEndHistories", which would have your EmployeeID, Year, and DaysRemaining at that time.
We don't keep a YearEndHistories table, but instead if we want to find out the amount of days possessed at a certain time, we loop through all vacations added and subtracted that exist UP TO that specific time.
I like the feeling of certainty that comes with #1 --- the recorded values would be reviewed by administration, and there would be no arguing or possibility about that number changing. With #2, I like the efficiency --- one less table to maintain, and there's no derived data present in the actual tables. But I have a weird fear about some unseen bug slipping by and peoples' historical value calculation start getting screwed up or something. In 2020 I don't want to deal with, "I ended 2012 with 9.5 days, not 9.0! Where did my half day go?!"
One thing we have decided on is that it will not be possible to modify values in previous years. That means it will never be possible to go back to the previous calendar year and add a vacation day or anything like that. The value at the end of the year is THE value, regardless of whether or not there was a mistake in the past. If a mistake is discovered, it will be balanced out by rewarding or subtracting vacation time in the current year.
Yes, it is acceptable, especially if the calculation is complex or frequently called, or doesn't change very often (eg: A high score table in a game - it's viewed very often, but the content only changes on the increasingly rare occasions when a player does very well).
As a general rule, I would normalise the data as far as possible, then add in derived fields or tables where necessary for performance reasons.
In your situation, the calculation seems relatively simple - a sum of employee vacation days granted - days taken, but that's up to you.
As an aside, I would encourage you to get out of thinking about "loops" when data is concerned - try to think about the data as a whole, as a set. Something like
SELECT StaffID, sum(Vacation)
from
(
SELECT StaffID, Sum(VacationAllocated) as Vacation
from Allocations
where AllocationDate<=convert(datetime,'2010-12-31' ,120)
group by StaffID
union
SELECT StaffID, -Count(distinct HolidayDate)
from HolidayTaken
where HolidayDate<=convert(datetime,'2010-12-31' ,120)
group by StaffID
) totals
group by StaffID
Derived data seems to me like a transitive dependency, which is avoided in normalisation.
That's the general rule.
In your case I would go for #1, which gives you a better "auditability", without performance penalty.
I'm currently designing a database that:
1) Has a list of Tasks, such as:
Clean the floor.
Wipe the sink.
Take swabs.
2) Has a list of Areas, such as:
Kitchen.
Servery.
3) Tasks are scheduled against an area, either as "Hourly", "Daily", "Weekly", "Monthly" or "Annually". I'll call this AreaTask (Area, Task, Frequency) :-
Kitchen, Clean the floor, Daily
4) An AreaTask will become due either at the start of a working day (if it is Daily, Weekly, Monthly or Annually), or at the start of the hour if it is Hourly - based on the schedule. For example, if "Clean the floor" is scheduled "Weekly" on Wednesdays, then at the start of each Wednesday it will become Due, and remain Due for the day until it has been done (Worked, Signed off, etc) - or it will become OverDue if it goes beyond a certain time.
5) When work is done against an AreaTask, it is logged in the database (Area, Task, User [whom did the work], DateTime [that the work was done]) : -
Kitchen, Clean the floor, Joe Bloggs, 2012-05-23 10:50:00
Here is what I'm trying to decide:
I can determine the various states of a AreaTask at any particular time by queries alone because all of the data is there (i.e. I can determine that an AreaTask will of become Due on Wednesday, and I can determine that it became overdue if no Work was done against that AreaTask before a set time). However, I'm wondering if instead I should have a AreaTaskDue table that is populated perhaps by a CRON job, or some other means.
This way I have a formal entry in the database to query and store data against, for example:
ScheduledTask (Area, Task, ScheduledDateTime)
Kitchen, Clean the floor, 2012-05-23 06:00:00
This would also allow a task to be scheduled manually should the need arise.
Then when work is done against a ScheduledTask, it can be logged against the ScheduledTask itself:
ScheduledTaskWork (Area,Task,ScheduledDateTime,User,DateTime)
Kitchen, Clean the floor, 2012-05-23 06:00:00, Joe Bloggs, 2012-05-23 11:30:00
I hope that makes some sense.
PS this is for a RDBMS based database - not OO. I tend to use Views to see data from different perspectives.
Thanks.
PS perhaps too the CRON job would mark ScheduledTask as OverDue too rather than determining that. I guess the question is about whether these formal states should be stored in the database, or determined. The only way I can store them is to have some kind of CRON job running (which is fine, as long as I know there isn't a better way).
EDIT: One argument against deriving the state is that the Schedule may change - however I do keep history in the database, so I could still derive - but the more I think about it the more I'm leaning towards using a CRON job to schedule tasks based on the schedule.
Take a look at this model:
Every time a task is started, insert a new row into WORK table. When it's finished, set WORK.COMPLETED_AT.
You can find daily tasks (and their areas) that have not yet been done today like this:
SELECT *
FROM SCHEDULE
WHERE
FREQUENCY = 'daily'
AND NOT EXIST (
SELECT * FROM WORK
WHERE
SCHEDULE.AREA_ID = WORK.AREA_ID
AND SCHEDULE.TASK_ID = WORK.TASK_ID
AND DAY(COMPLETED_AT) = TODAY
)
Replace DAY and TODAY with whatever is specific to your database, and you'd probably want to use integers instead of strings for FREQUENCY.
Similar queries can be devised for other frequencies.
Manually scheduled tasks could be modeled through a table similar to SCHEDULE, but with FREQUENCY replaced by explicit time(s).
I was recently asked an interview question on a hypothetical web based booking system and how I would design the database schema to minimize duplication and maximize flexibility.
The use case is that a admin would enter the availability of a property into the system. There could be multiple time period set. For example, 1st of April 2009 to 14th of April 2009 and 3rd of July 2009 to 21st of July 2009.
A user is then only able to place a booking in the periods made available of equal or shorter periods.
How would you store this information in a database?
Would you use something as simple (really simplified) as;
AVAILABILITY(property_id, start_date, end_date);
BOOKING(property_id, start_date, end_date);
Could you then easily construct a web page that showed a calendar of availability with periods that have been booked blanked out. Would it be easy to build reports from this database schema? Is it as easy as it seems?
It might be easier to work with a single table for both availability and booking, with a granularity of 1 day:
property_date (property_id, date, status);
Column status would have (at least) the following 2 values:
Available
Booked
Entering a period of availability e.g. 1st to 14th of April would entail (the application) inserting 14 rows into property_date each with a status of 'Available'. (To the user it should seem like a single action).
Booking the property for the period 3rd to 11th April would entail checking that an 'Available' row existed for each day, and changing the status to 'Booked'.
This model may seem a bit "verbose", but it has some advantages:
Checking availability for any date is easy
Adding a booking automatically updates the availability, there isn't a separate Availability table to keep in sync.
Showing availability in a web page would be very simple
It is easy to add new statuses to record different types of unavailability - e.g. closed for maintenance.
NB If "available" is the most common state of a property, it may be better to reverse the logic so that there is an 'Unavailable' status, and the absence of a row for a date means it is available.