Design question: How would you design a recurring event system? [closed] - calendar

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
If you were tasked to build an event scheduling system that supported recurring events, how would you do it? How do you handle when an recurring event is removed? How could you see when the future events will happen?
i.e. When creating an event, you could pick "repeating daily" (or weekly, yearly, etc).
One design per response please. I'm used to Ruby/Rails, but use whatever you want to express the design.
I was asked this at an interview, and couldn't come up with a really good response that I liked.
Note: was already asked/answered here. But I was hoping to get some more practical details, as detailed below:
If it was necessary to be able to comment or otherwise add data to just one instance of the recurring event, how would that work?
How would event changes and deletions work?
How do you calculate when future events happen?

I started by implementing some temporal expression as outlined by Martin Fowler. This takes care of figuring out when a scheduled item should actually occur. It is a very elegant way of doing it. What I ended up with was just a build up on what is in the article.
The next problem was figuring out how in the world to store the expressions. The other issue is when you read out the expression, how do those fit into a not so dynamic user interface? There was talk of just serializing the expressions into a BLOB, but it would be difficult to walk the expression tree to know what was meant by it.
The solution (in my case) is to store parameters that fit the limited number of cases the User Interface will support, and from there, use that information to generate the Temporal Expressions on the fly (could serialize when created for optimization). So, the Schedule class ends up having several parameters like offset, start date, end date, day of week, and so on... and from that you can generate the Temporal Expressions to do the hard work.
As for having instances of the tasks, there is a 'service' that generates tasks for N days. Since this is an integration to an existing system and all instances are needed, this makes sense. However, an API like this can easily be used to project the recurrences without storing all instances.

I've had to do this before when I was managing the database end of the project. I requested that each event be stored as separate events. This allows you to remove just one occurrence or you could move a span. It's a lot easier to remove multiples than to try and modify a single occurrence and turn it into two. We were then able to make another table which simply had a recurrenceID which contained the information of the recurrence.

#Joe Van Dyk asked: "Could you look in the future and see when the upcoming events would be?"
If you wanted to see/display the next n occurences of an event they would have to either a) be calculated in advance and stored somewhere or b) be calculated on the fly and displayed. This would be the same for any evening framework.
The disadvantage with a) is that you have to put a limit on it somewhere and after that you have to use b). Easier just to use b) to begin with.
The scheduling system does not need this information, it just needs to know when the next event is.

When saving the event I would save the schedule to a store (let's call it "Schedules" and I'd calculate when the event was to fire the next time and save that as well, for instance in "Events". Then I'd look in "Events" and figure out when the next event was to take place and go to sleep until then.
When the app "wakes up" it would calculate when the event should take place again, store this in "Events" again and then perform the event.
Repeat.
If an event is created while sleeping the sleep is interrupted and recalculated.
If the app is starting or recovering from a sleep event or similar, check "Events" for passed events and act accordingly (depending on what you want to do with missed events).
Something like this would be flexible and would not take unnecessary CPU cycles.

Off the top of my head (after revising a couple things while typing/thinking):
Determine the minimum recurrence-resolution needed; that's how often the app runs. Maybe it's daily, maybe every five minutes.
For each recurring event, store the most recent run time, the run-interval and other goodies like expiration time if that's desirable.
Every time the app runs, it checks all events, comparing (today/now + recurrenceResolution) to (recentRunTime + runInterval) and if they coincide, fire the event.

When I wrote a calendar app for myself mumble years ago, I basically just stole the scheduling mechanism from cron and used that for recurring events. e.g., Something taking place on the second Saturday of every month except January would include the instruction "repeat=* 2-12 8-14 6" (every year, months 2-12, the 2nd week runs from the 8th to the 14th, and 6 for Saturday because I used 0-based numbering for the days of the week).
While this makes it quite easy to determine whether the event occurs on any given date, it is not capable of handling "every N days" recurrence and is also rather less than intuitive for users who aren't unix-savvy.
To deal with unique data for individual event instances and removal/rescheduling, I just kept track of how far out events had been calculated for and stored the resulting events in the database, where they could then be modified, moved, or deleted without affecting the original recurrent event information. When a new recurring event was added, all instances were immediately calculated out until the existing "last calculated" date.
I make no claim that this is the best way to do it, but it is a way, and one which works quite well within the limitations I mentioned earlier.

If you have a simple reoccuring event, such as daily, weekly or a couple days a week, whats wrong with using buildt in scheduler/cron/at functionallity? Creating an executable/console app and set up when to run it? No complicated calendar, event or time management.
:)
//W

Related

Recurring Events Database Model

I've being searching for a solution for recurring events, so far I've found two approaches:
First approach:
Create an instance for each event, so if the user has a daily event for one year, it would be necessary 365 rows in the table.
It sounds plausible for a fixed time frame, but how to deal with events that has no end date?
Second approach:
Create a Reccuring pattern table that creates future events on runtime using some kind of Temporal expression (Martin Fowler).
Is there any reason to not choose the first approach instead of the second one?
The first approach is going to overpopulate the database and maybe affect performance, right?!
There's a quote about the approach number 1 that says:
"Storing recurring events as individual rows is a recipe for disaster." (https://github.com/bmoeskau/Extensible/blob/master/recurrence-overview.md)
What do you guys think about it? I would like some insights on why that would be a disaster.
I appreaciate your help
The proper answer is really both, and not either or.
Setting aside for a moment the issue of no end date for recurrence: what you want is a header that contains recurrence rules for the whole pattern. That way if you need to change the pattern, you've captured that pattern in a single record that can be edited without risking update anomalies.
Now, joining against some kind of recurrence pattern in SQL is going to be a great big pain in the neck. Furthermore, what if your rules allow you to tweak (edit, or even delete) specific instances of this recurrence pattern?
How do you handle this? You have to create an instance table with one row per recurring instance with a link (foreign key) back to the single rule that was used to create it. This let's you modify an individual child without losing sight of where it came from in case you need to edit (or delete) the entire pattern.
Consider a calendaring tool like Outlook or Google Calendar. These applications use this approach. You can move or edit an instance. You can also change the whole series. The apps ask you which you mean to do whenever you go into an editing mode.
There are some limitations to this. For example, if you edit an instance and then edit the pattern, you need to have a rule that says either (a) new parent wins or (b) modified children always win. I think Outlook and Google Calendar use approach (a).
As for why having each instance recorded explicitly, the only disastrous thing I can think of would be that if you didn't have the link back to the original recurrence pattern you would have a heck of a time cancelling the whole series in one action.
Back to no end date - This might be a case of discretion being the better part of valour and using some kind of rule of thumb that imposes a practical limit on how far into the future you extend such a series - or alternatively you could just not allow that kind of rule in a pattern. Force an end to the pattern and let the rule's creator worry about extending it at whatever future point it becomes necessary.
Store the calendar's event as a rule rather than just as a materialized event.
Storing recurring event materialized as a row is a recipe for disaster for the apparent reason, that the materialization will ideally be of infinite length. Since endless length table is not possible, the developer will try to mimic that behavior using some clever, incomprehensive trick - resulting in erratic behavior of the application.
My suggestion: Store the rules and materialize them and add as rows, only when queried - leading to a hybrid approach.
So you will have two tables store your information, first for storing rules, second, for storing rows materialized from any rule in the rules' table.
The general guidelines can be:
For a one-time event, add a row to the second table.
For a recurring event, add a row to the first table and materialize some of into the second table.
For a query about a future date, materialize the rules and save them in the second table.
For a modification of a specific instance of a recurring event, materialize the event up till the instance you want to modify, and then modify the last instance and store it.
Further, if the event is too far in the future, do not materialize it. Instead save it as a rule also and execute it later when the time arrives.
Plain tables will not be enough to store what you are trying to save. Keeping this kind of information in the database is best maintained when supported with Stored Procedures for access and modifications.
from the answers in the blog post and answers here:
1- eat DB storage and memory with these recurrences (with no need) , with the extreme case of "no-end date"
2- impact performance (for query / join / update / ...)
3- in case of update (or generally in any case you need to handle the recurrence set as a set not as individual occurrences) , you will need to update all rows

CRISP-DM - Timing for each of the tasks?

I have what may be a simple question.
So, using CRISP-DM we have 6 tasks which have to be followed.
How to identify the amount of time needed for each of the tasks?
P.S. As assumption, for Data Collection we need 3 days.
This is the question, how it's looks like.
There is no general rule.
Every project is very different.
For example, one project may already have all its data, and thus need 0 days to get the data.
Usually, there will be some manager preventing access to the data you need, and then it will take at least 6 months and C-level activity to get the data to you. And absolutely no progress will be possible before seeing the data.
So just plan 0-12 months on every step.
Also, don't forget that it is an iterative process, so you will need to restart again, anyway. In my opinion, CRISP-DM is dead. Business people love it because it gives them the impression of "managing" things, but it doesn't work that way in reality, it is just theater you do for the managers.

How to handle achievments/badges/awards for your APP with minimum hit to system?

I like the concept of badges and achievements for a website I am designing. They have been proven to improve ussage/utilization rates and I think could be a large motivator for an app I'd like to develop.
At a high level I can think of 3 ways to do this.
Check for members who meet requirements as a cron job: This doesn't seem like a good idea to me, as the membership grows, the cron job would take longer and longer to do.
Every time an action is completed that could meet the requirements for a badge, check to see if any badges should be awarded: This seems like a good way to do it, but it seems like I could potentially pound the server continuously checking on badges that have already been awarded or that the user may not even be close to.
Every time the user completes an action that could get a badges, check to see if they already have it then check if they meet the requirements: This seems alright as well, but if I'm storing the user as an object, it seems like it could get prohibitively large, or that I may end up hitting the database pretty hard checking for achievements all the time.
Are there any options I'm missing? Are my concerns for one or more approaches overblown?
Edit:
Is this a far less interesting question than I thought it was, or did I ask at a bad time? Did I leave something unclear?
Or combine two of your ideas:
Every time the user completes an action that could get a badge, put the user in a list (if he was not there already) and process this list frequently using cron.
This way you do not have to check each time the user completes an action and you can keep the cron job reasonable.
Of course there are variants: like processing the list when it reaches a certain amount. Or partially check the requirements before adding the user to the list.
I suppose this would depend on the amount of users, the available actions that can be completed, etc.

How to calculate time left in a count down scenario

We have a requirement to count down based on a user taking a test. What would be a best way to tackle tracking the time taken by a user while taking the test.
We do capture start time, end time. But the calculations go awry if the application server or the OS goes down during the test. We were thinking of using another variable to store the current time after the user submits an answer to the question. So (end time - current time) would reasonably account for the amount of time left.
Is there an effective way to calculate the "time left" in such cases other than the one mentioned above?
We would like the solution to be database agnostic as possible
To be specific, I'll continue with MYSQL.
As you may stated, you have captured start time. When the test loaded by the user, write this timestamp in a DATETIME field. Another option is that using UNIX_STAMP. And then, when user submits the answer, you may easily put this data to another DATETIME field.
As well as other rdbms systems, mysql got the date-time manipulation functions.
SELECT CURRENT_TIMESTAMP(); query returns current timestamp. eg. '2007-12-15 23:50:26'
SELECT UNIX_TIMESTAMP(); query returns current unix timestamp which may be easy to calculate difference. eg. 1111885200
Also we have got DATE_SUB() and DATE_ADD() functions for addition, subtraction operations.
Please visit date-time manual page for details. I guess this information will lead you to a proper solution.
--
Added on Sep 18:
You may use javascript to track user behaviour. For instance, a function calls a server side script with a salt or something you have in session. That server side script records the current timestamp as "last update". Database parts same as above.
I have written such a set of exams using a countdown timer. As unfortunately there were times when power cuts happened frequently in the computer lab, I had to add code to handle this. Basically, the exam program saves its state (the answers and the time elapsed) in an .ini file every 30 seconds. When the exam program starts, it checks to see whether such an ini file exists - if so, it carries on from where the program ceased (in terms of which questions have yet to be answered and how long remains), otherwise the program begins anew.
In order to make the exams standalone and thus independent of any server, all the questions and options were exported to a resource file, which was then included in the build of the exam itself.

Best Practices: Storing a workflow state of an item in a database? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 2 years ago.
Improve this question
I have a question about best practices regarding how one should approach storing complex workflow states for processing tasks in a database. I've been looking online to no avail, so I figured I'd ask the community what they thought was best.
This question comes out of the same "BoxItem" example I gave in a prior question. This "BoxItem" is being tracked in my system as various tasks are performed on it. The task may take place over several days and with human interaction, so the state of the BoxItem must be persisted. Who did the task (if applicable), and when the task was done must also be tracked.
At first, I approached this by adding three fields to the "BoxItems" table for every human-interactive task that needed to be done:
IsTaskNameComplete
DateTaskNameComplete
UserTaskNameComplete
This worked when the workflow was simple... but now that it has grown to a complex process (> 10 possible human interactions in the flow... about half of which are optional, and may or may not be done for the BoxItem, which resulted in me beginning to add "DoTaskName" fields as well for those optional tasks), I've found that what should've been a simple table now has 40 or so field devoted entirely to the retaining of this state information.
I find myself asking if there isn't a better way to do it... but I'm at a loss.
My first thought was to make a generic "BoxItemTasks" table which defined the tasks that may be done on a given box, but I still would need to save the Date and User information individually, so it didn't really help.
My second thought was that perhaps it didn't matter, and I shouldn't worry if this table has 40 or more fields devoted to state retaining... and maybe I'm just being paranoid. But it feels like that's a lot of information to retain.
Anyways, I'm at a loss as far as what a third option might be, or if one of the two options above is actually reasonable. I can see this workflow potentially getting even more complex in the future, and for each new task I'm going to need to add 3-4 fields just to support the tracking of it... it feels like it's spiraling out of control.
What would you do in this situation?
I should note that this is maintenance of an existing system, one that was built without an ORM, so I can't just leave it up to the ORM to take care of it.
EDIT:
Kev, are you talking about doing something like this:
BoxItems
(PK) BoxItemID
(Other irrelevant stuff)
BoxItemActions
(PK) BoxItemID
(PK) BoxItemTaskID
IsCompleted
DateCompleted
UserCompleted
BoxItemTasks
(PK) TaskType
Description (if even necessary)
Hmm... that would work... it would represent a need to change how I currently approach doing SQL Queries to see which items are in what state, but in the long term something like this looks like it would work better (without having to make a fundamental design change like the Serialization idea represents... though if I had the time, I'd like to do it that way I think.).
So is this what you were mentioning Kin, or am I off on it?
EDIT: Ah, I see your idea as well with the "Last Action" to determine the current state... I like it! I think that might work for me... I might have to change it up a little bit (because at some point tasks happen concurrently), but the idea seems like a good one!
EDIT FINAL: So in summation, if anyone else is looking this up in the future with the same question... it sounds like the serialization approach would be useful if your system has the information pre-loaded into some interface where it's queryable (i.e. not directly calling the database itself, as the ad-hoc system I'm working on does), but if you don't have that, the additional tables idea seems like it should work well! Thank you all for your responses!
If I'm understanding correctly, I would add the BoxItemTasks table (just an enumeration table, right?), then a BoxItemActions table with foreign keys to BoxItems and to BoxItemTasks for what type of task it is. If you want to make it so that a particular task can only be performed once on a particular box item, just make the (Items + Tasks) pair of columns be the primary key of BoxItemActions.
(You laid it out much better than I did, and kudos for correctly interpreting what I was saying. What you wrote is exactly what I was picturing.)
As for determining the current state, you could write a trigger on BoxItemActions that updates a single column BoxItems.LastAction. For concurrent actions, your trigger could just have special cases to decide which action takes recency.
As the previous answer suggested, I would break your table into several.
BoxItemActions, containing a list of actions that the work flow needs to go through, created each time a BoxItem is created. In this table, you can track the detailed dates \ times \ users of when each task was completed.
With this type of application, knowing where the Box is to go next can get quite tricky, so having a 'Map' of the remaining steps for the Box will prove quite helpful. As well, this table can group like crazy, hundreds of rows per box, and it will still be very easy to query.
It also makes it possible to have 'different paths' that can easily be changed. A master data table of 'paths' through the work flow is one solution, where as each box is created, the user has to select which 'path' the box will follow. Or you could set up so that when the user creates the box, they select tasks are required for this particular box. Depends on our business problem.
How about a hybrid of the serialization and the database models. Have an XML document that serves as your master workflow document, containing a node for each step with attributes and elements that detail it's name, order in the process, conditions for whether it's optional or not, etc. Most importantly each step node can have a unique step id.
Then in your database you have a simple two table structure. The BoxItems table stores your basic BoxItem data. Then a BoxItemActions table much like in the solution you marked as the answer.
It's essentially similar to the solution accepted as the answer, but instead of a BoxItemTasks table to store the master list of tasks, you use an XML document that allows for some more flexibility for the actual workflow definition.
For what it's worth, in BizTalk they "dehydrate" long-running message patterns (workflows and the like) by binary serializing them to the database.
I think I would serialize the Workflow object to XML and store in the database with an ID column. It may be more difficult to report on, but it sounds like it may work in your case.

Resources