I want to create a database to store process cycle time data. For example:
Say a particular process for a certain product, say welding, theoretically takes about 10 seconds to do (the process cycle time). Due to various issues, the machine's actual cycle time would vary throughout the day. I would like to store the machine's actual cycle time throughout the day and analyze it over time (days, weeks, months). How would i go about designing the database for this?
I considered using a time series database, but i figured it isn't suitable - cycle time data has a start time and an end time - basically i'm measuring time performance over time - if this even makes sense. At the same time, I was also worried that using relational database to store and then display/analyze time related data is inefficient.
Any thoughts on a good database structure would be greatly appreciated. Let me know if any more info is needed and i will gladly edit this question
You are tracking the occurrence of an event. The event (weld) starts at some time and ends at some time. It might be tempting to model the event entity like so:
StationID StartTime StopTime
with each welding station having a unique identifier. Some sample data might look like this:
17 08:00:00 09:00:00
17 09:00:00 10:00:00
For simplicity, I've set the times to large values (1 hour each) and removed the date values. This tells you that welding station #17 started a weld at 8am and finished at 9am, at which time the second weld started which finished at 10am.
This seems simple enough. Notice, however, that the StopTime of the first entry matches the StartTime of the second entry. Of course it does, the end of one weld signals the start of the next weld. That's how the system was designed.
But this sets up what I call the Row Spanning Dependency antipattern: where the value of one field of a row must be synchronized with the value of another field in a different row.
This can create any number of problems. For example, what if the StartTime of the second entry showed '09:15:00'? Now we have a 15 minute gap between the end of the first weld and the start of the next. The system does not allow for gaps -- the end of each weld also starts the next weld. How should be interpret this gap. Is the StopTime of the first row wrong. Is the StartTime of the second row wrong? Are both wrong? Or was there another row between them that was somehow deleted? There is no way to tell which is the correct interpretation.
What if the StartTime of the second entry showed '08:45'? This is an overlap where the start of the second cycle supposedly started before the first cycle ended. Again, we can't know which row contains the erroneous data.
A row spanning dependency allows for gaps and overlaps, neither of which is allowed in the data. There would need to be a large amount of database and application code required to prevent such a situation from ever occurring, and when it does (as assuredly it will) there is no way to determine which data is correct and which is wrong -- not from within the database, that is.
An easy solution is to do away with the StopTime field altogether:
StationID StartTime
17 08:00:00
17 09:00:00
Each entry signals the start of a weld. The end of the weld is indicated by the start of the next weld. This simplifies the data model, makes it impossible to have a gap or overlap, and more precisely matches the system we are modeling.
But we need the data from two rows to determine the length of a weld.
select w1.StartTime, w2.StartTime as StopTime
from Welds w1
join Welds w2
on w2.StationID = w1.StationID
and w2.StartTime =(
select Max( StartTime )
from Welds
where StationID = w2.StationID
and StartTime < w2.StartTime );
This may seem like a more complicated query that if the start and stop times were in the same row -- and, well, it is -- but think of all that checking code that no longer has to be written, and executed at every DML operation. And since the combination of StationID and StartTime would be the obvious PK, the query would use only indexed data.
There is one addition to suggest. What about the first weld of the day or after a break (like lunch) and the last weld of the day or before a break? We must make an effort not to include the break time as a cycle time. We could include the intelligence to detect such situation in the query, but that would increase the complexity even more.
Another way would be to include a status value in the record.
StationID StartTime Status
17 08:00:00 C
17 09:00:00 C
17 10:00:00 C
17 11:00:00 C
17 12:00:00 B
17 13:00:00 C
17 14:00:00 C
17 15:00:00 C
17 16:00:00 C
17 17:00:00 B
So the first few entries represent the start of a cycle, whereas the entry for noon and 5pm represents the start of a break. Now we just need to append the line
where w1.Status = 'C'
to the end of the query above. Thus the 'B' entries supply the end times of the previous cycle but do not start another cycle.
Related
I want to execute a Job in CRON for every 14 days from a specific date and timezone.
As an e.g. from JUNE 24TH every 14 days in CST time zone.
Run job every fortnight
The easy way
The easiest way to do this is simply to create the task to run every 14 days from when you want it to first run like:
CREATE TASK mytask_fortnightly
WAREHOUSE = general
SCHEDULE = '20160 MINUTE'
AS
SELECT 'Hello world'
How it works
As there are 60 minutes in an hour, 24 hours in a day and 14 days in a fortnight, ergo that's 20,160 minutes.
Caveat
The above solution does not run the task every fortnight from a given date/time, but rather every fortnight from when the task is created.
Even though this is the simplest method, it does require you to be nominally present to create the task at the exact desired next scheduled time.
As a workaround however, you can create a one-shot task to do that for you the very first time at the exact correct date/time. This means you don't have to remember to be awake / alert / present to do it manually yourself, and you can clean up the creation task afterwards.
The harder way.
Other solutions will require you to create a task which gets run every Thursday (since 2021-06-24 is/was a Thursday, each subsequent Thursday will either be the off-week, or the fortnight week)
e.g. SCHEDULE = 'USING CRON 0 0 * * THU'
Then you will add specific logic to it to determine which one the correct fortnight is.
Using this method will also incur execution cost for the off-week as well to determine if it's the correct week.
Javascript SP
In javascript you can determine if it's the correct week or not by subtracting the start date from the current date and if it's not a mutiple of 14 days, use this as a conditional to short circuit the SP.
const deltaMs = (new Date) - (new Date('2021-06-24'));
const deltaDays = ~~(deltaMs / 86400000);
const run = deltaDays % 14 === 0;
if (!run) return;
// ... continue to do what you want.
SQL
You can also check if it's a fortnight using the following SQL condition in a WHERE clause, or IFF / CASE functions.
DATEDIFF('day', '2021-06-24', CURRENT_DATE) % 14 = 0
I want to load one table for data for say 1 month starting from 1 April to 30 April in successive manner.
i.e after loading data for 1 April, date should automatically increment to 2, load the data and increment and so on, till its 30 April.
Also, data of 2 April depends on 1 April data. So i cannot give a date range to load randomly.
How can I do it?
It would be preferable to get the loads done in single session run, instead of running the session for several times.
Sort the source data by date and use a Transaction Control transformation to enforce a commit every time the date changes.
I have a bit of an interesting problem.
I required the cumulative sum on a set that is created by pieces of a Time dimension. The time dimension is based on hours and minutes. This dimension begins at the 0 hour and minute and ends at the 23 hour and 59 minute.
What I need to do is slice out portions from say 09:30 AM - 04:00 PM or 4:30PM - 09:30 AM. And I need these values in order to perform my cumulative sums. I'm hoping that someone could suggest a means of doing this with standard MDX. If not is my only alternative to write my own stored procedure which forms my Periods to date set extraction using the logic described above?
Thanks in advance!
You can create a secondary hiearchy in your time dimension with only the hour and filter the query with it.
[Time].[Calendar] -> the hierarchy with year, months, day and hours level
[Time].[Hour] -> the 'new' hierarchy with only hours level (e.g.) 09:30 AM.
The you can make a query in mdx adding your criteria as filter :
SELECT
my axis...
WHERE ( SELECT { [Time].[Hour].[09:30 AM]:[Time].[Hour].[04:00 PM] } on 0 FROM [MyCube] )
You can also create a new dimension instead of a hierarchy, the different is in the autoexists behaviour and the performance.
I'm writing an application that indexes data for our stores, some of which are open late (8 am - 2 am). We need to be able to search this database quickly -- basically, to run a query to find which stores are open at a given point in time (now, Sunday at 1 am, whatever).
In addition, the open/close times can vary day-by-day -- some stores are closed on Sundays, for example.
The obvious solution to me would be to make a table where I have a row with the store ID, day, open time, and close time. For something like Monday, 8 am - 2 am, that would actually be two rows, one for Monday 0800 - 2400, and one for Tuesday 0000 - 0200.
We have a lot of stores, so the search has to perform well (basically, the data has to be index-friendly), but I'll also have to display this data back out in a human-readable format. With my current solution, that'd look something like this:
Monday: 8:00 - Midnight
Tuesday: Midnight - 2:00 am; 8:00 am - Midnight
I'm just wondering if anybody else has alternative solutions before I jump right to an implementation. Thanks!
When PBS (the US Public Broadcasting System) faced this same problem a couple of years ago, they invented the idea of the "30 hour day" -- Where 00:00 is midnight at the start of the day, 24:00 is midnight at the end of the day, 25:00 is 1am the next day, 30:00 is 6am the next day. That way Mon closing time of 26:00 is 2am Tues morning.
Rather than two records representing a single store's times for a day, it may be more object oriented to think of the "store day" as the object. That way 1 record = 1 store's times for a day. If you want to store the two sets of open/close times, just use four fields in the record instead of two--and adjust your queries appropriately.
Remember that your queries should use a library/api that you write and publish. The library will then deal with the data store and its data layout. No one but your library should be looking at the db directly.
Time zones are very important in this sort of app too. (Hopefully) at some point, the store chain will expand to cover more than one time zone. You'll then need to determine the local time of the query. -- May not the same as the time zone of your server which is handling the queries.
Further thoughts--
I now see that you're standardizing to GMT. Good. You could also use datetime values (vs time values) and standardize to a given week in time. Eg open time is Sun Jan 1, 1995 10am - Mon Jan 2, 1995 2am (using Jan 1, 1995 as a base since it was a Sunday).
Then rationalize your "current time and date" to match the same point in the week of Jan 1, 1995. Then query to find open store days.
HTH,
Larry
I've 2 columns called record time and unload time which is in time format AM/PM and I require a new column called total time where I need to find difference between unload time and record time...
for example here is my table
record time unload time
11:37:05 PM 11:39:09 PM
11:44:56 PM 1:7:23 AM
For this I require a new column which finds the difference between these 2 columns.
Cab anyone suggest a query for this please?
why you cant go with datediff system function in SQL SERVER
select datediff(mi,'11:37:05 PM','11:39:09 PM')
mi/n is for minute
If you're doing timespan calculations within one 24 hour period, anishmarokey's response is correct. However, I'd add the date to the time field as well, if you're going to have cases where the load and unload might occur over midnight between two or more days.