Calculate Facebook likes, comments, and shares for different time zones from saved UTC - sql-server

I've been struggle with this for a while and hope someone can give me an idea to tackle this.
We have a service that goes out and collects Facebook likes, comments, and shares for each status update multiple times a day. The table that stores this data is something like this:
PostId EngagementTypeId Value CollectedDate
100 1(for likes) 10 1/1/2013 1:00
100 2 (comments) 2 1/1/2013 1:00
100 3 0. 1/1/2013 1:00
100. 1. 12 1/1/2013 3:00
100. 2. 3. 1/1/2013 3:00
100. 3 5. 1/1/2013 3:00
Value holds the total for each engagement type at the time of collection.
I got a requirement to create a report that shows new value per day at different time zones.
Currently,I'm doing the calculation in a stored procedure that takes in a time zone offset and based on that I calculate the delta for each day. If this is for someone in California, the report will show 12 likes, 3 comments, and 5 shares for 12/31/2012. But if someone with the time zone offset of -1, he will see 10 likes on 12/31/2012 and 2 likes on 1/1/2013.
The problem I'm having is doing the calculation on the fly can be slow if we have a lot of data and a big date range. We're talking about having the delta pre-calculated for each day and stored in a table and I can just query from that ( we're considering SSAS but that's for the next phase). But doing this, I would need to have the data for each day for 24 time zones. Am I correct (and if so, this is not ideal) or is there a better way to approach this?
I'm using SQL 2012.
Thank you!

You need to convert UTC DateTime stored in your column to Date based on users UTC time. This way you don't have to worry about any table that has to be populated with data. To get users date from your UTC column you will use something like this
SELECT CONVERT(DATE,(DATEADD(mi, DATEDIFF(mi, GETUTCDATE(), GETDATE()), '01/29/2014 04:00')))
AS MyLocalDate
The select statement above figures out Local date based on the difference of UTC date and local Date. You will need to replace GETDATE() with users DATETIME that is passed in to your procedure and replace '01/29/2014 04:00' with your column. This way when you select any date from your table it will be according to what that date was at users local time. Than you can calculate other fields accordingly.

Related

How to loop informatica sessions

I want to load one table for data for say 1 month starting from 1 April to 30 April in successive manner.
i.e after loading data for 1 April, date should automatically increment to 2, load the data and increment and so on, till its 30 April.
Also, data of 2 April depends on 1 April data. So i cannot give a date range to load randomly.
How can I do it?
It would be preferable to get the loads done in single session run, instead of running the session for several times.
Sort the source data by date and use a Transaction Control transformation to enforce a commit every time the date changes.

Cumulative Sum - Choosing Portions of Hierarchy

I have a bit of an interesting problem.
I required the cumulative sum on a set that is created by pieces of a Time dimension. The time dimension is based on hours and minutes. This dimension begins at the 0 hour and minute and ends at the 23 hour and 59 minute.
What I need to do is slice out portions from say 09:30 AM - 04:00 PM or 4:30PM - 09:30 AM. And I need these values in order to perform my cumulative sums. I'm hoping that someone could suggest a means of doing this with standard MDX. If not is my only alternative to write my own stored procedure which forms my Periods to date set extraction using the logic described above?
Thanks in advance!
You can create a secondary hiearchy in your time dimension with only the hour and filter the query with it.
[Time].[Calendar] -> the hierarchy with year, months, day and hours level
[Time].[Hour] -> the 'new' hierarchy with only hours level (e.g.) 09:30 AM.
The you can make a query in mdx adding your criteria as filter :
SELECT
my axis...
WHERE ( SELECT { [Time].[Hour].[09:30 AM]:[Time].[Hour].[04:00 PM] } on 0 FROM [MyCube] )
You can also create a new dimension instead of a hierarchy, the different is in the autoexists behaviour and the performance.

DATE lookup table (1990/01/01:2041/12/31)

I use a DATE's master table for looking up dates and other values in order to control several events, intervals and calculations within my app. It has rows for every single day begining from 01/01/1990 to 12/31/2041.
One example of how I use this lookup table is:
A customer pawned an item on: JAN-31-2010
Customer returns on MAY-03-2010 to make an interest pymt to avoid forfeiting the item.
If he pays 1 months interest, the employee enters a "1" and the app looks-up the pawn
date (JAN-31-2010) in date master table and puts FEB-28-2010 in the applicable interest
pymt date. FEB-28 is returned because FEB-31's dont exist! If 2010 were a leap-year, it
would've returned FEB-29.
If customer pays 2 months, MAR-31-2010 is returned. 3 months, APR-30... If customer
pays more than 3 months or another period not covered by the date lookup table,
employee manually enters the applicable date.
Here's what the date lookup table looks like:
{ Copyright 1990:2010, Frank Computer, Inc. }
{ DBDATE=YMD4- (correctly sorted for faster lookup) }
CREATE TABLE datemast
(
dm_lookup DATE, {lookup col used for obtaining values below}
dm_workday CHAR(2), {NULL=Normal Working Date,}
{NW=National Holiday(Working Date),}
{NN=National Holiday(Non-Working Date),}
{NH=National Holiday(Half-Day Working Date),}
{CN=Company Proclamated(Non-Working Date),}
{CH=Company Proclamated(Half-Day Working Date)}
{several other columns omitted}
dm_description CHAR(30), {NULL, holiday description or any comments}
dm_day_num SMALLINT, {number of elapsed days since begining of year}
dm_days_left SMALLINT, (number of remaining days until end of year}
dm_plus1_mth DATE, {plus 1 month from lookup date}
dm_plus2_mth DATE, {plus 2 months from lookup date}
dm_plus3_mth DATE, {plus 3 months from lookup date}
dm_fy_begins DATE, {fiscal year begins on for lookup date}
dm_fy_ends DATE, {fiscal year ends on for lookup date}
dm_qtr_begins DATE, {quarter begins on for lookup date}
dm_qtr_ends DATE, {quarter ends on for lookup date}
dm_mth_begins DATE, {month begins on for lookup date}
dm_mth_ends DATE, {month ends on for lookup date}
dm_wk_begins DATE, {week begins on for lookup date}
dm_wk_ends DATE, {week ends on for lookup date}
{several other columns omitted}
)
IN "S:\PAWNSHOP.DBS\DATEMAST";
Is there a better way of doing this or is it a cool method?
This is a reasonable way of doing things. If you look into data warehousing, you'll find that those systems often use a similar system for the time fact table. Since there are less than 20K rows in the fifty-year span you're using, there isn't a huge amount of data.
There's an assumption that the storage gives better performance than doing the computations; that most certainly isn't clear cut since the computations are not that hard (though neither are they trivial) and any disk access is very slow in computational terms. However, the convenience of having the information in one table may be sufficient to warrant having to keep track of an appropriate method for each of the computed values stored in the table.
It depends on which database you are using. SQL Server has horrible support for temporal data and I almost always end up using a date fact table there. But databases like Oracle, Postgres and DB2 have really good support and it is typically more efficient to calculate dates on the fly for OLTP applications.
For instance, Oracle has a last_day() function to get the last day of a month and an add_months() function to, well, add months. Typically in Oracle I'll use a pipelined function that takes start and end dates and returns a nested table of dates.
The cool way of generating a rowset of dates in Oracle is to use the hierarchical query functionality, connect by. I have posted an example of this usage in another thread.
It gives a lot of flexibility without the PL/SQL overhead of a pipelined function.
OK, so I tested my app using 31 days/month to calculate interest rates & pawnshops are happy with it! Local Law prays as follows: From pawn or last int. pymt. date to 5 elapsed days, 5% interest on principal, 6 to 10 days = 10%, 11 to 15 days = 15%, and 16 days to 1 "month" = 20%.
So the interest table is now defined as follows:
NUMBER OF ELAPSED DAYS SINCE
PAWN DATE OR LAST INTEREST PYMT
FROM TO ACUMULATED
DAY DAY INTEREST
----- ---- ----------
0 5 5.00%
6 10 10.00%
11 15 15.00%
16 31 20.00%
32 36 25.00%
37 41 30.00%
42 46 35.00%
47 62 40.00%
[... until day 90 (forfeiture allowed)]
from day 91 to 999, daily prorate based on 20%/month.
Did something bad happen in the UK on MAR-13 or SEP-1752?

Storing and searching opening/closing times for stores

I'm writing an application that indexes data for our stores, some of which are open late (8 am - 2 am). We need to be able to search this database quickly -- basically, to run a query to find which stores are open at a given point in time (now, Sunday at 1 am, whatever).
In addition, the open/close times can vary day-by-day -- some stores are closed on Sundays, for example.
The obvious solution to me would be to make a table where I have a row with the store ID, day, open time, and close time. For something like Monday, 8 am - 2 am, that would actually be two rows, one for Monday 0800 - 2400, and one for Tuesday 0000 - 0200.
We have a lot of stores, so the search has to perform well (basically, the data has to be index-friendly), but I'll also have to display this data back out in a human-readable format. With my current solution, that'd look something like this:
Monday: 8:00 - Midnight
Tuesday: Midnight - 2:00 am; 8:00 am - Midnight
I'm just wondering if anybody else has alternative solutions before I jump right to an implementation. Thanks!
When PBS (the US Public Broadcasting System) faced this same problem a couple of years ago, they invented the idea of the "30 hour day" -- Where 00:00 is midnight at the start of the day, 24:00 is midnight at the end of the day, 25:00 is 1am the next day, 30:00 is 6am the next day. That way Mon closing time of 26:00 is 2am Tues morning.
Rather than two records representing a single store's times for a day, it may be more object oriented to think of the "store day" as the object. That way 1 record = 1 store's times for a day. If you want to store the two sets of open/close times, just use four fields in the record instead of two--and adjust your queries appropriately.
Remember that your queries should use a library/api that you write and publish. The library will then deal with the data store and its data layout. No one but your library should be looking at the db directly.
Time zones are very important in this sort of app too. (Hopefully) at some point, the store chain will expand to cover more than one time zone. You'll then need to determine the local time of the query. -- May not the same as the time zone of your server which is handling the queries.
Further thoughts--
I now see that you're standardizing to GMT. Good. You could also use datetime values (vs time values) and standardize to a given week in time. Eg open time is Sun Jan 1, 1995 10am - Mon Jan 2, 1995 2am (using Jan 1, 1995 as a base since it was a Sunday).
Then rationalize your "current time and date" to match the same point in the week of Jan 1, 1995. Then query to find open store days.
HTH,
Larry

how to get diff b/w 2 columns which is in time format

I've 2 columns called record time and unload time which is in time format AM/PM and I require a new column called total time where I need to find difference between unload time and record time...
for example here is my table
record time unload time
11:37:05 PM 11:39:09 PM
11:44:56 PM 1:7:23 AM
For this I require a new column which finds the difference between these 2 columns.
Cab anyone suggest a query for this please?
why you cant go with datediff system function in SQL SERVER
select datediff(mi,'11:37:05 PM','11:39:09 PM')
mi/n is for minute
If you're doing timespan calculations within one 24 hour period, anishmarokey's response is correct. However, I'd add the date to the time field as well, if you're going to have cases where the load and unload might occur over midnight between two or more days.

Resources