Looping in Stata - loops

I am trying to run regression over each id. I aslo need to narrow it down to regression by each year within a particular id.
tsset id date
forvalues i=1/3 {
eststo:quietly arch rtr mon tue wed thu fri lag1r lag2r if id == `i' & Year==`i', noconstant arch(1/1) tarch(1/1) garch(1/1) distribution(t)
}
esttab using d:\Return_reg.csv, append cells("b(fmt(8))")
It returns the following error:
no observations.
I suspect it's because years are different within each id.
How do I need to improve code so I achieve my goal?

As mentioned in the comments, it's a typo (unless your year variable really only takes the values 1, 2 and 3). Furthermore, tsset takes only one argument; if you want to declare panel data, you need to use xtset. Try the following:
xtset id date
levelsof Year, local(years) //create list containing all values of year
levelsof id, local(ids) //create list containing all values of id
foreach id in `ids'{
foreach yr in `years'{
eststo: quietly arch rtr mon tue wed thu fri lag1r lag2r if id == `id' & Year==`yr', noconstant arch(1/1) tarch(1/1) garch(1/1) distribution(t)
}
}

Related

Using powerquery to unpivot table with multiple columns into a table with two columns that represent pairs of dates from original table?

Imagine I had a 'horizontal' data set that contained:
Unique Key
Multiple 'pairs' of dates across multiple columns (i.e. Event A Start, Event B Start, Event C Start, etc and separate columns for Event A End, Event B End, Event C End, etc).
A single date (not a pair) for a specific 'Event'.
In essence, looks something like this:
Data Set
Unique Key
Event A Start
Event A End
Single Date Event
Event B Start
Event B End
2nd Single Date Event
Key 1
1 Jan 2021
3 Jan 2021
2 Jan 2021
5 Jan 2021
10 Jan 2021
10 Jan 2021
Key 2
7 Jan 2021
10 Jan 2021
null
null
null
null
How would I convert the Data Set above into a table like this using PowerQuery?
Expected Output:
Unique Key
Event
Start Date
End Date
Key 1
Event A
1 Jan 2021
3 Jan 2021
Key 1
Single Date Event
null
2 Jan 2021
Key 1
Event B
5 Jan 2021
10 Jan 2021
Key 1
2nd Single Date Event
null
10 Jan 2021
Key 2
Event A
7 Jan 2021
10 Jan 2021
I've tried:
Unpivot but I can't rename both "Event A Start" and "Event A End" into "Event A". I even tried renaming all "Event [x] Start" as "Event [x]", did a 'unpivot selected' of all "Event [x]'. Then I renamed all "Event [x] End" into "Event [x]" and then performed an unpivot on those columns. Unfortunately, the Key and Event columns don't line up.
Merge Query: I have tried merging one query with another but it's not quite getting the desired output. I created two separate queries (one with Key, Event, and Start Date; another with Key, Event and End Date). But this not having the desired effect. I think this is because of the Single Date Events being 'null'?
I feel I am definitely doing something wrong, so asking here to see if the output that I want is even achievable with PowerQuery based on the input data?
You definitely have to do a bit of extra work on top of an unpivot.
Here's how I'd approach it:
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45W8k6tVDBU0lEyVPBKzFMwMjACcYyROUbIHFNkjqEBLl6sDsRkI6C4OW4teaU5Odip2FgA", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [#"Unique Key" = _t, #"Event A Start" = _t, #"Event A End" = _t, #"Single Date Event" = _t, #"Event B Start" = _t, #"Event B End" = _t, #"2nd Single Date Event" = _t]),
#"Unpivoted Columns" = Table.UnpivotOtherColumns(Source, {"Unique Key"}, "Event", "Value"),
#"Changed Type" = Table.TransformColumnTypes(#"Unpivoted Columns",{{"Value", type date}}),
#"Added Custom" = Table.AddColumn(#"Changed Type", "Custom", each if Text.Contains([Event], "Start") then "Start Date" else "End Date"),
#"Transformed Text" = Table.TransformColumns(#"Added Custom",{{"Event", each if Text.EndsWith(_, "Start") or Text.EndsWith(_, "End") then Text.BeforeDelimiter(_, " ", {0, RelativePosition.FromEnd}) else _, type text}}),
#"Pivoted Column" = Table.Pivot(#"Transformed Text", List.Distinct(#"Transformed Text"[Custom]), "Custom", "Value")
in
#"Pivoted Column"
Steps:
Unpivot the date columns
Add a new column to tag each row as Start Date / End Date
Strip off " Start" / " End" suffix in the [Event] column
Pivot on the new column from Step 2

Measure does not work for Month Threshold

I build this Dax measure
_Access_Daily = CALCULATE(
DISTINCTCOUNTNOBLANK(ApplicationAccessLog[ApplicationUserID]),
FILTER('Date','Date'[DateId]=SELECTEDVALUE('DateSelector'[DateId],MAX('DateSelector'[DateId]))))+0
_Access__PreviousDay = CALCULATE(
DISTINCTCOUNTNOBLANK(ApplicationAccessLog[ApplicationUserID]), FILTER('Date','Date'[DateId]=SELECTEDVALUE('DateSelector'[DateId],MAX('DateSelector'[DateId]))-1 ))+0
The Date Selector table is a disconnected table containing dates from the 20th Jan to now. Dateid is a whole number like 20200131.
The Date table is a standard date table with all the dates between 1970 and 2038. Date id is a whole number like 20200131.
However it does not seems to work for the month threshold between Jan and Feb ? So if selected date is 01/02/2020 then it does not return correctly for the 31/01/2020.
As mentioned in the comments, the root problem here is that the whole numbers you use are not dates. As a result, when you subtract 1 and cross month (or year) boundaries, there is no calendar intelligence that can adjust the numbers properly.
Your solution (using 'Date'[DayDateNext]) might work, and if for some additional considerations this design is a must, go with it. However, I'd suggest to revisit the overall approach and use real dates instead of "DateId". You will then be able to use built-in DAX time intelligence, and your code will be more elegant and faster.
For example, if your "Date" and "DateSelector" tables have regular date fields, your code can be re-written as follows:
_Access_Daily =
VAR Selected_Date = SELECTEDVALUE ( 'DateSelector'[Date], MAX ( 'DateSelector'[Date] ) )
VAR Result =
CALCULATE (
DISTINCTCOUNTNOBLANK ( ApplicationAccessLog[ApplicationUserID] ),
'Date'[Date] = Selected_Date
)
RETURN
Result + 0
and:
_Access_PreviousDay =
CALCULATE ( [_Access_Daily], PREVIOUSDAY ( 'Date'[Date] ) )

SQL Server - extract dates from strings in several formats

I've inherited quite a mess of a database table column called DOB, of type nvarchar - here is just a sample of the data in this column:
DOB: 1998-09-04US
Sex: M Race: White Year of Birth: 1950
12/31/00
January 5th, 1998
Date of Birth: 12/19/1938
AGE; 46
DOB: 11-24-1967
May 31, 1942, Split, Croatia
DOB:   12/28/1986
D.O.B.31-OCT-92
D.O.B.: January 8, 1973
31/07/1974 (44 years old)
Date Of Birth: 08/01/1979
78  (DOB: 12/09/1940)
1961 (56 years old)
12/31/1985 (PRIMARY)
DOB: 05/27/67
8-Jun-43
9/9/78
12/31/84 0:00
NA
Birth Year 2018
nacido el 29 de junio de 1959
I am trying to determine whether there is any way to extract the dates from these fields, with so many varying formats, without using something like RegEx patterns for every single possible variation in this column.
The resulting extracted data would look like this:
1998-09-04
1950
12/31/00
January 5th, 1998
12/19/1938
11-24-1967
May 31, 1942
12/28/1986
31-OCT-92
January 8, 1973
31/07/1974
08/01/1979
12/09/1940
1961
12/31/1985
05/27/67
8-Jun-43
9/9/78
12/31/84
NA
2018
29 de junio de 1959
While it may be a complete pipe dream, I was wondering if this could be accomplished with SQL, with some kind of "if it looks like a date, attempt to extract it" method. And if not out-of-the-box, perhaps with a helper extension or plugin?
It is possible, but there are potential pitfalls. This will certainly have to be expanded and maintained.
This is a brute-force pattern match where the longest matching pattern is selected
Example - See Full Working Demo
Select ID
,DOB
,Found
From (
Select *
,Found = substring(DOB,patindex(PatIdx,DOB),PatLen)
,RN = Row_Number() over (Partition By ID Order by PatLen Desc)
From #YourTable A
Left Join (
Select *
,PatIdx = '%'+replace(replace(Pattern, 'A', '[A-Z]'), '0', '[0-9]') +'%'
,PatLen = len(Pattern)
From #FindPattern
) B
on patindex(PatIdx,DOB)>0
) A
Where RN=1
Returns

Scala Slick Lifted Date GroupBy

I'm using Scala 2.10 with Slick 1.0.0 and trying to do a lifted query.
I have a table, "Logins", where I'm attempting to do a load, and groupBy on a Timestamp column. However, when I attempt to groupBy, I am running into an issue when I try and format the Timestamp field to extract only the day portion, to group the objects by the same day.
Given the objects:
id | requestTimestamp
1 | Jan 1, 2013 01:02:003
2 | Jan 1, 2013 03:04:005
3 | Jan 1, 2013 05:06:007
4 | Jan 2, 2013 01:01:001
I'd like to return a grouping out of the database by similar days, where, for the sake of brevity, the the following Formatted timestamp to id relationship happens, where the id's would actually be a list of objects
Jan 1, 2013 -> (1, 2, 3)
Jan 2, 2013 (4)
I've got the following slick table object:
private implicit object Logins extends Table[(Int, Timestamp)]("LOGINS") {
def id = column[Int]("ID", O.PrimaryKey)
def requestTimeStamp = column[Timestamp]("REQUESTTIMESTAMP", O.NotNull)
def * = logId ~ requestTimeStamp
}
The following Query method:
val q = for {
l <- Logins if (l.id >= 1 && l.id <= 4)
} yield l
val dayGroupBy = new java.text.SimpleDateFormat("MM/dd/yyyy")
val q1 = q.groupBy(l => dayGroupBy.format(l.requestTimeStamp))
db.withSession {
q1.list
}
However, instead of getting the expected grouping, I get an exception on the line where I attempt the groupBy:
java.lang.IllegalArgumentException: Cannot format given Object as a Date
Does anyone have any suggestions on properly grouping by Timestamps out of the database?
Timestamp and Date are not the same thing! Try to convert Timestamp to Human understandable text using calendar or SimpleDateTime.
Not so sure about the second one though!

find the Day of week of a particular date

i have an object which has 2 dates startdate_c and enddate_c .
i need to find a way to find the days of week these dates fall in
For example
startdate = 1 jun 2012 and enddate = 3 jun2012
I need to know which days of the week the days between these dates fall in.
In this example
Mon = false, tue = false, wed = false, thu=false, fri=true,sat=true,sun=true
I want to use this in a Vf page to render the somefields based on the boolean value.
Any pointers would be of great help.
Date has a method called toStartOfWeek which you could leverage, assuming your two dates do lie within the same week you could simply do something like this:
date weekStart = startdate.toStartOfWeek();
list<boolean> days = new list<boolean>();
for(integer i = 0; i < 7; i++)
{
days.add(weekStart.addDays(i) >= startdate && weekStart.addDays(i) <= enddate);
}
A little bit crude, but it'll give you an array of 7 boolean values. For longer/unknown ranges you could use a date cursor and increment that instead of an integer here, but this should get you started. Note, I've not tested this code ;)

Resources