I am using Power Query to access my PostgreSQL database and filter my data by certain date parameters. However, for one of my database tables the date format is YYYYMM (201510).
Is it possible to convert this format to an actual date format?
Power Query recognizes YYYY-MM or YYYYMMDD as valid date formats, but not YYYYMM. Here's a solution inserting a hyphen then changing types:
Split the text by number of characters, 4
Delete the automatic number type inference step.
Merge the columns using a custom separator -
Change type to date
Here's a simple example:
let
Source = Csv.Document("201510
201502"),
SplitColumnByPosition = Table.SplitColumn(Source,"Column1",Splitter.SplitTextByPositions({0, 4}, false),{"Column1.1", "Column1.2"}),
MergedColumns = Table.CombineColumns(SplitColumnByPosition,{"Column1.2", "Column1.1"},Combiner.CombineTextByDelimiter("-", QuoteStyle.None),"Merged"),
ChangedType = Table.TransformColumnTypes(MergedColumns,{{"Merged", type date}})
in
ChangedType
Please try:
=DATE(LEFT(A1,4),RIGHT(A1,2),1)
select to_date('201510', 'YYYYMM');
to_date
------------
2015-10-01
In this case I prefer to create a new custom column and delete the original column afterwards. In your case the formula for the custom column would look like this:
=Text.Range([DateColumn],0,4) & "-" & Text.Range([DateColumn],4,2) & "-01"
After adding the custom column you can change it's format to Date.
For further reference check the M Formula reference: https://msdn.microsoft.com/en-us/library/mt211003.aspx
Related
I have a situation where I am getting dates in two separate formats, MM/dd/yyyy & yyyy-dd-MM, AND there might be even more different formats as well in csv which will be obviously in string.
Below are the data which currently come as String from CSV-
1/14/2022 0:00
2021-12-31 00:00:00
I am using a Dataflow task in ADF to load the data into Azure SQL where the default format it uses should be yyyy-MM-dd HH:mm:ss.
how can I do this?
ok, i managed to build a quick demo.
Main idea of my solution:
you need to differentiate between valid rows and rows that needs to be modified.
in order to do so, i used case condition.
the idea is to add a derived column with a name 'Date' and modify only needed rows.
Input Data:
i created a csv file and saved my data as a dataset in ADF.
ADF:
In source, i select my dataset as an input.
in a derived column activity:
added a new derived column with a name 'Date' , value :
case(contains(split(Date,''),#item=='/'), toString(toTimestamp(Date,'MM/dd/yyyy H:mm'),'yyyy-MM-dd HH:mm:SS'), Date)
in toTimestamp method, i added first the dateFormat of my input Date and in toString the desired format that i want to cast the date to it.
Output:
P.s
You can cast all possible date formats that will appear in your data in that way.
you can read more about it here:
https://learn.microsoft.com/en-us/azure/data-factory/data-flow-expressions-usage#toTimestamp
I am loading a flat file to a database table, and need to change the format of the date from YYYY-MM-DD in the flat file, to MM/DD/YYYY in the database table. I tried using the following statement in Derived Columns as shown below, but not sure how to configure the statement, so I got an error message stating that SSIS could not parse the expression.
Derived Column Name: EFF_DATE
Derived Column: Replace EFF_DATE
Expression: TOKEN( MONTH([EFF_DATE]),"//|",DAY([EFF_DATE]),"//|",YEAR([Copy of EFF_DATE]) )
DATA TYPE: databasetimestamp[DT_DBTIMESTAMP]
Can anyone help me determine how to change the format of the column in Derived Column? Otherwise, please let me know if there is another way to do it. Thank you.
This question was different from the last one. In the last question, the date column was data type DateTime. But in this question, the date is a string, and when I used the Derived Column to change the date from YYYY-MM-DD to MM/DD/YYYY, it kept the leading zeroes in MM and DD. The issue then became, not just changing the date format, but also removing the leading zeroes from the Month and Day.
However, I researched and came up with a better solution in SSIS for changing the date value with data type string, as the database I am working with stores the date in that format.
I removed the Derived Column from my Data Source Task, and added an Execute SQL Task in the Control Flow, then added the following Update statement which not only changes the format from YYYY-MM-DD to MM/DD/YYYY, but also removes the leading zeroes from Month and Day. The CONCAT function I used the sample SQL below changes the format from YYYY-MM-DD to MM/DD/YYYY, while the Convert function changes the MM and DD values to data type INT which removed any leading Zeros. This solution allowed the date to remain a string, as that was the table format I had to work with.
UPDATE [StagingTable]
SET START_DATE =
CONCAT( CONVERT(INT, SUBSTRING(START_DATE, 6,2)), '/', CONVERT(INT, RIGHT(START_DATE, 2)),'/', LEFT(START_DATE,4) )]
Thanks to everyone for their comments, as it helped me to think outside the box and determine this solution.
I am using a flat file data provider in SSIS to import data from an external system. I don't have any control over the file, it is pushed out on a weekly basis, and I pick it up from a common folder.
The first two columns of the CSV are dates. Part of the way through the file, the date format has changed from date to numeric as follows:
Service_Date, Event_Datetime
2018-04-30,2018-04-30 21:18
43220,43220.92412
As you can see, the format changed from date to numeric. Other date columns not shown here also have changed.
Obviously, this is breaking the data flow task.
Aside from going into Excel and saving the CSV with the proper column format, is there any way within SSIS can convert on the fly so that the job doesn't fail and require manual intervention?
These data values 43220,43220.92412 are called date serials, you can get the date value in many approaches:
(1) Using A derived Column
You can convert this column to float then to datetime using a derived column:
(DT_DATE)(DT_R8)[dateColumn]
References
convert Excel Date Serial Number to Regular Date
Is there a better way to parse [Integer].[Integer] style dates in SSIS?
(2) Using a script component
You can use DateTime.FromOADAte() function, as example: (code in VB.NET)
If Row.ServiceDate_IsNull = False AndAlso String.IsnullOrEmpty(Row.ServiceDate) Then
Dim dblTemp as Double
If Double.TryParse(Row.ServiceDatemdblTemp) Then
Row.OutputDate = DateTime.FromOADate(dblTemp)
Else
Row.OutputDate = Date.Parse(Row.ServiceDatemdblTemp)
End
End If
Reference
SSIS Script Task - VB - Date is extracting as INT instead of date string
I was able to solve the problem using a variation of the derived column. This expression would catch the column obviously formatted as a date, and convert it to a date, otherwise it converts the date serial to a float first, then to a date
FINDSTRING(Date_Service,"-",1) != 0 ? (DT_DATE)Date_Service : (DT_DATE)(DT_R8)Date_Service
I'm working on a PowerPivot project, and I need to convert a date to another format.
My SQL Server Analysis Cube provides me a Time dimension including a date attribute.
I want to convert it to a dd/mm/yyyy format to create a reference to another data source (Excel file).
I tried to convert it by using standard DAX date functions but it's not recognized as a date, it's seems it is due to the name of the day added as a prefix.
How can I transform it to the dd/mm/yyyy format ? How to extract the sub string after the comma ?
Thanks !
I used the following data to test my solution out.
You can use the SEARCH function to find the first instance of a string. So I can parse just the date portion out with the following formula:
=right([Datefield],(LEN([Datefield])-SEARCH(",",[Datefield])-1))
It gets the substring starting at the character after the comma through the end of the string.
The DATEVALUE function takes a string that represents a date and turns it into a date. I can combine that with my previous function:
=datevalue(right([Datefield],(LEN([Datefield])-SEARCH(",",[Datefield])-1)))
In the picture below, the first column is the original data. The second column is the function that parses out the substring for the date. The third column takes the datevalue of that date string in the second column. The fourth column is the all in one formula with both the substring and the datevalue.
If you will load the data from the database regularly, than I suggest you use Power Query to load data into PowerPivot Module.
Here is the sample code that can do the result for you.
Data look like this (Table2):
Date
Monday, January 12, 2014
Tuesday, January 13, 2014
Wednesday, January 14, 2014
let
Source = Excel.CurrentWorkbook(){[Name="Table2"]}[Content],
#"Split Column by Delimiter" = Table.SplitColumn(Source,"Date",Splitter.SplitTextByEachDelimiter({","}, null, false),{"Date.1", "Date.2"}),
#"Change to Date Format" = Table.TransformColumnTypes(#"Split Column by Delimiter",{{"Date.1", type text}, {"Date.2", type date}}),
#"Removed Columns" = Table.RemoveColumns(#"Change to Date Format",{"Date.1"})
in
#"Removed Columns"
As #mmarie explained, the folowing formula works well (need some changes) :
=datevalue(right([Datefield];(LEN([Datefield])-SEARCH(",";[Datefield])-1)))
The date stored in my database is 01-01-1900 for field emp_doj(Data Type DATE).
But while retrieving the data, the value is 01-jan-00, even though formatted with dd-mm-yyyy.
I am comparing retrieved date field with some other date in SQL query.
select *
form persons
where per_join = to_date(to_char(i.emp_doj,'DD-MM-YYYY'),'DD-MM-YYYY')
Here the value of to_date(to_char(i.emp_doj,'DD-MM-YYYY'),'DD-MM-YYYY') results in 01-01-00, instead of 01-01-1900
I suspect it's your default NLS settings for the IDE you are using that is merely displaying the dates with a two digit year. Oracle does not store dates in any specific format. It stores them in a binary format internally.
Therefore I suggest you check the settings for your IDE and/or database and you will see the NLS date format set to DD-MM-YY, alter this and set it to DD-MM-YYYY and you will see the dates as you wish to see them.
As for your query, why use TO_CHAR and then TO_DATE? If emp_doj is a date field and per_join is also a date then just compare them directly for equality. If you are trying to cut the time portion off your emp_doj values then just use TRUNC().
For examples and reference on NLS: http://www.orafaq.com/wiki/NLS
More here: http://www.oracle.com/technetwork/database/globalization/nls-lang-099431.html
select to_char(emp_doj,'dd-mm-yyyy') from yourtable
I have got some temporary solutions it currently works for me, I simply changed my select query to
select *
form persons
where to_date(per_join,'DD-MM-YYYY')= to_date(i.emp_doj,'DD-MM-YYYY')