flink: use allowedLateness in flink sql api - apache-flink

I'm using flink sql api and I have a sql like
Table result2 = tableEnv.sqlQuery(
"SELECT user, SUM(amount) " +
"FROM Orders " +
"GROUP BY TUMBLE(proctime, INTERVAL '1' DAY), user"
);
Can I enable "allowedLatenness" and getting late data as a side output

Late data handling is not supported in Flink SQL yet (version 1.5.0). Late rows are just dropped.

Related

org.apache.flink.table.api.TableException: Unsupported query: Merge Into

I am working on a Flink streaming job where I need to upsert data in the Hudi table. I am using merge into a query to upsert data in the Hudi table.
Table table = tableEnv.fromDataStream(KafkaStreamTableDataStreamStream);
tableEnv.createTemporaryView("table1", table);
tableEnv.executeSql("Merge into target " +
"USING table1 s0 " +
"ON target.id = s0.id " +
"WHEN MATCHED THEN UPDATE SET amount=s0.amount");
This query is working fine in spark-shell. But it is giving me Exception in thread "main" org.apache.flink.table.api.TableException: Unsupported query: Merge into .. in flink
Do merge into statement query work in the Flink job?
Flink doesn't support MERGE statements. This has been brought up for discussion but nothing has happened since then.

Converting PostgreSql timestamp query to MS SQL/Azure SQL

I'm currently in the process of converting some basic sql scripts from Postgresql to Azure Sql. I'm newbie to sql, but I really can't understand epoch/unix time in Azure SQL / SQL Server. My application is using bigint epoch time as event timestamp and I want to make a housekeep script for the database and for the table where events are stored.
How do you work with epoch time intervals in MS or Azure SQL? What would be the equivalent in Azure SQL to the following query?
SELECT count(*) FROM info_event WHERE event_time < (SELECT cast(EXTRACT(epoch FROM current_timestamp - INTERVAL '1 MONTH') AS bigint) * 1000);
The solution was.
SELECT * FROM table WHERE epoch_column < (SELECT cast(DATEDIFF(second,'1970-01-01 00:00:00',(DATEADD(month,-1,GETDATE())))AS bigint)* 1000 );
Thanks for the help!

SSIS: Variable from SQL to Data Flow Task

Pretty new to BI and SQL in general, but a few months ago I didn't even know what a model is and now here I am...trying to build a package that runs daily.
Currently running this is Excel via PowerQuery but because the data is so much, I have to manually change the query every month. Decided to move it into SSIS.
Required outcome: Pull the last date in my Database and use it as a variable in the model (as I have millions of rows, I only want to load lines with dates greater than what I have in my table already).
Here is my Execute SQL Task:
I set up a variable for the SQL query
and trying to use it in my OLE DB query like this
Execute SQL Task: results, are fine - returns date as "dd/mm/yyyy hh24:mi:ss"
SELECT MAX (CONVACCT_CREATE_DATE) AS Expr1 FROM GOMSDailySales
Variable for OLE DB SQL Query:
"SELECT fin_booking_code, FIN_DEPT_CODE, FIN_ACCT_NO, FIN_PROD_CODE, FIN_PROG_CODE, FIN_OPEN_CODE, DEBIT_AMT, CREDIT_AMT, CURRENCY_CODE, PART_NO, FIN_DOC_NO, CREATE_DATE
FROM cuown.converted_accounts
WHERE (CREATE_DATE > TO_DATE(#[User::GetMaxDate],'yyyy/mm/dd hh24:mi:ss'))
AND (FIN_ACCT_NO LIKE '1%')"
Currently getting missing expression error, if I add " ' " to my #[User::GetMaxDate], I get a year must be between 0 and xxxx error.
What am I doing wrong / is there a cleaner way to get this done?
In the OLEDB source use the following, change the data access mode to SQL command, and use the following command:
SELECT fin_booking_code, FIN_DEPT_CODE, FIN_ACCT_NO, FIN_PROD_CODE, FIN_PROG_CODE, FIN_OPEN_CODE, DEBIT_AMT, CREDIT_AMT, CURRENCY_CODE, PART_NO, FIN_DOC_NO, CREATE_DATE
FROM cuown.converted_accounts
WHERE (CREATE_DATE > TO_DATE(?,'yyyy/mm/dd hh24:mi:ss'))
AND (FIN_ACCT_NO LIKE '1%')
And click on the parameters button and map #[User::GetMaxDate] to the first parameter.
For more information, check the following answer: Parameterized OLEDB source query
Alternative method
If parameters are not supported in the OLE DB provider you are using, create a variable of type string and evaluate this variable as the following expression:
"SELECT fin_booking_code, FIN_DEPT_CODE, FIN_ACCT_NO, FIN_PROD_CODE, FIN_PROG_CODE, FIN_OPEN_CODE, DEBIT_AMT, CREDIT_AMT, CURRENCY_CODE, PART_NO, FIN_DOC_NO, CREATE_DATE
FROM cuown.converted_accounts
WHERE CREATE_DATE > TO_DATE('" + (DT_WSTR, 50)#[User::GetMaxDate] +
"' ,'yyyy/mm/dd hh24:mi:ss') AND FIN_ACCT_NO LIKE '1%'"
Then from the OLE DB source, change the data access mode the SQL Command from variable and select the string variable you created.
Your trying to use the SSIS variable like a variable in the query. When constructing a SQL query in a string variable you simply need to concatenate the strings together. The expression for your query string variable should look like this.
"SELECT fin_booking_code, FIN_DEPT_CODE, FIN_ACCT_NO, FIN_PROD_CODE, FIN_PROG_CODE, FIN_OPEN_CODE, DEBIT_AMT, CREDIT_AMT, CURRENCY_CODE, PART_NO, FIN_DOC_NO, CREATE_DATE
FROM cuown.converted_accounts
WHERE CREATE_DATE > " + #[User::GetMaxDate] +
"AND (FIN_ACCT_NO LIKE '1%')"

Parameterise a RecordSource query to ensure the data remains updateable?

I have a subform within a main form, that is set to datasheet view and extracts data from a SQL Server database, based on the forms supplied parameters.
This means that selecting a different team from the combobox or date from the datepicker, pulls the relevant information into the datasheet.
The user needs to be able to manipulate a boolean field within the data, and so far the only way I have found that I can make this data load and keep it updateable is to write the query in the .RecordSource property in VBA fully like below:
Set mf = mainFrm
S = " SELECT t.Date, t.Team, s.Username, t.Reference, t.Status, t.Reason, t.Completed " & _
" FROM [ODBC;DRIVER=SQL Server;SERVER=<SERVERNAME>;Integrated_Security=SSPI;DATABASE=<DBNAME>].testTbl as t " & _
" INNER JOIN [ODBC;DRIVER=SQL Server;SERVER=SRVFOSABESQL01;Integrated_Security=SSPI;DATABASE=MO_Productivity].staffTbl as s ON t.emp_id = s.emp_id " & _
" WHERE t.Team = '" & tmName & "' AND t.Date = #" & wkEnd & "# " & _
" ORDER BY t.Reference; "
mf.subFrm.Form.RecordSource = S
Obviously the huge issue here is that the provided variables are open to SQL injection, aren't going to be escaped and is obviously exposing the connection string. All of which can be worked around, but certainly doesn't feel 'best practice'.
I have tried using the .Recordset property to get data from the server with a pass-through parameterised query / stored procedure, but this seems to be a one-way read-only operation of placing the data in and making it 'unlinked' or not updateable.
What is the correct and more secure way to retrieve this data from SQL Server so it can be placed in the RecordSource to make it updateable?
I found this on fmsinc.com:
There are many reasons why a Query or Recordset may not be updateable. Some are pretty obvious:
The query is a Totals query (uses GROUP BY) or Crosstab query (uses
TRANSFORM), so the records aren't individual records
The field is a calculated field, so it can't be edited
You don't have permissions/rights to edit the table or database
The query uses VBA functions or user defined functions and the
database isn't enabled (trusted) to allow code to run
Some reasons are less obvious but can't be avoided:
The table being modified is a linked table without a primary key. For
certain backend databases (e.g. Microsoft SQL Server), Access/Jet
requires the table to be keyed to make any changes. This makes sense
since Access wants to issue a SQL query for modifications but can't
uniquely identify the record.
Less obvious are these situations:
List item
Queries with some summary fields linked to individual records and the
individual records still can't be edited
Queries with multi-table joins that are not on key fields
Union queries
If the query is updateable, then the recordset must be updateable. I solved this by using ADO object.
See MS Access 2019 - SQL Server 2017 - Recordset cannot be updated here on Stack Overflow.

Executing SSRS Reports From SSIS

I need to execute a SSRS reports from SSIS on periodic schedule.
Saw a solution here :
https://www.mssqltips.com/sqlservertip/3475/execute-a-sql-server-reporting-services-report-from-integration-services-package/
But is there any other option in SSIS without using Script Task ? I don't quite understand the script and concern there could be some support issue for me.
Database : SQL Server 2008R2 Standard Edition
Any ideas ? Thanks very much ...
SSIS controlling the running of an SSRS in SQL Agent.
This assumes that the SSIS job will have updated a control record or written some other identifiable record to a database.
1. Create a subscription for the report.
2. Run this SQL to get the GUID of the report
SELECT c.Name AS ReportName
, rs.ScheduleID AS JOB_NAME
, s.[Description]
, s.LastStatus
, s.LastRunTime
FROM
ReportServer..[Catalog] c
JOIN ReportServer..Subscriptions s ON c.ItemID = s.Report_OID
JOIN ReportServer..ReportSchedule rs ON c.ItemID = rs.ReportID
AND rs.SubscriptionID = s.SubscriptionID<br>
3. Create a SQL Agent job.
a. Step 1. A SQL statement to look for data in a table containing a flagged record where the Advanced setting is "on failure end job reporting success"
IF NOT exists ( select top 1 * from mytable where mykey = 'x'
and mycondition = 'y') RAISERROR ('No Records Found',16,1)
b. Step 2
USE msdb
EXEC sp_start_job #job_name = ‘1X2C91X5-8B86-4CDA-9G1B-112C4F6E450A'<br>
Replacing the GUID with the one returned from your GUID query.
One thing to note though ... once the report subscription has been executed then as far as SQL Agent is concerned then that step is complete, even though the report has not necessarily finished running. I once had a clean up job after the Exec step which effectively deleted some of my data before the report reached it!
You can create a subscription for the report that is never scheduled to run.
If you have the Subscription ID, you can fire the report subscription using a simple SQL Task in SSIS.
You can get the Subscription ID from the Report Server database. It is in the Subscriptions table.
Use this query to help locate the subscription:
SELECT Catalog.Path
,Catalog.Name
,SubscriptionID
,Subscriptions.Description
FROM Catalog
INNER JOIN Subscriptions
ON Catalog.ItemID = Subscriptions.Report_OID
In SSIS, you can use this statement, inside of a SQL Task, to fire the subscription:
EXEC reportserver.dbo.AddEvent #EventType='TimedSubscription',#EventData= [Your Subscription ID]
Hope this helps.

Resources