query and update a stream in flink stream sql - apache-flink

I am looking for a solution based on flink, the situation is that I have a trans stream and some rules which can be expressed as SQL, I want to update the stream after query(if matched ruleSql1 set this transEvent respCode = 01; if matched ruleSql2 then set this transEvent respCode = 02; respCode has priority).
The question is:
By flink sql I can get a result, but how to feedback the result to original stream, the output I expected is original stream with different respCode.
I have a lot of rules, how to merge the result.

Flink's operators have streams coming in, and transformed streams coming out. It's not clear exactly what you want -- but whether you want to modify each event to add a field with the response code, or something else, it's easily done. If you are using SQL, simply describe the output you want in the SELECT clause.
You can use split/select to make n copies of your stream, and then apply one of your rules (expressed as a SQL query) to each of these parallel copies. Then you can use union to merge them back together (provided they are all of the same type).
You'll find the documentation on split, select, and union in this section of the docs.
The Flink training site has a sequence of hands-on exercises that you may find helpful in learning how the pieces of the API fit together, though none that use split/select/union.

Related

Convert FileTime to DateTime in Azure Logic App

I'm pretty new to Logic App so still learning my way around custom expressions. One thing I cannot seem to figure out is how to convert a FileTime value to a DateTime value.
FileTime value example: 133197984000000000
I don't have a desired output format as long as Logic App can understand that this is a DateTime value and can be able to run before/after date logic.
To achieve your requirement, I have converted the Windows file Time to Unix File Time then converted to File time by add them as seconds to a default date 1970-01-01T00:00:00Z. Here is the Official documentation that I followed. Below is the expression that worked for me.
addSeconds('1970-01-01T00:00:00Z', div(sub(133197984000000000,116444736000000000),10000000))
Results:
This isn't likely to float your boat but the Advanced Data Operations connector can do it for you.
The unfortunate piece of the puzzle is that (at this stage) it doesn't just work as is but be rest assured that this functionality is coming.
Meaning, you need to do some trickery if you want to use it to do what you want.
By this I mean, if you use the Xml to Json operation, you can use the built in functions that come with the conversion to do it for you.
This is an example of what I mean ...
You can see that I have constructed some XML that is then passed into the Data parameter. That XML contains your Windows file time value.
I have then setup the Map Object to then take that value and use the built in ado function FromWindowsFileTime to convert it to a date time value.
The Primary Loop at Element is the XPath query that will make the selection to return the relevant values to loop over.
The result is this ...
Disclaimer: I should point out, this is due to drop in preview sometime in the middle of Jan 2023.
They have another operation in development that will allow you to do this a lot easier but for now, this is your easier and cheapest option.
This kind of thing is also available in the Transform and Expert operations but that's the next tier level of pricing.

Zeppelin: How to see text results from JDBC query

On SQL queries zeppelin 0.8.1 provide table output and several visualizations of data out of the box:
And it is very useful most of the time.
But sometimes I want just select text for presentation.
Said for query SELECT version();. There table output is annoying:
What very interesting, there already implemented text output, for example for EXPLAIN:
Off course ideally for EXPLAIN query you may also expect more visualize for nods, cost and so on, but it is absolutely another question.
So, main question: How I can switch output to text form for some of my SQL queries except explain but in similar form?
Additionally, If I run some maintenance commands like VACUUM and ANALYZE I can see output in many IDE, but in zeppelin it is empty!
/*'EXPLAIN '*/ select version();
An ugly workaround can be used while JDBCInterpreter contains EXPLAIN_PREDICATE
private static final String EXPLAIN_PREDICATE = "EXPLAIN ";
String results = getResults(resultSet,
!containsIgnoreCase(sqlToExecute, EXPLAIN_PREDICATE), isComplete);
In future it's will be nice to manage output type via paragraph properties.
VACUUM and ANALYZE send messages which should be caught via Statement#getWarnings

Postgres: is there any row_to_json equivalent that returns values only?

In a project I'm working on, I need to stream potentially large data sets from a Postgres database to the client, for analytics purposes.
The application is built in Rails (irrelevant for this question) and after a bit of research I'm currently able to stream query results by using COPY in Postgres:
COPY (SELECT row_to_json(t) from (#{query}) t) TO STDOUT;
Sources (for who's interested):
https://shift.infinite.red/fast-csv-report-generation-with-postgres-in-rails-d444d9b915ab
https://github.com/brianhempel/stream_json_demo
This works, but it yields every row as a key-value pair, e.g.:
["{\"id\":403457,\"email\":\"email403457#example.com\",\"first_name\":\"Firstname403457\",\"last_name\":\"Lastname403457\",\"source\":\"adwords\",\"created_at\":\"2015-08-05T22:43:07.295796\",\"updated_at\":\"2017-01-19T04:48:29.464051\"}"]
In the spirit of minimising the size (in bytes) of the response and especially since this is getting served through the web, I want to return just an array of values for every row, i.e.:
["[403457, \"email403457#example.com\", \"Firstname403457\", \"Lastname403457\", \"adwords\", \"2015-08-05T22:43:07.295796\", \"2017-01-19T04:48:29.464051\"]"]
Is there a way to achieve this within Postgres, even by nesting functions, starting from the query above?
You could create a simple SQL function that converts a row into the desired format:
CREATE FUNCTION row2json(anyelement) RETURNS json
LANGUAGE sql STABLE AS
'SELECT json_agg(z.value) FROM json_each(row_to_json($1)) z';
Then you use that to transform the output:
SELECT row2json(mytab) FROM mytab;
If performance is more important than JSON output, just cast the result to a string:
SELECT CAST(mytab AS text) FROM mytab;

Remove Duplicate adjacent Sub-String from String in Microsoft SQL Server

I am using SQL Server 2008 and I have a column in a table, which has values like below. It basically shows departure and arrival information.
-->Heathrow/Dublin*Dublin/Heathrow
-->Gatwick/Liverpool*Liverpool/Carlisle *Carlisle/Gatwick
-->Heathrow/Dublin*Liverpool/Heathrow
(The 3rd example shown above is slightly different where the person did not depart from Dublin, instead departed from a Liverpool).
This makes the column too lengthy, and I want to remove only the adjacent duplicates, so the information can be shown like below:
-->Heathrow/Dublin/Heathrow
-->Gatwick/Liverpool/Carlisle/Gatwick
-->Heathrow/Dublin***Liverpool/Heathrow
So, this would still show the correct travel route, but omits only the contiguous duplicates. Also, in the 3rd case, since the departure and arrival information location is not the same, Iwould like to show it as ***.
I found a post here that removes all duplicates (Find and Remove Repeated Substrings) but this is slightly different from the solution that I need.
Could someone share any thoughts please?
The first step is to adapt the process defined in the following link so that it splits based on /:
T-SQL split string
This returns a table which you would then loop through checking if the value contains an *. In that case you would get the text values before and after the * and compare them. Use CHARINDEX to get the position of the *, and SUBSTRING to get the values before and after. Once you have those check both values and append to your output string accordingly.
So you have a database column that contains this text string? Is your concern to display the data to the user in a new format, or to update the data in your database table with a new value?
Do you have access to the original data from which this text string was built? It would probably be easier to re-create the string in the format you desire than it would be to edit the existing string programmatically.
If you don't have access to this data, it would probably be a lot simpler to update your data (or reformat it for display) if you do the string manipulation in a high-level language such as c# or java.
If you're reformatting it for display, write the string manipulation code in whatever language is appropriate, right before displaying it. If you're updating your table, you could write a program to process the table, reading each record, building the replacement string, and updating the record before moving on to the next one.
The bottom line is that T-SQL is just not a good language for doing this sort of string examination and manipulation. If you can build a fresh string from the original data, or do your manipulation in a high-level language, you'll have an easier job of it and end up with more maintainable code.
I wrote a code for the first example you gave. You still need to
improve it for the rest ...
DECLARE #STR VARCHAR(50)='Heathrow/Dublin*Dublin/Heathrow'
IF (SELECT SUBSTRING(#STR,CHARINDEX('/',#STR)+1,CHARINDEX('*',#STR)-CHARINDEX('/',#STR)-1)) =
(SELECT SUBSTRING(#STR,CHARINDEX('*',#STR)+1,LEN(SUBSTRING(#STR,CHARINDEX('/',#STR)+1,CHARINDEX('*',#STR)-CHARINDEX('/',#STR)-1))))
BEGIN
SELECT STUFF(#STR,CHARINDEX('*',#STR),LEN(SUBSTRING(#STR,CHARINDEX('/',#STR)+1,CHARINDEX('*',#STR)-CHARINDEX('/',#STR)-1))+1,'')
END
ELSE
BEGIN
SELECT STUFF(#STR,CHARINDEX('*',#STR),LEN(SUBSTRING(#STR,CHARINDEX('*',#STR)+1,LEN(SUBSTRING(#STR,CHARINDEX('/',#STR)+1,CHARINDEX('*',#STR)-CHARINDEX('/',#STR)-1)))),'***')
END

store one date and two time fields

I need to store schedule date and times. Scheduale contains one date field and two time fields.
Is there any possibility to store schedule in one db field and not in two (datetime + datetime)?
I am using SQL Server 2005.
Thanks!
Whether it is "start"+"stop", or "start"+""duration", you have 2 pieces of information = store 2 pieces of information.
Using a string or XML makes no sense: this requires take more space, more processing, more code to search and use.
Why would you want to store what are effectively two datetimes in one field rather than two? Are there no cases where the schedule might have times that cross days? (ie. 01/03/2011 23:59, 02/03/2011 01:35)? Do you not mind having to parse out the information rather than having it immediately ready for query?
If you really want to, there's no reason you can't store it as a string type, comma separated possibly, maybe XML as suggested, but I can't say it's recommended as date/time fields are more space efficient, nice and fast/flexible for searching purposes, and there are many useful T-SQL functions which can easily be used on date/time types which you'd be hard pushed to use on a string without some parsing and casting/converting.
If you can come up with a good reason for not using two datetime fields, I'll have another Donut! (ps. happy Fat Thursday).
One quick, and horribly evil thought ... you could use part of the datetime to store the "difference" ... sneak it into the "seconds" and "milliseconds" values, and apply it to the main date/time to get the new value. A bit hacky, but it'd could do the job, depending on your range requirements.
-- Example: 01/03/2011 12:30:02
-- Translates into - first of March 2011, 12:30 to 14:30 (12:30 + (seconds * hours))
set #ModifiedDatetime =
DATEADD(hour, DATEPART(second, #originalDateTime), #originalDateTime);
Beware of rounding errors with milliseconds ... and please think about the consequences of what you're doing. God kills a kitten each time someone abuses a type :)
You can try using the XML field type and store an XML snippet in there, similar to the following:
<schedule date="2011-01-01" fromTime="12:00" toTime="14:00" />
You can then use XQuery in a select to transform the result set back to a "normal" row-based result set. A sample query implementing XQuery, based on my example's XML schema, could be as follows:
SELECT
[...]
, Schedule.value('(/schedule/#date)[1]','datetime') as [Date]
, Schedule.value('(/schedule/#fromTime)[1]','char(5)') as [FromTime]
, Schedule.value('(/schedule/#toTime)[1]','char(5)') as [ToTime]
FROM [TABLE]
I'm not saying that storing it as XML is the best way to do it (as the other answers rightfully state), but you asked IF it is possible and I propose a solution...

Resources