Flink Nested Json parsing parsing with complex schema - apache-flink

We have a requirement of parsing a very complex json (size around 25 kb per event) event with a predefined schema (nested schema, with multiple schema files ) and create a temporary table and from temp table we have to apply some case statement based some fields( eg. to find out success, failure count , status code ) and do a aggregation in 1 sec interval.
We have tried with inbuilt JSON_VALUE function to retrieve some field value and then apply the case statement, but as I am using JSON_VALUE more than 5/6 times, the application is performing very slow.
For some other filtering use case we are able to receive more that 1600 event/sec, but for this case we are only receiving around 300 event/sec for 1 core .
Below is the query example:
Query 1:
select JSON_VALUE(message, '$.eventRecordHeader.Result' RETURNING INT) AS `result1`, JSON_VALUE(message, '$.eventRecordHeader.Cause.ErrorCode' ) AS errorCode, JSON_VALUE(message, '$.eventRecordHeader.Cause.SubCause' ) AS subCause, JSON_VALUE(message, '$.eventRecordHeader.Cause.SubCause.SubProtocol' ) AS subProtocol, JSON_VALUE(message, '$.eventRecordHeader.Cause.SubCause.SubError' ) AS subError, TO_TIMESTAMP_LTZ(cast(JSON_VALUE(message, '$.eventRecordHeader.StartTime') as bigint)/1000, 3) AS eventTime, proctime() as proctime from kafkaJsonSource;
Query 2:
select count(case when result1=1 then 1 else null end) failed_result,count(case when result1=0 then 1 else null end) successful_result,count(case when errorCode like '4%' then 1 else null end) err_starts_4,count(case when errorCode like '5%' then 1 else null end) err_starts_5,count(case when errorCode like '6%' then 1 else null end) err_starts_6,count(case when subCause is not null then 1 else null end) has_sub_cause,count(case when subProtocol='DNS' then 1 else null end) protocol_dns, count(case when subProtocol='Diameter' then 1 else null end) protocol_diameter, count(case when (subProtocol='Diameter' and subError like '3%') then 1 else null end) protocol_diameter_err_starts_3,count(case when (subProtocol='Diameter' and subError like '4%') then 1 else null end) protocol_diameter_err_starts_4,count(case when (subProtocol='Diameter' and subError like '5%') then 1 else null end) protocol_diameter_err_starts_5 FROM TABLE(TUMBLE(TABLE filter_transformed, DESCRIPTOR(proctime), INTERVAL '1' SECOND)) GROUP BY window_start, window_end;

Related

need help for select function in sql

In a table I have 914 rows in that I have a Column which contains "yes or no" values(Yes=193 No = 721 total 914).
In this I want to create a function to use in select statement How many Yes and No
I wrote a query
create function TSS(
#string as nvarchar(20)
)
returns int
begin
declare #result int
if (#string='NO')
select #result=sum(case Re_engaged when 'NO' then 1 else null end) from TVS_PRE
else if (#string='YES')
select #result=sum(case Re_engaged when 'YES' then 1 else null end) from TVS_PRE
return #result
end
select [dbo].[TSS]('Yes') as columns_with_Yes,[dbo].[TSS]('No') as columns_with_No from TVS_PRE
And I got the result is
Columns_with_Yes
Columns_with_No
1
193
2
193
3
193
upto...
...
914
193
but I required this
Columns_with_Yes
Columns_with_No
1
193
You should combine this into one query
SELECT
sum(case Re_engaged when 'NO' then 1 end) Columns_with_No,
sum(case Re_engaged when 'YES' then 1 end) Columns_with_Yes
FROM TVS_PRE;
If you really want this as a function, you can turn it into an inline Table valued Function
CREATE FUNCTION TSS()
RETURNS TABLE
AS RETURN (
SELECT
sum(case Re_engaged when 'NO' then 1 end) Columns_with_No,
sum(case Re_engaged when 'YES' then 1 end) Columns_with_Yes
FROM TVS_PRE
);
GO
I don't understand your need... but if you want only one row... then dont use "from":
select [dbo].[TSS]('Yes') as columns_with_Yes,[dbo].[TSS]('No') as columns_with_No
Use this syntax
SELECT DISTINCT COUNT(CASE WHEN Re_engaged='No' then 1 else NULL end) OVER () AS No_col,
COUNT(CASE WHEN Re_engaged='Yes' then 1 else NULL end) OVER () AS Yes_col
FROM TVS_PRE

SQL Server PIVOT Function To Switch Rows, Columns

I'm trying to switch rows and columns with PIVOT (or another method). The documentation is pretty confusing to me. Thanks
DECLARE #CallCenterID Int;
DECLARE #BeginDateofReview SmallDateTime;
DECLARE #EndDateofReview SmallDateTime;
SELECT
COUNT(case when Score_Greeting = 'Yes' then Score_Greeting END) AS Score_Greeting_Passed,
SUM(CASE WHEN Score_Greeting IS NOT NULL THEN 1 ELSE 0 END) AS Score_Greeting_Reviewed,
ROUND(CONVERT(decimal(4,1), (COUNT(CASE WHEN Score_Greeting = 'Yes' THEN Score_Greeting END) * 100.0) / NULLIF(SUM(CASE WHEN Score_Greeting IS NOT NULL THEN 1 ELSE 0 END),0),0),0) AS Score_Greeting_PctngPassed,
COUNT(CASE WHEN Score_Authentication = 'Yes' THEN Score_Authentication END) AS Score_Authentication_Passed,
SUM(CASE WHEN Score_Authentication IS NOT NULL THEN 1 ELSE 0 END) AS Score_Authentication_Reviewed,
ROUND(CONVERT(decimal(4,1), (COUNT(CASE WHEN Score_Authentication = 'Yes' THEN Score_Authentication END) * 100.0) / NULLIF(SUM(CASE WHEN Score_Authentication IS NOT NULL THEN 1 ELSE 0 END), 0), 0), 0) AS Score_Authentication_PctngPassed,
FROM
Calls
WHERE
CallCenterID = #CallCenterID AND
(DateofReview >= #BeginDateofReview AND DateofReview <= #EndDateofReview)
Desired results:
Score_Greeting_Passed 5
Score_Greeting_Reviewed 9
Score_Greeting_PctngPassed 56
Score_Authentication_Passed 6
Score_Authentication_Reviewed 9
Score_Authentication_PctngPassed 67
You can use the UNPIVOT operator to transpose the rows and columns. If we assume your query above is stored in #YourData, it's going to be something like:
SELECT
UnpivotedData.ScoreType
, UnpivotedData.ScoreValue
FROM
#YourData
UNPIVOT (ScoreValue FOR ScoreType IN ( [Score_Greeting_Passed], [Score_Greeting_Reviewed], [Score_Greeting_PctngPassed],
[Score_Authentication_Passed], [Score_Authentication_Reviewed], [Score_Authentication_PctngPassed] )) AS UnpivotedData

Get counters as columns in SQL Server

I have many records of products that have status.
I wanted to count the records by status and place these results as columns. I tried with case, but I have duplicated rows that differ on the status.
Received StandBy Stocked Pending
Product-1 2 NULL NULL NULL
Product-2 NULL 25 NULL NULL
Product-1 NULL 5 NULL NULL
I would like something like this
Received StandBy Stocked Pending
Product-1 2 5 NULL NULL
Product-2 NULL 25 NULL NULL
This is the query that I try to do without success:
SELECT
--COALESCE(StatusID, 0) AS StatusID, --=1,2
ProductID,
ProductNumber,
DATEPART(hour,p.ArrivalDate) as ArrivalHour,
DATEPART(minute,p.ArrivalDate) as ArrivalMinute,
ProductWarehouseID,
SUM(CASE WHEN StatusID = 1 THEN 1 END) AS Received,
SUM(CASE WHEN StatusID = 2 THEN 1 END) AS StandBy,
SUM(CASE WHEN StatusID = 3 THEN 1 END) AS Stocked,
SUM(CASE WHEN StatusID = NULL THEN 1 END) AS Pending
FROM Product AS p
GROUP BY
ProductID,
ProductNumber,
DATEPART(hour, p.ArrivalDate),
DATEPART(minute, p.ArrivalDate),
ProductWarehouseID
I believe you have given partial query output to simply your solution. If my understanding is correct, you may need to use an aggregate operator over your result. An outline is here:
;WITH Temp AS
(
--Keep your current query here
)
SELECT ProductID, MAX(Received), MAX(StandBy), MAX(Stocked), MAX(Pending)
FROM Temp
GROUP BY ProductID --and any other grouping columns
I think your key stumbling point is null + 1 = null when concat null is at its default settings in MsSql servers after Sql 2000. You may be able to solve your issue by changing the settings of the null concat settings. Or include the "else 0" syntax that others have posted before this post.

How To Get Non-Null,Non-0 for Average with SqlServer Select

**Note: I need to go a little further and add NULLIF(0 or 5). I wrote a short post about my answer here:
http://peterkellner.net/2013/10/13/creating-a-compound-nullif-in-avg-function-with-sqlserver/
but am not happy with my solution)
I've got a table with results where attendees type in estimated attendance to a course. If they type 0 or leave it empty, I want ignore that and get the average of values typed in. I can't figure out how to add that constraint to my AVG function without having a where clause for the entire SQL. Is that possible? My code looks like this: (EstimatedNumberAttendees is what I'm going after).
SELECT dbo.SessionEvals.SessionId,
AVG(Cast (dbo.SessionEvals.CourseAsWhole as Float)) AS CourseAsWholeAvg,
COUNT(*),
COUNT(case
when dbo.SessionEvals.InstructorPromptness = 'On Time' then 1
else null
end) AS SpeakerOnTime,
COUNT(case
when dbo.SessionEvals.InstructorPromptness = 'Late' then 1
else null
end) AS SpeakerLate,
COUNT(case
when dbo.SessionEvals.InstructorPromptness = 'NoShow' then 1
else null
end) AS SpeakerNoShow,
COUNT(case
when dbo.SessionEvals.PercentFull = '10% to 90%' then 1
else null
end) AS PercentFull10to90,
COUNT(case
when dbo.SessionEvals.PercentFull = '> 90%' then 1
else null
end) AS PercentFullGreaterThan90,
COUNT(case
when dbo.SessionEvals.PercentFull = ' < 10% Full ' then 1
else null
end) AS PercentFullLessThan10,
AVG(Cast (dbo.SessionEvals.EstimatedNumberAttendees as Float)) AS
EstimatedAttending
FROM dbo.Sessions
INNER JOIN dbo.SessionEvals ON (dbo.Sessions.Id =
dbo.SessionEvals.SessionId)
WHERE dbo.Sessions.CodeCampYearId = 8
GROUP BY dbo.SessionEvals.SessionId
AVG omits NULLs. Therefore make it treat 0s as NULLs. Use NULLIF for that:
...
AVG(NULLIF(Cast (dbo.SessionEvals.CourseAsWhole as Float), 0)) AS CourseAsWholeAvg,
...
AVG(NULLIF(Cast (dbo.SessionEvals.EstimatedNumberAttendees as Float), 0)) AS EstimatedAttending
...
You can try to use an inner query to get the same sessions but exclude zero and null:
SELECT dbo.SessionEvals.SessionId,
(
SELECT AVG(SE1.CourseAsWhole)
FROM dbo.SessionEvals SE1
WHERE SE1.SessionId = dbo.SessionEvals.SessionId
AND ISNULL(SE1.CourseAsWhole, 0) <> 0
) AS CourseAsWholeAvg,
COUNT(*),
COUNT(case
when dbo.SessionEvals.InstructorPromptness = 'On Time' then 1
else null
end) AS SpeakerOnTime,
COUNT(case
when dbo.SessionEvals.InstructorPromptness = 'Late' then 1
else null
end) AS SpeakerLate,
COUNT(case
when dbo.SessionEvals.InstructorPromptness = 'NoShow' then 1
else null
end) AS SpeakerNoShow,
COUNT(case
when dbo.SessionEvals.PercentFull = '10% to 90%' then 1
else null
end) AS PercentFull10to90,
COUNT(case
when dbo.SessionEvals.PercentFull = '> 90%' then 1
else null
end) AS PercentFullGreaterThan90,
COUNT(case
when dbo.SessionEvals.PercentFull = ' < 10% Full ' then 1
else null
end) AS PercentFullLessThan10,
AVG(Cast (dbo.SessionEvals.EstimatedNumberAttendees as Float)) AS
EstimatedAttending
FROM dbo.Sessions
INNER JOIN dbo.SessionEvals ON (dbo.Sessions.Id =
dbo.SessionEvals.SessionId)
WHERE dbo.Sessions.CodeCampYearId = 8
GROUP BY dbo.SessionEvals.SessionId
SQL AVG function will by default ignore null values so you need to only exclude the 0s. Your AVG code can be changed to below:
AVG(nullif( Cast(dbo.SessionEvals.CourseAsWhole as Float), 0) AS CourseAsWholeAvg

SQL Server - need assistance with sum and case statements

I'm using SQL Server 2005. I'm looking to add up the columns (AM, Midday, Evening) to see which ones contains the value "YES" and then take that total and multiply it by the rate for each row for a client.
Here is the query I have so far:
Select
Sum(Case When morning = 'yes' Then 1 Else 0 End) am_total,
Sum(Case When midday = 'yes' Then 1 Else 0 End) midday_total
From services
where client_id = 24
with the following output
am_total midday_total
45 49
When I introduce the rate variable, my query starts telling me I need the group_by clause and I don't think I'm ready for that since I still need to add the am_total and the midday_total together first and then multiply that by the rate.
Ultimately, all I'm looking for is the grand total.
If I understand your question, maybe this is what you need
declare #rate int
set #rate = 2 /*what ever rate is */
select am_total * #rate as am, midday_total * #rate as midday
from (
Select
Sum(Case When morning = 'yes' Then 1 Else 0 End) am_total,
Sum(Case When midday = 'yes' Then 1 Else 0 End) midday_total
From services
where client_id = 24
)
You can also join another table and use its columns in calculation
Select
Sum(Case When morning = 'yes' Then 1 * u.rate Else 0 End) am_total,
Sum(Case When midday = 'yes' Then 1 * u.rate Else 0 End) midday_total
From services srv inner join users u on services.id = u.service_id -- assuming this is relation
pay attention on u.rate above

Resources