CannotPlanException after "CROSS JOIN UNNEST"

CannotPlanException after "CROSS JOIN UNNEST" - apache-flink

When I create a VIEW as a result of "CROSS JOIN UNNEST" and then use the condition in the WHERE clause of the VIEW, it throws an exception "org.apache.calcite.plan.RelOptPlanner$CannotPlanException".
Why am I getting this exception and how should I handle it the right way?
The following is the test code in which an error occurs.
it should "filter with object_key" in {
tEnv.executeSql(
s"""CREATE TABLE s3_put_event (
| Records ARRAY<
| ROW<
| s3 ROW<
| bucket ROW<name STRING>,
| object ROW<key STRING, size BIGINT>
| >
| >
| >
|) WITH (
| 'connector' = 'datagen',
| 'number-of-rows' = '3',
| 'rows-per-second' = '1',
| 'fields.Records.element.s3.bucket.name.length' = '8',
| 'fields.Records.element.s3.object.key.length' = '15',
| 'fields.Records.element.s3.object.size.min' = '1',
| 'fields.Records.element.s3.object.size.max' = '1000'
|)
|""".stripMargin
)
tEnv.executeSql(
s"""CREATE TEMPORARY VIEW s3_objects AS
|SELECT object_key, bucket_name
|FROM (
| SELECT
| r.s3.bucket.name AS bucket_name,
| r.s3.object.key AS object_key,
| r.s3.object.size AS object_size
| FROM s3_put_event
| CROSS JOIN UNNEST(s3_put_event.Records) AS r(s3)
|) rs
|WHERE object_size > 0
|""".stripMargin
)
tEnv.executeSql(
s"""CREATE TEMPORARY VIEW filtered_s3_objects AS
|SELECT bucket_name, object_key
|FROM s3_objects
|WHERE object_key > ''
|""".stripMargin)
val result = tEnv.sqlQuery("SELECT * FROM filtered_s3_objects")
tEnv.toChangelogStream(result).print()
env.execute()
}
If I remove the condition object_key > '' in the "filtered_s3_objects" VIEW, and do it in the "s3_objects" VIEW, no exception is thrown.
However, my actual query is complicated, so it is not easy to move the condition of the WHERE clause like this. It's hard to use especially if I need to separate the output stream.

I'm not sure that you can use a CROSS JOIN UNNEST on an array with a nested hierarchy (given that you have a ROW in your ARRAY). Either way, could you file a Jira ticket for this? https://issues.apache.org/jira/projects/FLINK/issues/

Related

Is there a documented list of Snowflake query types?

I am working with the view SNOWFLAKE.ACCOUNT_USAGE.QUERY_HISTORY. It would be extremely helpful to have an exhaustive list of query types that might appear in the column QUERY_TYPE, with the type of commands that generate them. For example, does a PUT command generate a PUT query type? Or is it something like "LOAD"?
If anyone knows where such a list can be found, please post a link. Snowflake's documentation of the view does not provide any list.

Thanks all who have answered so far. Since the consensus is that no such list exists, here is a merge of the entries provided so far with the values found in my own database. Please keep posting additional answers if your DB contains entries not found below. This way, sooner or later, we will have a fairly complete list:
QUERY_TYPE
CREATE_USER
REVOKE
DROP_CONSTRAINT
RENAME_SCHEMA
UPDATE
CREATE_VIEW
CREATE_TASK
RENAME_TABLE
INSERT
ALTER_TABLE_ADD_COLUMN
RENAME_COLUMN
MERGE
BEGIN_TRANSACTION
ALTER_VIEW_MODIFY_SECURITY
GRANT
ALTER_SESSION
DELETE
DROP_ROLE
DESCRIBE
UNKNOWN
TRUNCATE_TABLE
DROP
SHOW
ALTER_WAREHOUSE_SUSPEND
GET_FILES
UNLOAD
CREATE_NETWORK_POLICY
ALTER_TABLE_DROP_COLUMN
CREATE
REMOVE_FILES
ALTER
ALTER_USER
PUT_FILES
COPY
ALTER_ACCOUNT
DROP_TASK
CREATE_CONSTRAINT
DESCRIBE_QUERY
SELECT
RENAME_USER
COMMIT
RENAME_VIEW
USE
CREATE_TABLE
ALTER_NETWORK_POLICY
CREATE_ROLE
ALTER_TABLE_MODIFY_COLUMN
SET
ALTER_USER_ABORT_ALL_JOBS
ROLLBACK
LIST_FILES
UNSET
CREATE_TABLE_AS_SELECT
DROP_USER
ALTER_WAREHOUSE_RESUME
QUERY_TYPE
ALTER_PIPE
ALTER_ROLE
ALTER_TABLE
ALTER_TABLE_DROP_CLUSTERING_KEY
ALTER_USER_RESET_PASSWORD
CREATE_EXTERNAL_TABLE
CREATE_MASKING_POLICY
CREATE_SEQUENCE
CREATE_STREAM
DROP_STREAM
RENAME_DATABASE
RENAME_FILE_FORMAT
RENAME_ROLE
RENAME_WAREHOUSE
RESTORE

By the looks of it there is no complete list of query types that show up in this table. Best I can do is give you a list from my own database, which still doesn't contain things like alter role etc. To answer your other question a PUT command is actually PUT_FILES by the looks of it:
select distinct query_type from SNOWFLAKE.ACCOUNT_USAGE.QUERY_HISTORY;
+-------------------------+
|QUERY_TYPE |
+-------------------------+
|ALTER |
|ALTER_SESSION |
|ALTER_TABLE_ADD_COLUMN |
|ALTER_TABLE_DROP_COLUMN |
|ALTER_TABLE_MODIFY_COLUMN|
|ALTER_USER |
|ALTER_WAREHOUSE_RESUME |
|ALTER_WAREHOUSE_SUSPEND |
|BEGIN_TRANSACTION |
|COMMIT |
|COPY |
|CREATE |
|CREATE_CONSTRAINT |
|CREATE_EXTERNAL_TABLE |
|CREATE_MASKING_POLICY |
|CREATE_ROLE |
|CREATE_SEQUENCE |
|CREATE_STREAM |
|CREATE_TABLE |
|CREATE_TABLE_AS_SELECT |
|CREATE_USER |
|CREATE_VIEW |
|DELETE |
|DESCRIBE |
|DESCRIBE_QUERY |
|DROP |
|DROP_CONSTRAINT |
|DROP_STREAM |
|DROP_USER |
|GET_FILES |
|GRANT |
|INSERT |
|LIST_FILES |
|MERGE |
|PUT_FILES |
|REMOVE_FILES |
|RENAME_COLUMN |
|RENAME_DATABASE |
|RENAME_TABLE |
|RESTORE |
|REVOKE |
|ROLLBACK |
|SELECT |
|SET |
|SHOW |
|TRUNCATE_TABLE |
|UNKNOWN |
|UNLOAD |
|UPDATE |
|USE |
+-------------------------+

Added ours ... 16 extra's ... pass it on :-)
QUERY_TYPE
ALTER
ALTER_ACCOUNT
ALTER_PIPE
ALTER_ROLE
ALTER_SESSION
ALTER_TABLE
ALTER_TABLE_ADD_COLUMN
ALTER_TABLE_DROP_CLUSTERING_KEY
ALTER_TABLE_DROP_COLUMN
ALTER_TABLE_MODIFY_COLUMN
ALTER_USER
ALTER_USER_ABORT_ALL_JOBS
ALTER_USER_RESET_PASSWORD
ALTER_WAREHOUSE_RESUME
ALTER_WAREHOUSE_SUSPEND
BEGIN_TRANSACTION
COMMIT
COPY
CREATE
CREATE_CONSTRAINT
CREATE_EXTERNAL_TABLE
CREATE_MASKING_POLICY
CREATE_NETWORK_POLICY
CREATE_ROLE
CREATE_SEQUENCE
CREATE_STREAM
CREATE_TABLE
CREATE_TABLE_AS_SELECT
CREATE_TASK
CREATE_USER
CREATE_VIEW
DELETE
DESCRIBE
DESCRIBE_QUERY
DROP
DROP_CONSTRAINT
DROP_ROLE
DROP_STREAM
DROP_TASK
DROP_USER
GET_FILES
GRANT
INSERT
LIST_FILES
MERGE
PUT_FILES
REMOVE_FILES
RENAME_COLUMN
RENAME_DATABASE
RENAME_FILE_FORMAT
RENAME_ROLE
RENAME_SCHEMA
RENAME_TABLE
RENAME_USER
RENAME_VIEW
RENAME_WAREHOUSE
RESTORE
REVOKE
ROLLBACK
SELECT
SET
SHOW
TRUNCATE_TABLE
UNKNOWN
UNLOAD
UNSET
UPDATE
USE

Here are some additional ones:
ALTER_AUTO_RECLUSTER
ALTER_SET_TAG
ALTER_TABLE_MODIFY_CONSTRAINT
ALTER_UNSET_TAG
CALL
DROP_SESSION_POLICY
RECLUSTER

Is there a way to get Azure AD user's information using KQL

I am trying to get user's information from Azure AD directly, like DisplayName and UserPrincipalName, using KQL. Is there a way to do so?

Ended up exporting the needed user's attributes using PowerShell and copying the output into a Blob container, then ran the below KQL query to join the file content with the query:
let UserAtt = externaldata (UserPrincipalName:string, DisplayName:string) [
#"URL to the file location in the blob storage"
h#"?sp="Secret token"
] with (format="csv", ignoreFirstRecord=true);
UserAtt
| join kind=inner (
OfficeActivity
| where TimeGenerated > ago(1h)
| where (Operation =~ "Set-Mailbox" and Parameters contains 'ForwardingSmtpAddress')
or (Operation =~ 'New-InboxRule' and Parameters contains 'ForwardTo')
| extend parsed=parse_json(Parameters)
| extend fwdingDestination_initial = (iif(Operation=~"Set-Mailbox", tostring(parsed[1].Value), tostring(parsed[2].Value)))
| where isnotempty(fwdingDestination_initial)
| extend fwdingDestination = iff(fwdingDestination_initial has "smtp", (split(fwdingDestination_initial,":")[1]), fwdingDestination_initial )
| parse fwdingDestination with * '#' ForwardedtoDomain
| parse UserId with *'#' UserDomain
| extend subDomain = ((split(strcat(tostring(split(UserDomain, '.')[-2]),'.',tostring(split(UserDomain, '.')[-1])), '.') [0]))
| where ForwardedtoDomain !contains subDomain
| extend Result = iff( ForwardedtoDomain != UserDomain ,"Mailbox rule created to forward to External Domain", "Forward rule for Internal domain")
| extend ClientIPAddress = case( ClientIP has ".", tostring(split(ClientIP,":")[0]), ClientIP has "[", tostring(trim_start(#'[[]',tostring(split(ClientIP,"]")[0]))), ClientIP )
| extend Port = case(
ClientIP has ".", (split(ClientIP,":")[1]),
ClientIP has "[", tostring(split(ClientIP,"]:")[1]),
ClientIP
)
| project TimeGenerated, UserId, UserDomain, subDomain, Operation, ForwardedtoDomain, ClientIPAddress, Result, Port, OriginatingServer, OfficeObjectId, fwdingDestination
| extend timestamp = TimeGenerated, AccountCustomEntity = UserId, IPCustomEntity = ClientIPAddress, HostCustomEntity = OriginatingServer)
on $left.UserPrincipalName == $right.AccountCustomEntity

Unable to insert SQL table values from Powershell into SQL Server

A script gets data from an API, and I'm trying to import that data into SQL Server using a PowerShell.
$params = #{
ServerInstance = "SQLDB1"
Database="Stage"
}
$InsertResults = #"
INSERT INTO [Stage].[dbo].[ImportTable]([roleID],[roleName])
VALUES ('$roleId','$rolename')
"#
foreach($r in $roles) {
[int]$roleId = $r.id
$rolename = $r.name
Invoke-sqlcm #params -Query $InsertResults }
Here, the API spits out r in roles, which can be r.id (a number value I convert to int) or r.name, a string value, with the goal to put them into a single table side by side, [roleID][roleName]
Well, that's the goal. When checking the table in SQL Server, all I get is
|roleID|roleName|
-----------------
| 0 | |
That's if I set roleID to Primary Key. If I don't, it repeats that same row as many times as there are lines of data in the API. If I don't inclued "$rolename = $r.name" then the roleName column just says ".name" and that's that.
What I need looks like
|roleID|roleName|
-----------------
| 1 | role1 |
| 2 | role2 |
| 3 | role3 |
etc.

In your code I see some logic mistake:
You define query string with uninitialized parameters outside query loop. So, your query string never update correctly (expectedly), and you see after executed cmdlet default values for int is 0 and string is NULL (empty string).
So, correct code will be
foreach($r in $roles) {
$InsertResults = #"
INSERT INTO [Stage].[dbo].[ImportTable]([roleID],[roleName])
VALUES ('$roleId','$rolename')
"#
[int]$roleId = $r.id
$rolename = $r.name
Invoke-sqlcmd #params -Query $InsertResults }
I created array for test
$roles = #([pscustomobject]#{id=5; name ="test2"},[pscustomobject]#{id = 6;name ="test"})
I tested your code and saw this result
|roleID|roleName|
-----------------
| 6 | test |
| 6 | test |
After changed initialization query string (inside loop), i saw
|roleID|roleName|
-----------------
| 5 | test2 |
| 6 | test |
Your code also can be modified like this:
$InsertResults = #"
INSERT INTO [Stage].[dbo].[ImportTable]([roleID],[roleName])
VALUES (`$(roleId),`$(rolename))
"#
foreach($r in $roles) {
$variables = #(
"roleId=$($r.id)",
"rolename='$($r.name)'"
)
Invoke-sqlcmd #params -Query $InsertResults -Variable $variables}

Can a PS Custom Object be created from a variable?

I have a 100 column table in sql server and I want to make it so not all of the columns need to be passed in the file to load. I have assigned column names in a table that then compares the columns in a hash table to find matching columns. I then create the code based on the match for the array I want to use to insert the data from the file. The problem is, it doesn't like calling the one variable to create the custom object.
I store the following below in a array. (up to a 100 of these, few below for sample (notice sqlcolumn2 is skipped for example)).
sqlcolumn1 = if ([string]::IsNullOrEmpty($obj.P1) -eq $true) {$null} else {"$obj.P1"}
sqlcolumn3 = if ([string]::IsNullOrEmpty($obj.P2) -eq $true) {$null} else {"$obj.P2"}
sqlcolumn4 = if ([string]::IsNullOrEmpty($obj.P3) -eq $true) {$null} else {"$obj.P3"}
sqlcolumn5 = if ([string]::IsNullOrEmpty($obj.P4) -eq $true) {$null} else {"$obj.P4"}
Here is the array:
foreach($line in $Final)
{
$DataRow = "$($line."TableColumnName") = if ([string]::IsNullOrEmpty(`$obj.$($line."PName")) -eq `$true) {`$null} else {`"`$obj.$($line."PName")`"}"
$DataArray += $DataRow
}
I then try to add it to a final array where I would want this to be looped through for each row of data after which I would perform the insert from the array. Even though the "string" value in the array above is correct if it were hand coded, I can't get it to recognize the rows and run.
foreach ($obj in $data2)
{
$test = [PSCustomObject] #{
$DataArray = Invoke-Expression $DataArray
}
If I just type $DataArray, it doesn't like this because it wants the = sign which I already have built into the string.
Is what I am trying to do even possible.
I was attempting to template out various different ways we receive this data, where some people send us 30 of the 100 columns, other more or less, and no one person using the exact columns to cut down on individual scripts for everything.
Adding more code:
Function ArrayCompare() {
[CmdletBinding()]
PARAM(
[Parameter(Mandatory=$True)]$Array1,
[Parameter(Mandatory=$True)]$A1Match,
[Parameter(Mandatory=$True)]$Array2,
[Parameter(Mandatory=$True)]$A2Match)
$Hash = #{}
foreach ($Data In $Array1) {
$Hash[$Data.$A1Match] += ,$Data
}
foreach ($Data In $Array2) {
$Hash[$Data.$A2Match] += ,$Data
}
foreach ($KeyValue In $Hash.GetEnumerator()){
$Match1, $Match2 = $KeyValue.Value.Where( {$_.$A1Match}, 'Split')
[PSCustomObject]#{
MatchValue = $KeyValue.Key
A1Matches = $Match1.Count
A2Matches = $Match2.Count
TablePosition = [int]$Match2.TablePosition
TableColumnName = $Match2.TableColumnName
# PName is the P(##) that is a generic ascending column value back to import-excel module. ColumnA = P1, ColumnB = P2 etc..until no data is detected. Allows flexibility and not having to know how many columns there are
PName = $Match1.Name}
}
}
$Server = 'ServerName'
$Catalog = 'DBName'
$DestinationTable = 'ImportIntoTableName'
$FileIdentifierID = 10
$FileName = 'Test.xlsx'
$FilePath = 'C:\'
$FullFilePath = $FilePath + $FileName
$data = Import-Excel -Path $FullFilePath -NoHeader -StartRow 1 # Import-
Excel Module for working with xlsx excel files
$data2 = Import-Excel -Path $ullFilePath -NoHeader -StartRow 2 # Import-
Excel Module for working with xlsx excel files
$ExpectedHeaderArray = #()
$HeaderArray = #()
$DataArray = #()
$HeaderDetect = #()
$HeaderDetect = $data | Select-Object -First 1 # Header Row In File
$HeaderDetect |
ForEach-Object {
$ColumnValue = $_
$ColumnValue |
Get-Member -MemberType *Property |
Select-Object -ExpandProperty Name |
ForEach-Object {
$HeaderValues = [PSCustomObject]#{
Name = $_
Value = $ColumnValue.$_}
$HeaderArray += $HeaderValues
}
}
# Query below provides a list of all expected file headers and the table
column name they map to
$Query = "SELECT TableColumnName, FileHeaderName, TablePosition FROM
dbo.FileHeaders WHERE FileIdentifierID = $($FileIdentifierID)"
$ds = Invoke-Sqlcmd -ServerInstance $Server -Database $Catalog -Query $Query
-OutputAs DataSet
$ExpectedHeaderArray = foreach($Row in $ds.Tables[0].Rows)
{
new-object psObject -Property #{
TableColumnName = "$($row.TableColumnName)"
FileHeaderName = "$($row.FileHeaderName)"
TablePosition = "$($row.TablePosition)"
}
}
#Use Function Above
#Bring it together so we know what P(##) goes with which header in file/mapped to table column name
$Result = ArrayCompare -Array1 $HeaderArray -A1Match Value -Array2 $ExpectedHeaderArray -A2Match FileHeaderName
$Final = $Result | sort TablePosition
foreach($Line in $Final)
{
$DataRow = "$($Line."TableColumnName") = if ([string]::IsNullOrEmpty(`$obj.$($Line."PName")) -eq `$true) {`$null} else {`"`$obj.$($Line."PName"))`"}"
$DataArray += $DataRow
}
# The output below is what the code inside the last array would be that I would use to import into excel.
# The goal is to be dynamic and match headers in the file to the stored header value and import into a table (mapped from header column to table column name)
# The reason for this is before I was here, there were many different "versions" of a layout that was given out. In the end, it is all one in the same
# but some send all 100 columns, some only send a handful, some send 80 etc. I am trying to have everything flow through here vs. 60+ pieces of code/stored procedures/ssis packs
Write-Output $DataArray
# Output Sample -- Note how in the sample, P2 and subsequent skip SQLColumn2 because P2 maps to the header value of position 3 in the sql table and each after is one off.
# In this example, SqlColumn2 would not be populated
# SqlColumn1 = if ([string]::IsNullOrEmpty($obj.P1) -eq $true) {$null} else {"$obj.P1"}
# SqlColumn3 = if ([string]::IsNullOrEmpty($obj.P2) -eq $true) {$null} else {"$obj.P2"}
# SqlColumn4 = if ([string]::IsNullOrEmpty($obj.P3) -eq $true) {$null} else {"$obj.P3"}
# SqlColumn5 = if ([string]::IsNullOrEmpty($obj.P4) -eq $true) {$null} else {"$obj.P4"}
# I know this doesn't work. This is where I'm stuck, how to build an array now off of this output from above
foreach ($obj in $data2)
{
$test = [PSCustomObject] #{
$DataArray = Invoke-Expression $DataArray}
}

I'm gong to re-state your question first, just to make sure I understand it properly (it's possible I don't!)...
You've got an excel file that looks something like this:
+---+---------+---------+---------+
| | A | B | C |
+---+---------+---------+---------+
| 1 | HeaderA | HeaderB | HeaderC |
+---+---------+---------+---------+
| 2 | Value P | Value Q | Value R |
+---+---------+---------+---------+
| 3 | Value S | Value T | Value U |
+---+---------+---------+---------+
You've also got a database table which looks like this:
+---------+---------+---------+---------+
+ ColumnW | ColumnX | ColumnY | ColumnZ |
+---------+---------+---------+---------+
+ ....... | ....... | ....... | ....... |
+---------+---------+---------+---------+
and a column mapping table like this (note, ColumnX isn't mapped in this example):
+-----------------+----------------+---------------+
| TableColumnName | FileHeaderName | TablePosition |
+-----------------+----------------+---------------+
| ColumnW | HeaderA | 1 |
+-----------------+----------------+---------------+
| ColumnY | HeaderB | 2 |
+-----------------+----------------+---------------+
| ColumnZ | HeaderC | 3 |
+-----------------+----------------+---------------+
You want to insert the values from the spreadsheet into the database table, using the data in your mapping table so you get this:
+---------+---------+---------+---------+
+ ColumnW | ColumnX | ColumnY | ColumnZ |
+---------+---------+---------+---------+
+ Value P | null | Value Q | Value R |
+---------+---------+---------+---------+
+ Value S | null | Value T | Value U |
+---------+---------+---------+---------+
So let's load the spreadsheet (letting the header row generate meaningful property names this time):
$data = Import-Excel -Path ".\MySpreadsheet.xlsx";
write-host ($data | ft | out-string);
# HeaderA HeaderB HeaderC
# ------- ------- -------
# Value P Value Q Value R
# Value S Value T Value U
and get your column mapping data (I'm programmatically creating an in-memory dataset, but you obviously read yours from your database instead):
$mappings = new-object System.Data.DataTable;
$null = $mappings.Columns.Add("TableColumnName", [string]);
$null = $mappings.Columns.Add("FileHeaderName", [string]);
$null = $mappings.Columns.Add("TablePosition", [int]);
#(
#{ "TableColumnName"="ColumnW"; "FileHeaderName"="HeaderA"; "TablePosition"=1 },
#{ "TableColumnName"="ColumnY"; "FileHeaderName"="HeaderB"; "TablePosition"=2 },
#{ "TableColumnName"="ColumnZ"; "FileHeaderName"="HeaderC"; "TablePosition"=3 }
) | % {
$row = $mappings.NewRow();
$row.TableColumnName = $_.TableColumnName;
$row.FileHeaderName = $_.FileHeaderName;
$row.TablePosition = $_.TablePosition;
$mappings.Rows.Add($row);
}
$ds = new-object System.Data.DataSet;
$ds.Tables.Add($mappings);
write-host ($ds.Tables[0] | ft | out-string)
# TableColumnName FileHeaderName TablePosition
# --------------- -------------- -------------
# ColumnW HeaderA 1
# ColumnY HeaderB 2
# ColumnZ HeaderC 3
Now we can build the "mapped" objects:
$values = #();
foreach( $row in $data )
{
$properties = [ordered] #{};
foreach( $mapping in $mappings )
{
$properties.Add($mapping.TableColumnName, $row."$($mapping.FileHeaderName)");
}
$values += new-object PSCustomObject -Property $properties;
}
write-host ($values | ft | out-string)
# ColumnW ColumnY ColumnZ
# ------- ------- -------
# Value P Value Q Value R
# Value S Value T Value U
The tricksy bit is $properties.Add($mapping.TableColumnName, $row."$($mapping.FileHeaderName)"); - basically, you can access object properties in PowerShell using a dotted string literal or variable (I'm not sure of the exact feature name) - e.g.
PS> $myValue = new-object PSCustomObject -Property #{ "aaa"="bbb"; "ccc"="ddd" }
PS> $myValue."aaa"
bbb
PS> $myProperty = "aaa"
PS> $myValue.$myProperty
"bbb"
so $row."$($mapping.FileHeaderName)" is an expression that evaluates to the value of the property of $row named in $mapping.FileHeaderName.
And then finally you can insert the objects into your database using your existing process...
Note that I couldn't quite work out what your ArrayCompare is actually doing so it's possible the above doesn't solve your problem 100%, but it's hopefully close enough that you can either work the difference out yourself, or leave a comment with where it differs from your desired solution.
Hope this helps.

splunk query taking long time to return the value, can we eliminate append

i have initially used inputlook to get the output and query was returning output in fractions of sec, but now i want to use the source as input and run the Splunk query but its taking lot of time to return output.
Please suggest solution to optimise the output time.
I am thinking of removing multiple append
index=csvlookups source="F:\\SplunkMonitor\\csvlookups\\Core_Network\\lookup_table_sip_pbx_usage.csv" OR source="F:\\SplunkMonitor\\csvlookups\\Core_Network\\lookup_table_dpt_capacity.csv" OR source="F:\\SplunkMonitor\\csvlookups\\Core_Network\\lookup_table_sip_pbx_forecasts.csv"
| eval Date=strftime(strptime(Date,"%m/%d/%Y"),"%Y-%m-%d")
| sort Date, CLLI
| rename CLLI as Office
| search Office="CLGRAB21DS1"
| stats sum(Usage) as Usage by Office, Date
| append
[ search index=csvlookups source="F:\\SplunkMonitor\\csvlookups\\Core_Network\\lookup_table_sip_pbx_usage.csv" OR source="F:\\SplunkMonitor\\csvlookups\\Core_Network\\lookup_table_dpt_capacity.csv" OR source="F:\\SplunkMonitor\\csvlookups\\Core_Network\\lookup_table_sip_pbx_forecasts.csv"
| eval Date=strftime(strptime(Date,"%m/%d/%Y"),"%Y-%m-%d")
| reverse
| search Office="CLGRAB21DS1" AND Type="SIP PBX"
| fields Date NB_RTU
| fields - _raw _time ]
| sort Date
| fillnull value="CLGRAB21DS1" Office
| filldown Usage
| filldown NB_RTU
| fillnull value=0 Usage
| eval _time = strptime(Date, "%Y-%m-%d")
| eval latest_time = if("now" == "now", now(), relative_time(now(), "now"))
| where ((_time >= relative_time(now(), "-3y#h")) AND (_time <= latest_time))
| fields - latest_time Date
| append
[ gentimes start=-1
| eval Date=strftime(mvrange(now(),now()+60*60*24*365*3,"1mon"),"%F")
| mvexpand Date
| fields Date
| append
[ search index=csvlookups source="F:\\SplunkMonitor\\csvlookups\\Core_Network\\lookup_table_sip_pbx_usage.csv" OR source="F:\\SplunkMonitor\\csvlookups\\Core_Network\\lookup_table_dpt_capacity.csv" OR source="F:\\SplunkMonitor\\csvlookups\\Core_Network\\lookup_table_sip_pbx_forecasts.csv"
| rename "Expected Date of Addition" as edate
| eval edate=strftime(strptime(edate,"%m/%d/%Y"),"%Y-%m-%d")
| rename edate as "Expected Date of Addition"
| table Contact Customer "Expected Date of Addition" "Number of Channels" Switch
| reverse
| search Customer = "Regular Usage" AND Switch = "CLGRAB21DS1"
| rename "Number of Channels" as val
| return $val ]
| reverse
| filldown search
| rename search as Usage
| where Date != ""
| reverse
| append
[ search index=csvlookups source="F:\\SplunkMonitor\\csvlookups\\Core_Network\\lookup_table_sip_pbx_usage.csv" OR source="F:\\SplunkMonitor\\csvlookups\\Core_Network\\lookup_table_dpt_capacity.csv" OR source="F:\\SplunkMonitor\\csvlookups\\Core_Network\\lookup_table_sip_pbx_forecasts.csv"
| rename "Expected Date of Addition" as edate
| eval edate=strftime(strptime(edate,"%m/%d/%Y"),"%Y-%m-%d")
| rename edate as "Expected Date of Addition"
| table Contact Customer "Expected Date of Addition" "Number of Channels" Switch
| reverse
| search Customer != "Regular Usage" AND Switch = "CLGRAB21DS1"
| rename "Expected Date of Addition" as Date
| eval _time=strptime(Date, "%Y-%m-%d")
| rename "Number of Channels" as Forecast
| stats sum(Forecast) as Forecast by Date]
| sort Date
| rename Switch as Office
| eval Forecast1 = if(isnull(Forecast),Usage,Forecast)
| fields - Usage Forecast
| streamstats sum(Forecast1) as Forecast
| fields - Forecast1
| eval Date=strptime(Date, "%Y-%m-%d")
| eval Date=if(Date < now(), now(), Date) ]
| filldown Usage
| filldown Office
| eval Forecast = Forecast + Usage
| eval Usage = if(Forecast >= 0,NULL,Usage)
| eval _time=if(isnull(_time), Date, _time)
| timechart limit=0 span=1w max(Usage) as Usage, max(NB_RTU) as NB_RTU, max(Forecast) as Forecast by Office
| rename "NB_RTU: CLGRAB21DS1" as "RTU's Purchased", "Usage: CLGRAB21DS1" as "Usage", "Forecast: CLGRAB21DS1" as "Forecast"
| filldown "RTU's Purchased" |sort -Forecast

Definitely an expensive query you don't want to run often or over large timeranges. In your first append, why are you using reverse? Are you trying to get latest time and earliest time which is why you used the append? You could use earliest and latest for this and eliminate the first subsearch. You could also consider eventstats instead of stats on that first search since you'll still retain the raw data.
You're also summing by _time, so you should think about binning your _time spans (i.e. | bin Date span=1h). Also, why are you using filldown? I'm guessing you want to grab values from different rows and need the rows to match? If so, use streamstats for this

If inputlookup was working well you should stick with that as you won't get much faster.
It's hard to give specific advice about your query without knowing more about the data and your end goals. In general:
Filter early. Make your base query (before the first '|') as specific as possible. Run your where and search clauses as soon as you can.
Use fields instead of table. It's more efficient.
Sort only when necessary. Usually, it's not necessary.
Fewer appends is better.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

CannotPlanException after "CROSS JOIN UNNEST" - apache-flink

I'm not sure that you can use a CROSS JOIN UNNEST on an array with a nested hierarchy (given that you have a ROW in your ARRAY). Either way, could you file a Jira ticket for this? https://issues.apache.org/jira/projects/FLINK/issues/

Related

Is there a documented list of Snowflake query types?

Is there a way to get Azure AD user's information using KQL

Unable to insert SQL table values from Powershell into SQL Server

Can a PS Custom Object be created from a variable?

splunk query taking long time to return the value, can we eliminate append

Categories

Resources