How to get the sql output using Splunk DB connect for 2 DBs? - connection-string

Getting the payment details from 1 DB and using those values in 2nd DB to get the results. But as per SQL limit payment ID IN (<limited to 999 payment ids from 1st DB>). How do we do this using SPLUNK DB CONNECT which allows more than 30K payment IDs? I have both DBs connection strings.
1st sql with connection:
| streamstats count as "rc"
| table payment_id rc
| eval rc1 = rc%10
| table payment_id rc1
| eval payment_id = "'"+payment_id+"',"
| stats values(payment_id) as payment_id by rc1
| mvcombine delim="" payment_id
| nomv payment_id
| rex field=payment_id "(?<payment_id>.*)."
| map search="|dbxquery connection = ""
<2nd sql here where using the above payment_id in where condition as below>
(payment_id IN ("$payment_id$"))>
Here it will divide the no of payment_id values by 10 and send the streams to map. As MAP only allows 9 values, I have to divide it by 10. Example: if I have 9K payment ids, 9K%10 streams will be formed and send to 2nd sql query. Here the limitation is only for 10K records only the below will work. it won't work for more than 10K payment_ids

1st sql with connection
| streamstats count as "rc"
| table payment_id rc
| eval rc1 = rc%10
| table payment_id rc1
| eval payment_id = "'"+payment_id+"',"
| stats values(payment_id) as payment_id by rc1
| mvcombine delim="" payment_id | nomv payment_id
| rex field=payment_id "(?<payment_id>.*)."
| map search="|dbxquery connection = "" <2nd sql here where using the above payment_id in where condition as below>
(payment_id IN ("$payment_id$"))>
Here it will divide the no of payment_id values by 10 and send the streams to map. As MAP only allows 9 values, I have to divide it by 10. Example: if I have 9K payment ids, 9K%10 streams will be formed and sent to the 2nd sql query.
Here the limitation is only for 10K records only the below will work. It won't work for more than 10K payment_ids.

Related

Postgresql - Find overlapping time ranges for different users in the same session and present them as pairs

I have a table which has records of sessions a players have played in a group music play. (music instruments)
so if a user joins a session, and leaves, there is one row created. If they join even the same session 2x, then two rows are created.
Table: music_sessions_user_history
| Column | Type | Default|
| --- | --- | ---|---
| id | character varying(64) | uuid_generate_v4()|
| user_id | user_id | |
| created_at | timestamp without time zone | now()|
| session_removed_at | timestamp without time zone | |
| max_concurrent_connections | integer |
| music_session_id|character varying(64)|
This table is basically the amount of time a user was in a given session. So you can think of it as a timerange or tsrange in PG. The max_concurrent_connections which is a count of the number of users who were in the session at once.
so the query at it's heart needs to find overlapping time ranges for different users in the same session; and to then count them up as a pair that played together.
The query needs to do this: It tries to report each user that played in a music session with others - and who those users were
So for example, if a userA played with userB, and that's the only data in the database, then two rows would be returned like:
| User | Other users in the session |
| --- | --- |
|userA | [userB] |
|userB | [userA] |
But if userA played with both userB and UserC, then three rows would be like:
| User | Other users in the session |
| --- | --- |
|userA | [userB, userC]|
|userB | [userA, userC]|
|userC | [userA, userB]|
Any help of constructing this query is much appreciated.
update:
I am able to get overlapping records using this query.
select m1.user_id, m1.created_at, m1.session_removed_at, m1.max_concurrent_connections, m1.music_session_id
from music_sessions_user_history m1
where exists (select 1
from music_sessions_user_history m2
where tsrange(m2.created_at, m2.session_removed_at, '[]') && tsrange(m1.created_at, m1.session_removed_at, '[]')
and m2.music_session_id = m1.music_session_id
and m2.id <> m1.id);
Need to find a way to convert these results in to pairs.
create a cursor and for each fetched record determine which other records intersect using a between time of start and end time.
append the intersecting results into a temporary table
select the results of the temporary table

SQL Server - same table for multiple customers

I have a need to manage a dataset for multiple customers - each customer manages a small table to update procedure volumes for the next five years. The table is structured like so:
+-------------+--------+--------+--------+--------+--------+
| | Year 1 | Year 2 | Year 3 | Year 4 | Year 5 |
+-------------+--------+--------+--------+--------+--------+
| Procedure A | 5 | 10 | 14 | 12 | 21 |
+-------------+--------+--------+--------+--------+--------+
| Procedure B | 23 | 23 | 2 | 3 | 4 |
+-------------+--------+--------+--------+--------+--------+
| Procedure C | 5 | 6 | 7 | 8 | 12 |
+-------------+--------+--------+--------+--------+--------+
The values in this table will be managed by each customer via MS PowerApps.
This same structure exists for every single customer. What is the best way to put all of these in one dataset?
Should I just add a column for CUSTOMER ID and just put all the data in there?
The process:
Utilizing PowerApps, a new customer deal will be generated and a row will be added for them in the SQL DB in a customer records table.
Simultaneously, the blank template of the above table should be generated for them.
Now, the customer can interface with this SQL table within PowerApps and add their respective procedure volumes.
The question isn't explained well but:
I would assume all of the customer specific data has at least one column that is the same. For instance CustomerName. You could create your own table with CustomerId, CustomerName, (any other fields you would like to see). If there isn't a concept of CustomerId on the customer's tables, you would have to join them on CustomerName. You could populate your own CustomerId for the new table.
I would be happy to help more if you could clarify the question and show a few examples.

SQL Server 2008 exec stored procedure on 3rd strike

I've been asked to create a process that triggers a stored procedure when an employee hits their 3rd strike. The strikes relate to absence, so if an employee is off 3 times in a 3 month period it hits the trigger.
But... this only applies to single instances of absence, so if a person is off; for example on the 11/01/2016, 12/01/2016 & 13/01/2016 then this is one instance. Meaning I can't do a count on the number of days off sick.
Data I have available and is a fixed process I can't update:
Date | EmpID | EmpName
01/01/2016 | JS01 | John Spartan
02/01/2016 | JS01 | John Spartan
03/01/2016 | JS01 | John Spartan
08/01/2016 | JS01 | John Spartan
19/02/2016 | JS01 | John Spartan
12/02/2016 | JS01 | John Spartan
Based on the above there are more than 2 instances. So this would trigger the procedure
IF EXISTS (<Query Here>)
BEGIN
EXEC usp_ThreeStrikes
END
Is there a way to do this in T-SQL?
If you can't add columns to help query with the task (eg. InstanceID query would group by to find out number of instances), I think best solution would be to create aggregate CLR function for the task.
https://msdn.microsoft.com/en-us/library/ms131056.aspx
You can try the below approach:
Add an additional column to your table to differentiate if the record should be considered for next strike or not (means after 1 instance, it should not be considered the 2nd time)
Create a SQL Update Trigger to call the procedure, based on the below condition:
Get the records whose column is considered for next strike (same column what you have created in first step)
For those particular records check the count if its greater or equal to 3 and call the stored procedure
For those particular records, update the additional column (created in step 1) to not consider it for the subsequent strikes
Hope this helps.
Here you have a query that lists the empid's that were absent three or more times per quarter. You can modify this query in your trigger to only select in the empids/quarters that are present in the inserted table in your trigger.
PS: I've added some random absences to show that the query only selectes when the number of absences is three or more.
CREATE TABLE #absences(dt DATE,empid NVARCHAR(128),empname NVARCHAR(128));
INSERT INTO #absences(dt,empid,empname)VALUES
('20151212','JS02','John Spartan2'),
('20151213','JS02','John Spartan2'),
('20151010','JS01','John Spartan'),
('20151011','JS01','John Spartan'),
('20151217','JS02','John Spartan2'),
('20151219','JS02','John Spartan2'),
('20160101','JS01','John Spartan'),
('20160102','JS01','John Spartan'),
('20160103','JS01','John Spartan'),
('20160108','JS01','John Spartan'),
('20160201','JS02','John Spartan2'),
('20160203','JS02','John Spartan2'),
('20160219','JS01','John Spartan'),
('20160212','JS01','John Spartan');
SELECT
empid,
[quarter]=DATEADD(QUARTER,DATEDIFF(QUARTER,0,o.dt),0)
FROM
#absences AS o
WHERE
NOT EXISTS (
SELECT 1
FROM #absences AS i
WHERE i.empid=o.empid AND
DATEDIFF(QUARTER,0,i.dt)=DATEDIFF(QUARTER,0,o.dt) AND
i.dt=DATEADD(DAY,-1,o.dt)
)
GROUP BY
empid,
DATEDIFF(QUARTER,0,o.dt)
HAVING
COUNT(*)>=3;
DROP TABLE #absences;
Result:
+-------+-------------------------+
| empid | quarter |
+-------+-------------------------+
| JS02 | 2015-10-01 00:00:00.000 |
| JS01 | 2016-01-01 00:00:00.000 |
+-------+-------------------------+

How do I show a timestamp member as a date in SQL Server MDX queries

I have a scenario where I the dimension has a series of date / time members but instead I want to show it grouped to the day, how do I do that?
Example cube query:
select {[Measures].[Count]} on columns,
[Date].[Date].[Date] on rows
from [Cube]
and this query returns:
| count
2014-03-03 15:50:24.000 | 1
2014-03-03 16:05:10.000 | 1
2014-03-03 16:05:21.000 | 1
2014-03-02 16:30:13.000 | 1
I want to be able to show
| count
2014-03-03 | 3
2014-03-02 | 1
I'm using Microsoft Analysis Services 2008 R2 and the MDX queries for that
Maybe this is a little similar to what you're trying to achieve:
WITH
MEMBER [Measures].[countDays] AS
Count((EXISTING [Date].[Calendar].[Date]))
SELECT
{[Measures].[countDays]} ON COLUMNS
,[Date].[Calendar].[Month] ON ROWS
FROM [Adventure Works];
It returns the following:

How to analayze/display a raw web analytics data?

I've created a web tracking system that simply insert an event information (click or page view) into a simple SQL server table:
Column | Type | NULL?
-------------------------------------
RequestId | bigint | NOT NULL
PagePath | varchar(50) | NOT NULL
EventName | varchar(50) | NULL
Label | varchar(50) | NULL
Value | float | NULL
UserId | int | NOT NULL
LoggedDate | datetime | NOT NULL
How can I harvest/analayze/display this raw information?
First decide what trends you are most interested in. Perhaps looking at some existing web analytics software - there is free software available - to see what options exist.
If your requirements are simple, you have enough data. If you want a breakdown of which countries are accessing your website, you need to log IP addresses and get a database that ties IP ranges to countries - these are not 100% reliable but will get you fairly good accuracy.
Some simple examples of reporting you can do with your current data:
Number of hits per hour, day, week, month
Top 20 accessed pages
Top Users
Number of users accessing the site per hour, day, week, month
etc.
Most of these you can pull with a single SQL query using the group by clause and date functions.
Example MS SQL Server query to achieve hits per day (untested):
SELECT COUNT(RequestID) AS NumberOfHits,
YEAR(LoggedDate) AS EventYear,
MONTH(LoggedDate) AS EventMonth,
DAY(LoggedDate) AS EventDay
FROM MyTable
GROUP BY YEAR(LoggedDate), MONTH(LoggedDate), DAY(LoggedDate)
ORDER BY YEAR(LoggedDate), MONTH(LoggedDate), DAY(LoggedDate)
Maybe Logparser is sufficient for your needs: http://www.microsoft.com/downloads/details.aspx?FamilyID=890cd06b-abf8-4c25-91b2-f8d975cf8c07&displaylang=en

Resources