Finding most queried items in a table in SQL Server

Finding most queried items in a table in SQL Server - sql-server

We have a SQL Server database which has table consisting of tickers. Something like
Ticker | description
-------+-------------
USDHY | High yield
USDIG | Investment grade ...
Now we have a lot of other tables which has data corresponding to these tickers (time series). We want to able to create a report which can show us which of these tickers are more queried for and which not not queried for at all. This can allow us to selectively run some procedures for the tickers which are more frequently used and ignore the others on a regular basis.
Is there some way to achieve this in SQL, any report which could generate this stat over a period of time say n-months.
Any help is much appreciated

Seems like no answers so far. As I mentioned, one possibility is to use Extended Events like below:
CREATE EVENT SESSION [TestTableSelectLog]
ON SERVER
ADD EVENT sqlserver.sp_statement_completed (
WHERE [statement] LIKE '%SELECT%TestTable%' --Capure all selects from TestTable
AND [statement] NOT LIKE '%XEStore%' --filter extended event queries
AND [statement] NOT LIKE '%fn_xe_file_target_read_file%'),
ADD EVENT sqlserver.sql_statement_completed (
WHERE [statement] LIKE '%SELECT%TestTable%'
AND [statement] NOT LIKE '%XEStore%'
AND [statement] NOT LIKE '%fn_xe_file_target_read_file%')
ADD TARGET package0.event_file (SET FILENAME=N'C:\Temp\TestTableSelectLog.xel');--log to file
ALTER EVENT SESSION [TestTableSelectLog] ON SERVER STATE=START;--start capture
You can then select from file using sys.fn_xe_file_target_read_file:
CREATE TABLE TestTable
(
Ticker varchar(10),
[Description] nvarchar(100)
)
SELECT * FROM TestTable
SELECT *, CAST(event_data AS XML) AS 'event_data_XML'
FROM sys.fn_xe_file_target_read_file('C:\Temp\TestTableSelectLog*.xel', NULL, NULL, NULL)
The SELECT statement should be captured.
Extended Events can be also configured from GUI (Management/Extended Events/Sessions in Management Studio).

Related

Read amount on a postgres table

Is there any way to calculate the amount of read per second on a Postgres table?
but what I need is that whether a table has any read at the moment. (If no, then I can safely drop it)
Thank you

To figure out if the table is used currently, tun
SELECT pid
FROM pg_locks
WHERE relation = 'mytable'::regclass;
That will return the process ID of all backends using it.
To measure whether s table is used at all or not, run this query:
SELECT seq_scan + idx_scan + n_tup_ins + n_tup_upd + n_tup_del
FROM pg_stat_user_tables
WHERE relname = 'mytable';
Then repeat the query in a day. If the numbers haven't changed, nobody has used the table.

Audit SELECT activity
My suggestion is to wrap mytable in a view (called the_view_to_use_instead in the example) which invokes a logging function upon every select and then use the view for selecting from, i.e.
select <whatever you need> from the_view_to_use_instead ...
instead of
select <whatever you need> from mytable ...
So here it is
create table audit_log (table_name text, event_time timestamptz);
create function log_audit_event(tname text) returns void language sql as
$$
insert into audit_log values (tname, now());
$$;
create view the_view_to_use_instead as
select mytable.*
from mytable, log_audit_event('mytable') as ignored;
Every time someone queries the_view_to_use_instead an audit record with a timestamp appears in table audit_log. You can then examine it in order to find out whether and when mytable was selected from and make your decision. Function log_audit_event can be reused in other similar scenarios. The average number of selects per second over the last 24 hours would be
select count(*)::numeric/86400
from audit_log
where event_time > now() - interval '86400 seconds';

How to access the a table ABC_XXX constantly in Teradata where XXX changes periodically?

I have a table in Teradata ABC_XXX where XXX will change monthly basis.
For Ex: ABC_1902, ABC_1812, ABC_1904 etc...
I need to access this table in my application without modifying the code every month.
Is that any way to do in Teradata or any alternate solution.??
Please help

Can you try using DBC.TABLES in subquery like below:
with tbl as (select 'select * from ' || databasename||'.'||tablename as tb from
dbc.tables where tablename like 'ABC_%')
select * from tbl;
If you can get the final query executed in your application, you will be able to query the required table without editing the query.
The above solution is with expectation that the previous month's table gets dropped whenever a new month's table is created.
However, if previous table is not being dropped, then you can try the below approach:
select 'select * from db.ABC_' ||to_char(current_date,'YYMM')
Output will be
select * from db.ABC_1902
execute the output in your application, you will be able to query dynamic table.

T-SQL, SSRS: Set up automatic daily Inserts into Table

I'm using SQL Server 2012.
SSMS 11.0.6248.0.
I want to create an automated way of Inserting data [using a T-SQL insert statement] into a SQL Server table before users start using the system [third-party business system] each morning.
I do a lot of SSRS reporting and creating subscriptions; know how to do inserts using T-SQL, and I am familiar with stored procedures, but I have not had to automate something like this strictly within SQL Server.
Can I make this happen on a schedule - strictly in the SQL Server realm [i.e. using SSRS ... or a stored procedure ... or a function]?
Example Data to read:
Declare #t Table
(
DoctorName Varchar(1),
AppointmentDate Date,
Appointments Int
)
Insert Into #t select 'A','2018-10-23', 5
Insert Into #t select 'B','2018-10-23', 5
Insert Into #t select 'C','2018-10-23', 5
Insert Into #t select 'D','2018-10-23', 5
Insert Into #t select 'E','2018-10-23', 5
Insert Into #t select 'F','2018-10-23', 5
Insert Into #t select 'G','2018-10-23', 5
Insert Into #t select 'H','2018-10-23', 5
Insert Into #t select 'I','2018-10-23', 5;
Select * From #t
The value in Appointments changes through the day as Doctors see patients. Patients may cancel. Patients may walk in. Typically, at the end of the day Doctors end up seeing more patients than they have scheduled at the start of the day. [I set the number at 5 for all Doctors at the start of the above day].
I want to capture the data as it is at the start of each day - before the Clinic opens and the numbers change - and store it in another Table for historic reporting.
I hope this simplified example clarifies what I want to do.
I would appreciate any suggestions on how I might best go about doing this.
Thanks!

This sounds like a job for the SQL Server Agent. A more specific suggestion will require a more detailed description of what you're doing (with sample data, preferably).

You can use SSIS to create a job that you can then schedule. Since you are familiar with stored procedures, you would create your SP first then in SSIS add a Control Flow of Execute SQL Task and configure it according to your needs.
If that doesn't work for you, you could create an application to run on a Timer that executes your SP, however, since you want to stay in the SQL realm, SSIS is the place to look.

SSIS data flow - copy new data or update existing

I queried some data from table A(Source) based on certain condition and insert into temp table(Destination) before upsert into Crm.
If data already exist in Crm I dont want to query the data from table A and insert into temp table(I want this table to be empty) unless there is an update in that data or new data was created. So basically I want to query only new data or if there any modified data from table A which already existed in Crm. At the moment my data flow is like this.
clear temp table - delete sql statement
Query from source table A and insert into temp table.
From temp table insert into CRM using script component.
In source table A I have audit columns: createdOn and modifiedOn.
I found one way to do this. SSIS DataFlow - copy only changed and new records but no really clear on how to do so.
What is the best and simple way to achieve this.

The link you posted is basically saying to stage everything and use a MERGE to update your table (essentially an UPDATE/INSERT).
The only way I can really think of to make your process quicker (to a significant degree) by partially selecting from table A would be to add a "last updated" timestamp to table A and enforcing that it will always be up to date.
One way to do this is with a trigger; see here for an example.
You could then select based on that timestamp, perhaps keeping a record of the last timestamp used each time you run the SSIS package, and then adding a margin of safety to that.
Edit: I just saw that you already have a modifiedOn column, so you could use that as described above.
Examples:
There are a few different ways you could do it:
ONE
Include the modifiedOn column on in your final destination table.
You can then build a dynamic query for your data flow source in a SSIS string variable, something like:
"SELECT * FROM [table A] WHERE modifiedOn >= DATEADD(DAY, -1, '" + #[User::MaxModifiedOnDate] + "')"
#[User::MaxModifiedOnDate] (string variable) would come from an Execute SQL Task, where you would write the result of the following query to it:
SELECT FORMAT(CAST(MAX(modifiedOn) AS date), 'yyyy-MM-dd') MaxModifiedOnDate FROM DestinationTable
The DATEADD part, as well as the CAST to a certain degree, represent your margin of safety.
TWO
If this isn't an option, you could keep a data load history table that would tell you when you need to load from, e.g.:
CREATE TABLE DataLoadHistory
(
DataLoadID int PRIMARY KEY IDENTITY
, DataLoadStart datetime NOT NULL
, DataLoadEnd datetime
, Success bit NOT NULL
)
You would begin each data load with this (Execute SQL Task):
CREATE PROCEDURE BeginDataLoad
#DataLoadID int OUTPUT
AS
INSERT INTO DataLoadHistory
(
DataLoadStart
, Success
)
VALUES
(
GETDATE()
, 0
)
SELECT #DataLoadID = SCOPE_IDENTITY()
You would store the returned DataLoadID in a SSIS integer variable, and use it when the data load is complete as follows:
CREATE PROCEDURE DataLoadComplete
#DataLoadID int
AS
UPDATE DataLoadHistory
SET
DataLoadEnd = GETDATE()
, Success = 1
WHERE DataLoadID = #DataLoadID
When it comes to building your query for table A, you would do it the same way as before (with the dynamically generated SQL query), except MaxModifiedOnDate would come from the following query:
SELECT FORMAT(CAST(MAX(DataLoadStart) AS date), 'yyyy-MM-dd') MaxModifiedOnDate FROM DataLoadHistory WHERE Success = 1
So the DataLoadHistory table, rather than your destination table.
Note that this would fail on the first run, as there'd be no successful entries on the history table, so you'd need you insert a dummy record, or find some other way around it.
THREE
I've seen it done a lot where, say your data load is running every day, you would just stage the last 7 days, or something like that, some margin of safety that you're pretty sure will never be passed (because the process is being monitored for failures).
It's not my preferred option, but it is simple, and can work if you're confident in how well the process is being monitored.

How to get 2 data sources to merge in SSIS?

I am fairly new to SSIS. I came across a situation where i have to use a data flow task. The data source is MS- SQL server 2008 and destination is Sharepoint list. I gave an SQl query for Data source object as
SELECT Customer_ID, Project_ID, Project_Name, Project_Manager_ID, Project_Manager_Name, DeliveryManager1_ID, DM1NAME FROM dbo.LifeScience_Project
WHERE (Customer_ID IN ('1200532', '1200632', '1207916', '1212121', '1217793', '1219351', '1219417', '1219776'))
Now, this is the problem. The customer ids in where clause need to come from a different data source. That would make it look like something as
SELECT Customer_ID, Project_ID, Project_Name, Project_Manager_ID, Project_Manager_Name, DeliveryManager1_ID, DM1NAME FROM dbo.LifeScience_Project
WHERE Customer_ID IN (select customer_id from [Database2].Customer_Master)
Please guide me to implement this.

I would run one query in an execute sql task in which build you a list of customer Id's and store the result in a variable:-
Declare #s varchar(max)
Set #s =''
SELECT #s = #s + '''' + Cast(customer_id as varchar(20)) + ''','
FROM (select customer_id from [Database2].Customer_Master ) As T
Select #s
Then in your dataflow task, the source query would be parameterized, using the variable from the first part.
SELECT Customer_ID, Project_ID, Project_Name, Project_Manager_ID, Project_Manager_Name, DeliveryManager1_ID, DM1NAME FROM dbo.LifeScience_Project
WHERE (Customer_ID IN (?))

You would have to use a Lookup Component within your data flow task. So for example, write your first query within an OLEDB Component by chosing SQL Command from the Data Access Mode just pulling everything, i.e.,
SELECT Customer_ID, Project_ID, Project_Name, Project_Manager_ID, Project_Manager_Name, DeliveryManager1_ID, DM1NAME
FROM dbo.LifeScience_Project
Then connect this source to a Lookup Component and within the Lookup component, instead of choosing a lookup table, select the SQL option and type your second query in there:
select customer_id
from [Database2].Customer_Master
Now do a lookup matching the two Customer_ID fields and only direct output on successful matches. If there are specific ID's you want you can add that to the second query like so:
select customer_id
from [Database2].Customer_Master
WHERE customer_id IN('someid', 'someid2',...)
That's how I would do it.
EDIT:
In response to Sushant's additional questions for clarification.
(1) You want to use the MATCH output when directing your rows NOT NO MATCH.
(2) Yes that is correct, make sure for your data source it points to the other database (Non-SQL2008 one).
(3) In columns Tab you simply match Customer_ID to Customer_ID;
(4) In Advanced tab you don't need to do anything. In the Error tab you can choose to Ignore errors since you don;t care about the non-matches.
(5) Yes that is the correct flow - the prompt when connecting to Sharepoint is asking if you want to redirect the MATCHED output from the Lookup or the UNMATCHED output. You want to choose the MATCHED output as I mentioned in point (1).

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight