How to get data from distributed table

How to get data from distributed table - distributed

If I have a table, which structure was updated (ie system.query_log after latest update), but somehow distributed "view" has still old structure, how I could query data of new columns from that from entire cluster?
What I meant:
If you have distributed table, it could be done easily by:
select count(1) from distributed_query_log where event_date = '2019-01-24'
But select Settings.names, Settings.values from distributed_query_log where event_date = '2019-01-24' limit 1\G will fail, because it does not have those fields, when system.query_log has:
select Settings.names, Settings.values from system.query_log where event_date = '2019-01-24' limit 1\G

In Clickhouse release 1.1.54362 was added function cluster.
So, you can do it by:
select Settings.names, Settings.values from cluster('CLUSTER_TITLE', 'system.query_log') where event_date = '2019-01-24' limit 1\G
Where CLUSTER_TITLE - your cluster's title.
Thanks: Alexander Bocharov

In general case: after changing the underlying table you need to recreate (or alter) Distributed table.

Related

How to get the the most recent queries in Oracle DB

I have a web application and I doubt some others have deleted some records manually. Upon enquiry nobody is admitting the mistakes. How to find out at what time those records were deleted ?? Is it possible to get the history of delete queries ?

If you have access to v$ view then you can use the following query to get it. It contains the time as FIRST_LOAD_TIME column.
select *
from v$sql v
where upper(sql_text) like '%DELETE%';

If flashback query is enabled for your database (try it with select * from table as of timestamp sysdate - 1) then it may be possible to determine the exact time the records were deleted. Use the as of timestamp clause and adjust the timestamp as necessary to narrow down to a window where the records still existed and did not exist anymore.
For example
select *
from table
as of timestamp to_date('21102016 09:00:00', 'DDMMYYYY HH24:MI:SS')
where id = XXX; -- indicates record still exists
select *
from table
as of timestamp to_date('21102016 09:00:10', 'DDMMYYYY HH24:MI:SS')
where id = XXX; -- indicates record does not exist
-- conclusion: record was deleted in this 10 second window

Merge query using two tables in SQL server 2012

I am very new to SQL and SQL server, would appreciate any help with the following problem.
I am trying to update a share price table with new prices.
The table has three columns: share code, date, price.
The share code + date = PK
As you can imagine, if you have thousands of share codes and 10 years' data for each, the table can get very big. So I have created a separate table called a share ID table, and use a share ID instead in the first table (I was reliably informed this would speed up the query, as searching by integer is faster than string).
So, to summarise, I have two tables as follows:
Table 1 = Share_code_ID (int), Date, Price
Table 2 = Share_code_ID (int), Share_name (string)
So let's say I want to update the table/s with today's price for share ZZZ. I need to:
Look for the Share_code_ID corresponding to 'ZZZ' in table 2
If it is found, update table 1 with the new price for that date, using the Share_code_ID I just found
If the Share_code_ID is not found, update both tables
Let's ignore for now how the Share_code_ID is generated for a new code, I'll worry about that later.
I'm trying to use a merge query loosely based on the following structure, but have no idea what I am doing:
MERGE INTO [Table 1]
USING (VALUES (1,23-May-2013,1000)) AS SOURCE (Share_code_ID,Date,Price)
{ SEEMS LIKE THERE SHOULD BE AN INNER JOIN HERE OR SOMETHING }
ON Table 2 = 'ZZZ'
WHEN MATCHED THEN UPDATE SET Table 1.Price = 1000
WHEN NOT MATCHED THEN INSERT { TO BOTH TABLES }
Any help would be appreciated.

http://msdn.microsoft.com/library/bb510625(v=sql.100).aspx
You use Table1 for target table and Table2 for source table
You want to do action, when given ID is not found in Table2 - in the source table
In the documentation, that you had read already, that corresponds to the clause
WHEN NOT MATCHED BY SOURCE ... THEN <merge_matched>
and the latter corresponds to
<merge_matched>::=
{ UPDATE SET <set_clause> | DELETE }
Ergo, you cannot insert into source-table there.
You could use triggers for auto-insertion, when you insert something in Table1, but that will not be able to insert proper Shared_Name - trigger just won't know it.
So you have two options i guess.
1) make T-SQL code block - look for Stored Procedures. I think there also is a construct to execute anonymous code block in MS SQ, like EXECUTE BLOCK command in Firebird SQL Server, but i don't know it for sure.
2) create updatable SQL VIEW, joining Table1 and Table2 to show last most current date, so that when you insert a row in this view the view's on-insert trigger would actually insert rows to both tables. And when you would update the data in the view, the on-update trigger would modify the data.

Using Triggers in SQL Server to keep a history

I am using SQL Server 2012
I have a table called AMOUNTS and a table called AMOUNTS_HIST
Both tables have identical columns:
CHANGE_DATE
AMOUNT
COMPANY_ID
EXP_ID
SPOT
UPDATE_DATE [system date]
The Primary Key of AMOUNTS is COMPANY_ID and EXP_ID.
The Primary Key pf AMOUNTS_HIST is COMPANY_ID, EXP_ID and CHANGE_DATE
Whenever I add a row in the AMOUNTS table, I would like to create a copy of it in the AMOUNTS_HIST table. [Theoretically, each time a row is added to 'AMOUNTS', the COMPANY_ID, EXP_ID, CHANGE_DATE will be unique. Practically, if they are not, the relevant row in AMOUNTS_HIST would need to be overridden. The code below does not take the overriding into account.]
I created a trigger as follows:
CREATE TRIGGER [MYDB].[update_history] ON [MYDB].[AMOUNTS]
FOR UPDATE
AS
INSERT MYDB.AMOUNTS_HIST (
CHANGE_DATE,
COMPANY_ID,
EXP_ID,
SPOT
UPDATE_DATE
)
SELECT e.CHANGE_DATE,
e.COMPANY_ID,
e.EXP_ID
e.REMARKS,
e.SPOT,
e.UPDATE_DATE
FROM MYDB.AMOUNTS e
JOIN inserted ON inserted.company_id = e.company_id
AND inserted.exp_id=e.exp_id
I don't understand why it does nothing at all in my AMOUNTS_HIST table.
Can anyone help?
Thanks,

Probably because the trigger, the way it's currently written, will only get fired when an Update is done, not an insert.
Try changing it to:
CREATE TRIGGER [MYDB].[update_history] ON [MYDB].[AMOUNTS]
FOR UPDATE, INSERT

I just wanted to chime in. Have you looked at CDC (change data capture).
http://msdn.microsoft.com/en-us/library/bb522489(v=sql.105).aspx
"Change data capture is designed to capture insert, update, and delete activity applied to SQL Server tables, and to make the details of the changes available in an easily consumed relational format. The change tables used by change data capture contain columns that mirror the column structure of a tracked source table, along with the metadata needed to understand the changes that have occurred.
Change data capture is available only on the Enterprise, Developer, and Evaluation editions of SQL Server."
As far as your trigger goes, when you update [MYDB].[AMOUNTS] does the trigger throw any errors?
Also I believe you can get all your data from Inserted table without needed to do the join back to mydb.amounts.

Populating a table with fields from two other tables

I have two tables in Filemaker:
tableA (which includes fields idA (e.g. a123), date, price) and
tableB (which includes fields idB (e.g. b123), date, price).
How can I create a new table, tableC, with field id, populated with both idA and idB, (with the other fields being used for calculations on the combined data of both tables)?

The only way is to script it (for repeating uses) or do it 'manually', if this is an ad-hoc process. Details depend on the situation, so please clarify.
Update: Sorry, I actually forgot about the question. I assume the ID fields do not overlap even across tables and you do not need to add the same record more than once, but update it instead. In such a case the simplest script would be like that:
Set Variable[ $self, Get( FileName ) ]
Import Records[ $self, Table A -> Table C, sync on ID, update and add new ]
Import Records[ $self, Table B -> Table C, sync on ID, update and add new ]
The Import Records step is managed via rather elaborate dialog, but the idea is that you import from the same file (you can just type file:<YourFileName> there), the format is FileMaker Pro, and then set the field mapping. Make sure to choose the Update matching records and Add remaining records options and select the ID fields as key files to sync by.

It would be a FileMaker script. It could be run as a script trigger, but then it's not going to be seamless to the user. Your best bet is to create the tables, then just run the script as needed (manually) to build Table C. If you have FileMaker Server, you could schedule this script to be run periodically to keep Table C up-to-date.

Maybe you can use the select into statement.
I'm unsure if you wish to use calculated field from TableA and TableB or if your intension was to only calculate fields from the same table?
If tableA.IdA exists also in tableB.IdA, you could join the two tables and select into.
Else, you run the statement once for each table.
Select into statement
Select tableA.IdA, tableA.field1A, tableA.field2A, tableA.field1A * tableB.field2A
into New_Table from tableA
Edit: missed the part where you mentioned FileMaker.
But maybe you could script this on the db and just drop the table.

Updating redundant/denormalized data automatically in SQL Server

Use a high level of redundant, denormalized data in my DB designs to improve performance. I'll often store data that would normally need to be joined or calculated. For example, if I have a User table and a Task table, I would store the Username and UserDisplayName redundantly in every Task record. Another example of this is storing aggregates, such as storing the TaskCount in the User table.
User
UserID
Username
UserDisplayName
TaskCount
Task
TaskID
TaskName
UserID
UserName
UserDisplayName
This is great for performance since the app has many more reads than insert, update or delete operations, and since some values like Username change rarely. However, the big draw back is that the integrity has to be enforced via application code or triggers. This can be very cumbersome with updates.
My question is can this be done automatically in SQL Server 2005/2010... maybe via a persisted/permanent View. Would anyone recommend another possibly solution or technology. I've heard document-based DBs such as CouchDB and MongoDB can handle denormalized data more effectively.

You might want to first try an Indexed View before moving to a NoSQL solution:
http://msdn.microsoft.com/en-us/library/ms187864.aspx
and:
http://msdn.microsoft.com/en-us/library/ms191432.aspx
Using an Indexed View would allow you to keep your base data in properly normalized tables and maintain data-integrity while giving you the denormalized "view" of that data. I would not recommend this for highly transactional tables, but you said it was heavier on reads than writes so you might want to see if this works for you.
Based on your two example tables, one option is:
1) Add a column to the User table defined as:
TaskCount INT NOT NULL DEFAULT (0)
2) Add a Trigger on the Task table defined as:
CREATE TRIGGER UpdateUserTaskCount
ON dbo.Task
AFTER INSERT, DELETE
AS
;WITH added AS
(
SELECT ins.UserID, COUNT(*) AS [NumTasks]
FROM INSERTED ins
GROUP BY ins.UserID
)
UPDATE usr
SET usr.TaskCount = (usr.TaskCount + added.NumTasks)
FROM dbo.[User] usr
INNER JOIN added
ON added.UserID = usr.UserID
;WITH removed AS
(
SELECT del.UserID, COUNT(*) AS [NumTasks]
FROM DELETED del
GROUP BY del.UserID
)
UPDATE usr
SET usr.TaskCount = (usr.TaskCount - removed.NumTasks)
FROM dbo.[User] usr
INNER JOIN removed
ON removed.UserID = usr.UserID
GO
3) Then do a View that has:
SELECT u.UserID,
u.Username,
u.UserDisplayName,
u.TaskCount,
t.TaskID,
t.TaskName
FROM User u
INNER JOIN Task t
ON t.UserID = u.UserID
And then follow the recommendations from the links above (WITH SCHEMABINDING, Unique Clustered Index, etc.) to make it "persisted". While it is inefficient to do an aggregation in a subquery in the SELECT as shown above, this specific case is intended to be denormalized in a situation that has higher reads than writes. So doing the Indexed View will keep the entire structure, including the aggregation, physically stored so each read will not recalculate it.
Now, if a LEFT JOIN is needed if some Users do not have any Tasks, then the Indexed View will not work due to the 5000 restrictions on creating them. In that case, you can create a real table (UserTask) that is your denormalized structure and have it populated via either a Trigger on just the User Table (assuming you do the Trigger I show above which updates the User Table based on changes in the Task table) or you can skip the TaskCount field in the User Table and just have Triggers on both tables to populate the UserTask table. In the end, this is basically what an Indexed View does just without you having to write the synchronization Trigger(s).