I have a table used for a chat. Among others there is a field called userid and a field called timesent. I need to know the latest timesent for each userid in the table, so that I can delete them from the table if they haven't said anything for 3 minutes, in which case I will assume they are gone.
I can't really crack this nut... How do I query.
I could of course split it up and first select all the userids and then loop through them and select top 1 timesent in my method, but I was wondering if sql alone can do the trick, so I don't need to execute tons of queries.
To get the latest timesent per userid you can use MAX
SELECT userid, MAX(timesent) AS timesent
FROM your_table
GROUP BY userid
Or to do the specified delete you can use
DELETE your_table
FROM your_table y1
WHERE NOT EXISTS(SELECT *
FROM your_table y2
WHERE y2.userid = y1.userid
AND y2.timesent >= DATEADD(MINUTE, -3, GETDATE()))
Related
I have a SQL Server table with an expirydate column, I want to update rows on this table with the nearest expirydate, running two queries (select then update) won't work because two users may update the same row at the same time, so it has be one query.
The following query:
Update Top(5) table1
Set col1 = 1
Output deleted.* Into table2
This query runs fine but it doesn't sort by expirydate
This query:
WITH q AS
(
SELECT TOP 5 *
FROM table1
ORDER BY expirydate
)
UPDATE table1
SET col1 = 1
OUTPUT deleted.* INTO table2
WHERE table1.id IN (SELECT id FROM q)
It works but again I run the risk of two users updating the same row at the same time
What options do I have to make this work?
Thanks for the help
In these types of scenarios if you want a more optimistic concurrency approach, you need to include either an Order By AND / OR a Where clause to filter out the rows.
In application design it is common to use SELECT TOP (#count) FROM... style queries to fill the interface, however to execute DELETE or UPDATE statements you would use the primary key to specifically identify the rows to modify.
As long as you are not executing delete, then you could use a timestamp or other date based descriminator column to ensure that your updates only affect the rows that haven't been changed since the last select.
So you could query the current time as part of the select query:
SELECT TOP 5 *, SYSDATETIMEOFFSET() as [Selected]
FROM table1
ORDER BY expirydate
or query for the timestamp first, and add a created column to the table to track new records so you do not include them in deletes, either way you need to ensure that the query to select the rows will always return the same records, even if I run it tomorrow, which means you will need to ensure that no one can modify the expirydate column, if that could be modified, then you can't use it as your primary sort or filter key.
DECLARE #selectedTimestamp DateTimeOffset = (SELECT SYSDatetimeoffset())
SELECT TOP 5 *, SYSDATETIMEOFFSET() as [Selected]
FROM table1
WHERE CREATED < #selectedTimestamp
ORDER BY expirydate
Then in your update, make sure you only update the rows if they have not changed since the time that we selected them, this will either require you to have setup a standard audit trigger on the table to keep created and modified columns up to date, or for you to manage it manually in your update statement:
WITH q AS
(
SELECT TOP 5 *
FROM table1
WHERE CREATED < #selectedTimestamp
ORDER BY expirydate
)
UPDATE table1
SET col1 = 1, MODIFIED = SYSDatetimeoffset()
OUTPUT deleted.* INTO table2
WHERE table1.id IN (SELECT id FROM q)
AND MODIFIED < #selectedTimestamp
In this way we are effectively ignoring our change if another user has already updated records that were in the same or similar initial selection range.
Ultimately you could combine my initial advice to UPDATE based on the primary key AND the modified dates if you are genuinely concerned about the rows being updated twice.
If you need a more pessimistic approach, you could lock the rows with a specific user based flag so that other users cannot even select those rows, but that requires a much more detailed explanation.
I have a logging table that is live which saves my value to a table frequently.
My plan is to take those values and put them on a temporary table with
SELECT * INTO #temp from Block
From there I guess my block table is empty and the logger can keep on logging new values.
The next step is that I want to save them in a existing table. I wanted to use
INSERT INTO TABLENAME(COLUMN1,COLUMN2...) SELECT (COLUMN1,COLUMN2...) FROM #temp
The problem is that the #temp table has duplicates primary keys. And I only want to store the last ID.
I have tried DISTINCT but it didn't work. Could not get ROW_Count to work. Are there any ideas on how I should do it? I wish to make it with as few reads as possible.
Also, in the future I plan to send them to another database, how do I do that on SQL Server? I guess it's something like FROM Table [in databes]?
I couldn't get the blocks to copy. But here goes:
create TABLE Product_log (
Grade char(64),
block_ID char(64) PRIMARY KEY NOT NULL,
Density char(64),
BatchNumber char(64) NOT NULL,
BlockDateID Datetime
);
That is my table i want to store the data in. There I do not wish to have duplicates on the id. The problem is, while logging I get duplicates since I log on change. Lets say that the batchid is 1, if it becomes 2 while logging. I will get a blockid twice, both with batch number 1 and 2. How do I pick the latter?
Hope I explained enough for guidance. While logging they look like this:
id SiemensTiaV15_s71200_BatchTester_NewBatchIDValue_VALUE SiemensTiaV15_s71200_BatchTester_TestWriteValue_VALUE SiemensTiaV15_s71200_BatchTester_TestWriteValue_TIMESTAMP SiemensTiaV15_s71200_MainTank_Density_VALUE SiemensTiaV15_s71200_MainTank_Grade_VALUE
1 00545 S0047782 2020-06-09 11:18:44.583 0 xxxxx
2 00545 S0047783 2020-06-09 11:18:45.800 0 xxxxx
Please use below query,
select * from
(select id, SiemensTiaV15_s71200_BatchTester_NewBatchIDValue_VALUE,SiemensTiaV15_s71200_BatchTester_TestWriteValue_VALUE, SiemensTiaV15_s71200_BatchTester_TestWriteValue_TIMESTAMP, SiemensTiaV15_s71200_MainTank_Density_VALUE,SiemensTiaV15_s71200_MainTank_Grade_VALUE,
row_number() over (partition by SiemensTiaV15_s71200_BatchTester_NewBatchIDValue_VALUE order by SiemensTiaV15_s71200_BatchTester_TestWriteValue_TIMESTAMP desc) as rnk
from table_name) qry
where rnk=1;
INTO #temp FROM Block; INSERT INTO Product_log(Grade, block_ID, Density, BatchNumber, BlockDateID)
selct NewBatchIDValue_VALUE, TestWriteValue_VALUE, TestWriteValue_TIMESTAMP,
Density_VALUE, Grade_VALUE from
(select NewBatchIDValue_VALUE, TestWriteValue_VALUE,
TestWriteValue_TIMESTAMP, Density_VALUE, Grade_VALUE, row_number() over
(partition by BatchTester_NewBatchIDValue_VALUE order by
BatchTester_TestWriteValue_VALUE) as rnk from #temp) qry
where rnk = 1;
(Submitting on behalf of a Snowflake User)
We have wrong duplicate id loaded in the table and we need to correct it. The rules to update the id is whenever there is a time difference of more than 30 min, the id should be new/unique. I have written the query to filter that out, however update is not happening
The below query is there to find the ids to be updated. For testing I have used a particular id.
select id,
BEFORE_TIME,
TIMESTAMP,
datediff(minute,BEFORE_TIME,TIMESTAMP) time_diff,
row_number() over (PARTITION BY id ORDER BY TIMESTAMP) rowno,
concat(id,to_varchar(rowno)) newid from
(SELECT id,
TIMESTAMP,
LAG(TIMESTAMP_EST) OVER (PARTITION BY visit_id ORDER BY TIMESTAMP) as BEFORE_TIME
FROM table_name t
where id = 'XX1X2375'
order by TIMESTAMP_EST)
where BEFORE_TIME is not NULL and time_diff > 30
order by time_diff desc
;
And i could see the 12 record with same id and time difference more than 30. However when I am trying to update. the query is succesfull but nothing is getting update.
update table_name t
set t.id = c.newid
from
(select id ,
BEFORE_TIME,
TIMESTAMP,
datediff(minute,BEFORE_TIME,TIMESTAMP) time_diff,
row_number() over (PARTITION BY id ORDER BY TIMESTAMP) rowno,
concat(id,to_varchar(rowno)) newid from
(SELECT id,
TIMESTAMP,
LAG(TIMESTAMP) OVER (PARTITION BY visit_id ORDER BY TIMESTAMP) as BEFORE_TIME
FROM table_name t
where id = 'XX1X2375'
order by TIMESTAMP_EST)
where BEFORE_TIME is not NULL and time_diff > 30
order by time_diff desc) c
where t.id = c.id
and t.timestamp = c.BEFORE_TIME
;
please note:
I even created a temp table t1 from the above subquery.
And i can see the records in table t1.
when doing select with join with main table i can even see in record in main table.
But again when I am trying to update using new t1. its just showing zero record updated.
I even tried merge but same issue.
MERGE INTO snowplow_data_subset_temp t
USING t1
ON (trim(t.visit_id) = trim(t1.visit_id) and trim(t1.BEFORE_DATE) = trim(t.TIMESTAMP_EST))
WHEN MATCHED THEN UPDATE SET visit_number = newid;
Any recommendations, ideas, or work-arounds? Thanks!
This looks like they may be running into two things:
The table that you created t1, was it a transient or cloned table? Check out the
Get_DDL('t1', 'schemaname');
to check if there are any constraints on the temp table in the session you work on this next. Or you can query the table constraints view
"Alternatively, retrieve a list of all table constraints by schema (or across all schemas in a database) by querying the TABLE_CONSTRAINTS View view in the Information Schema." from: https://docs.snowflake.net/manuals/user-guide/table-considerations.html#referential-integrity-constraints
Since the sub query is working just fine - the merge and update statements are clues for what to look for, this is what I found in the documentation for more general info:
*Limitations on Sub queries:
https://docs.snowflake.net/manuals/user-guide/querying-subqueries.html#limitations
You can also check to see if there are any errors for the update query by altering the session: https://docs.snowflake.net/manuals/sql-reference/sql/update.html#usage-notes
ALTER SESSION SET ERROR_ON_NONDETERMINISTIC_UPDATE=TRUE;
Here is an example of how to use an update with a Temp table:
https://snowflakecommunity.force.com/s/question/0D50Z00008P7BznSAF/can-you-use-a-cte-or-temp-table-with-an-update-statement-to-update-a-table
I am looking forward to seeing how they ended up solving the issue.
I need to select 10 random rows from a table, but it has to be done in the where clause, because the query is executed using another aplication that only allows to modify this part.
I searched for a lot of solutions (select top 10, RAND(), ORDER BY NEWID(), ...), but none work in the where clause.
There an option to do that? or some kind of workaround?
Try this:
SELECT *
FROM Test
WHERE Id IN (SELECT TOP 10 Id FROM Test ORDER BY NewId())
If your table has a unique column you can do something like :
SELECT * FROM TABLE WHERE PRIMARYCOLUMN IN (SELECT TOP(10) PRIMARYCOLUMN FROM TABLE ORDER BY NEWID())
I need some help to create a new column in a database in SQL Server 2008.
I have the following data table
Please have a look at a snapshot of my table
Table
In the blank column I would like to put the difference between the current status date and the next status' date. And for the last ID_Status for each ID_Ticket I would like to have the difference between now date and it's date !
I hope that you got an idea about my problem.
Please share if you have any ideas about how to do .
Many thanks
kind regards
You didn't specify your RDBMS, so I'll post an answer for both since they are almost identical :
SQL-Server :
SELECT ss.id_ticket,ss.id_status,ss.date_status,
DATEDIFF(day,ss.date_status,ss.coalesce(ss.next_date,GETDATE())) as diffStatus
FROM (
SELECT t.*,
(SELECT TOP 1 s.date_status FROM YourTable s
WHERE t.id_ticket = s.id_ticket and s.date_status > t.date_status
ORDER BY s.date_status ASC) as next_date)
FROM YourTable t) ss
MySQL :
SELECT ss.id_ticket,ss.id_status,ss.date_status,
DATEDIFF(ss.date_status,ss.coalesce(ss.next_date,now())) as diffStatus
FROM (
SELECT t.*,
(SELECT s.date_status FROM YourTable s
WHERE t.id_ticket = s.id_ticket and s.date_status > t.date_status
ORDER BY s.date_status ASC limit 1) as next_date)
FROM YourTable t) ss
This basically first use a correlated sub query to bring the next date using limit/top , and then wrap it with another select to calculate the difference between them using DATEDIFF().
Basically it can be done without the wrapping query, but it won't be readable since the correlated query will be inside the DATEDIFF() function, so I prefer this way.