Cursor variable not updated - sql-server

I'm not understanding why the variable, #NextURLId, in this cursor is not being updated. Here is the code
DECLARE #NextURLId INT = 1
DECLARE #varContact_Id INT
DECLARE GetURL_Cursor CURSOR FOR
SELECT DISTINCT(contact_id)
FROM obp.Contacts
OPEN GetURL_Cursor
FETCH NEXT FROM GetURL_Cursor INTO #varContact_id
WHILE ##FETCH_STATUS = 0
BEGIN
-- Available URLs have the used value as NULL. Used has value of 1.
SET #NextURLId = (SELECT MIN(id) FROM obp.URL WHERE used IS NULL)
UPDATE obp.Contacts SET URL = (
SELECT url from obp.URL WHERE id = #NextURLId)
UPDATE obp.URL SET
used = 1,
contact_id = #varContact_Id,
date = GETDATE()
WHERE id = #NextURLId
FETCH NEXT FROM GetURL_Cursor INTO #varContact_id
END;
CLOSE GetURL_Cursor
DEALLOCATE GetURL_Cursor
The code is supposed to retrieve a unique URL from a table (obp.URL), enter that URL in the Contacts table and then update the URL to indicated that the URL has been used. It seems to me that after the URL table is updated with 'used = 1' then the next iteration of the code should get a new URLId when I query for it.
However, when I run this code I get the same URL every time. No doubt I am missing something obvious but need some help to point it out.
As a side, if there is a set based solution for this, I'd be happy to hear it.
TIA

this
UPDATE obp.Contacts SET URL = (
SELECT url from obp.URL WHERE id = #NextURLId)
updates every row with the same. Add a proper WHERE clause like
WHERE contact_id=#varContact_id
About the requirement for this: I understand that you want to associate a Contact with a URL and that there is no logical rule for which with what. At first sight I would consider a match table the right way to do this. It feels better to me to put such associations into a seperate table, even if there is a strong belief in a 1:1-relationship between the two objects associated. obp.URL and obp.Contacts are dimensional tables (I assume/hope). Keeping the association in one different table requires one action if changes occur. In your model a change must be reflected in both those tables.
Here is an idea for such a table:
create table Contact_URL_match
(ID int identity (1,1)
,URL_id int not null unique
,contact_id int not null unique
,created datetime)
the unique constraints disallow insertion of the same URL or the same Contact_id twice. On each insert/update prior content is being checked for duplicates and if found the action is denied, thus uniqueness protected.
For manifesting new matches in a first large initial action try this (haven't tested, just an idea)
INSERT INTO
Contact_URL_match
(URL_id
,contact_id
,created)
SELECT
urls.id
,contacts.contact_id
,getdate()
FROM
(SELECT
DISTINCT(contact_id)
,ROW_NUMBER() over (ORDER BY contact_id asc) rn
FROM
obp.Contacts) contacts
INNER JOIN
(SELECT
id
,ROW_NUMBER() over (ORDER BY id) rn
FROM
obp.URL) urls
ON
contacts.rn=urls.rn
Within the subqueries this creates a row number in both the source tables based on the ORDER BY clauses. It then joins the resultsets of the subqueries by that rownumber which is an act of deliberate randomness. I hope you want that. The result of that join is inserted into the match table.
If later you want to manifest a single new association you could add WHERE clauses to the subqueries that specify what URL you want matched with what Contact. Before picking a URL or Contact check the match table with NOT EXISTS to make sure it is not used in there.
EDIT : syntax errors cleaned

Related

in-place update leads to forwarded records

I am well aware what a forwarded record within a heap is.
Since I want to keep forwarded records at 0, we decided to update only on columns that could not be extended.
Recently on my system I encountered forwarded records.
Table design is like this:
CREATE TABLE dbo.test (
HashValue BINARY(16) NOT NULL,
LoadTime DATETIME NOT NULL,
LoadEndTime DATETIME NULL,
[other columns that never get updates]
) WITH(DATA_COMPRESSION=PAGE);
The insert statements ALWAYS brings all the columns, so none is left NULL. I checked the query logs.
I insert a value of '9999-12-31' for the LoadEndTime.
Now system performs an update on LoadTime like this.
;WITH CTE AS (
SELECT *, COALESCE(LEAD(LoadTime) OVER(PARTITION BY HashValue ORDER BY LoadTime) ,'9999-12-31') as EndTimeStamp
)
UPDATE CTE SET LoadEndTime = EndTimeStamp;
since the LoadEntTime column is always filled there should be no extention of that column within the row when the update is executed. It should be an in place update. Still i get forwarded records always after that process... It doesn't make sense to me...

SQL Server - Update All Records, Per Group, With Result of SubQuery

If anyone could even just help me phrase this question better I'd appreciate it.
I have a SQL Server table, let's call it cars, which contains entries representing items and information about their owners including car_id, owner_accountNumber, owner_numCars.
We're using a system that sorts 'importantness of owner' based on number of cars owned, and relies on the owner_numCars column to do so. I'd rather not adjust this, if reasonably possible.
Is there a way I can update owner_numCars per owner_accountNumber using a stored procedure? Maybe some other more efficient way I can accomplish every owner_numCars containing the count of entries per owner_accountNumber?
Right now the only way I can think to do this is to (from the c# application):
SELECT owner_accountNumber, COUNT(*)
FROM mytable
GROUP BY owner_accountNumber;
and then foreach row returned by that query
UPDATE mytable
SET owner_numCars = <count result>
WHERE owner_accountNumber = <accountNumber result>
But this seems wildly inefficient compared to having the server handle the logic and updates.
Edit - Thanks for all the help. I know this isn't really a well set up database, but it's what I have to work with. I appreciate everyone's input and advice.
This solution takes into account that you want to keep the owner_numCars column in the CARs table and that the column should always be accurate in real time.
I'm defining table CARS as a table with attributes about cars including it's current owner. The number of cars owned by the current owner is de-normalized into this table. Say I, LAS, own three cars, then there are three entries in table CARS, as such:
car_id owner_accountNumber owner_numCars
1 LAS1 3
2 LAS1 3
3 LAS1 3
For owner_numCars to be used as an importance factor in a live interface, you'd need to update owner_numCars for every car every time LAS1 sells or buys a car or is removed from or added to a row.
Note you need to update CARS for both the old and new owners. If Sam buys car1, both Sam's and LAS' totals need to be updated.
You can use this procedure to update the rows. This SP is very context sensitive. It needs to be called after rows have been deleted or inserted for the deleted or inserted owner. When an owner is updated, it needs to be called for both the old and new owners.
To update real time as accounts change owners:
create procedure update_car_count
#p_acct nvarchar(50) -- use your actual datatype here
AS
update CARS
set owner_numCars = (select count(*) from CARS where owner_accountNumber = #p_acct)
where owner_accountNumber = #p_acct;
GO
To update all account_owners:
create procedure update_car_count_all
AS
update C
set owner_numCars = (select count(*) from CARS where owner_acctNumber = C.owner_acctNumber)
from CARS C
GO
I think what you need is a View. If you don't know, a View is a virtual table that displays/calculates data from a real table that is continously updated as the table data updates. So if you want to see your table with owner_numCars added you could do:
SELECT a.*, b.owner_numCars
from mytable as a
inner join
(SELECT owner_accountNumber, COUNT(*) as owner_numCars
FROM mytable
GROUP BY owner_accountNumber) as b
on a.owner_accountNumber = b.owner_accountNumber
You'd want to remove the owner_numCars column from the real table since you don't need to actually store that data on each row. If you can't remove it you can replace a.* with an explicit list of all the fields except owner_numCars.
You don't want to run SQL to update this value. What if it doesn't run for a long time? What if someone loads a lot of data and then runs the score and finds a guy that has 100 cars counts as a zero b/c the update didn't run. Data should only live in 1 place, updating has it living in 2. You want a view that pulls this value from the tables as it is needed.
CREATE VIEW vOwnersInfo
AS
SELECT o.*,
ISNULL(c.Cnt,0) AS Cnt
FROM OWNERS o
LEFT JOIN
(SELECT OwnerId,
COUNT(1) AS Cnt
FROM Cars
GROUP BY OwnerId) AS c
ON o.OwnerId = c.OwnerId
There are a lot of ways of doing this. Here is one way using COUNT() OVER window function and an updatable Common Table Expression [CTE]. That you won't have to worry about relating data back, ids etc.
;WITH cteCarCounts AS (
SELECT
owner_accountNumber
,owner_numCars
,NewNumberOfCars = COUNT(*) OVER (PARTITION BY owner_accountNumber)
FROM
MyTable
)
UPDATE cteCarCounts
SET owner_numCars = NewNumberOfCars
However, from a design perspective I would raise the question of whether this value (owner_numCars) should be on this table or on what I assume would be the owner table.
Rominus did make a good point of using a view if you want the data to always reflect the current value. You could also use also do it with a table valued function which could be more performant than a view. But if you are simply showing it then you could simply do something like this:
SELECT
owner_accountNumber
,owner_numCars = COUNT(*) OVER (PARTITION BY owner_accountNumber)
FROM
MyTable
By adding a where clause to either the CTE or the SELECT statement you will effectively limit your dataset and the solution should remain fast. E.g.
WHERE owner_accountNumber = #owner_accountNumber

Updating column based on three tables

I know it's very unprofessional, but it's our business system so I can't change it.
I have three tables: t_posList, t_url, t_type. The table t_posList has a column named URL which is also stored in the table t_url (the ID of the table t_url is not saved in t_posList so I have to find it like posList.Url = t_url.Url).
The column t_posList.status of every data row should be updated to 'non-customer' (it will be a status id but lets keep it simple) if: the ID of t_url can NOT be found in t_type.url_id.
So the query has like two steps: first I have to get all of the data rows where t_posList.Url = t_url.Url. After this I have to check which ID's of the found t_url rows can NOT be found in t_type.url_id.
I really hope you know what I mean. Because our system is very unprofessional and my SQL knowledge is not that good I'm not able to make this query.
EDIT: I tried this:
UPDATE t_poslist SET status = (
SELECT 'non-customer'
FROM t_url, t_type
WHERE url in
(select url from t_url
LEFT JOIN t_type ON t_url.ID = t_type.url_id
WHERE t_type.url_id is null)
)
What about this?
UPDATE p
SET status = 'non-customer'
FROM t_poslist p
INNER JOIN t_url u ON u.url = p.url
WHERE NOT EXISTS
(
SELECT * FROM t_type t WHERE t.url_id = u.ID
)

globally unique integer based ID (sequential) for a given location

I need to create a unique ID for a given location, and the location's ID must be sequential. So its basically like a primary key, except that it is also tied to the locationID. So 3 different locations will all have ID's like 1,2,3,4,5,...,n
What is the best way to do this?
I also need a safe way of getting the nextID for a given location, I'm guessing I can just put a transaction on the stored procedure that gets the next ID?
One of the ways I've seen this done is by creating a table mapping the location to the next ID.
CREATE TABLE LocationID {
Location varchar(32) PRIMARY KEY,
NextID int DEFAULT(1)
}
Inside your stored procedure you can do an update and grab the current value while also incrementing the value:
...
UPDATE LocationID SET #nextID = NextID, NextID = NextID + 1 WHERE Location = #Location
...
The above may not be very portable and you may end up getting the incremented value instead of the current one. You can adjust the default for the column as desired.
Another thing to be cautious of is how often you'll be hitting this and if you're going to hit it from another stored procedure, or from application code. If it's from another stored procedure, then one at a time is probably fine. If you're going to hit it from application code, you might be better off grabbing a range of values and then doling them out to your application one by one and then grabbing another range. This could leave gaps in your sequence if the application goes down while it still has a half allocated block.
You'll want to wrap both the code to find the next ID and the code to save the row in the same transaction. You don't want (pseudocode):
transaction {
id = getId
}
... other processing
transaction {
createRowWithNewId
}
Because another object with that id could be saved during "... other processing"
If this doesn't need to be persisted, you could always do this in your query versus storing it in the table itself.
select
locationID
,row_number() over (partition by locationID order by (select null)) as LocationPK
From
YourTable

SQL Server best way to calculate datediff between current row and next row?

I've got the following rough structure:
Object -> Object Revisions -> Data
The Data can be shared between several Objects.
What I'm trying to do is clean out old Object Revisions. I want to keep the first, active, and a spread of revisions so that the last change for a time period is kept. The Data might be changed a lot over the course of 2 days then left alone for months, so I want to keep the last revision before the changes started and the end change of the new set.
I'm currently using a cursor and temp table to hold the IDs and date between changes so I can select out the low hanging fruit to get rid of. This means using #LastID, #LastDate, updates and inserts to the temp table, etc...
Is there an easier/better way to calculate the date difference between the current row and the next row in my initial result set without using a cursor and temp table?
I'm on sql server 2000, but would be interested in any new features of 2005, 2008 that could help with this as well.
Here is example SQL. If you have an Identity column, you can use this instead of "ActivityDate".
SELECT DATEDIFF(HOUR, prev.ActivityDate, curr.ActivityDate)
FROM MyTable curr
JOIN MyTable prev
ON prev.ObjectID = curr.ObjectID
WHERE prev.ActivityDate =
(SELECT MAX(maxtbl.ActivityDate)
FROM MyTable maxtbl
WHERE maxtbl.ObjectID = curr.ObjectID
AND maxtbl.ActivityDate < curr.ActivityDate)
I could remove "prev", but have it there assuming you need IDs from it for deleting.
If the identity column is sequential you can use this approach:
SELECT curr.*, DATEDIFF(MINUTE, prev.EventDateTime,curr.EventDateTime) Duration FROM DWLog curr join DWLog prev on prev.EventID = curr.EventID - 1
Hrmm, interesting challenge. I think you can do it without a self-join if you use the new-to-2005 pivot functionality.
Here's what I've got so far, I wanted to give this a little more time before accepting an answer.
DECLARE #IDs TABLE
(
ID int ,
DateBetween int
)
DECLARE #OID int
SET #OID = 6150
-- Grab the revisions, calc the datediff, and insert into temp table var.
INSERT #IDs
SELECT ID,
DATEDIFF(dd,
(SELECT MAX(ActiveDate)
FROM ObjectRevisionHistory
WHERE ObjectID=#OID AND
ActiveDate < ORH.ActiveDate), ActiveDate)
FROM ObjectRevisionHistory ORH
WHERE ObjectID=#OID
-- Hard set DateBetween for special case revisions to always keep
UPDATE #IDs SET DateBetween = 1000 WHERE ID=(SELECT MIN(ID) FROM #IDs)
UPDATE #IDs SET DateBetween = 1000 WHERE ID=(SELECT MAX(ID) FROM #IDs)
UPDATE #IDs SET DateBetween = 1000
WHERE ID=(SELECT ID
FROM ObjectRevisionHistory
WHERE ObjectID=#OID AND Active=1)
-- Select out IDs for however I need them
SELECT * FROM #IDs
SELECT * FROM #IDs WHERE DateBetween < 2
SELECT * FROM #IDs WHERE DateBetween > 2
I'm looking to extend this so that I can keep at maximum so many revisions, and prune off the older ones while still keeping the first, last, and active. Should be easy enough through select top and order by clauses, um... and tossing in ActiveDate into the temp table.
I got Peter's example to work, but took that and modified it into a subselect. I messed around with both and the sql trace shows the subselect doing less reads. But it does work and I'll vote him up when I get my rep high enough.

Resources