For example, there is a table
int type
int number
int value
How to make that when inserting a value into a table
indexing started from 1 for different types.
type 1 => number 1,2,3...
type 2 => number 1,2,3...
That is, it will look like this.
type
number
value
1
1
-
1
2
-
1
3
-
2
1
-
1
4
-
2
2
-
3
1
-
6
1
-
1
5
-
2
3
-
6
2
-
Special thanks to #Larnu.
As a result, in my case, the best solution would be to create a table for each type.
As I mentioned in the comments, neither IDENTITY nor SEQUENCE support the use of another column to denote what "identity set" they should use. You can have multiple SEQUENCEs which you could use for a single table, however, this doesn't scale. If you are specific limited to 2 or 3 types, for example, you might choose to create 3 SEQUENCE objects, and then use a stored procedure to handle your INSERT statements. Then, when a user/application wants to INSERT data, they call the procedure and that procedure has logic to use the SEQUENCE based on the value of the parameter for the type column.
As mentioned, however, this doesn't scale well. If you have an undeterminate number of values of type then you can't easily handle getting the right SEQUENCE and handling new values for type would be difficult too. In this case, you would be better off using a IDENTITY and then a VIEW. The VIEW will use ROW_NUMBER to create your identifier, while IDENTITY gives you your always incrementing value.
CREATE TABLE dbo.YourTable (id int IDENTITY(1,1),
[type] int NOT NULL,
number int NULL,
[value] int NOT NULL);
GO
CREATE VIEW dbo.YourTableView AS
SELECT ROW_NUMBER() OVER (PARTITION BY [type] ORDER BY id ASC) AS Identifier,
[type],
number,
[value]
FROM dbo.YourTable;
Then, instead, you query the VIEW, not the TABLE.
If you need consistency of the column (I name identifier) you'll need to also ensure row(s) can't be DELETEd from the table. Most likely by adding an IsDeleted column to the table defined as a bit (with 0 for no deleted, and 1 for deleted), and then you can filter to those rows in the VIEW:
CREATE VIEW dbo.YourTableView AS
WITH CTE AS(
SELECT id,
ROW_NUMBER() OVER (PARTITION BY [type] ORDER BY id ASC) AS Identifier,
[type],
number,
[value],
IsDeleted
FROM dbo.YourTable)
SELECT id,
Identifier,
[type],
number,
[value]
FROM CTE
WHERE IsDeleted = 0;
You could, if you wanted, even handle the DELETEs on the VIEW (the INSERT and UPDATEs would be handled implicitly, as it's an updatable VIEW):
CREATE TRIGGER trg_YourTableView_Delete ON dbo.YourTableView
INSTEAD OF DELETE AS
BEGIN
SET NOCOUNT ON;
UPDATE YT
SET IsDeleted = 1
FROM dbo.YourTable YT
JOIN deleted d ON d.id = YT.id;
END;
GO
db<>fiddle
For completion, if you wanted to use different SEQUENCE object, it would look like this. Notice that this does not scale easily. I have to CREATE a SEQUENCE for every value of Type. As such, for a small, and known, range of values this would be a solution, but if you are going to end up with more value for type or already have a large range, this ends up not being feasible pretty quickly:
CREATE TABLE dbo.YourTable (identifier int NOT NULL,
[type] int NOT NULL,
number int NULL,
[value] int NOT NULL);
CREATE SEQUENCE dbo.YourTable_Type1
START WITH 1 INCREMENT BY 1;
CREATE SEQUENCE dbo.YourTable_Type2
START WITH 1 INCREMENT BY 1;
CREATE SEQUENCE dbo.YourTable_Type3
START WITH 1 INCREMENT BY 1;
GO
CREATE PROC dbo.Insert_YourTable #Type int, #Number int = NULL, #Value int AS
BEGIN
DECLARE #Identifier int;
IF #Type = 1
SELECT #Identifier = NEXT VALUE FOR dbo.YourTable_Type1;
IF #Type = 2
SELECT #Identifier = NEXT VALUE FOR dbo.YourTable_Type2;
IF #Type = 3
SELECT #Identifier = NEXT VALUE FOR dbo.YourTable_Type3;
INSERT INTO dbo.YourTable (identifier,[type],number,[value])
VALUES(#Identifier, #Type, #Number, #Value);
END;
I am moving a small database from MS Access into SQL Server. Each year, the users would create a new Access database and have clean data, but this change will put data across the years into one pot. The users have relied on the autonumber value in Access as a reference for records. That is very inaccurate if, say, 238 records are removed.
So I am trying to accommodate them with an id column they can control (somewhat). They will not see the real primary key in the SQL table, but I want to give them an ID they can edit, but still be unique.
I've been working with this trigger, but it has taken much longer than I expected.
Everything SEEMS TO work fine, except I don't understand why I have the same data in my INSERTED table as the table the trigger is on. (See note in code.)
ALTER TRIGGER [dbo].[trg_tblAppData]
ON [dbo].[tblAppData]
AFTER INSERT,UPDATE
AS
BEGIN
SET NOCOUNT ON;
DECLARE #NewUserEnteredId int = 0;
DECLARE #RowIdForUpdate int = 0;
DECLARE #CurrentUserEnteredId int = 0;
DECLARE #LoopCount int = 0;
--*** Loop through all records to be updated because the values will be incremented.
WHILE (1 = 1)
BEGIN
SET #LoopCount = #LoopCount + 1;
IF (#LoopCount > (SELECT Count(*) FROM INSERTED))
BREAK;
SELECT TOP 1 #RowIdForUpdate = ID, #CurrentUserEnteredId = UserEnteredId FROM INSERTED WHERE ID > #RowIdForUpdate ORDER BY ID DESC;
IF (#RowIdForUpdate IS NULL)
BREAK;
-- WHY IS THERE A MATCH HERE? HAS THE RECORD ALREADY BEEN INSERTED?
IF EXISTS (SELECT UserEnteredId FROM tblAppData WHERE UserEnteredId = #CurrentUserEnteredId)
BEGIN
SET #NewUserEnteredId = (SELECT Max(t1.UserEnteredId) + 1 FROM tblAppData t1);
END
ELSE
SET #NewUserEnteredId = #CurrentUserEnteredId;
UPDATE tblAppData
SET UserEnteredId = #NewUserEnteredId
FROM tblAppData a
WHERE a.ID = #RowIdForUpdate
END
END
Here is what I want to accomplish:
When new record(s) are added, it should increment values from the Max existing
When a user overrides a value, it should check to see the existence of that value. If found restore the existing value, otherwise allow the change.
This trigger allows for multiple rows being added at a time.
It is great for this to be efficient for future use, but in reality, they will only add 1,000 records a year.
I wouldn't use a trigger to accomplish this.
Here is a script you can use to create a sequence (op didn't tag version), create the primary key, use the sequence as your special id, and put a constraint on the column.
create table dbo.test (
testid int identity(1,1) not null primary key clustered
, myid int null constraint UQ_ unique
, somevalue nvarchar(255) null
);
create sequence dbo.myid
as int
start with 1
increment by 1;
alter table dbo.test
add default next value for dbo.myid for myid;
insert into dbo.test (somevalue)
select 'this' union all
select 'that' union all
select 'and' union all
select 'this';
insert into dbo.test (myid, somevalue)
select 33, 'oops';
select *
from dbo.test
insert into dbo.test (somevalue)
select 'oh the fun';
select *
from dbo.test
--| This should error
insert into dbo.test (myid, somevalue)
select 3, 'This is NO fun';
Here is the result set:
testid myid somevalue
1 1 this
2 2 that
3 3 and
4 4 this
5 33 oops
6 5 oh the fun
And at the very end a test, which will error.
I have a Tree structure with some 10.000 Nodes
translated into more than 20 languages
for totally some 230.000 Records
Relevant Fields in the table are:
LanguageID, CategoryID, ParentCategoryID, CategoryLevel, isLeaf, Expired, TopLevelCategoryID
I need to populate TopLevelCategoryID with the proper value, thus CategoryID where CategoryParentID IS NULL --(or where CategoryLevel=1)
for all records where
isLeaf=0 and Expired=0 and TopLevelCategoryID is NOT NULL
After googling a lot I wrote this query
Declare #LanguageID nvarchar(10)='FR'
;WITH Explode AS
(
SELECT categoryID AS major,
categoryID AS minor,
LanguageIDID,
CAST(CategoryID as nvarchar(max)) AS levels
FROM dbo.Categories
Where LanguageID=#LanguageID
AND CategoryLevel=1
UNION ALL
SELECT MJ.major, MN.categoryID, MN.LanguageID, MJ.levels +','+CAST(MN.CategoryID as nvarchar(max)) levels
FROM Explode AS MJ
JOIN dbo.Categories AS MN ON MJ.minor = MN.ParentCategoryID
WHERE MN.LanguageID=#LanguageID
AND (','+MJ.levels+',' NOT LIKE '%'+CAST(MN.CategoryID as nvarchar(max))+',%')
AND MN.Expired=0
)
Update c set TopLevelCategoryID= e.major
FROM Explode e
JOIN dbo.Categories c ON c.categoryID=e.minor
WHERE c.LanguageID=#LanguageID
AND c.TopLevelCategoryID IS NULL
OPTION (MAXRECURSION 0)
that works, but is extremely slow even if I added the #LanguageID Parameter
Surely it is not efficient..
I was also wondering if it was not better a recursive routine
for categorylevel=6 to 1
where for each level you put into TopLevelCategoryID the value of the ParentCategoryID
but I've not been able to manage the recursivity :-(
here some sample data with 1 language (sorry but can't create sqlfiddle with this data)
CREATE TABLE dbo.Categories (
languageID nvarchar(3) NULL,
CategoryID int NULL,
ParentCategoryID int NULL,
CategoryLevel int NULL,
isLeaf bit NULL,
Expired bit NULL,
TopLevelCategoryID int NULL
)
INSERT INTO
Categories
(
LanguageID,
CategoryID,
ParentCategoryID,
CategoryLevel,
isLeaf,
Expired,
TopLevelCategoryID
)
VALUES
('EN',10,NULL,1,0,0,NULL),
('EN',20,NULL,1,0,0,NULL),
('EN',30,NULL,1,0,0,NULL),
('EN',40,NULL,1,0,0,NULL),
('EN',107,20,2,0,0,NULL),
('EN',112,10,2,0,0,NULL),
('EN',145,20,2,0,0,NULL),
('EN',167,20,2,0,0,NULL),
('EN',182,30,2,0,0,NULL),
('EN',194,20,2,0,0,NULL),
('EN',199,145,3,0,0,NULL),
('EN',214,112,3,0,0,NULL),
('EN',345,182,3,1,0,NULL),
('EN',567,167,3,0,0,NULL),
('EN',682,194,3,0,0,NULL),
('EN',794,145,3,0,0,NULL),
('EN',814,199,4,0,0,NULL),
('EN',823,214,4,0,0,NULL),
('EN',846,214,4,1,0,NULL),
('EN',880,199,4,0,0,NULL),
('EN',896,567,4,1,0,NULL),
('EN',898,682,4,0,0,NULL),
('EN',1104,823,5,1,0,NULL),
('EN',1120,880,5,1,0,NULL),
('EN',1450,814,5,0,0,NULL),
('EN',1670,814,5,1,0,NULL),
('EN',1820,1450,6,0,0,NULL),
('EN',1940,1450,6,0,0,NULL)
Can somebody give some hints?
Thanks!
BREAKING NEWS
I tried with a different approach, more suitable with my skills, thus, setting TopLevelCategoryID as parentCategoryID and move this value from low CategoryLevel levels to highest with this query:
DECLARE #cnt int=1
WHILE #cnt < 8
BEGIN
UPDATE c SET TopLevelCategoryID = COALESCE(c2.TopLevelCategoryID,c2.ParentCategoryID)
FROM Categories cat
JOIN Categories c2
ON c.ParentCategoryID=c2.CategoryID
AND c.LanguageID=c2.LanguageID
WHERE c.CategoryLevel=#cnt
AND c.Expired=0
AND c2.Expired=0
SET #cnt = #cnt + 1
END
from preliminary test seems working, but processing time pass from more than 4 hours to less than 1 second, and although it would be great I presume there's something wrong: from what I saw, all tree structure are processed with CTE and Recursion: cannot be so simple as it seems me now:
Can somebody help to find what I did not consider and made wrongly?
Thanks!
surely it is not the solution for everybody, but for my specific requirements works well although not using neither CTE neither Recursivity, only reiteration: actually requires to know in advance the max level of reiteration. But advantage is that on more than 200.000 records takes less than 1 second compared to more than 250 Minutes (not seconds!) of the CTE+Recursion solution posted at beginning
DECLARE #cnt int=1
WHILE #cnt < 8
BEGIN
UPDATE c SET TopLevelCategoryID = COALESCE(c2.TopLevelCategoryID,c2.ParentCategoryID)
FROM Categories cat
JOIN Categories c2
ON c.ParentCategoryID=c2.CategoryID
AND c.LanguageID=c2.LanguageID
WHERE c.CategoryLevel=#cnt
AND c.Expired=0
AND c2.Expired=0
SET #cnt = #cnt + 1
END
I have a table with about 70000000 rows of phone numbers. I use OFFSET to read those 50 by 50 numbers.
But it takes a long time (about 1 min).
However, that full-text index used for search and does not impact for offset.
How I can speed up my query?
SELECT *
FROM tblPhoneNumber
WHERE CountryID = #CountryID
ORDER BY ID
OFFSET ((#NumberCount - 1) * #PackageSize) ROWS
FETCH NEXT #PackageSize ROWS ONLY
Throw a sequence on that table, index it and fetch ranges by sequence. You could alternatively just use the ID column.
select *
FROM tblPhoneNumber
WHERE
CountryID = #CountryID
and Sequence between #NumberCount and (#NumberCount + #PackageSize)
If you're inserting/deleting frequently, this can leave gaps, so depending on the code that utilizes these batches of numbers, this might be a problem, but in general a few gaps here and there may not be a problem for you.
Try using CROSS APPLY instead of OFFSET FETCH and do it all in one go. I grab TOP 2 to show you that you can grab any number of rows.
IF OBJECT_ID('tempdb..#tblPhoneNumber') IS NOT NULL
DROP TABLE #tblPhoneNumber;
IF OBJECT_ID('tempdb..#Country') IS NOT NULL
DROP TABLE #Country;
CREATE TABLE #tblPhoneNumber (ID INT, Country VARCHAR(100), PhoneNumber INT);
CREATE TABLE #Country (Country VARCHAR(100));
INSERT INTO #Country
VALUES ('USA'),('UK');
INSERT INTO #tblPhoneNumber
VALUES (1,'USA',11111),
(2,'USA',22222),
(3,'USA',33333),
(4,'UK',44444),
(5,'UK',55555),
(6,'UK',66666);
SELECT *
FROM #Country
CROSS APPLY(
SELECT TOP (2) ID,Country,PhoneNumber --Just change to TOP(50) for your code
FROM #tblPhoneNumber
WHERE #Country.Country = #tblPhoneNumber.Country
) CA
We have found that SQL Server is using an index scan instead of an index seek if the where clause contains parametrized values instead of string literal.
Following is an example:
SQL Server performs index scan in following case (parameters in where clause)
declare #val1 nvarchar(40), #val2 nvarchar(40);
set #val1 = 'val1';
set #val2 = 'val2';
select
min(id)
from
scor_inv_binaries
where
col1 in (#val1, #val2)
group by
col1
On the other hand, the following query performs an index seek:
select
min(id)
from
scor_inv_binaries
where
col1 in ('val1', 'val2')
group by
col1
Has any one observed similar behavior, and how they have fixed this to ensure that query performs index seek instead of index scan?
We are not able to use forceseek table hint, because forceseek is supported on SQL Sserver 2005.
I have updated the statistics as well.
Thank you very much for help.
Well to answer your question why SQL Server is doing this, the answer is that the query is not compiled in a logical order, each statement is compiled on it's own merit,
so when the query plan for your select statement is being generated, the optimiser does not know that #val1 and #Val2 will become 'val1' and 'val2' respectively.
When SQL Server does not know the value, it has to make a best guess about how many times that variable will appear in the table, which can sometimes lead to sub-optimal plans. My main point is that the same query with different values can generate different plans. Imagine this simple example:
IF OBJECT_ID(N'tempdb..#T', 'U') IS NOT NULL
DROP TABLE #T;
CREATE TABLE #T (ID INT IDENTITY PRIMARY KEY, Val INT NOT NULL, Filler CHAR(1000) NULL);
INSERT #T (Val)
SELECT TOP 991 1
FROM sys.all_objects a
UNION ALL
SELECT TOP 9 ROW_NUMBER() OVER(ORDER BY a.object_id) + 1
FROM sys.all_objects a;
CREATE NONCLUSTERED INDEX IX_T__Val ON #T (Val);
All I have done here is create a simple table, and add 1000 rows with values 1-10 for the column val, however 1 appears 991 times, and the other 9 only appear once. The premise being this query:
SELECT COUNT(Filler)
FROM #T
WHERE Val = 1;
Would be more efficient to just scan the entire table, than use the index for a seek, then do 991 bookmark lookups to get the value for Filler, however with only 1 row the following query:
SELECT COUNT(Filler)
FROM #T
WHERE Val = 2;
will be more efficient to do an index seek, and a single bookmark lookup to get the value for Filler (and running these two queries will ratify this)
I am pretty certain the cut off for a seek and bookmark lookup actually varies depending on the situation, but it is fairly low. Using the example table, with a bit of trial and error, I found that I needed the Val column to have 38 rows with the value 2 before the optimiser went for a full table scan over an index seek and bookmark lookup:
IF OBJECT_ID(N'tempdb..#T', 'U') IS NOT NULL
DROP TABLE #T;
DECLARE #I INT = 38;
CREATE TABLE #T (ID INT IDENTITY PRIMARY KEY, Val INT NOT NULL, Filler CHAR(1000) NULL);
INSERT #T (Val)
SELECT TOP (991 - #i) 1
FROM sys.all_objects a
UNION ALL
SELECT TOP (#i) 2
FROM sys.all_objects a
UNION ALL
SELECT TOP 8 ROW_NUMBER() OVER(ORDER BY a.object_id) + 2
FROM sys.all_objects a;
CREATE NONCLUSTERED INDEX IX_T__Val ON #T (Val);
SELECT COUNT(Filler), COUNT(*)
FROM #T
WHERE Val = 2;
So for this example the limit is 3.7% of matching rows.
Since the query does not know the how many rows will match when you are using a variable it has to guess, and the simplest way is by finding out the total number rows, and dividing this by the total number of distinct values in the column, so in this example the estimated number of rows for WHERE val = #Val is 1000 / 10 = 100, The actual algorithm is more complex than this, but for example's sake this will do. So when we look at the execution plan for:
DECLARE #i INT = 2;
SELECT COUNT(Filler)
FROM #T
WHERE Val = #i;
We can see here (with the original data) that the estimated number of rows is 100, but the actual rows is 1. From the previous steps we know that with more than 38 rows the optimiser will opt for a clustered index scan over an index seek, so since the best guess for the number of rows is higher than this, the plan for an unknown variable is a clustered index scan.
Just to further prove the theory, if we create the table with 1000 rows of numbers 1-27 evenly distributed (so the estimated row count will be approximately 1000 / 27 = 37.037)
IF OBJECT_ID(N'tempdb..#T', 'U') IS NOT NULL
DROP TABLE #T;
CREATE TABLE #T (ID INT IDENTITY PRIMARY KEY, Val INT NOT NULL, Filler CHAR(1000) NULL);
INSERT #T (Val)
SELECT TOP 27 ROW_NUMBER() OVER(ORDER BY a.object_id)
FROM sys.all_objects a;
INSERT #T (val)
SELECT TOP 973 t1.Val
FROM #T AS t1
CROSS JOIN #T AS t2
CROSS JOIN #T AS t3
ORDER BY t2.Val, t3.Val;
CREATE NONCLUSTERED INDEX IX_T__Val ON #T (Val);
Then run the query again, we get a plan with an index seek:
DECLARE #i INT = 2;
SELECT COUNT(Filler)
FROM #T
WHERE Val = #i;
So hopefully that pretty comprehensively covers why you get that plan. Now I suppose the next question is how do you force a different plan, and the answer is, to use the query hint OPTION (RECOMPILE), to force the query to compile at execution time when the value of the parameter is known. Reverting to the original data, where the best plan for Val = 2 is a lookup, but using a variable yields a plan with an index scan, we can run:
DECLARE #i INT = 2;
SELECT COUNT(Filler)
FROM #T
WHERE Val = #i;
GO
DECLARE #i INT = 2;
SELECT COUNT(Filler)
FROM #T
WHERE Val = #i
OPTION (RECOMPILE);
We can see that the latter uses the index seek and key lookup because it has checked the value of variable at execution time, and the most appropriate plan for that specific value is chosen. The trouble with OPTION (RECOMPILE) is that means you can't take advantage of cached query plans, so there is an additional cost of compiling the query each time.
I had this exact problem and none of query option solutions seemed to have any effect.
Turned out I was declaring an nvarchar(8) as the parameter and the table had a column of varchar(8).
Upon changing the parameter type, the query did an index seek and ran instantaneously. Must be the optimizer was getting messed up by the conversion.
This may not be the answer in this case, but something that's worth checking.
Try
declare #val1 nvarchar(40), #val2 nvarchar(40);
set #val1 = 'val1';
set #val2 = 'val2';
select
min(id)
from
scor_inv_binaries
where
col1 in (#val1, #val2)
group by
col1
OPTION (RECOMPILE)
What datatype is col1?
Your variables are nvarchar whereas your literals are varchar/char; if col1 is varchar/char it may be doing the index scan to implicitly cast each value in col1 to nvarchar for the comparison.
I guess first query is using predicate and second query is using seek predicate.
Seek Predicate is the operation that describes the b-tree portion of the Seek. Predicate is the operation that describes the additional filter using non-key columns. Based on the description, it is very clear that Seek Predicate is better than Predicate as it searches indexes whereas in Predicate, the search is on non-key columns – which implies that the search is on the data in page files itself.
For more details please visit:-
https://social.msdn.microsoft.com/Forums/sqlserver/en-US/36a176c8-005e-4a7d-afc2-68071f33987a/predicate-and-seek-predicate