How can the date a row was added be in a different order to the identity field on the table? - sql-server

I have a 'change history' table in my SQL Server DB called tblReportDataQueue that records changes to rows in other source tables.
There are triggers on the source tables in the DB which fire after INSERT, UPDATE or DELETE. The triggers all call a stored procedure that just inserts data into the change history table that has an identity column:
INSERT INTO tblReportDataQueue
(
[SourceObjectTypeID],
[ActionID],
[ObjectXML],
[DateAdded],
[RowsInXML]
)
VALUES
(
#SourceObjectTypeID,
#ActionID,
#ObjectXML,
GetDate(),
#RowsInXML
)
When a row in a source table is updated multiple times in quick succession the triggers fire in the correct order and put the changed data in the change history table in the order that it was changed. The problem is that I had assumed that the DateAdded field would always be in the same order as the identity field but somehow it is not.
So my table is in the order that things actually happened when sorted by the identity field but not when sorted by the 'DateAdded' field.
How can this happen?
screenshot of example problem
In example image 'DateAdded' of last row shown is earlier than first row shown.

You are using a surrogate key. One very important characteristic of a surrogate key is that it cannot be used to determine anything about the tuple it represents, not even the order of creation. All systems which have auto generated values like this, including Oracles sequences, make no guarantee as to order, only that the next value generated will be unique from previous generated values. That is all that is required, really.
We all do it, of course. We look at a row with ID of 2 and assume it was inserted after the row with ID of 1 and before the row with ID of 3. That is a bad habit we should all work to break because the assumption could well be wrong.
You have the DateAdded field to provide the information you want. Order by that field and you will get the rows in order of insertion (if that field is not updateable, that is). The auto generated values will tend to follow that ordering, but absolutely do not rely on that!

try use Sequence...
"Using the identity attribute for a column, you can easily generate auto-
incrementing numbers (which as often used as a primary key). With Sequence, it
will be a different object which you can attach to a table column while
inserting. Unlike identity, the next number for the column value will be
retrieved from memory rather than from the disk – this makes Sequence
significantly faster than Identity.
Unlike identity column values, which are generated when rows are inserted, an
application can obtain the next sequence number before inserting the row by
calling the NEXT VALUE FOR function. The sequence number is allocated when NEXT
VALUE FOR is called even if the number is never inserted into a table. The NEXT
VALUE FOR function can be used as the default value for a column in a table
definition. Use sp_sequence_get_range to get a range of multiple sequence
numbers at once."

Related

SQLite: can I reverse the order of row inserts with an AUTOINCREMENT Table?

I have a RecyclerView list of items that uses an SQLite database to store user input data. I use the traditional _id column as INTEGER PRIMARY KEY AUTOINCREMENT. If I understand correctly, newly inserted rows in the database are added below existing rows and the new ROWID takes the largest existing ROWID and increments it by +1. Therefore, a cursor search for the latest insert will have to scan down the entire set of rows to reach the bottom of the database. For example, after 10 inserts, the cursor has to search down from 1, 2, 3,... until it gets to row 10.
To avoid a lengthy search of the entire set of ROWIDs, is there any way to have new inserts be added to the top of the database and not the bottom? That way a cursor search for the latest insert using moveToFirst() will be very fast since the cursor will stop at the first row it searches, the top of the database. The cursor would search 10, 9, 8,...3,2,1 and therefore the search would be very fast since it would stop at 10, the first row at the top of the database.
You are thinking too much about the database internals. Indexes are designed for this kind of optimisation.
Make a new numeric column where you put your wished ordering as a value and use order by in selects. Do not forget to make an index on this column and verify your selects do use the indexes. (explain)
First, if you are concerned about overheads then use the recommended INTEGER PRIMARY KEY as opposed to INTEGER PRIMARY KEY AUTOINCREMENT. Both will result in a unique id, the latter has overheads as per :-
The AUTOINCREMENT keyword imposes extra CPU, memory, disk space, and
disk I/O overhead and should be avoided if not strictly needed. It is
usually not needed.
SQLite Autoincrement
If I understand correctly, newly inserted rows in the database are
added below existing rows and the new ROWID takes the largest existing
ROWID and increments it by +1.
Generally BUT not necessarily, there is no guarantee that the value will increment by 1.
AUTOINCREMENT utilises a table called sqlite_seqeunce that has a single row per table that stores the highest last used sequence number along with the table name. The next sequence number will be that value + probably 1 UNLESS the highest rowid is greater than the value in the sqlite_sequence table.
Without AUTOINCREMENT then the next sequence is the highest rowid + probably 1.
AUTOINCREMENT guarantees a higher number. Without AUOINCREMENT can use a lower number (BUT not until the number would be greater than 9223372036854775807). If AUTOINCREMENT would use a number higher that this then an SQLITE_FULL exception will happen.
Again with regard to rowid's and searching :-
The data for rowid tables is stored as a B-Tree structure containing
one entry for each table row, using the rowid value as the key. This
means that retrieving or sorting records by rowid is fast. Searching
for a record with a specific rowid, or for all records with rowids
within a specified range is around twice as fast as a similar search
made by specifying any other PRIMARY KEY or indexed value. ROWIDs and the INTEGER PRIMARY KEY
To avoid a lengthy search of the entire set of ROWIDs, is there any
way to have new inserts be added to the top of the database and not
the bottom?
Yes there is, simply specify the value for the rowid or typically the alias when inserting (but beware using an already used value and good luck with managing the numbering). However, I doubt that doing so would result in a faster search. Tables have a rowid by default largely due to the rowid being optimised for searching by rowid.

How can I include NULLs in a PivotTable Count over SSAS?

I have a view that joins orders to web tracking data which is being used as a fact table. I have lots of nulls because it takes a while for orders to obtain web tracking information. As you can see i have a total row count of 86432. However my measure count is showing 52, 753 (simple row count when you build a measure group). (Is using exactly the same view).
I believe my counts are going to be wrong due to the nulls in my data. How can I get SSAS to correctly count my null values? (I am limited to what I can do to the source database as I don't have access to change the core structure of the source system).
I understand what you are saying about counting a field vs all fields however as you can see by creating a new measure in SSAS you have the option of count of rows of a source table. This is the behaviour I would expect and I would expect the same count as SELECT * on the table as shown in my images...
I believe DimAd does not have a null or zero AdKey row. And I believe during processing you have to change the error configuration to have it discard or ignore any fact table rows where the foreign key is null.
My top recommendation is to change your fact table foreign keys to be not null. You will need to create a -1 key in each dimension and then use it in the fact table instead of null as described here.
If that's not feasible then add null or zero AdKey rows to any dimension where the fact table foreign key can be null. SSAS should convert the Bulls to zero so either should work. Then during processing those rows won't be dropped because they join fine. And you won't have to change the error configuration during processing.
If that's not feasible or acceptable then you can turn on the Unknown member on all dimensions which could be nullable. Then in the Dimension Usage tab set each relationship to fallback to the Unknown member. This process is described here.
In order to get a true row count you need not to count the column, but instead use *.
COUNT(*) will count all rows, regardless of NULL
COUNT(Column) counts only NON-NULL values
Test Example
declare #table table (i int)
insert into #table (i) values
(1),(NULL),(NULL),(NULL)
select count(*) from #table --returns 4
select count(i) from #table --returns 1

Access Database Automatically sorting records! How to stop?

i am building a program that uses a database.
When i enter my data into the database it doesnt stay as entered. It is automatically sorted into ascending order by a field that contains the ID number. The problem is when i create a new record programatically, it creates a record in another table with the same row number.
I need to stop access automatically sorting the records. any ideas?
A relational database does not have the concept of an order of records per se; instead retrieved records are ordered by the query used to retrieve them - either in your code, or behind the scenes in the Access gui. So if you want them to appear in a specific order, then write a query including an ORDER BY clause to suit.
Like in all relational databases, the rows in MS Access don't have a fixed order.
If you select data from a table without specifying a ORDER BY clause in your query, the database will return the rows in random order.
Often the order will look sensible (like in your case, ordered ascending by the ID column), and if you run the same query several times, the order might really be the same.
But there's no guarantee - you can't rely on this order, you have to specify one yourself by ordering by the ID column or any other column.
I think that your problem (besides apparently not knowing how ordering works in a relational database) is this:
The problem is when i create a new record programatically, it creates a record in another table with the same row number.
If I'm understanding you correctly:
When you need the record in the other table to have the same value, just take the value from the ID column after inserting into the first table (given that ID is the primary key) and use that value to save the data into the second table.
Removing any indexing from the table is the easy way. But then you have no index.
I would just create a new field in your table called "PretendRowNumber" and insert the would be row number into it. Then you can at least tie your two tables back to each other.

Avoiding gaps in an identity column

I have a table in MS SQL SERVER 2008 and I have set its primary key to increment automatically but if I delete any row from this table and insert some new rows in the table it starts from the next identity value which created gap in the identity value. My program requires all the identities or keys to be in sequence.
Like:
Assignment Table has total 16 rows with sequence identities(1-16) but if I delete a value at 16th position
Delete From Assignment Where assignment_id=16;
and after this operation when I insert a new row
Insert into Assignment(assignment_title)Values('myassignment');
Rather than assigning 16 as a primary key to this new value it assigns 17.
How can I solve this Problem ?
Renaming or re-numbering primary key values is not a good database management practice. I suggest you keep the primary key as is, and create a separate column index with the values you require to be re-numbered. Then simply create a trigger to run a routine that will re-number every row in the order you expect, obviously by seeking the "gaps" and entering them with values incremented from their previous value.
This is SQL Servers standard behaviour. If you deleted a row with ID=8 in your example, you would still have a gap.
All you could do, is write a function getSmallestDreeID in SQL Server, that you called for every insert and that would get you the smallest not assigned ID. But you would have to take great care of transactions and ACID.
The behavior you desire isn't possible without some post processing logic to renumber the rows.
Consider thus scenario:
Session 1 begins a transaction, inserts a row (id=16), but doesn't commit yet.
Session 2 begins a transaction, inserts a row (id=17) and commits.
Session1 rolls back.
Whether 16 will or will not exist in the table is decided after 17 is committed.
And you can't renumber these in a trigger, you'll get deadlocked.
What you probably need to do is to query the data adding a row number that is a sequential integer.
Gaps in identity values isn't a problem
well, i have recently faced the same problem: i need the ID values in an external C# application in order to retrieve files named exactly as the ID.
==> here is what i did to avoid the identity property, i entered id values manually because it was a small table, but if it is not in your case, use a SEQUENCE SQL Server 2014.
Use the statement UPDATE instead of delete to keep the id values in order.

How can I get current autoincrement value

How can I get last autoincrement value of specific table right after I open database? It's not last_insert_rowid() because there is no insertion transaction. In other words I want to know in advance which number autoincrement will choose when inserting new row for particular table.
It depends on how the autoincremented column has been defined.
If the column definition is INTEGER PRIMARY KEY AUTOINCREMENT, then SQLite will keep the largest ID in an internal table called sqlite_sequence.
If the column definition does NOT contain the keyword AUTOINCREMENT, SQLite will use its ‘regular’ routine to determine the new ID. From the documentation:
The usual algorithm is to give the newly created row a ROWID that is
one larger than the largest ROWID in the table prior to the insert. If
the table is initially empty, then a ROWID of 1 is used. If the
largest ROWID is equal to the largest possible integer
(9223372036854775807) then the database engine starts picking positive
candidate ROWIDs at random until it finds one that is not previously
used. If no unused ROWID can be found after a reasonable number of
attempts, the insert operation fails with an SQLITE_FULL error. If no
negative ROWID values are inserted explicitly, then automatically
generated ROWID values will always be greater than zero.
I remember reading that, for columns without AUTOINCREMENT, the only surefire way to determine the next ID is to VACUUM the database first; that will reset all ID counters to the largest existing ID for that table + 1. But I can’t find that quote anymore, so this may no longer be true.
That said, I agree with slash_rick_dot that fetching auto-incremented IDs beforehand is a bad idea, especially if there’s even a remote chance that another process might write to the database at the same time.
Different databases implement auto-increment differently. But as far as I know, none of them will answer the question you are asking.
The auto increment feature is intended to create a unique ID for a newly added table row. If a row hasn't been inserted yet, then the feature hasn't produced the id.
And it makes sense... If you did get the next auto increment number, what would you do with it? Likely the intent is to assign it as the primary key of the not-yet-inserted new row. But between the time you got the id, and the time you used it to insert the row, the database could have used that id to insert a row for another process.
Your choices are this: manage the creation of ids yourself, or wait until rows are inserted before using their auto-created identifiers.

Resources