From learn.microsoft.com "Populating a DataSet from a DataAdapter"
Pulling all of the table to the client also locks all of the rows on the server.
I didn't find any information (in Namespace: System.Data) regarding possibility to put lock on records (or group records) in DB, that was read to a DataSet (DataTable), which can affect all users of the DB, but not only those who will work with the database through my application.
Also From learn.microsoft.com "Using UpdatedRowSource to Map Values to a DataSet"
The Update method resolves your changes back to the data source; however other clients may have modified data at the data source since the last time you filled the DataSet. To refresh your DataSet with current data, use the DataAdapter and Fill method. New rows will be added to the table, and updated information will be incorporated into existing rows. The Fill method determines whether a new row will be added or an existing row will be updated by examining the primary key values of the rows in the DataSet and the rows returned by the SelectCommand. If the Fill method encounters a primary key value for a row in the DataSet that matches a primary key value from a row in the results returned by the SelectCommand, it updates the existing row with the information from the row returned by the SelectCommand and sets the RowState of the existing row to Unchanged. If a row returned by the SelectCommand has a primary key value that does not match any of the primary key values of the rows in the DataSet, the Fill method adds a new row with a RowState of Unchanged.
If you we have modified copy records from DB (locally in-memory) in DataSet, and want to propogand these chanes to server, why we must refresh our local record? It can rejects all our changes.
I doesn't understand, in common, the strategy for organization modification a records in DB throught DataSet:
make local copy records (by "Addapter.Fill(Dataset)")
change record (or records) locally (some contunies time) and wait when user click "Update", when:
save all modification in temp table?
reread records from DB (again by "Addapter.Fill(Dataset)")?
compare records from temp table with updated Dataset?
And if any nothing is changed, quickly to update records in DB (by
"Addapter.Update(Dataset)")?
But also in that case, It's a possibly that someone would be more quickly than I (and can update "My records" between the my reread and my update?
Rereaded all articles about ADO.NET (from learn.microsoft.com), again, and found some additive information and can answer on own questions:
1) The reading records from DB to DataSet (by Addapter.Fill(Dataset) doesn’t put any lock on records in DB (the foregoing phrase from learn.microsoft.com “Pulling all of the table to …” isn’t correct).
2) “In a multiuser environment, there are two models for updating data in a database: optimistic concurrency and pessimistic concurrency. The DataSet object is designed to encourage the use of optimistic concurrency (ONLY) for long-running activities, such as remoting data and interacting with data.”
A transaction (DbTransaction Class-derived types) which consists of a single command or a group of commands and which execute as a package, can put different types of locks on rows in tables in DB (DbTransaction.IsolationLevel Property).
Not correct use of transaction can very badly influence on work DB in multi-user env (the transactions must keep as short as possible).
3) When we talk regarding a locking records in DB, we must clear understood, what the aim of locking records and who it will be corresponds with logic work of our Application.
By example, we want made system for selling tickets for cinema (to simplify, for one movie and for one session only).
Application (which will run on multi-devices) must connect to DB and
fill local DataSet, and show for users all available seats (all rows
which have in special DB field SeatStatus = “FREE”).
After “selecting seat” user must press “Make Reservation” (MakeReservation()). Method MakeReservation() must use the ”Testing for Optimistic Concurrency Violations” (see example from Microsoft it more elegant than my from first post) trying to change value of the field SeatStatus to “RESERVED”, in this time will use the “optimistic concurrency” – who will press first, that “receive seat”.
Who will “second” receive message “Sorry, place is reserved” and Update for current list of available seats (UpdateSeats()).
Also, UpdateSeats() must periodically run on all active devices (one
per sec).
User who first pressed button, on next screen must entered
credit card information and press “Pay” (PayTicket()).
Method PayTicket() must connect to Bank, check payment and change
status SeatStatus to “OCCUPIED”, and this case (IMHO) more correct will use a some transaction with “pessimistic
concurrency”.
If User isn’t pressed “Pay” in some demanded time (5 min.) will
return to previous
screen and the field SeatStatus changed to “FREE”.
P.S>
Wellcome to all who knows more correct way to realize this task more correctly.
Related
This is similar to another question and I have given it the same name. But my situation is a bit different.
The first question for reference: Access Linked to SQL: Wrong data shown for a newly created record
I have an Access front end linked to tables in SQL Server. For all relevant tables, there is an autonumber (int with Identity Specification) as Primary Key. About half of the linked tables have the following issue, the others do not, despite being set up similarly:
When adding a new record to the table, the record is inserted in the SQL database, but then in the access front end view, be it a table or form, the added record is filled up with data of another record.
In the other question, it was explained that Access is querying SQL Server with ##IDENTITY. I saw the same thing in a trace. In my case it tries SELECT ##IDENTITY twice, then attempts to pull the new record with a sp_prepexec generated SQL that I can't read, and consistently gets the wrong one, in certain tables, not in others, which are set up basically the same.
The wrong record being returned seems to be an earlier autonumber in the table, and if I do it several times in a row, it returns a series of autonumbers in sequence, for instance, 18347, 18348, 18349. (These are the incorrect autonumbers being displayed, along with all data from their records, instead of the newly created record.) But if I wait a few minutes, there will be a gap, it might return 18456 next, for instance.
Refreshing does bring the correct record into view.
The autonumber fields do show up in Access design view as Primary Keys.
The Access front end is an .mdb file. We are using Access for Microsoft 365 MSO 64 bit.
As a general rule, this issue should not show up.
However, there are two cases to keep in mind.
First case:
Access when you START typing in a record, with a Access back end (BE), then the auto number is generated, and displayed instant, and this occurs EVEN before the record save.
And in fact if the record is not saved (user hits Esc key, or un-do from menu, or even ctrl-z). At that point, the record is not dirty and will not be saved. And of course this means gaps will and can appear in the autonumber.
WHEN using a linked table to sql server? You can start typing, and the record becomes dirty, but the AUTONUMBER will NOT display, and has NOT yet been generated. And thus your code cannot use the autonumber quite yet. The record has to be saved first before you can get/grab/use the autonumber.
Now for a form + sub form? Well, they work because access (for sql or access tables) ALWAYS does a record save of the main form when focus moves to the child form. So these setups should continue to work.
I note, and mention the above since SOME code that uses or requires use of the autonumber during a record add process MIGHT exist in your application. That code will have to be changed. Now to be fair, even in a fair large application, I tend to find few places where this occurs.
Often the simple solution is to modify the code, and simply force the record to be written, and then you have use of the autonumber.
You can do this:
if me.IsNewReocrd = True then
if me.dirty = true then me.Dirty = false
end if
' code here that needs the PK autonumber
lngNewID = me!id ' the autonumber is now generated and available for use.
The next common issue (and likely YOUR issue).
The table(s) in question have triggers. You have to modify the store procedures to re-select the PK id, and if you don't, then you see/find the symptoms you see. If the store procedure updates other tables, then it can work, but the last line of the store procedure will need to re-select the PK id.
So, in the last line of your store procedure that is attached to the table? you need to re-select the existing PK value.
eg:
SELECT #MyPK as ID
If an ETL process attempts to detect data changes on system-versioned tables in SQL Server by including rows as defined by a rowversion column to be within a rowversion "delta window", e.g.:
where row_version >= #previous_etl_cycle_rowversion
and row_version < #current_etl_cycle_rowversion
.. and the values for #previous_etl_cycle_rowversion and #current_etl_cycle_rowversion are selected from a logging table whose newest rowversion gets appended to said logging table at the start of each ETL cycle via:
insert into etl_cycle_logged_rowversion_marker (cycle_start_row_version)
select ##DBTS
... is it possible that a rowversion of a record falling within a given "delta window" (bounded by the 2 ##DBTS values) could be missed/skipped due to rowversion's behavior vis-à-vis transactional consistency? - i.e., is it possible that rowversion would be reflected on a basis of "eventual" consistency?
I'm thinking of a case where say, 1000 records are updated within a single transaction and somehow ##DBTS is "ahead" of the record's committed rowversion yet that specific version of the record is not yet readable...
(For the sake of scoping the question, please exclude any cases of deleted records or immediately consecutive updates on a given record within such a large batch transaction.)
If you make sure to avoid row versioning for the queries that read the change windows you shouldn't miss many rows. With READ COMMITTED SNAPSHOT or SNAPSHOT ISOLATION an updated but uncommitted row would not appear in your query.
But you can also miss rows that got updated after you query ##dbts. That's not such a big deal usually as they'll be in the next window. But if you have a row that is constantly updated you may miss it for a long time.
But why use rowversion? If these are temporal tables you can query the history table directly. And Change Tracking is better and easier than using rowversion, as it tracks deletes and optionally column changes. The feature was literally built for to replace the need to do this manually which:
usually involved a lot of work and frequently involved using a
combination of triggers, timestamp columns, new tables to store
tracking information, and custom cleanup processes
.
Under SNAPSHOT isolation, it turns out the proper function to inspect rowversion which will ensure contiguous delta windows while not skipping rowversion values attached to long-running transactions is MIN_ACTIVE_ROWVERSION() rather than ##DBTS.
So I am using postgres type database and I have a function that updates rows in the database for some reason every time I change something it "pushes" the row to the end of the table rather than staying in the same position of where it was.
this is an example of me updating the data (this is a part of the function):
users.query.filter_by(username = user).update(dict(computer_id = assign_value, level=level))
db.session.commit()
but for some reason whenever I see the users table I can see that whatever value I updated is getting pushed to the end of the row
There is no such thing as an ordering on the records of a table. Internally, updating a record is handled as inserting a newer version and at some time delete the older version (if the trasaction completes, the older version should not be needed again, at least not for newer transactions). From this point of view, it even makes some sense that the record is "moved" to the end of the table (eventhough the table does not have any start or end).
If you want to have a certain ordering, consider querying the data with an appropriate ORDER BY (or whatever function or options your framework uses to do this). If you query data and you do not specify an ordering, the retrieved records may be shuffled in any way. Do never rely on things like "If I only insert in this table, the data will always be returned in the same sequence as I inserted it" (eventhough this might be true under some circumstances).
I have a database table which have more than 1 million records uniquely identified by a GUID column. I want to find out which of these record or rows was selected or retrieved in the last 5 years. The select query can happen from multiple places. Sometimes the row will be returned as a single row. Sometimes it will be part of a set of rows. there is select query that does the fetching from a jdbc connection from a java code. Also a SQL procedure also fetches data from the table.
My intention is to clean up a database table.I want to delete all rows which was never used( retrieved via select query) in last 5 years.
Does oracle DB have any inbuild meta data which can give me this information.
My alternative solution was to add a column LAST_ACCESSED and update this column whenever I select a row from this table. But this operation is a costly operation for me based on time taken for the whole process. Atleast 1000 - 10000 records will be selected from the table for a single operation. Is there any efficient way to do this rather than updating table after reading it. Mine is a multi threaded application. so update such large data set may result in deadlocks or large waiting period for the next read query.
Any elegant solution to this problem?
Oracle Database 12c introduced a new feature called Automatic Data Optimization that brings you Heat Maps to track table access (modifications as well as read operations). Careful, the feature is currently to be licensed under the Advanced Compression Option or In-Memory Option.
Heat Maps track whenever a database block has been modified or whenever a segment, i.e. a table or table partition, has been accessed. It does not track select operations per individual row, neither per individual block level because the overhead would be too heavy (data is generally often and concurrently read, having to keep a counter for each row would quickly become a very costly operation). However, if you have you data partitioned by date, e.g. create a new partition for every day, you can over time easily determine which days are still read and which ones can be archived or purged. Also Partitioning is an option that needs to be licensed.
Once you have reached that conclusion you can then either use In-Database Archiving to mark rows as archived or just go ahead and purge the rows. If you happen to have the data partitioned you can do easy DROP PARTITION operations to purge one or many partitions rather than having to do conventional DELETE statements.
I couldn't use any inbuild solutions. i tried below solutions
1)DB audit feature for select statements.
2)adding a trigger to update a date column whenever a select query is executed on the table.
Both were discarded. Audit uses up a lot of space and have performance hit. Similary trigger also had performance hit.
Finally i resolved the issue by maintaining a separate table were entries older than 5 years that are still used or selected in a query are inserted. While deleting I cross check this table and avoid deleting entries present in this table.
I'm trying to create SSIS package which will periodically send data to other database. I want to send only new records(I need to keep sent records) so I created status column in my source table.
I want my package to update this column after successfuly sending data, but I can't update all rows wih "unsent" status because during package execution some rows may have been added, and I also can't use transactions(I mean on isolation levels that would solve my problem: I can't use Serializable beacause i musn't prevent users from adding new rows, and Sequence Container doesn't support Snapshot).
My next idea was to use recordset and after sending data to other db use it to get ids of sent rows, but I couldn't find a way to use it as datasource.
I don't think I should set status "to send" and then update it to "sent", I believe it would be to costly.
Now I'm thinking about using temporary table, but I'm not convinced that this is the right way to do it, am I missing something?
Record Set is a destination. You cannot use it in Data Flow task.
But since the data is saved to a variable, it is available in the Control flow.
After completing the DataFlow, come to the control flow and create a foreach component that can run on the ResultSet varialbe.
Read each Record Set value into a variable and use it to run an update query.
Also, see if "Lookup Transform" can be useful to you. You can generate rows that match or doesn't match.
I will improve the answer based on discussions
What you have here is a very typical data mirroring problem. To start with, I would not simply have a boolean that signifies that a record was "sent" to the destination (mirror) database. At the very least, I would put a LastUpdated datetime column in the source table, and have triggers on that table, on insert and update, that put the system date into that column. Then, every day I would execute an SSIS package that reads the records updated in the last week, checks to see if those records exist in the destination, splitting the datastream into records already existing and records that do not exist in the destination. For those that do exist, if the LastUpdated in the destination is less than the LastUpdated in the source, then update them with the values from the source. For those that do not exist in the destination, insert the record from the source.
It gets a little more interesting if you also have to deal with record deletions.
I know it may seem wasteful to read and check a week's worth, every day, but your database should hardly feel it, it provides a lot of good double checking, and saves you a lot of headaches by providing a simple, error tolerant algorithm. Some record does not get transferred because of some hiccup on the network, no worries, it gets picked up the next day.
I would still set up the SSIS package as a server task that sends me an email with any errors, so that I can keep track. Most days, you get no errors, and when there are errors, you can wait a day or resolve the cause and let the next days run pick up the problems.
I am doing a similar thing, in my case, I have a status on the source record.
I read in all records with a status of new.
Then use a OLE DB Command to execute SQL on each row, changing
the status to "In progress"(in you where, enter a ? as the value in
the Component Property tab, and you can configure it as a parameter
from the table row like an ID or some pk in the Column Mappings
tab).
Once the records are processed, you can change all "In Progress"
records to "Success" or something similar using another OLE DB
Command.
Depending on what you are doing, you can use the status to mark records that errored at some point, and require further attention.