I'm wanting to update hundreds (or even thousands) of records at a time with Peewee FlaskDb (which is in an App Factory).
Referencing the Peewee documentation, I have bulk_update working well (and it is very fast compared to other methods), but it fails when using batches.
For example, Ticket.bulk_update(selected_tickets, fields=[Ticket.customer]) works great, but when I use the following code to update in batches I receive the following error.
code
with db.atomic():
Ticket.bulk_update(selected_tickets, fields=[Ticket.customer], batch=50)
error
AttributeError: 'FlaskDB' object has no attribute 'atomic'
What is the recommended way of updating records in bulk with FlaskDB? Does FlaskDB support atomic?
You are trying to access peewee.Database methods on the FlaskDB wrapper class. Those methods do not exist, you need to refer to the underlying Peewee database:
# Here we assume db is a FlaskDB() instance:
peewee_db = db.database
with peewee_db.atomic():
...
Related
Background: I have few models which are materialized as 'Table'. This tables are populated with wipe(Truncate) and Load. Now I want to protect my existing data in the Table if the query used to populate data is returning empty result set. How can I make sure an empty result set is not replacing my existing data in table.
My table lies in Snowflake and using dbt to model the output table.
Nutshell: Commit the transaction only when SQL statement used is returning Not empty result set.
Have you tried using dbt ref() function, which allows us to reference one model within another?
https://docs.getdbt.com/reference/dbt-jinja-functions/ref
If you are loading data in a way that is not controlled via dbt and then you are using this table - this is called a source. You can read more about this in here.
dbt does not control what you load into a source, everything else that is the T in the ELT is controlled where you reference a model via ref() function. A great example if you have a source that changes and you load it into a table and make sure that incoming data does not "drop" already recorded data is "incremental" materialization. I suggest you read more in here.
Thinking incremental takes time and practise, also it is recommended every now and then to do a --full-refresh.
You can have pre-hooks and post-hooks that can check your sources with clever macros and add dbt tests. We would really need a little bit more context of what you have and what you wish to achieve to suggest a real response.
I am having a problem with django subqueries. When I fetch the original QuerySet , I specify the database that I need to use. My hunch is that the later subquery ends up using the 'default' database instead of what the parent query used.
My models approximately look like so (I have several):-
class Author(models.Model):
author_name=models.CharField(max_length=255)
author_address=models.CharField(max_length=255)
class Book(models.Model):
book_name=models.CharField(max_length=255)
author=models.ForeignKey(Author, null = True)
Now I fetch a QuerySet representing all books that are called Mark like so:-
b_det = Book.objects.using('some_db').filter(book_name = 'Mark')
Then later somewhere in the code I trigger a subquery by doing something like:-
if b_det:
auth_address = b_det[0].author.author_address
My problem is that arbitrarily in some cases , on my live server, the subquery fails even though there is valid data for that author's id. My suspicion is that the subquery is not making use of the same database 'some_db'. Is this possible? Is it so that the database that needs to be used is not sticky in subqueries? It is just a hunch that this might be a problem, it is happening in the context of a celery worker, is it possible that the combination of celery with django ORM has some bug?
I have solved this each time this occurred by doing a full fetch by invoking select_related like so.
b_det = Book.objects.using('some_db').select_related('author').filter(book_name = 'Mark')
So right now, the only way for me to solve the problems is determine beforehand all the data that I will need, and make sure that the top level fetch has all those inner model references using select_related. Any ideas why something like this would fail?
I am unable to recreate this locally else I would have debugged it. Like I said, it is pretty random.
Ok, I have an handle on this now. My assumption that the subqueries would remain sticky to the original database is wrong. What django does is that first it hits the database router that is configured. If that does not return anything only in that case it makes use of the original database.
So, if the configured database router returns some database to be used then that gets used. In my opinion this is wrong and we need to use the original database first and then check the database router.
I'm working on a data conversion utility which can push data from one master database out to a number of different databases. The utility its self will have no knowledge of how data is kept in the destination (table structure), but I would like to provide writing a SQL statement to return data from the destination using a complex SQL query with multiple join statements. As long as the data is in a standardized format that the utility can recognize (field names) in an ADO query.
What I would like to do is then modify the live data in this ADO Query. However, since there are multiple join statements, I'm not sure if it's possible to do this. I know at least with BDE (I've never used BDE), it was very strict and you had to return all fields (*) and such. ADO I know is more flexible, but I don't know quite how flexible in this case.
Is it supposed to be possible to modify data in a TADOQuery in this manner, when the results include fields from different tables? And even if so, suppose I want to append a new record to the end (TADOQuery.Append). Would it append to two different tables?
The actual primary table I'm selecting from has a complimentary table which is joined by the same primary key field, one is a "Small" table (brief info) and the other is a "Detail" table (more info for each record in Small table). So, a typical statement would include something like this:
select ts.record_uid, ts.SomeField, td.SomeOtherField from table_small ts
join table_detail td on td.record_uid = ts.record_uid
There are also a number of other joins to records in other tables, but I'm not worried about appending to those ones. I'm only worried about appending to the "Small" and "Detail" tables - at the same time.
Is such a thing possible in an ADO Query? I'm willing to tweak and modify the SQL statement in any way necessary to make this possible. I have a bad feeling though that it's not possible.
Compatibility:
SQL Server 2000 through 2008 R2
Delphi XE2
Editing these Fields which have no influence on the joins is usually no problem.
Appending is ... you can limit the Append to one of the Tables by
procedure TForm.ADSBeforePost(DataSet: TDataSet);
begin
inherited;
TCustomADODataSet(DataSet).Properties['Unique Table'].Value := 'table_small';
end;
but without an Requery you won't get much further.
The better way will be setting Values by Procedure e.g. in BeforePost, Requery and Abort.
If your View would be persistent you would be able to use INSTEAD OF Triggers
Jerry,
I encountered the same problem on FireBird, and from experience I can tell you that it can be made(up to a small complexity) by using CachedUpdates . A very good resource is this one - http://podgoretsky.com/ftp/Docs/Delphi/D5/dg/11_cache.html. This article has the answers to all your questions.
I have abandoned the original idea of live ADO query updates, as it has become more complex than I can wrap my head around. The scope of the data push project has changed, and therefore this is no longer an issue for me, however still an interesting subject to know.
The new structure of the application consists of attaching multiple "Field Links" on various fields from the original set of data. Each of these links references the original field name and a SQL Statement which is to be executed when that field is being imported. Multiple field links can be on one single field, therefore can execute multiple statements, placing the value in various tables, etc. The end goal was an app which I can easily and repeatedly export a common dataset from an original source to any outside source with different data structures, without having to recompile the app.
However the concept of cached updates was not appealing to me, simply for the fact pointed out in the link in RBA's answer that data can be changed in the database in the mean-time. So I will instead integrate my own method of customizable data pushes.
I'm receiving an error message in VS2010 after I execute the following code to get values from a SQLite database via an automatically generated ADO.Net Entity Data Model.
using (Data.DbEntities ent = new Data.DbEntities())
{
var r = from tt in ent.Template_DB select tt;
r.First();//Required to cause error
}
The SQLite database table being accessed is called 'Template' (which was renamed to Template_DB for the model) with a few columns holding strings, longs and bits. All queries I've tried return exactly what's expected.
The message I receive is:
ReleaseHandleFailed was detected
A SafeHandle or CriticalHandle of type
'Microsoft.Win32.SafeHandles.SafeCapiHashHandle' failed to properly
release the handle with value 0x0D0DDCF0. This usually indicates that
the handle was released incorrectly via another means (such as
extracting the handle using DangerousGetHandle and closing it directly
or building another SafeHandle around it.)
This message comes up perhaps 60% of the time, up to 8 seconds after the code has completed. As far as I'm aware, the database is not encrypted and has no password. Until recently, I've been using similar MS-SQL databases with Entity Framework models and never seen an error like this.
Help!
EDIT:
I downloaded/installed "sqlite-netFx40-setup-bundle-x86-2010-1.0.81.0.exe" to install SQLite, from here. This included the System.Data.SQLite 1.0.81.0 (3.7.12.1) package (not 3.7.13 as stated in the comment below)
I am working on an interface for making CRUD operations using JPA, and the following question formed in my head.
If I use persist for my create method, then I can catch an EntityExistsException, if the method is called with an object that has an ID already in the database.
In the update method I will then use merge for saving changes. In order not to create something that does not exists I was thinking about looking it up first, and throwing an exception if it is not found in the database.
Now I am just thinking that this might be overkill, and why not just let merge create it if does not exists and update it if it does? But then what do I need the create method for then?
So what do you think? Is it best practice to have a create method that only creates and throws an exception when trying to create something already in the database, and have an update method that only lets you update something that already exists and therefore never creates?
I'd use just merge. If you understand (and document in your code) what it is doing, it is a good option. I've used it on several projects without any problem.