I've used the reFind utility to migrate a large (> 1200 TQuery components) BDE application to FireDAC. Since True is the default value for the RequestLive property of TFDQuery we have a lot of queries that previously were read-only are now writable. When they are connected to a grid, this is a problem since the user can edit data they should not be allowed to. I expect I will need to fix these "manually". Question 1: Did I miss something in the conversion, or is this to be expected?
For the queries NOT connected to grids, the fact that they are writable is probably not a problem. Question 2: Is there any advantage (such as performance) for queries to be either read-only or writable?
Related
This is a Postgres specific question. I am in a middle of the classic design situation where I have to decide whether to use Stored Procedure or Dynamic SQL (Prepared statement). I have read a lot and lot of blogs regarding the same and have come to conclusion that with current implementation of advanced database systems, there isn't any specific attribute that would weigh one over the other.
Hence my question is being Postgresql specific.
What I want to ask is, are there advantages or disadvantages of using Stored Procedures in Postgres?
More about my design: As we are using Postgres specific functions like width_bucket and relying on various other things like Partitioning and Inheritance that Postgres provides, it is unlikely that we would switch to any other database provider in future. Our queries would be complex queries involving building of graphs and reports from the real time/ non-real time data.
There also would be some analytics built. Moreover, we would also be planning of sharding and partitioning our database.
I want view points on the use of Stored Procedure with the type of system and environment I have describe above, specific to Postgresql.
I would also like to understand how query optimization and execution works in Postgres.
Ok, so your question is whether to create sql on the client side and send it to the server, vs stored procedures. Note, usually if you use stored procedures, you still have to create the sql that calls them so it is not purely an either/or. So this is about a relational interface vs stored procedures.
Additionally it is worth noting that a key question is whether this is a database owned by an application or something that many applications may use. In the former, you may not worry about encapsulation, but in the latter you want to think about your database as having a service interface.
So if it is "my application has a database and all material use goes through my application" then go with dynamic SQL against the underlying tables.
If your database has one or more applications, however, you want to make sure you can change your database structure without breaking any or all of your databases. This usually means encapsulating access behind some sort of abstract interface. This can be a use of VIEWs or stored procedures.
Views have an advantage that they can be directly manipulated in SQL, and are very flexible. This allows wide-open retrieval (and with some work storage) of data behind them. The application does not need to know how data is physically stored, just how to access it.
Stored procedures have the same benefit of encapsulation but provide a much more limited interface. They also have the problem that usually people use them in ways that require a fixed number of arguments, so adding an argument requires close coordination of updates for the db and the application (Oracle's revision based editions are a solution to this problem but PostgreSQL has nothing similar). However, one can discover arguments and handle them appropriately at run-time with a little work.
All in all this is a wide question and the specifics will be more important than generalities.
I maintain an application which has many domain entities that draw data from more than one database. The way this normally works is that the entities are loaded from Database A (in which most of their fields are stored). when a property corresponding to data in Database B is called, the entity fires off SQL to Database B to get all the relevant data.
I'm currently using a 'roll-your-own' ORM, which is ugly, but effective (and easy to understand). I've recently started using NHibernate for entities drawn solely from Database A, but I'm wondering how I might use NHibernate for entities drawn from both Databases A and B.
The best way I can think of do this is as follows. I continue to use a NHibernate-based class library for entities in Database A. Those entities which also need data from Database B expose all their data from Database B in a single class accessed via a property. When this property is called, it invokes the appropriate repository, and the object is returned. The class library for accessing Database B would therefore need to be referenced from the class library for accessing Database A.
Does this make any sense, and is there a more established pattern for this situation (which must be fairly common).
Thanks
David
I don't know how well it maps to your situation, or how mature the NHibernate porting for it is at this point, but you might want to look into Shards.
If it doesn't work for you as-is, it might at least supply some interesting patterns to consider.
EDIT (based on comments):
This indeed doesn't seem to map to your situation, as Shards is about horizontal splitting of data.
If you need to split vertically, you'll probably need to define multiple persistence units. Queries and transactions involving both databases will probably get interesting. I'm afraid I can't really help much with this. This question is definitely related though.
SQLite3 uses dynamic typing rather than static typing, in contrast to other flavors of SQL. The SQLite website reads:
Most SQL database engines (every SQL database engine other than SQLite, as far as we know) uses static, rigid typing. With static typing, the datatype of a value is determined by its container - the particular column in which the value is stored.
SQLite uses a more general dynamic type system. In SQLite, the datatype of a value is associated with the value itself, not with its container.
It seems to me that this is exactly what you don't want, as it lets you store, for example, strings in integer columns.
The page continues:
...the dynamic typing in SQLite allows it to do things which are not possible in traditional rigidly typed databases.
I have two questions:
The use case question: What are some examples where SQLite3's dynamic typing is beneficial?
The historical/design question: What was the motivation for implementing SQLite with dynamic typing?
This is called type affinity in SQLite.
According to the SQLite website, they have done this "in order to maximize compatibility between SQLite and other database engines." (see the above link)
SQLite supports the concept of "type affinity" on columns. The type affinity of a column is the recommended type for data stored in that column. The important idea here is that the type is recommended, not required. Any column can still store any type of data. It is just that some columns, given the choice, will prefer to use one storage class over another. The preferred storage class for a column is called its "affinity".
My understanding is that SQLite is exactly what it's named for - a very lightweight, minimalistic database engine. The overhead associated with strong typing is probably beyond the scope of the project, and best left to the application that uses SQLite.
But again, according to their website, they've done this to maximize compatibility with other DB engines.
If you look at, say, Firefox's "about:config" page, I believe these settings are actually stored in an SQlite database (I'm not 100% sure, though). The benefit of using SQlite's dynamic typing is that each value in the settings can be strong-typed (e.g. the "alerts.totalOpenTime" setting is an integer, while "app.update.channel" is a string) without having to have one separate column per type.
It's basically the same argument as for programming languages, in the end: why have dynamic typing in a programming language over static typing?
I need to choose a database management system (DBMS) that uses the least amount of main memory since we are severely constrained. Since a DBMS will use more and more memory to hold the index in main memory, how exactly do I tell which DBMS has the smallest memory footprint?
Right now I just have a memory monitor program open while I perform a series of queries we'll call X. Then I run the same set of queries X on a different DBMS and see how much memory is used in its lifetime and compare with the other memory footprints.
Is this a not-dumb way of going about it? Is there a better way?
Thanks,
Jbu
Just use SQLite. In a single process. With C++, preferably.
What you can do in the application is manage how you fetch data. If you fetch all rows from a given query, it may try to build a Collection in your application, which can consume memory very quickly if you're not careful. This is probably the most likely cause of memory exhaustion.
To solve this, open a cursor to a query and fetch the rows one by one, discarding the row objects as you iterate through the result set. That way you only store one row at a time, and you can predict the "high-water mark" more easily.
Depending on the JDBC driver (i.e. the brand of database you're connecting to), it may be tricky to convince the JDBC driver not to do a fetchall. For instance, some drivers fetch the whole result set to allow you to scroll through it backwards as well as forwards. Even though JDBC is a standard interface, configuring it to do row-at-a-time instead of fetchall may involve proprietary options.
On the database server side, you should be able to manage the amount of memory it allocates to index cache and so on, but the specific things you can configure are different in each brand of database. There's no shortcut for educating yourself about how to tune each server.
Ultimately, this kind of optimization is probably answering the wrong question.
Most likely the answers you gather through this sort of testing are going to be misleading, because the DBMS will react differently under "live" circumstances than during your testing. Futhermore, you're locking yourself in to a particular architecture. It's difficult to change DBMS down the road, once you've got code written against it. You'd be far better served finding which DBMS will fill your needs and simplify your development process, and then make sure you're optimizing your SQL queries and indices to fit the needs of your application.
With really small sets of data, the policy where I work is generally to stick them into text files, but in my experience this can be a development headache. Data generally comes from the database and when it doesn't, the process involved in setting it/storing it is generally hidden in the code. With the database you can generally see all the data available to you and the ways with which it relates to other data.
Sometimes for really small sets of data I just store them in an internal data structure in the code (like A Perl hash) but then when a change is needed, it's in the hands of a developer.
So how do you handle small sets of infrequently changed data? Do you have set criteria of when to use a database table or a text file or..?
I'm tempted to just use a database table for absolutely everything but I'm not sure if there are any implications to this.
Edit: For context:
I've been asked to put a new contact form on the website for a handful of companies, with more to be added occasionally in the future. Except, companies don't have contact email addresses.. the users inside these companies do (as they post jobs through their own accounts). Now though, we want a "speculative application" type functionality and the form needs an email address to send these applications to. But we also don't want to put an email address as a property in the form or else spammers can just use it as an open email gateway. So clearly, we need an ID -> contact_email type relationship with companies.
SO, I can either add a column to a table with millions of rows which will be used, literally, about 20 times OR create a new table that at most is going to hold about 20 rows. Typically how we handle this in the past is just to create a nasty text file and read it from there. But this creates maintenance nightmares and these text files are frequently looked over when data that they depend on changes. Perhaps this is a fault with the process, but I'm just interested in hearing views on this.
Put it in the database. If it changes infrequently, cache it in your middle tier.
The example that springs to mind immediately is what is appropriate to have stored as an enumeration and what is appropriate to have stored in a "lookup" database table.
I tend to "draw the line" with the rule that if it will result in a column in the database containing a "magic number" that maps to an enumeration value, then the enumeration should really exist as a lookup table. If it's unrelated to the data stored in the database (eg. Application configuration data rather than user generated data), then it's an enumeration all the way.
Surely it depends on the user of the software tool you've developed to consume the set of data, regardless of size?
It might just be that they know Excel, so your tool would have to parse a .csv file that they create.
If it's written for the developers, then who cares what you use. I'm not a fan of cluttering databases with minor or transient data however.
We have a standard config file format (key:value) and a class to handle it. We just use that on all projects. Mostly we're just setting persistent properties for our applications (mobile phone development) so that's an appropriate thing to do. YMMV
In cases where the program accesses a database, I'll store everything in there: easier for backup and moving data around.
For small programs without database access I store my data in the .net settings, which are stored in an xml file - of course this is a feature of c#, so it might not apply to you.
Anyway, I make sure to store all data in one place. Usually a database.
Have you considered sqlite ? It's file-based, which addresses your feeling that "just a file might do" (zero configuration), but it's a perfectly good database and scales remarkably well. It supports a number of APIs and there are numerous front ends for administering it.
If these are small config-like data, i use some simple and common format. ini, json and yaml are usually ok. Java and .NET fans also like XML. in short, use something that you can easily read to an in-memory object and forget about it.
I would add it to the database in the main table:
Backup and recovery (you do want to recover this text file, right?)
Adhoc querying (since you can do it will a SQL tool and join it to the other database data)
If the database column is empty the store requirements for it should be minimal (nothing if it's a NULL column at the end of the table in Oracle)
It will be easier if you want to have multiple application servers as you will not need to keep multiple copies of some extra config file around
Putting it into a little child table only complicates the design without giving any real benefits
You may well already be going to that same row in the database as part of your processing anyway, so performance is not likely to be a problem. If you are not, you could cache it in memory.