multi_schema and side effect problems - database

I am working on a project in which we need to define agencies in other cities. We have the same application but separate database schema for each agency. I used one session factory. For each request we get the person's username and therefore we can recognize which agency they belongs to. We change the PostgreSQL search_path for that.
The problem is now with cache. Since we are changing schema constantly it seems cache does not work.
Our jobs (which are written with quartz scheduling) seem to have problem because we are changing schema constantly.
Any ideas?

Can you not use a session factory per schema?

Related

Domain Driven Design, should I use multiple databases or a single source of truth

I'm about to propose some fundamental changes to my employers and would like the opinion of the community (opinion because I know that getting a solid answer to something like this is a bit far-fetched).
Context:
The platform I'm working on was built by a tech consultancy before I joined. While I was being onboarded they explained that they used DDD to build it, they have 2 domains, the client side and the admin side, each has its own database, its own GraphQl server, and its own back-end and front-end frameworks. The data between the tables is being synchronized through an http service that's triggered by the GraphQl server on row insertions, updates, and deletes.
Problem:
All of the data present on the client domain is found in the admin domain, there's no domain specific data there. Synchronization is a mess and is buggy. The team isn't large enough to manage all the resources and keep track of the different schemas.
Proposal:
Remove the client database and GraphQl servers, have a single source of truth database for all the current and potentially future applications. Rethink the schema, split the tables that need to be split, consolidate the ones that should be joined, and create new tables according to the actual current business flow.
Am I justified in my proposal, or was the tech consultancy doing the right thing and I'm sending us backwards?
Normally you have a database, or schema, for each separated boundary context. That means, that the initial idea of the consultancy company was correct.
What's not correct is the way that the consistency between the two is managed. You don't do it on tables changes but with services inside one (or both) the domains listening to the events and taking the update actions. It's a lot of work, anyway, because you have to update the event handlers on every change (in the events or tables structure).
This code is what's called anti corruption layer, that's exactly what it does: it avoids any corruption between the copies of the domain in another domain.
Said this, as you pointed out, your team is small and it could be that maintaining such a layer (and hence code) could cost a lot of energies. But, you've also to remember that once you've done, you have just to update it when needed.
Anyway, back to the proposal, you could also take this route. What you should (must, I would say) is that in each domain the external tables should be accessed only by some services, or queries, and this code should never ever modify the content that it access. Never. But I suppose that you already know this.
Nothing is written in the stone, the rules should always be adapted when put in a real context. Two separated databases means more work, but also a much better separation of the domains. It could never happen that someone accidentally modifies the content of the tables of the other domain. On the other side, one database means less work, but also much more care about what the code does.

asp.net code first automatic database updates

I am creating an application in C# Asp.net using Code First Entity Framework that will be using a different databases for different customers (in other words every customer has its own database, that will be generated on first time use).
I am trying to figure out a way to update all these databases automatically whenever I apply changes to my objects. In other words, how would I approach a cleanstep system in Code First EF?
Currently I am using InitializerIfModelChange to define a simple database that allows me to test my application whenever a schema change occurs. However, this method drops the database, which obviously is unacceptable in case of customer databases.
I must assume hundreds of customers so updating all databases by hand is not an option
I do not mind writing code that copies the data into a new database.
I think the best solution would be a way to backup a database somehow and then reinsert all data into the newly created database. Even better would be a way that automatically updates the schema without dropping the database. However I have no idea how to approach this. Can anyone point me in the right direction?
The link posted by Joakim was helpful. It requires you to update to EF 4.3.1 (dont forget your references in other projects if you have them) after which you can run the command that enables the migration. To automatically update the schema from code you can use
Configuration configuration = new Configuration();
DbMigrator migrator = new DbMigrator(configuration);
migrator.Update();
Database.SetInitializer<DbContext>(null);

Proper change-logging impossible with Entity Framework?

I'd like to log all changes made an SQL Azure database using Entity Framework 4.
However, I failed to find a proper solution so far..
So far I can track all entities themself by overriding SaveChanges() and using the ObjectStateManager to retrieve all added, modified and deleted entities. This works fine. Unfortunately I don't seem to be able to retrieve any useful information out of RelationshipEntries. In our database model we some many-to-many relationships, where I want to log new / modified / deleted entries too.
I want to store all changes in an Azure Storage, to be able to follow changes made by a user and perhaps roll back to a previous version of an entity.
Is there any good way to accomplish this?
Edit:
Our scenario is that we're hosting a RESTful WebService that contains all business logic and stores the data in the Azure SQL Database. A client must be authenticated as a user with the WebService, and I'd need to store the information which user changed the data.
See FrameLog, an Entity Framework logging library that I wrote for this purpose. It is open-source, including for commercial use.
Even if you don't want to use the library, you can look at the code to see one way of handling logging relationships. It handles all relationship multiplicities.
Particularly, see the code for the private methods logRelationshipChange, and logForeignKeyChange in the ChangeLogger class.
You can do it with a tracing provider.
You may want to consider just using a database trigger for this. Whenever a value in a table is changed, copy the row to another Archive table. It has worked pretty well for me.

Is there a way to prevent users from doing bulk entries in a Postgresql Database

I have 4 new data entry users who are using a particular GUI to create/update/delete entries in our main database. The "GUI" client allows them to see database records on a map and make modifications there, which is fine and preferred way of doing it.
But lately lot of guys have been accessing local database directly using PGAdmin and running bulk queries (i.e. update, insert, delete,etc) which introduces lot of problems like people updating lot of records without knowing or making mistakes while setting values. It also effects our logging procedures as we are calculating averages and time stamps for reporting purposes which are quite crucial to us.
So is there a way to prevent users from using PGAdmin (please remember lot of these guys are working from home and we do not have access to their machines) and running SQL queries directly in the database.
We still have to give them access to certain tables and allow them to execute sql as long as it's coming through a certain client but deny access to same user when he/she tries to execute a query directly in the db.
The only sane way to control access to your database is converting your db access methods to 3-tier structure. You should build a middleware (maybe some rest API or something alike) and use this API from your app. Database should be hidden behind this middleware, so no direct access is possible. From DB point of view, there are no ways to tell if one database connection is from your app, or from some other tool (pgadmin, simple psql or some custom build client). Your database should be accessible only from trusted hosts and clients should not have access to those hosts.
This is only possible if you use a trick (which might get exploited, too, but maybe your users are not smart enought).
In your client app set some harmless parameter like geqo_pool_size=1001 (if it is 1000 normally).
Now write a trigger that checks if this parameter is set and outputs "No access through PGAdmin" if this parameter is not set like from your app (and the username is not your admin username).
Alternatives: Create a temporary table and check for its existance.
I believe you should block direct access to the database, and set an application to which your clients (humans and software ones) will be able to connect.
Let this application filter and pass only allowed commands.
A great care should be taken in the filtering - I would carefully think whether raw SQL would be allowed at all. Personally, I would design some simplified API, which would make me sure that a hypothetical client-attacker (In God we trust, all others we monitor) would not find a way to sneak with some dangerous modification.
I suppose that from security standpoint your current approach is very unsafe.
You should study advanced pg_hba.conf settings.
this file is the key point for use authorization. Basic settings imply only simple authentification methods like passwords and lists of IP, but you can have some more advanced solution.
GSSAPI
kerberos
SSPI
Radius server
any pam method
So your official client can use a more advanced method, like somthing with a third tier API, some really complex authentification mechanism. Then without using the application it will at least becomes difficult to redo these tasks. If the kerberos key is encrypted in your client, for example.
What you want to do is to REVOKE your users write access, then create a new role with write access, then as this role you CREATE FUNCTION defined as SECURITY DEFINER, which updates the table in a way you allow with integrity checks, then GRANT EXECUTE access to this function for your users.
There is an answer on this topic on ServerFault which references the following blog entry with detailed description.
I believe that using middleware as other answers suggest is an unnecessary overkill in your situation. The above solution does not require for the users to change the way they access the database, just restricts their right to modify the data only through the predefined server side methods.

Bring current user to the database layer

I have a classic 3-tier web application build with MySQL and Tomcat.
I want to save the creator id of each database record for all tables at creator_id column or just log it somewhere.
Current user is stored at the http session object.
Modify all queries and pass creator id parameter is unacceptable.
Can I solve the problem using triggers, alter table commands etc.?
What is the best way to do that?
PS. Hacks are acceptable and welcome.
The database can't possibly know which site user is sending the query, all it knows is which database user. And if it's a web application, it's probably the same database user all the time, no matter who is logged in on the website.
The short answer is that no, you're going to have to go with your "unacceptable" option, unless you want to create a database user for every site user, and have the site open the database connection using those, instead of one "shared" user. But that may end up causing more problems than it solves.
Based on what you say in your question, your logical application user ID is different than your database connection ID. IF that is the case how can the database possibly know what your logical application user ID is? unless you pass it in, there is no way for it to know who is doing what. You say that is is unacceptable to modify all queries to pass this in. However, you would only need to modify the saves where you want to record this "creator_id" value. You will need to modify those tables as well. Hopefully you have a table that contains all of these users and you can FK to the new column to this table.

Resources