Event Sourcing SQL Populate Parent and Child Table - sql-server

Following up from question
CQRS Read Model Design when Event Sourcing with a Parent-Child-GrandChild… relationship:
We utilize Event sourcing with SQL Server 2016 at Example: furniture company.
(1) We have a Parent and Child table. Say a FurnitureDescriptionTable, (Parent table- description of all furniture Items) and FurnitureOrders(Child - multiple customers orders, refers to FurnitureDescription table). Should the join column between these be Guid or Integer Identity in SQL?
(2) If Guid, who generates the Guid, API or SQL? any reason?

Choosing what kind of type you need for for primary/foreign keys is a known problem in RDBMS world. Simple googling will help. But still:
Guids are usually done on the application side. This option is popular (since you are referring to CQRS) when command handlers can generate complete domain objects, including the identity. Otherwise, you need to have a unique identity generator, which might be non-trivial, but still feasible in some databases, like using Oracle sequences.
Numbers are usually chosen for database-generated ids. Then, new id will only be known when the row is inserted to a table. For event-sourcing scenario this is not an option, since you will only insert on the read side, but objects are created on the write side.

Related

Oracle APEX - Data Modeling & Primary Keys

I'm creating a rather large APEX application which allows managers to go in and record statistics for associates in the company. Currently we have a database in oracle with data from AD which hold all the associates information. Name, Manager, Employee ID, etc.
Now I'm responsible for creating and modeling a table that will house all their stats for each employee. The table I have created has over 90+ columns in it. Some contain data such as:
Documents Processed
Calls Received
Amount of Doc 1 Processed
Amount of Doc 2 Processed
and the list goes on for well over 90 attributes. So here is my question:
When creating this table in my application with so many different columns how would I go about choosing a primary key that's appropriate? Should I link it to our employee table using the employees identification which is unique (each have a associate number)?
Secondly, how can I create these tables (and possibly form) to allow me to associate the statistic I am entering for an individual to the actual individual?
I have ordered two books from amazon on data modeling since I am new to APEX and DBA design. Not a fresh chicken, but new enough to need some guidance. An additional problem I am running into is that each form can have only 60 fields to it. So I had thought about creating tables for different functions out of my 90+ I have.
Thanks
4.2 allows for 200 items per page.
oracle apex component limits
A couple of questions come to mind:
Are you sure that the employee Ids are not recyclable? If these ids are unique and not recycled.. you've found yourself a good primary key.
What do you plan on doing when you decide to add a new metric? Seems like you might have to add a new column to your rather large and likely not normalized table.
I'd recommend a vertical table for your metrics.. you can use oracle's pivot function to make your data appear more like a horizontal table.
If you went this route you would store your employee Id in one column, your metric key in another, and value...
I'd recommend that you create a metric table consisting of a primary key, a metric label, an active indicator, creation timestamp, creation user id, modified timestamp, modified user id.
This metric table will allow you to add new metrics, change the name of the metric, deactivate a metric, and determine who changed what and when.
This would be a much more flexible approach in my opinion. You may also want to think about audit logs.

How to send data from OLE DB source to Anchor model tables using ETL procedure?

I'm currently solving this task: some data should be sent from AdventureWorks2012 to Anchor model tables on the same server in MsSQL.
This is my Anchor Model
At this point I have a pretty simple Integration Services project in Visual Studio and it looks like this.
Control flow:
For example Load_territories is:
The main requirement is to fill all tables of Anchor model tables in MsSQL but I'm constantly facing a problem: the amount of attributes in tables are different and some of them are repeating
At this picture in the second table basically TR_ID,TR_GRP_TR_ID, TR_TID_TR_ID, TR_TNM_TR_ID contain the same values from dwh_key but it's impossible to create a one-to-many relation between attributes. My tutor has recommended me to use Lookup but I cannot figure out how to implement them in this project
This may be considered as cheating, but if you insert data into the latest view rather than the separate 6NF tables all of those ID fields will be populated by underlying trigger logic. I suspect that this would defeat the purpose of using SSIS though, since you would effectively be loading attributes sequentially rather than in parallel.
Another option is to leave surrogate key management to the ETL tool. This would require that you switch the data type for your identities from integers to GUID:s. SSIS can then generate a GUID and you can then use that very same GUID to populate all the attributes. Note that the anchor would have to be loaded first, or you will get a foreign key violation.
The most common solution though, is to leave surrogate key management to the database (and use integers). You would have a step in which you populate the metadata column in the anchor with the desired number of new identities to be created. Using the metadata number you can then select the newly generated identities and merge them into your data flow. It doesn't matter which number gets assigned to which row. After that all attributes can be populated in parallel, including their ID columns.
Of course, if this is intended to be used for more than an initial load, you would also have to add steps to detect if the data you are loading is already known or not.
I can also recommend watching the video tutorial referenced in this blog post: https://clinthuijbers.wordpress.com/2013/06/14/ssis-anchor-modeling-example-tutorial/

A way to differentiate datas when synchronize db?

I have a web app which I can create some notes, each time I create a new note, it will insert to a table with an auto_increment id. (quite obvious)
Now I want to develop an android app which I can create notes too (save them locally in sqlite), and then syncronize those notes with the server.
The problem is, when I create notes in my phone they will have their own auto_increment id which many times will be the same with those notes in server!
I don't care to have duplicated notes (actually I don't think there is a way to differentiate if the new note is duplicated or not, because they don't have some physical id), the problem is if they have same id (primary key), I won't be able to insert them to the server.
Any suggestion?
You could use an UUID as a key for your note.
That way, each entry should have an unique id, be it created on the server or on the client.
To create a UUID, you can use UUID.randomUUID().
The most obvious solution would be to give each note its own unique hash or GUID in addition to the database's auto_increment_id.
You'd then use these unique values as the basis for synchronisation in conjunction with a "last synced" timestamp in each of the tables so that you know what data needs to be synced and can easily determine if the data already exists in the destination (and should be updated) or whether it's a new note.
I'm sorry but i think that your DB structure is wrong. You cannot use autoincrement field in this way, different DBs with a disconnected architecture. Autoincrement values are created for a specific use, if you need to merge two tables like this, you have to implement a different logic. Use a note_id to identify a note in a unique way, using more data (i.e. the user id, the device id etc.) to make this id unique. Autoincrement will only give you a messy architecture at best in this scenario

inserting into a view in SQL server

I have a SQL Server as backend and use ms access as frontend.
I have two tables (persons and managers), manager is derived from persons (a 1:1 relation), thus i created a view managersFull which is basically a:
SELECT *
FROM `managers` `m`
INNER JOIN `persons` `p`
ON `m`.`id` = `p`.`id`
id in persons is autoincrementing and the primary key, id in managers is the primary key and a foreign key, referencing persons.id
now i want to be able to insert a new dataset with a form in ms access, but i can’t get it to work. no error message, no status line, nothing. the new rows aren’t inserted, and i have to press escape to cancel my changes to get back to design view in ms access.
i’m talking about a managers form and i want to be able to enter manager AND person information at the same time in a single form
my question is now: is it possible what i want to do here? if not, is there a “simple” workaround using after insert triggers or some lines of vba code?
thanks in advance
The problem is that your view is across several tables. If you access multiple tables you could update or insert in only one of them.
Please also check the MSDN for more detailed information on restrictions and on proper strategies for view updates
Assuming ODBC, some things to consider:
make sure you have a timestamp field in the person table, and that it is returned in your managers view. You also probably need the real PK of the person table in the manager view (I'm assuming your view takes the FK used for the self-join and aliases it as the ID field -- I wouldn't do that myself, as it is confusing. Instead, I'd use the real foreign key name in the managers view, and let the PK stand on its own with its real name).
try the Jet/ACE-specific DISTINCTROW predicate in your recordsource. With Jet/ACE back ends, this often makes it possible to insert into both tables when it's otherwise impossible. I don't know for certain if Jet will be smart enough to tell SQL Server to do the right thing, though.
if neither of those things works, change your form to use a recordsource based on your person table, and use a combo box based on the managers view as the control with which you edit the record to relate the person to a manager.
Ilya Kochetov pointed out that you can only update one table, but the work-around would be to apply the updates to the fields on one table and then the other. This solution assumes that the only access you have to these two tables is through this view and that you are not allowed to create a stored procedure to take care of this.
To model and maintain two related tables in access you don’t use a query or view that is a join of both tables. What you do is use a main form, and drop in a sub-form that is based on the child table. If the link master and child setting in the sub-form is set correctly, then you not need to write any code and access will insert the person’s id in the link field.
So, don’t use a joined table here. Simply use a form + sub-form setup and you be able to edit and maintain the data and the data in the related child table.
This means you base the form on the table, and not a view. And you base the sub-form on the child table. So, don't use a view here.

What would you do to avoid conflicting data in this database schema?

I'm working on a multi-user internet database-driven website with SQL Server 2008 / LinqToSQL / custom-made repositories as the DAL. I have run across a normalization problem which can lead to an inconsistent database state if exploited correctly and I am wondering how to deal with the problem.
The problem: Several different companies have access to my website. They should be able to track their Projects and Clients at my website. Some (but not all) of the projects should be assignable to clients.
This results in the following database schema:
**Companies:**
ID
CompanyName
**Clients:**
ID
CompanyID (not nullable)
FirstName
LastName
**Projects:**
ID
CompanyID (not nullable)
ClientID (nullable)
ProjectName
This leads to the following relationships:
Companies-Clients (1:n)
Companies-Projects (1:n)
Clients-Projects(1:n)
Now, if a user is malicious, he might for example insert a Project with his own CompanyID, but with a ClientID belonging to another user, leaving the database in an inconsistent state.
The problem occurs in a similar fashion all over my database schema, so I'd like to solve this in a generic way if any possible. I had the following two ideas:
Check for database writes that might lead to inconsistencies in the DAL. This would be generic, but requires some additional database queries before an update and create queries are performed, so it will result in less performance.
Create an additional table for the clients-Projects relationship and make sure the relationships created this way are consistent. This also requires some additional select queries, but far less than in the first case. On the other hand it is not generic, so it is easier to miss something in the long run, especially when adding more tables / dependencies to the database.
What would you do? Is there any better solution I missed?
Edit: You might wonder why the Projects table has a CompanyID. This is because I want users to be able to add projects with and without clients. I need to keep track of which company (and therefore which website user) a clientless project belongs to, which is why a project needs a CompanyID.
I'd go with with the latter, having one or more tables that define the allowable relationships between entities.
Note, there's no circularity in the references you have, so the title is misleading.
What you have is the possibility of conflicting data, that's different.
Why do you have "CompanyID" in the project table? The ID of the company involved is implicitly given by the client you link to. You don't need it.
Remove that column and you've removed your problem.
Additionally, what is the purpose of the "name" column in the client table? Can you have a client with one name, differing from the name of the company?
Or is "client" the person at that company?
Edit: Ok with the clarification about projects without companies, I would separate out the references, but you're not going to get rid of the problem you're describing without constraints that prevent multiple references being made.
A simple constraint for your existing tables would be that not both the CompanyID and ClientID fields of the project row could be non-null at the same time.
If you want to use the table like this and avoid the all the new queries just put triggers on the table and when user tries to insert row with wrong data the trigger with stop him.
Best Regards,
Iordan
My first thought would be to create a special client record for each company with name "No client". Then eliminate the CompanyId from the Project table, and if a project has no client, use the "No client" record rather than a "normal" client record. If processing of such no-client's is special, add a flag to the no-client record to explicitly identify it. (I'd hate to rely on the name being "No Client" or something like that -- too fuzzy.)
Then there would be no way to store inconsistent data so the problem would go away.
In the end I implemented a completely generic solution which solves my problem without much runtime overhead and without requiring any changes to the database. I'll describe it here in case someone else has the same problem.
First off, the approach only works because the only table that other tables are referencing through multiple paths is the Companies table. Since this is the case in my database, I only have to check whether all n:1 referenced entities of each entity that is to be created / updated / deleted are referencing the same company (or no company at all).
I am enforcing this by deriving all of my Linq entities from one of the following types:
SingleReferenceEntityBase - The norm. Only checks (via reflection) if there really is only one reference (no matter if transitive or intransitive) to the Companies table. If this is the case, the references to the companies table cannot become inconsistent.
MultiReferenceEntityBase - For special cases such as the Projects table above. Asks all directly referenced entities what company ID they are referencing. Raises an exception if there is an inconsistency. This costs me a few select queries per CRUD operation, but since MultiReferenceEntities are much rarer than SingleReferenceEntities, this is negligible.
Both of these types implement a "CheckReferences" and I am calling it whenever the linq entity is written to the database by partially implementing the OnValidate(System.Data.Linq.ChangeAction action) method which is automatically generated for all Linq entities.

Resources