How to maintain the relation ship on more than two tables - database

Please help me on maintaining the relatoinships between more than two tables. I will explain the one scenario that i am facing now.
Table 1 has all the codes i am using in the application, example status codes means Active has the code as 10, InActive has the code as 20, in this way i am mainting all the codes,
Table 2 is having the 5 columns,
column 1 is auto generated key
column 2 - has the id of table 1
column 3 - also has the id of table 1
column 4 - also has the id of table 1
column 5 - has description.
So,I need to perform multiple joins while retreiving the data from these two tables.My question is this, is it right way to maitain the tables?
Moreover i am using Spring with Hibernate to fetch the data from DB. Any ideas on how to do that using Hibernate?
Please suggest me on this.

Related

Transfer of primary keys of dimension table to fact table cannot write values

My problem is understand the relation of primary keys to the fact table.
This is the structure I'm working in, the transfer works but it says the values I set as primary keys cannot be NULL
This is the structure I'm working in, the transfer works but it says the values I set as primary keys cannot be NULL
I'm using SSIS to transfer data from a CSV file to an OLEDB (SQL server 2019 over SSMS)
The actual problem is where/how I can get the values in the same task? I tried to do in in two different tasks but then it is in the table one after another ( this only worked when I allowed nulls for the primary keys and can't be a solution I think.)
Maybe the problem I have three transfer from the source
First dimension table
To second dimension table
To fact table. I think the primary keys are generated when I transfer the data to the DB so I think I can't get it in the same task.
dataflow 1
dataflow 2
input data
output data 5
I added the column salesid to the input to use it for the saleskey. Is there a better solution maybe with the third lookup you've mentioned?
You are attempting to load the fischspezi fact table as well as the product (produkt) and location (standort). The problem is, you don't have the keys from the dimensions.
I assume the "key" columns in your dimension are autogenerated/identity values? If that's the case, then you need to break your single data flow into two data flows. Both will keep the Flat File source and the multicast.
Data Flow Dimensions
This is the existing data flow, minus the path that leads to the Fact table.
Data Flow Fact
This data flow will work to populate the Fact table. Remove the two branches to the dimension tables. What we need to do here, is find the translated key values given our inputs. I assume produkt_ID and steuer_id should have been defined as NOT NULL and unique in the dimensions but the concept here is that we need to be able to use a value that comes in our file, product id 3892, and find the same row in the dimension table which has a key value of 1.
The tool for this, is the Lookup Transformation You're going to want 2-3 of those in your data flow right before the destination. The first one will lookup produktkey based on produkt_ID. The second will find standortkey based on steuer_id.
The third lookup you'd want here (and add back into the dimension load) would lookup the current row in the destination table. If you ran the existing package 10 times, you'd have 10x data (unless you have unique constraints defined). Guessing here, but I assume sale_id is a value in the source data so I'd have a lookup here to ensure I don't double load a row. If sales_id is a generated value, then for consistency, I'd rename the suffix to key to be in line with the rest of your data model.
I also encourage everyone to read Andy Leonard's Stairway to Integration Services series. Levels 3 &4 address using lookups and identifying how to update existing rows, which I assume will be some of the next steps in your journey.
Addressing comments
I would place them just over the fact destination and then join with a union all to fact table
No. There is no need to have either a join or a union all in your fact data flow. Flat File Source (Get our candidate data) -> Data Conversion(s) (Change data types to match the expected)-> Derived Columns (Manipulate the data as needed, add things like insert date, etc) -> Lookups (Translate source values to destination values) -> Destination (Store new data).
Assume Source looks like
produkt_ID
steuer_id
sales_id
umsatz
1234
1357
2468
12
2345
3579
4680
44
After dimension load, you'd have (simplified)
Product
produktkey
produkt_ID
1
1234
2
2345
Location
standortkey
steuer_id
7
1357
9
3579
The goal is to use that original data + lookups to have a set like
produkt_ID
steuer_id
sales_id
umsatz
produktkey
standortkey
1234
1357
2468
12
1
7
2345
3579
4680
44
2
9
The third lookup I propose (skip it for now) is to check whether sales_id exists in the destination. If it does, then you would want to see whether that existing record is the same as what we have in the file. If it's the same, then we do nothing. Otherwise, we likely want to update the existing row because we have new information - someone miskeyed the quantity and instead our sales should 120 and not 12. The update is beyond the scope of this question but it's covered nicely in the Stairway to Integration Services.

Reorganize ID of tables in postgreSQL so that the ID starts again at 1?

Hello i am currently try different data automation processes with python and postgreSQL. I automated the cleaning and upload of a dataset with 40.000 data emtries into my Database. Due to some flaws in my process i had to truncate some tables or data entries.
i am using: python 3.9.7 / postgeSQL 13.3 / pgAdmin 4 v.5.7
Problem
Currently i have ID's of tables who start at the ID of 44700 instead of 1 (due do my editing).
For Example a table of train stations begins with the ID 41801 and ends with ID of 83599.
Question
How can reorganize my index so that the ID starts from 1 to 41801?
After looking online i found topics like "bloat" or "reindex". I tired Vacuum or Reindex but nothing really showed a difference in my tables? As far as now my tables have no relations to each other. What would be the approach to solve my problem in postgreSQL. Some hidden Function i overlooked? Maybe it's not a problem at all, but it definitely looks weird. At some point of time i end up with the ID of 250.000 while only having 40.000 data entries in my table.
Do you use a Sequence to generate ID column of your table? You can check it in pgAdmin under your database if you have a Sequence object in your database: Schemas -> public -> Sequences.
You can change the current sequence number with right-click on the Sequence and set it to '1'. But only do this if you deleted all rows in the table and before you start to import your data again.
As long as you do not any other table which references the ID column of your train station table, you can even update the ID with an update statement like:
UPDATE trainStations SET ID = ID - 41801 WHERE 1 = 1;

Access 2010 Database - Count number of instances

I'm a complete Access noob and I'm creating a database to keep track of orders shipped for work. I've got two tables in the database, one keeping track of the units shipped and the other keeping track of each purchase order and how many more items until that order can be closed. What I want to do is, basically, like a COUNTIF function from Excel on the entries from Table A and transfer to Table B. An example:
Table A has:
PO123
PO123
PO234
PO123
What I want Table 2 to do is count the number of instances of each PO and display the count in a field, like so:
Table B:
Row 1 Field 1: PO123
Row 1 Field 2: 3
Row 2 Field 1: PO234
Row 2 Field 2: 1
Anyone have any ideas? Any help is greatly appreciated.
You didn't say what the name of the field was, so I'm going to assume it's POTYPE. You need to create a query in Access, and go to the SQL view. Then you can do a query like
Select POTYPE, Count(POTYPE) From TableA Group By POTYPE

correct approach to store in database

I'm developing an online website (using Django and Mysql). I have a Tests table and User table.
I have 50 tests within the table and each user completes them at their own pace.
How do I store the status of the tests in my DB?
One idea that came to my mind is to create an additional column in User table. That column containing testid's separated by comma or any other delimiter.
userid | username | testscompleted
1 john 1, 5, 34
2 tom 1, 10, 23, 25
Another idea was to create a seperate table to store userid and testid. So, I'll have only 2 columns but thousands of rows (no of tests * no of users) and they will always continue to increase.
userid | testid
1 1
1 5
2 1
1 34
2 10
Your second option is vastly preferred... your first solution breaks normalization rules by trying to store multiple values in a single field, which will cause you headaches down the road.
The second table will not only be easier to maintain when trying to add or remove values, but will also likely perform better since you'll be able to effectively index those columns.
There are two phrases that should automatically disqualify any database design idea.
create one table per [anything]
a column containing [anything] separated by a comma
Separate table, two columns--you're on the right track there.

Create index on two unrelated table in Solr-2

I need to create index from two tables that are not related. But when I try to do so I get the response that record fetched is equal to both tables records but no index in created.
Please help
Creating 1 index from 2 unrelated tables sound very strange. Why can't you create 2 indexes?
In case you don't have other choice create unique id filed in schema (look at http://wiki.apache.org/solr/UniqueKey). Check also in DB logs what queries are actually run.

Resources