I'm new to designing cubes with SSAS.
In my simple cube, I have one fact table with 3 dimension tables, as below. The fact table (table1) contains a list of client IDs and other columns linking to the 3 dimensions. This all works fine.
table1
client_id | dimension_link_1 | dimension_link_2 | dimension_link_3
AAAAA | xxx |zzz |bbb
BBBBB | yyy |aaa |ccc
I have another table (table2) that contains three columns - Client ID, Classification Type and Classification Name. A client may have 1-n classifications recorded against them (i.e. ethnicity, religion, allergies etc) so the Client ID may appear on multiple rows in table2. e.g.
table2
client_id | classification_type | classification_name
AAAAA | Ethnicity | Japanese
AAAAA | Allergy | Hayfever
AAAAA | Nationality | Russian
BBBBB | Ethnicity | Spanish
BBBBB | Allergy | Aspirin
BBBBB | Nationality | Spanish
BBBBB | Physical Support | Yes
I want to add table2 into my cube so that I can aggregate the list of client IDs by the existing fact table (table1) by Classification Type and Classification Name in table2.
However, I'm not sure what the correct approach for doing this is? I tried joining table2 to the fact table (table1) as a dimension linked on Client ID but I think this only joined the two objects together using the first occurrence of the Client ID in table2.
Help! :)
Thanks,
Hologram
Import table2 as both a fact table and a dimension. Then in the cube designer in the dimension usage tab, when specifying the relationship between the measure group formed from table1 and the dimension formed from table2, choose "Many to Many" as the relationship type and ensure the "intermediate measure group" is the one you formed from table2.
Related
I have a database which has the following many to many relation:
Group table:
group_id | group_name
User table:
user_id | user_name
User_group table:
user_id | group_id
I want to create a single csv file that will incorporate those 3 tables. I can ignore the ids from database. The csv should be easily parseable in order to recreate the database with the same data. I am thinking about the format that csv should have. Right now my idea is to do the following
user_name | group
-------------------
bob | group_a
| group_b
sam | group_a
However, I am not sure what is more conventional to do. Maybe list all the groups separated by comma/space?
The short version is I'm trying to map from a flat table to a new set of tables with a stored procedure.
The long version: I want to SELECT records from an existing table, and then for each record INSERT into a new set of tables (most columns will go into one table, but some will go to others and be related back to this new table).
I'm a little new to stored procedures and T-SQL. I haven't been able to find anything particularly clear on this subject.
It would appear I want to something along the lines of
INSERT INTO [dbo].[MyNewTable] (col1, col2, col3)
SELECT
OldCol1, OldCol2, OldCol3
FROM
[dbo].[MyOldTable]
But I'm uncertain how to get that to save related records since I'm splitting it into multiple tables. I'll also need to manipulate some of the data from the old columns before it will fit into the new columns.
Thanks
Example data
MyOldTable
Id | Year | Make | Model | Customer Name
572 | 2001 | Ford | Focus | Bobby Smith
782 | 2015 | Ford | Mustang | Bobby Smith
Into (with no worries about duplicate customers or retaining old Ids):
MyNewCarTable
Id | Year | Make | Model
1 | 2001 | Ford | Focus
2 | 2015 | Ford | Mustang
MyNewCustomerTable
Id | FirstName | LastName | CarId
1 | Bobby | Smith | 1
2 | Bobby | Smith | 2
I would say you have your OldTable Id to preserve in new table till you process data.
I assume you create an Identity column Id on your MyNewCarTable
INSERT INTO MyNewCarTable (OldId, Year, Make, Model)
SELECT Id, Year, Make, Model FROM MyOldTable
Then, join the new table and above table to insert into your second table. I assume your MyNewCustomerTable also has Id column with Identity enabled.
INSERT INTO MyNewCustomerTable (CustomerName, CarId)
SELECT CustomerName, new.Id
FROM MyOldTable old
JOIN MyNewCarTable new ON old.Id = new.OldId
Note: I have not applied Split of Customer Name to First Name and
Last Name as I was unsure about existing data.
If you don't want your OldId in MyNewCarTable, you can DELETE it
ALTER TABLE MyNewCarTable DROP COLUMN OldId
You are missing a step in your normalization. You do not need to duplicate your customer information per vehicle. You need three tables for 4th Normal form. This will reduce storage size and more importantly allow an update to the customer data to take place in one location.
Customer
CustomerID
FirstName
LastName
Car
CarID
Make
Model
Year
CustomerCar
CustomerCarID
CarID
CustomerID
DatePurchaed
This way you can have multiple owners per car, multiple cars per owner and only one record needs to be updated per car and or customer...4th Normal Form.
If I am reading this correctly, you want to take each row from table 1, and create a new record into table A using some of that row data, and then data from the same original row into Table B, Table C but referencing back to Table A again?
If that's the case, you will create TableA with an Identity and make thats the PK.
Insert the required column data into that table and use the #IDENTITY to retrieve the last identity value, then you will insert the remaining data from the original table into the other tables, TableB, TableC, etc. and use the identity you retrieved from TableA as the FK in the other tables.
By Example:
Table 1 has columns col1, col2, col3, col4, col5
Table A has TabAID, col1, col2
Table B has TabBID, TabAID, col3
TableC has TabCID, TabAID, col4
When the first row is read, the values for col1 & col2 are inserted into TableA.
The Identity is captured from that row inserted, and then value for col3 AND the identity are entered into TableB, and then value for col4 AND the identity are entered into TableC.
This is a standard data migration technique for normalizing data.
Hope this assists,
Any ideas on how this should be done with T-SQL queries?
I have two tables, Table A contain records I want to return but filter through. Table B contains the list of filters and class categories. New records are added to Table A all the time. The goal is to dynamically categorized records in Table A based on the filters listed in Table B.
Example:
Table A
Name
------------
John Doe
Mary Lamb
Peter Pan
Tom Sawyer
Suzie Lamb
Nancy Lamb
Josh Reddin
Table B:
Filter | Category
----------------------
John%Doe% | Team 1
%Lamb% | Team 2
Tom% | Team 1
Desired output:
Name | Category
John Doe | Team 1
Tom Sawyer | Team 1
Mary Lamb | Team 2
Suzie Lamb | Team 2
Nancy Lamb | Team 2
Peter Pan |
Josh Reddin |
I thought about doing the following but not sure if that's the best solution:
SELECT Filter, category from TableB (Get list of filters)
Using SQL Loop through filters returned in (1.) and find matches in Table A using LIKE.
Example:
SELECT name, Category
FROM Table A, Table B
WHERE Table A.Name Like (CURRENT filter FROM B)
Insert/append record(s) returned in (2.) into TempTable
SELECT *
FROM TempTable (this returns Names and categories as shown in the desired output)
UNION
SELECT *
FROM Table A
RIGHT OUTER JOIN TempTable on NAME
WHERE Category in null
(This returns rows with no categories found...Peter Pan and Josh Reddin)
Any ideas?
How about performance?
Thanks.
You can use combination of like and left join
select a.Name,b.Category
from tableA a left join tableB b on a.name like b.Filter
I have a question about the way Django models the one-to-one-relationship.
Suppose we have 2 models: A and B:
class B(models.Model):
bAtt = models.CharField()
class A(models.Model):
b = models.OneToOneField(B)
In the created table A, there is a field "b_id" but in the table B created by Django there is no such field as "a_id".
So, given an A object, it's certainly fast to retrieve the corresponding B object, simply through the column "b_id" of the A's row.
But how does Django retrieve the A object given the B object?
The worst case is to scan through the A table to search for the B.id in the column "b_id". If so, is it advisable that we manually introduce an extra field "a_id" in the B model and table?
Thanks in advance!
Storing the id in one of the tables is enough.
In case 1, (a.b) the SQL generated would be
SELECT * from B where b.id = a.b_id
and in case 2, the SQL would be:
SELECT * from A where a.b_id = b.id
You can see that the SQLs generated either way are similar and depend respectively on their sizes.
In almost all cases, indexing should suffice and performance would be good with just 1 id.
Django retrieves the field through a JOIN in SQL, effectively fusing the rows of the two tables together where the b_id matches B.id.
Say you have the following tables:
# Table B
| id | bAtt |
----------------
| 1 | oh, hi! |
| 2 | hello |
# Table A
| id | b_id |
-------------
| 3 | 1 |
| 4 | 2 |
Then a join on B.id = b_id will create the following tuples:
| B.id | B.bAtt | A.id |
------------------------
| 1 | oh, hi! | 3 |
| 2 | hello | 4 |
Django does this no matter which side you enter the relation from, and is thus equally effective :) How the database actually does the join depends on the database implementation, but in one way or another it has to go through every element in each table and compare the join attributes.
Other object-relational mappers require you to define relationships on both sides. The Django developers believe this is a violation of the DRY (Don't Repeat Yourself) principle, so Django only requires you to define the relationship on one end.
Always take a look at the doc ;)
I have a database table called A and now i have create a new table called B and create some columns of A in table B.
Eg: Suppose following columns in tables
Table A // The one already exists
Id, Country Age Firstname, Middlename, Lastname
Table B // The new table I create
Id Firstname Middlename Lastname
Now the table A will be look like,
Table A // new table A after the modification
Id, Country, Age, Name
In this case it will map with table B..
So my problem is now i need to kind of maintain the reports which were generated before the table modifications and my friend told me you need to have a data migration..so may i know what is data migration and how its work please.
Thank you.
Update
I forgot to address the reporting issue raised by the OP (Thanks Mark Bannister). Here is a stab at how to deal with reporting.
In the beginning (before data migration) a report to generate the name, country and age of users would use the following SQL (more or less):
-- This query orders users by their Lastname
SELECT Lastname, Firstname, Age, Country FROM tableA order by Lastname;
The name related fields are no longer present in tableA post data migration. We will have to perform a join with tableB to get the information. The query now changes to:
SELECT b.Lastname, b.Firstname, a.Country, a.Age FROM tableA a, tableB b
WHERE a.name = b.id ORDER BY b.Lastname;
I don't know how exactly you generate your report but this is the essence of the changes you will have to make to get your reports working again.
Original Answer
Consider the situation when you had only one table (table A). A couple of rows in the table would look like this:
# Picture 1
# Table A
------------------------------------------------------
Id | Country | Age | Firstname | Middlename | Lastname
1 | US | 45 | John | Fuller | Doe
2 | UK | 32 | Jane | Margaret | Smith
After you add the second table (table B) the name related fields are moved from table A to table B. Table A will have a foreign key pointing to the table B corresponding to each row.
# Picture 2
# Table A
------------------------------------------------------
Id | Country | Age | Name
1 | US | 45 | 10
2 | UK | 32 | 11
# Table B
------------------------------------------------------
Id | Firstname | Middlename | Lastname
10 | John | Fuller | Doe
11 | Jane | Margaret | Smit
This is the final picture. The catch is that the data will not move from table A to table B on its own. Alas human intervention is required to accomplish this. If I were the said human I would follow the steps given below:
Create table B with columns Id, Firstname, Middlename and Lastname. You now have two tables A and B. A has all the existing data, B is empty .
Add a foreign key to table A. This FK will be called name and will reference the id field of table B.
For each row in table A create a new row in table B using the Firstname, Middlename and Lastname fields taken from table A.
After copying each row, update the name field of table A with the id of the newly created row in table B.
The database now looks like this:
# Table A
-------------------------------------------------------------
Id | Country | Age | Firstname | Middlename | Lastname | Name
1 | US | 45 | John | Fuller | Doe | 10
2 | UK | 32 | Jane | Margaret | Smith | 11
# Table B
------------------------------------------------------
Id | Firstname | Middlename | Lastname
10 | John | Fuller | Doe
11 | Jane | Margaret | Smith
Now you no longer need the Firstname, Middlename and Lastname columns in table A so you can drop them.
voilĂ , you have performed a data migration!
The process I just described above is but a specific example of a data migration. You can accomplish it in a number of ways using a number of languages/tools. The choice of mechanism will vary from case to case.
Maintenance of the existing reports will depend on the tools used to write / generate those reports. In general:
Identify the existing reports that used table A. (Possibly by searching for files that have the name of table A inside them - however, if table A has a name [eg. Username] which is commonly used elsewhere in the system, this could return a lot of false positives.)
Identify which of those reports used the columns that have been removed from table A.
Amend the existing reports to return the moved columns from table B instead of table A.
A quick way to achieve this is to create a database view that mimics the old structure of table A, and amend the affected reports to use the database view instead of table A. However, this adds an extra layer of complexity into maintaining the reports (since developers may need to maintain the database view as well as the reports) and may be deprecated or even blocked by the DBAs - consequently, I would only recommend using this approach if a lot of existing reports are affected.