Self-Join in SSAS - sql-server

I have a table like this:
PersonId Job City ParentId
--------- ---- ----- --------
101 A C1 105
102 B C2 101
103 A C1 102
Then I need to getting the association rules between Person's job and parent's city.
I've used self-referencing and define case/nested tables but at the result of dependency graph there is no difference between person's job or city and parent's job or city!
What is the best solution for this problem in SSAS project?

SSAS Hierarchies should address your problem. However, it's tough to say exactly how to use them without knowing more about your particular situation.

I've run into a similar need in my own work. So far I have only investigated
SQL Server Analysis Services Tabular models. I will update this answer with more information once I have finished looking into Multidimensional models.
Per Relationships (SSAS Tabular), SSAS Tabular models do not support self-joins (see below for the relevant quote). What you end up having to do is break out the group of parent elements and each level of their child elements as separate model tables. Once you have the model tables, you can use the diagram view to draw the relevant relationships.
Self-joins and loops
Self-joins are not permitted in tabular model tables. A self-join is a
recursive relationship between a table and itself. Self-joins are
often used to define parent-child hierarchies. For example, you could
join an Employees table to itself to produce a hierarchy that shows
the management chain at a business.
The model designer does not allow loops to be created among
relationships in a model. In other words, the following set of
relationships is prohibited. +
Table 1, column a to Table 2, column f
Table 2, column f to Table 3, column n
Table 3, column n to Table 1, column a
If you try to create a relationship that would result in a loop being
created, an error is generated.

Not sure exactly what you are trying to acheive but the following SQL would be a good starting point:
select c.PersonId , p.City
from ptable c, ptable p
where c.ParentId = p.PersonId

Related

What is the best way to design a database to store record with a lot of values?

I want to design a database for events and track a lot of statistic about the it.
Option 1
Create one table for Events and put all my statistic column in it. Like number of male, number of female, number of unidentified gender, temperature that day, time it started, any fights, was the police called, and etc.
The query would be a very simple select * from events
Option 2
Create two tables, one for Events and one for EventsAttributes. In the Events table I would store important stuff like id, event title, and start/end time.
In EventsAttributes I would store all the event statistic and link them back to Events with a eventId foreign key.
The query would look like below. (attributeType == 1 would represent number of males)
select e.*,
(select ev.value from EventAttributes ev where ev.eventId = e.id and attributeType = 1) as NumberOfMale
from Events e
The query would be not be as straight forward as option 1, but I want to design it the right way and live with the messy query.
So which option is the right way to do it, and why (I'm not a database admin, but curious).
Thank you for your time.
I prefer using option 2 for designing database.
In that option(2), you apply the best practice of database normalization.
There are three main reasons to normalize a database:
The first is to minimize duplicate data.
The second is to minimize or avoid data modification issues
The third is to simplify queries.
For more details, read Designing a Normalized Database
You can create views (queries) based on this normalized database to support Option (1).
In this way, database will be ready for any future scaling.
Update:
You can use the the valuable operator pivot and common table expressions (CTE) to get eventAttributes1, eventAttributes2, ...
Suppose your tables are :events and event_attributes as described below:
events
----------
# event_id
event_title
start_date
end_date
event_attributes
-------------
#event_id
#att_type
att_value
# is primary key
-- using table expression (it's like a dynamic view)
with query as (
select e.event_id, e.event_title,a.att_type, a.att_value
from events e
join event_attributes a on e.event_id =a.event_id
)
select event_id , event_title,
[1] as eventAttributes1, -- list all eventAttributes1 numbered [1],[2],...
[2] as eventAttributes2
[3] as eventAttributes3
FROM query
PIVOT(SUM(att_value) FOR att_type IN ([1],[2],[3])) as pvt
For details on pivot read: Using PIVOT
For details Using Common Table Expressions

Laravel is it correct to set up a Model to get data from SQL's views?

Imagine I have my DB structure with tables A, B, C ... N
Tables have their relations. Nothing new...
Could it be a good a idea to set up a model that implement a group of methods to retrieve views of my DB (related to a concept e.g. statistics). So this model will not be directly connected to a real table, but it is a bunch of query joins on 2 or more tables...(with subqueries to other tables), or calls to stored func/procedures SQL views...
Or it is more stylish to stick to classical model setup, so have a model for each table and place around these models my (stats) methods ?
Or Maybe in laravel5 there's some functionality/sw-layer to implement this kind of cross tables non standard complex queries?
An example of query may look like :
SELECT
id_quote, h, AVG(numquote)
FROM
(SELECT
id_quote, COUNT(*) numquote, HOUR(datetime) h
FROM
MyDB.A
WHERE
id_quote IS NOT NULL
GROUP BY DATE(datetime) , HOUR(datetime)) AS C
GROUP BY id_quote , h;

Database Table repeat of data in rows

I'm want to compare two tables to see if an employee has a high enough "proficiency level (basic, intermediate, advanced)" in the correct "competencies" required for a job role. Each job role will have 10 competencies but I don't think a table with the following
Columns:
jobroleID, competence1, proficiency1, competence2, proficiency2....competence10, proficiency10 is right but my alternative below also seems wrong since it shows 10 rows repeating the column jobroleID.
table 1 job role requirements
jobroleID, JRCompetence, JRProficiencyLevel
001 205 intermediate
001 207 basic
001 301 advanced
etc
002
table 2 employee current capability
EmployeeID, EmployeeCompetence, EmployeeProficiencyLevel
E1234 205 intermediate
E1234 207 basic
E1234 555 basic
etc
I appreciate any advice on this.
I think your basic design is fine. Repeating rows is a natural consequence when you normalize a database so that each table only holds information about an entity and you model a many-to-many relationship such as employees and competencies/levels, and jobs and competencies w/ proficiency levels. This design makes it easy to add new requirements - you only have to add new rows in the job requirements table.
Your alternative design would require you to add new columns whenever you need to add new skills, and modify all queries that depend on the table - clearly this is not ideal.
I would however change the design so that the proficiency levels are stored in a separate table, which would make ordinal comparisons easier (so that 1=basic, 2=intermediate, 3=advanced).
A query to find which employees have the skills needed for a particular job could then look like:
-- list emps who has can do job 001:
SELECT EmployeeID
FROM employee_current_capability ecc1
WHERE NOT EXISTS (
SELECT *
FROM job_role_requirements jrr
WHERE jrr.jobroleID = 001
AND NOT EXISTS (
SELECT *
FROM employee_current_capability ecc2
WHERE ecc1.EmployeeID = ecc2.EmployeeID
AND ecc2.EmployeeCompetence = jrr.JRCompetence
AND ecc2.EmployeeProficiencyLevel >= jrr.JRProficiencyLevel
)
)
GROUP BY ecc1.EmployeeID;
See this SQL Fiddle for some examples.

Creating Data Warehouse

I am creating a data warehouse by using a star schema. I successfully build all the dimension tables, but I'm kind of stuck at the fact table. I am in a need to make a Sales table as Fact table. It has SalesKey, OrderKey, ProductKey and etc... Every order is a sale so each order will have a unique SalesKey however each sale will have more than one product.
What would be the best was to build this table?
Should I create something like that
SalesKey OrderKey ProductKey
-------- -------- ----------
s1 o1 p1
s1 o1 p2
s2 o2 p1
In general when you design a starschema it is preferred that each dimension is single valued for each fact record (that is having a 1:M relation between fact and dimension).
The trick is to include an ORDER-LINE dimension so that 1 order (=1 sale) can contain many order lines. Each order-line then contains 1 product.
So basically you will be using a snowflake schema where the facttable is linked to the ORDER-LINE dimension in a 1:M relation. The ORDER-LINE dimension is then linked to the PRODUCT dimension in a M:1 relation.
With this the original problem having a M:M relation between the Salesfact and the PRODUCT dimension has been solved with the ORDER-LINE dimension as a bridge table.
I would add that order items/lines can be tricky. There are multiple ways to handle it.
Add a column "order line item" or "transaction control id" to the fact table.
This will allow you to have SalesKey, OrderKey, ProductKey all on your fact, with an "OrderLineItem" degenerate dimension key, which is often the transaction control number or order line number from the source system.
One issue that you may encounter when using this method is when you have order-level measures that don't exist at the order-line (tax, cashier id, etc). Kimball's preferred approach is to distribute these measures down to the order line if at all possible.
Here's a good article by Kimball on degenerate dimensions:
http://www.kimballgroup.com/html/designtipsPDF/DesignTips2003/KimballDT46AnotherLook.pdf

Merge data object table with associated attributes table in a view

Here's the setup: I have several tables that hold information for data objects which have the potential to have various and sundry bits of data associated with them. Each of these tables has an associated attributes table, which holds 3 bits of information:
the id (integer) of the row the attribute is associated with
a short attribute name ( < 50 chars )
a value (varchar)
The object table will have any number of columns of varying data types, but will always have an integer primary key. If possible, I would like to set up a view that will allow me to select a row from the object table, and all of its associated attributes at one go.
****EDIT****
Ideally, the form I'd like this to take is having columns in the view with the names of the matched attribute from the attributes table, and the value as the value of the attribute.
So for example, if I have table Foo with columns 'Bar', 'Bat', and 'Baz' the view would have those columns, and additionally, columns for any attributes that a row might have.
****END EDIT****
Now, I know (or think I do) that SQL doesn't allow using variables as an alias for a column name. Is there a clean, practical way of doing what I want, or am I chasing a pipe dream?
The obvious solution is to handle all of this in the application code, but I'm curious if it can be done in SQL.
The answer depends on what you are actually seeking. Will the output of the view have one row per attribute per object or one column per attribute per object? If the former, then I'm not sure why you need a view:
Select ...
From ObjectTable
Join AttributeTable
On AttributeTable.Id = ObjectTable.Id
However, I suspect what you want is the later or something like:
Select ...
, ... As Attribute1
, ... As Attribute2
, ... As Attribute3
...
From ObjectTable
In this scenario, the columns that would be generated are not known at execution because the attribute names are dynamic. This is commonly known as a dynamic crosstab. In general, the SQL language is not designed for dynamic column generation. The only way to do this in T-SQL is to use some fugly dynamic SQL. Thus, it is better done in a reporting tool or in middle-tier code.
It sounds like you want a view for each of your 'object' tables as well as its 'attributes' table. Correct me if I am wrong in my reading. It's not clear what your intentions are with 'using variables as an alias for a column name'. Were you hoping to merge ALL your objects, with their different columns, into one view?
Suggest create one view per entity table, and join to its relevant 'attributes' table.
Question though - why is there one matching attributes table for each entity table? Why are they split out? Perhaps you've made the question simpler or obfuscated, so perhaps my question is rhetorical.
CREATE VIEW Foo AS
SELECT O.ID
,O.EverythingElse
,A.ShortName
,A.SomeVarcharValue
FROM
ObjectTable AS O --customer, invoice, whathaveyou
INNER JOIN
ObjectAttribute AS A ON A.ObjectID = O.ID
To consume from this, you could:
SELECT * FROM Foo WHERE ID = 4 OR
SELECT * FROM Foo WHERE ShortName = 'Ender'

Resources