I am doing a small exercise, I need to create a small dimensional design that deals with the tsunamis that have occurred in different countries over the years. I have created a "Country" dimension and a "Location" dimension. In each record of the provided table comes (or may not come) the longitude and latitude in which the place is located. My question is where should I put such attributes, whether in the fact table or in the location dimension. My understanding is that in the fact table it should only contain metrics and the foreign keys of the dimensions. However, I don't know how correct it can be to add the longitude and latitude to the location dimension, since by having the values a very wide range, many records are being created in the "Location" dimensional table. Would it be more appropriate to put those attributes in the fact table?
Thanks.
You should merge Location and Country into a single Location dimension (country is an attribute of location) and hold lat and long in the dimension
On the fact table. This is exactly like a fact that has a time attribute, with sub-second resolution. It's not necessary to create a dimension table containing every possible point in time, or every possible lat/long.
Related
We received some generic training related to TM1 and dimension creation and we were informed we'd need separate dimensions for the same values.
Let me describe, we transport goods and we'd have an origin and destination province and in typical database design I'd expect we'd have one "province" reference table, but we were informed we'd need an "origin" dimension and a "destination" dimension. This seems to be cumbersome and seems like we'd encounter the same issue with customers, services, etc.
Can someone clarify how this could work for us?
Again, I'd expect to see a "lookup" table in the database which contains all possible provinces (assumption is values in both columns would be the same), then you'd have an ID value in any column that used the "province" and join to the "lookup" table based on ID.
in typical database design I'd expect we'd have one "province" reference table, but we were informed we'd need an "origin" dimension and a "destination" dimension
Following the regular DB design it makes sense to keep two data entities separate: one defines source, other defines target. I think on this we'd both agree. If you could give more details it would be better.
Imagine a drop down list: two lists populated by one single "source", but represent two different values in DB.
assumption is values in both columns would be the same
if the destination=origin, you don't need two dimensions then? :) This point needs clarification.
Besides your solution (combination of all source and destination in a table with an unique ID, which could be a way of solving this), it seems it's resolvable by cube or dimension structure changes.
If at some dimension you'd use e.g. ProvinceOrigin and ProvinceDestination as string type elements, and populate them from one single dimension (dynamic attribute) then whenever you save the cube you'll have these two fields populated from one single dimension.
Obviously the best solution for you depends on your system architecture.
I want to manage some datas by intervals on my database like that :
It is possible to do that on an unique table or I need 3 tables, one for each color (with FK) ?
Real example :
Actually, on my app I use this on a dataGridView and on my database :
It is possible to set / modify or everything on three databases. I manually add the equivalency (green) but for some number with a little different is it the same equivalency, so it's - for me - interesting to use numeric intervals
I'm not an expert on modeling databases but this is how I solve your scenario.
I'd create two Range Tables, one for storing column values, and other one for row values, each table will have same structure but since you need to represent the final values in a matrix way i decide to consider two tables(instead of merging them in one, its possible but then you'll need more effort to showing data from "Values"). As you can see i've considered a IdEquivalency columns, this will be useful for showing the data ad needed.
Finally the table Values(for green values) has two FK(one for each range value), and the value stored.
This is still a basic idea, but I'm sure you get the point.
Considerations:
Change Table Names according what its value represent.
I have designed this relational database that is keeping track of various assets and their owners over time. One of the most important piece of analysis I want to do is to track the value of those assets over time: expected original cost, actual original cost, actual cost, etc. So I have been putting data relative to a cost / value in a separate table called “Support_Value”. To complicates things some of the assets I’m tracking are in countries with foreign currencies so I’m collecting cost / value data in US Dollars but also in local currencies (“LC”), which ends up doubling the number of columns I have in this table. I also use this table as a way to keep track of the value of the asset owners themselves in a similar fashion.
- The columns of this table are the following:
My initial plan was to carve out separate tables to deal with (1) the various “qualities” of entries relative to cost and value (i.e. the “planned”, “upper” bound, “lower” bound”, “estimated” by analysts, and “actual” and another table to track) and (2) another one for currencies. But I realize this is likely to break as it doesn’t allow to have an initial “planned” cost that is then subsequently revised unless we make it explicit by creating new column for revised appendages but then there can be more than one revision.. So still not perfect.
What I’m now envisaging is to create a different value table that would have the following columns:
ID (PK representing individual instances of cost / value estimates)
Currency (FK to my currency table)
Asset (FK to my assets table) - i.e. what this cost or value is referring to
Date (FK to my date table) - i.e. to track revisions actually
Type (i.e. “cost" or “value")
Quality (i.e. “planned”, “upper”, “lower”, “estimated”, “actual”)
Valuation - i.e. the actual absolute amount in the currency designated in the second column
What do think of this approach? Is this an improvement?
Thanks for any suggestion you could have!
Both approaches are fine.
But, if you think you may need additional similar columns,
then the second aproach is more extensible.
Your second approach, it does look it has overnormalization,
I suggest split the "Quality" column back to its parts.
Some thing like:
"ID"
"Currency"
"Asset"
"Date"
"Type"
"Planned"
"Lower"
"Upper"
"Estimated"
"Actual"
"Valuation"
Cheers.
I'm trying to create a multidimentional database from a preexisting database using SQL Server Analysis Services. My problem is that the original database stores all information on a varchar field called "value". What's in that field depends on another field that holds the type of statistic. So I can have for example a fact with statistic_type "number of products sold" with value 1000 and another with type "cost of material bought" with value 5000. The values can have completely differentic meanings, some are numeric values, others are percentages and others are strings.
How do I turn those into measures. Should the statistic_type be a dimension of the cube and have the value as a measure? Does a measure always need to have a numeric value? Should I separate the fact table amoung several tables, one for each type of statistic? Or is there some sensible way to create a cube using just the one table.
It's the first time I'm working with multidimentional databases and SSAS so I'm a little lost.
A measure always needs to have a numeric value. In fact, you will probably have to cast the value column as a numeric datatype in your Data Source View in order for it to even be a candidate for a measure in your cube.
You should make statistic_type a dimension and "value" a measure. It's ok to just use the one table, although it might be easier to work with if you make a lookup table of the distinct statistic_types.
I have what seems like what could be a simple question but might be more difficult than anticipated. Let's say I'm trying to track the latitude and longitude of Users and Businesses. Right now, I have a table called locTable, that contains 3 columns: Index, Latitude, Longitude.
The table that stores information for the Users and Businesses contain a FK to the locTable. This allows me to use one table to store the location data, however I've noticed doing queries on this data might be difficult.
Now, I could store Latitude and Longitude information in each table for Users and Businesses, however if I need to make changes regarding the data, I would have to update the queries along with two (or more) different tables.
What would you all suggest? Shared table or store the information separately?
First off, establish whether lat/long is a "lookup" table. I would not consider Lat/Long to be a lookup. What that means is you would not store an exhaustive list of every possible Lat/Long combo. They are often specified to 4 decimal places. In theory there could be infinite Lat/Long combo's if you have infinite scale.
I would not consider it duplication to store lat/long in both tables. Think of BirthDates. To avoid BirthDate duplication you could have a "BirthDates" table with a row for every day in the last 300 years. This would avoid BirthDay duplication of people, dogs, and companies. But it is not duplication in the "lookup" sense.
I am not suggesting it is wrong to store Lat/Long in it's own table. I'm just suggesting it may not be considered duplication to store Lat/Long in both tables.
Are the locations changing? If so, reverse the foreign key. Have the location table have a foreign key the user's and businesses table. Add a date column to the location table, and you can also track them over time just by adding new locations.
With your model, you potentially have to update two tables in order to update the location. With switching the foreign key around, all you have to do is add new rows to the location table when the location changes.