I'm developing a web-GIS based on PostGIS. When I create a GIS-enalbled table on PostGIS I've to specify the SRID of the data, for example an UTM Zone (eg. UTM 33N SRID 32633).
Is there a way to keep the project scalable and put in the same table data of different SRID (eg. 32633 and 32632 to cover all Italy) without use a global geographic SRID (like WGS84 and similars)?
Thank you
I think you'd end up in a world of pain if you try to combine different SRIDs in one column (how do you know how to display the layer properly?).
However, you could create a table with two geometry columns. To make things more maintainable you could e.g. have triggers that update the "other" column's geometry when one column changes...
create table table_with_2_geoms (
geom_32633 geometry(geometry,32633),
geom_32632 geometry(geometry,32632)
);
Related
I am struggling in figuring out how to create a star schema from multiple source tables. I work at a trading firm so the data is related to user trading activity. The issue I am having is that our datasets do not have primary ids for every field that could be a dimension. Instead, we usually relate our data together using the combination of date and account number. Here is an example of 3 source tables...
I would like to turn this into a star schema, something that looks like ...
Is my only option to denormalize my source tables into one wide table (joining trades to position on account number and date, and joining the users table on account number), create keys for each dimension, then re normalizing it into the star schema? Are star schema's ever built from multiple source tables?
Star schemas are almost always created from multiple source tables.
The normal process is:
Populate your dimension tables
Create a temporary/virtual fact record using your source data
Using this fact record, look up the relevant dimension keys
Write the actual fact record to your target fact table
Data-warehousing is about query speed. The data-warehouse should not be concerned with data integrity. IT SHOULD NOT CLEAN OR CORRECT BAD DATA. It only needs to gather all the data together into a single record to present to the model for analysis. Denormalizing the data is how this is done.
In a star schema, dimensions do not know about each other and have no relationships with other dimensions. In a snowflake, dimensions are related to other dimensions. That is the primary difference between star and snowflake.
All the metadata options for events are rolled up into dimensions and used for slicing/filtering. All the measurable/calculation data for an event are in the event fact, along with a reference to the dimension(s) containing the relevant metadata. The Metadata/Dimension is reused across multiple fact records.
Based on the limited example you've provided, I'd suggest you research degenerate dimensions and junk dimensions. Your Trade and Position data may need to be turned into a fact and a dimension (degenerate), and some of your flag attributes may be best placed into a junk dimension.
You should also make sure your dimension keys are clear. You should not have multiple paths to a dimension (accountnumber: trade -> position -> user & trade -> user ) as that will cause inconsistent results when querying depending on which relationship you traverse.
I have a Fact Table Service1 (open or latest data) from Mart1 and another Fact Table Service2 (historic data) from Mart2. These tables share few common measures and dimensions but the underlying dataset is mutually exclusive.
Now the business wants to merge these two facts into one table in Tabular model to do Year over Year comparison.
Is it possible to combine these two facts, if so, what should be the approach.
Alternatively, do we have to achieve this.
Things to note down are,
Records in Fact table Service2 will never change
The Dimension keys between Mart1 and Mart2 is not guaranteed to be same
Are these Data Marts different databases? If so, you can create a calculated table that brings the two tables together. To do this, in 2016, on the bottom of the designer, there is a little plus sign on the far right next to the last table tab defined. When you hover over it, it will say "Create new table from DAX formula". Create the DAX that selects from the first table union the second table.
If the marts are in the same database you can create a partition for each on the table properties. In order to do this you would create a tabular model, open a connection to data source and bring in the changing data. Then click on table, partitions, click New, then grab the archived data. You would have to make sure the column definitions are in line in order to do this.
As far the other issues that you describe, it sounds as though you are using the Data Marts as your warehouses. Do you have access to the data before it was transformed into the Data Mart where the surrogate keys were applied? I generally keep a Persisted Staging Area around for these cases. If you employ a Data Vault (http://learndatavault.com/) prior to your Data Mart creation, you could simply create a new Data Mart with both sets of data and all of the Dimension keys will be intact.
Like my school has longitude and latitude, do I need to create a new table for it ?
Or I just combine these two values into a string then insert this string into an existed key and value pairs setting table??
Any advice is welcome.
It depends on how you need to use it later. If you need the latitude and longitude in some logical processing then it is advised to store them separately in two columns. But if you need to just show it some where (eg school info page) then you can store them together as a string.
I would store them appart... and I would consider making them primary keys too.
The better way is to have them in separate columns as these values will be used for calculations.
The question is how database design should I apply for this situation:
main table:
ID | name | number_of_parameters | parameters
parameters table:
parameter | parameter | parameter
Number of elements in parameters table does not change. number_of_parameters cell defines how many parameters tables should be stored in next cell.
I have problems to move from object thinking to database design. So when we talk about object one row has as much parameters as number_of_parameters says.
I hope that description of requirements is clear. What is the correct way to design such database. If someone can provide some SQL statments to obtain it it would be nice. But the main goal of this question is to understand how to make such architecture.
I want to use SQLite to create this database.
The relational way is to have two tables. The main table has an ID, name and as many other universally-present parameters as possible. The parameters table contains a mapping from an ID in the main table to a parameter name and a parameter value; the main table ID should be a foreign key, and the combination of ID and name should be unique.
The number of parameters can be found by just counting the number of rows with a particular ID.
If you can serialize the data whiile saving to the database and deserialize it back when you get the record it will work. You can get total number of objects in serialized container and save the count to the number_of_parameters field and serialized data in parameters field.
There isn't one perfect correct way, but if you want to use a relational database, you preferably have relational tables.
If you have a key-value database, you place your serialized data as a document attached to your key.
If you want a hybrid solution, both human editable and single table, you can serialize your data to a human-readable format such as yaml, which sees heavy usage in configuration sections of open source projects.
Given the table tblProject. This has a myriad of properties. For example, width, height etc etc. Dozens of them.
I'm adding a new module which lets you specify settings for your project for mobile devices. This is a 1-1 relationship, so all the mobile settings should be stored in tblProject. However, the list is getting huge, there will be some ambiguity amongst properties (IE, I will have to prefix all mobile fields with MOBILE so that Mobile_width isn't confused with width).
How bad is it to denormalise and store the mobile settings in another table? Or a better way to store the settings? The properties and becoming unwieldly and hard to modify/find in the table.
I want to respond to #Alexander Sobolev's suggestion and provide my own.
#Alexander Sobolev suggests an EAV model. This trades maximum flexibility, for poor performance and complexity as you need to join multiple times to get all values for an entity. The way you typically work around those issues is keeping all the entity meta data in memory (i.e. tblProperties) so you don't join to it at runtime. And, denormalize the values (i.e. tblProjectProperties) as a CLOB (i.e. XML) off the root table. Thus you only use the values table for querying and sorting, but not to actually retrieve the data. Also you usually end up caching the actual entities by ID as well so you don't have the expense of deserialization each time. Issues you run into the are cache invalidation of the entities and their meta data. So overall a non trivial approach.
What I would do instead is create a separate table, perhaps more than one depending on your data, with a discriminator/type column:
create table properties (
root_id int,
type_id int,
height int
width int
...etc...
)
Make the unique a combination of root_id and type_id, where type_id would be representative of mobile for instance - assuming a separate lookup table in my example.
There is nothing bad in storing mobile section in other table. This could even carry some economy, this depends on how much this information is used.
You can store in another table or use even more complicated version with three tables. One is your tblProject, one is tblProperties and one is tblProjectProperties.
create table tblProperties (
id int autoincrement(1,1) not null,
prop_name nvarchar(32),
prop_description nvarchar(1024)
)
create table tblProjectProperties
(
ProjectUid int not null,
PropertyUid int not null,
PropertyValue nvarchar(256)
)
with foreign key tblProjectProperties. ProjectUid -> tblProject.uid
and foreign key tblProjectProperties.propertyUid -> tblProperties.id
Thing is if you have different types of projects wich use different properties, you have no need to store all these unused null and store only properties you really need for given project. Above schema gives you some flexibility. You can create some views for different project types and use it to avoid too much joins in user selects.