QGIS trouble setting a one to many relation - postgis

For the first time I tried to set up a one to many relationship in QGIS.24 using Postgis spatial table and a view. The view includes a foreign field key referring to the table's primary key.
CREATE TABLE paddocks(
id serial NOT NULL
name character varying(45) NOT NULL,
area numeric(5,2) NOT NULL,
sectionid smallint NOT NULL,
geom geometry (POLYGON,5347)
CONSTRAINT paddocks_pkey PRIMARY KEY (id))
CREATE VIEW es_cattle_existence_by_paddock_category
SELECT p.name,
cc.category,
sum(COALESCE(ci.move_in, 0)) - sum(COALESCE(ci.move_out, 0)) AS quantity,
ci.paddockid
FROM cattle_inventory ci
JOIN cattle_category cc ON ci.categoryid = cc.id
JOIN paddocks p ON ci.paddockid = p.id
GROUP BY p.name, cc.category, ci.paddockid
ORDER BY p.name, cc.category;
Add Relation (QGIS Project Properties)
Referenced (parent): paddocks
Field: id
Referencing (child) : es_cattle_existence_by_paddock_category
Field: paddockid
The relation, set up in Projects Properties relations, works just fine and allows displaying the child fields (view) at the paddock attribute table, but they are not displayed at the Layer Properties - Labels - (form value field) nor at the expression dialog or at the field calculator.
The relation (test_paddock) is shown at the Layers Properties' Attribute Forms section under Relations, but the child fields (view) do no show at the Form Layout when using the Drag and Drop Designer.
I followed the instructions of the QGIS user manual (15.2.6 Creating one or many to many relations). I tried in a different Windows PC using QGIS.22, and using tables with full set foreign keys and constrains always with the same result. My goal would be to use se child's field values at the label of the paddock (parent) layer.

Related

Modeling a data-store browser

I have a connection-object browser that I want to allow a user to view of various data sources they are connected to.The viewer of the objects look something like this:
Connection: Remote.1234.MySQL (3 level source)
Database: Sales
Table: User
Field: Name -- CHAR(80)
Field: Age -- INT32
Table: Product
...
Table: Purchase
...
Database: Other
...
Connection: Remote.abc.ElasticSearch (2 level source)
Index: Inventory
Field: ID -- INTEGER
Field: Product -- STRING
...
Connection: Local.xyz.MongoDB (3 level source)
Database: Mail
Collection: Users
Field: MailboxID -- INTEGER
Field: Name -- STRING
Collection: Documents
...
Connection: Local.xyz.SQLServer (4 level source)
Database: Main
Schema: Public
Table: user
Field: Name -- STRING
Database: History
...
In other words, a 'Source' is a hierarchy of a known number of levels and a known 'name' for each level. While the entire hierarchy is variable, the hierarchy of any given source will always have the same number of levels and name. What might be a good way to model this relationally? My thought was to have the following:
Connection:
id
host
(other details)
SourceType:
id
Name
SourceTypeLevelMapping:
SourceTypeID
level (int)
name
ThreeLevelSource_Level1: # e.g., Database
ID
ParentID (ConnectionID)
Name
(other details)
ThreeLevelSource_Level2: # e.g., Table
ID
ParentID (Level1ID)
Name
(other details)
ThreeLevelSource_Level3: # e.g., Field
ID
ParentID (Level2ID)
FieldName
FieldType
(other details)
Then do the same for the other level-ed hierarchies:
TwoLevelSource_Level1, TwoLevelSource_Level2
FourLevelSource_Level1, FourLevelSource_Level2, FourLevelSource_Level3, FourLevelSource_Level4
So basically define the known hierarchies and for each new source that we add we would attach it to one of the known hierarchy levels. The alternative approach I was thinking of doing is to create a new hierarchy for each new source, but then we would be looking at literally hundreds of tables if we were to allow access to 25-50 sources.
What might be a good way to model this type of hierarchical data?
(Also, yes I am familiar with the existing general approaches for modeling hierarchical data as delineated here -- What are the options for storing hierarchical data in a relational database?, How can you represent inheritance in a database? -- the below is not a duplicate.)
Relational Solution
Responding to the relational-database and hierarchic-data tags, the latter being pedestrian in the former.
1.1 Preliminary
Due to the requirement for, and the difference between:
the genuine SQL Platforms (conformation to the Standard; server architecture, unified language; etc) and
the pretend "SQL" programs (no architecture; bits of language spread across those programs; no Transactions; no ACID; etc) that provide no compliance to the Standard, and therefore use the term incorrectly, and
the non-SQLs
Thus I use Record and Field to cover all possibilities, instead of the Relational terms, which would convey Relational definitions.
All possibilities are catered for, but a Relational and SQL-compliant approach (eg. MS SQL Server) is taken as the best method, due to its 40-year establishment and maturity, and the absence of alternatives.
The collection of SQL Platforms; pretend "SQL" applications; and non-SQL suites, are labelled DataSource.
1.2 Compliance
This solution is 100% Relational: Codd's Relational Model, not the substandard alternatives marketed by the academics as "relational":
It can be implemented in any SQL compliant Platform
It has Relational Integrity (which is logical, beyond Referential Integrity, which is SQL and physical); Relational Power; and Relational Speed.
All update interaction is simple, via SQL ACID Transactions.
No warranties are given for pretend "SQLs" and non-SQLs.
2 Solution
2.1 Concept
I appreciate that as a developer, your focus in on the data values and how to retrieve it. However, two levels of definition are required first, in order to support the third data level:
Catalogue Potential
Blue (Reference cluster).
The DtaSources and definition that is available in the market, which the organisation might use. Let's say 42, as per your descriptions.
I would entrust this only to a developer, not an user_admin, because the set up it critical (the lower levels depend on it), and it describes the physical capability and limitations of each DataSource.
Catalogue Actual
Green (Identification cluster).
The DataSources and definition that are actually contracted and used by the organisation. Let's say 12. At this point we have connection addresses; ports; and users. It is constrained CataloguePotential, directly, and via CHECKS that call Functions.
This level defines the content (the tables that actually exist), it contains no data values.
Maintaining an SQL mindset, because that would be the most prudent, given that it is an established Standard, with 40 years of maturity, because it gives us the most flexibility: the CatalogueActual forms the SQL Catalogue.
Likewise, I have used the terms Record and Field for the objects in the collective, rather than Table and Column, which would imply Relational and SQL meanings.
SQL Platform
This level can be populated automatically by the program querying the SQL Catalogue.
"SQL" applications and non-SQL suites
Population is manual due to the absence of a Catalogue. It can be done by an user_admin. The constraint would be your program attempting a trial query to validate the user-supplied table definition.
Current Data
Yellow (Transaction cluster)
The current data, that the user has queried from the DataSources, via his Connection, for the webpage. The assumption is, I have taken the user::webpage to be central, and governing (one user per Connection; one user per webpage), not the OO Object.
if the OO Objects are not reliable (depends on the library you use), or there is one set of Objects across all user-webpages, more Constraints need to be added.
2.2 Method
You need:
Simple Hierarchy
a single-parent hierarchy to replicate the fixed levels of definition in the Catalogue in the SQL servers, as well as the variable levels in the constructed catalogue for the pretend "SQLs" and the non-SQLs.
Relational Hierarchies are fully defined, along with SQL implementation details, in the Hierarchy doc. The simple or single-parent model is given in [§ 2.2].
The Root level (not the Anchor) is the Potential DataSource
The Leaf level is that which contains data, either a Record or a Struct (for those in the collective that allow one).
In the Potential Datasource, it is representative, truly a RecordType and FieldType
In the Actual DataSource, it is an actual Record, which is an instance of RecordType, and actual Field, which is a narrower definition of FieldType.
Method/Struct
In order to handle a Struct, which in definition terms is identical to a Record, and to allow a Struct to contain a Struct, we need a level of abstraction, which is ...
Article
is either
a Field, which is the atomic unit of storage, xor
a Struct, which contains Articles
that requires an Exclusive Subtype cluster, fully defined along with SQL implementation details, in the Subtype doc
Method/Array
To support an Array of Fields:
These are multi-valued dependencies on Field, thus implemented as child tables.
For scalars the NumElement is 1.
That makes the Exclusive Subtype cluster on Field that is otherwise required for scalars redundant.
2.3 Relational Data Model
This is the progress after seven iterations.  It shows the Table-Relation level (the Attribute level is too large for an inline graphic).
Assumption
That the JS (or whatever) objects are local to the webpage/user.  If your objects are global, the value tables need to be constrained to Connection.
The data model is given in a single PDF:
Table Relation level
Table Relation level + sample data
Table Attribute level + sample data.
2.4 Notation
All my data models are rendered in IDEF1X, available from the early 1980's, the one and only notation for Relational Data Modelling, the Standard since 1993.
The IDEF1X Introduction is essential reading for those who are new to Codd's Relational Model, or its modelling method. Note that IDEF1X models complete, they are rich in detail and precision, showing all required details, whereas a home-grown model, being unaware of the imperatives of the Standard, have far less definition. Which means, the notation needs to be fully understood.
Here three working sqlite flavored implementations (once sqlite is being used, column types not being enforced are acceptable, only integer primary keys were used in order to act as rowid):
In all cases, sqlite foreign key PRAGMA is set to true: PRAGMA foreign_keys = 1;
Simple implementation - one fixed table for each source/level (constrained by foreign keys)
The following design/implementation utilizes one table for each type of database and level. Tables references one each other with foreign keys to ensure correctness. For example, a mongo collection can't be child of a mysql database. Only in the connection level all database types share the same table, but it could be different if it is expected different properties for each kind of connection.
create table databasetype(name primary key) without rowid;
insert into databasetype values ('mysql'),('elasticsearch'),('mongo'),('sqlserver');
create table datatype(name primary key) without rowid;
insert into datatype values ('int'),('str'); -- you can differentiate varchar if you will
create table connection(id integer, hostname, databasetype, primary key(id), foreign key(databasetype) references databasetype(name));
create table mysqldatabase(id integer, connectionid, name, primary key(id), foreign key(connectionid) references connection(id));
create table mysqltable(id integer, databaseid, name, primary key(id), foreign key(databaseid) references mysqldatabase(id));
create table mysqlfield(id integer, tableid, name, datatype, datalength, primary key(id), foreign key(tableid) references mysqltable(id), foreign key(datatype) references datatype(name));
create table elasticsearchindex(id integer, connectionid, name, primary key(id), foreign key(connectionid) references connection(id));
create table elasticsearchfield(id integer, indexid, name, datatype, datalength, primary key(id), foreign key(indexid) references mysqltable(id), foreign key(datatype) references datatype(name));
create table mongodatabase(id integer, connectionid, name, primary key(id), foreign key(connectionid) references connection(id));
create table mongocollection(id integer, databaseid, name, primary key(id), foreign key(databaseid) references mongodatabase(id));
create table mongofield(id integer, collectionid, name, datatype, datalength, primary key(id), foreign key(collectionid) references mongocollection(id), foreign key(datatype) references datatype(name));
create table sqlserverdatabase(id integer, connectionid, name, primary key(id), foreign key(connectionid) references connection(id));
create table sqlserverschema(id integer, databaseid, name, primary key(id), foreign key(databaseid) references sqlserverdatabase(id));
create table sqlservertable(id integer, schemaid, name, primary key(id), foreign key(schemaid) references sqlserverschema(id));
create table sqlserverfield(id integer, tableid, name, datatype, datalength, primary key(id), foreign key(tableid) references sqlservertable(id), foreign key(datatype) references datatype(name));
Loading data representing the first table:
insert into connection(hostname, databasetype) values ('remote:1234', 'mysql');
insert into mysqldatabase(connectionid, name) select id, 'sales' from connection where hostname='remote:1234';
insert into mysqltable(databaseid, name) select id, 'user' from mysqltable where name='sales';
insert into mysqlfield(tableid, name, datatype, datalength) select id, 'name', 'str', 80 from mysqldatabase where name='product';
insert into mysqlfield(tableid, name, datatype) select id, 'age', 'i32' from mysqldatabase where name='product';
Trying invalid manipulations of data:
insert into mysqlfield(tableid, name, datatype) values (2, 'newfield', 'qubit');
-- Error: FOREIGN KEY constraint failed
In order to pretty-print the whole tree it is necessary to do a manual join of all tables involved.
Graph like implementation - one table representing the tree, other the hierarchy (constrained by triggers)
Here the element table is used to represent each element/node in the tree. Its level column explicitly classifies each element as an database, table, etc. Here sqlite's rowid is being used as the primary key, but it is easy to change it to a regular id.
In the previous implementation, foreign keys were used to ensure model correctness. Now triggers are used for this job. They decide which parent level accepts which child level, as it is allowed for the respective dbtype - those rules are specified on the element_type table.
Lastly, an exra table element_properties, is used to allow extra properties to be attached to any elements, such as field type.
create table db_type(name primary key) without rowid;
insert into db_type values ('mysql'),('elasticsearch'),('mongo'),('sqlserver');
create table element_type(parentlevel, childlevel, dbtype, primary key(parentlevel, childlevel, dbtype), foreign key(dbtype) references db_type(name)); --not using without rowid to be able to have null parent level
insert into element_type values
(null, 'connection', 'mysql'),
('connection', 'database', 'mysql'),
('database', 'table', 'mysql'),
('table', 'field', 'mysql'),
(null, 'connection', 'elasticsearch'),
('connection', 'index', 'elasticsearch'),
('index','field', 'elasticsearch'),
(null, 'connection', 'mongo'),
('connection', 'database', 'mongo'),
('database', 'collection', 'mongo'),
('collection', 'field', 'mongo'),
(null, 'connection', 'sqlserver'),
('connection', 'database', 'sqlserver'),
('database', 'schema', 'sqlserver'),
('schema', 'table', 'sqlserver'),
('table', 'field', 'sqlserver');
create table element(id integer, parentid, name, level, dbtype, primary key(id), foreign key(parentid) references element(id), foreign key(dbtype) references db_type(name));
create table element_property(parentid, name, value, primary key(parentid, name), foreign key(parentid) references element(id)) without rowid;
-- trigger to guarantee that new elements will conform hierarchy
create trigger element_insert before insert on element
begin
select iif(count(*)>0, 'ok', raise(abort,'invalid parent-child insertion')) from element_type etc join element_type etp on (etp.childlevel, etp.dbtype)=(etc.parentlevel, etc.dbtype) where (etc.dbtype, etc.parentlevel, etc.childlevel)=(new.dbtype, (select level from element ei where ei.rowid=new.parentid), new.level);
end;
-- trigger to guarantee that updated elements will conform hierarchy
create trigger element_update before update on element
begin
select iif(count(*)>0, 'ok', raise(abort,'invalid parent-child update')) from element_type etc join element_type etp on (etp.childlevel, etp.dbtype)=(etc.parentlevel, etc.dbtype) where (etc.dbtype, etc.parentlevel, etc.childlevel)=(new.dbtype, (select level from element ei where ei.rowid=new.parentid), new.level);
end;
-- trigger to guarantee that hierarchy removal must respect existing elements (no delete cascade used)
create trigger element_type_delete before delete on element_type
begin
select iif(count(*)>0, raise(abort,'can''t remove, entries found in the element table using this relationship'), 'ok') from element etc join element etp on etp.rowid=etc.parentid and etp.dbtype=etp.dbtype where etc.dbtype=old.dbtype and (etp.level,etc.level)=(old.parentlevel, old.childlevel);
end;
-- trigger to guarantee that hierarchy changes must respect existing elements
create trigger element_type_update before update on element_type
begin
select iif(count(*)>0, raise(abort,'can''t change, entries found in the element table using this relationship'), 'ok') from element etc join element etp on etp.rowid=etc.parentid and etp.dbtype=etp.dbtype where etc.dbtype=old.dbtype and (etp.level,etc.level)=(old.parentlevel, old.childlevel) and (etp.level,etc.level)!=(new.parentlevel, new.childlevel);
end;
Loading data representing the first table:
insert into element(name, level, dbtype) values ('remote:1234', 'connection', 'mysql');
insert into element(name, level, dbtype, parentid) values ('sales', 'database', 'mysql', (select id from element where (level, name, dbtype)=('connection', 'remote:1234', 'mysql')));
insert into element(name, level, dbtype, parentid) values ('user', 'table', 'mysql', (select id from element where (level, name, dbtype)=('database', 'sales', 'mysql')));
insert into element(name, level, dbtype, parentid) values ('name', 'field', 'mysql', (select id from element where (level, name, dbtype)=('table', 'user', 'mysql')));
insert into element(name, level, dbtype, parentid) values ('age', 'field', 'mysql', (select id from element where (level, name, dbtype)=('table', 'user', 'mysql')));
insert into element_property(name, value, parentid) values ('fieldtype', 'varchar', (select id from element where (level, name, dbtype)=('field', 'name', 'mysql')));
insert into element_property(name, value, parentid) values ('fieldlength', 80, (select id from element where (level, name, dbtype)=('field', 'name', 'mysql')));
insert into element_property(name, value, parentid) values ('fieldtype', 'integer', (select id from element where (level, name, dbtype)=('field', 'age', 'mysql')));
Trying invalid manipulations of data:
insert into element(name, level, dbtype, parentid) values ('documents', 'collection', 'mysql', (select id from element where (level, name, dbtype)=('database', 'sales', 'mysql')));
-- Error: invalid parent-child insertion
update element_type set childlevel='specialfield' where dbtype='mysql' and (parentlevel, childlevel)=('table','field');
-- Error: can't change, entries found in the element table using this relationship
Pretty-printing the tree:
create view elementree(path) as
with recursive cte(id, name, depth, dbtype, level) as (
select id, name, 0 as depth, dbtype, level from element where parentid is null
union all
select el.id, el.name, cte.depth+1 as depth, el.dbtype, el.level from element el join cte on el.parentid=cte.id
order by depth desc
)
select substring(' ',0,2*depth)||name||' ('||dbtype||'-'||level||')' from cte;
select * from elementree;
-- remote:1234 (mysql-connection)
-- sales (mysql-database)
-- user (mysql-table)
-- documents (mysql-table)
-- name (mysql-field)
-- age (mysql-field)
Minimalist DRY graph like implementation - one table with only names representing the tree and only one auxiliar table
Here again it is used an element table to represent each element in the tree. Differently from the previous case, the table has less information and the type of each element - whether it is a database or a table is implicitly inferred instead of explicitly determined by a column. By simply adding an user as a child of sales, it is inferred that user is a mysql table, once it is child of a mysql database - sales, which is adatabase because it is child of a mysql connection, which is child of the mysql root element. Dbtypes are root elements in this tree, all their children are inferred to be of this dbtype.
Here the hierarchypath table was used to tell the hierarchy that has be followed in the element tree. For the user confort, (s)he will only have to insert a (> separated) string representing the hierarchy path, starting from dbtype. The hierarchy view will desconstruct this string to the hierachy structure. One example of a hirearcy path would be: mysql>connection>database>table>field.
Note that again, sqlite's rowid is used as table id. Remember that it is not possible to see rowid by simply select * from table;, it is hidden by default, it is needed to explicitly select it: select rowid,* from table;.
create table element(name, parentrowid, foreign key(parentrowid) references element(rowid));
-- dbtypes are the root elements
insert into element(name) values ('mysql'),('elasticsearch'),('mongo'),('sqlserver');
create table hierarchypath(path);
insert into hierarchypath values
('mysql>connection>database>table>field'),
('elasticsearch>connection>index>field'),
('mongo>connection>database>collection>field'),
('sqlserver>connection>schema>database>table>field');
Loading data:
insert into element select 'remote:1234',rowid from element where (name,coalesce(parentrowid,-1))=('mysql',-1); --returning rowid; -- returning only works for sqlite 3.35+
insert into element select 'sales',rowid from element where rowid=5;
insert into element select 'user',rowid from element where rowid=6;
insert into element select 'name',rowid from element where rowid=7;
insert into element select 'age',rowid from element where rowid=7;
Pretty-printing:
create view hierarchy(root, depth, name) as
with recursive hierarchycte(root, depth, name, remaining) as (
select substr(path, 0, instr(path, '>')) as root, 0 as depth, substr(path, 0, instr(path, '>')) as name, substr(path, instr(path, '>')+1)||'>' as remaining from hierarchypath
union all
select root, depth+1 as depth, substr(remaining, 0, instr(remaining, '>')) as name, substr(remaining, instr(remaining, '>')+1) as remaining from hierarchycte where instr(remaining, '>') > 0
)
select root, depth, name from hierarchycte where depth>=0;
create view elementhierarchy(root, depth, name) as
with recursive elementcte(root, depth, name, rowid, parentrowid) as (
select name as root, 0 as depth, name, rowid, parentrowid from element where parentrowid is null
union all
select elcte.root, elcte.depth+1, el.name, el.rowid, el.parentrowid from elementcte elcte join element el on el.parentrowid=elcte.rowid
order by depth desc
)
select root, depth, name from elementcte;
create view elementree as
with recursive elementcte(root, depth, name, rowid, parentrowid) as (
select name as root, 0 as depth, name, rowid, parentrowid from element where parentrowid is null
union all
select elcte.root, elcte.depth+1, el.name, el.rowid, el.parentrowid from elementcte elcte join element el on el.parentrowid=elcte.rowid
order by depth desc
)
select substring(' ',0,2*h.depth-2)||eh.name||' ('||h.root||'-'||h.name||')' from (select *,row_number() over () as originalorder from elementhierarchy) eh join hierarchy h on (eh.root,eh.depth)=(h.root,h.depth) where h.depth>0 order by originalorder;
select * from elementree;
-- remote:1234 (mysql-connection)
-- sales (mysql-database)
-- user (mysql-table)
-- age (mysql-field)
-- name (mysql-field)
Triggers were not implemented here, but it would be good to do so. One example would be to avoid inserting more levels than allowed.
It would be wiser to store the hierarchy in the desconstructed form seen on the view
hierarchy, by doing the desconstruction in insertion time instead of every select query to avoid cpu consumption. Here it was left this way to differentiate it more from other implementations.
Here the last level entity, the field have no properties as shown on previous implementations. In this model it would be necessary to add one or two extra levels to the hierarchy: ...table>field>fieldpropertyandvalue or ...table>field>fieldproperty>fieldpropertyvalue, in the first case an example of fieldpropertyandvalue would be datatype=integer and an example of separated property and values would be respectively datatype and integer. This approach where any properties are new nodes in the graph is closer to the approach used by RDF stores.
To conclude it must be stated that it would be possible to use specialized graph databases, using their own query languages like cypher in neo4j and sparql in others or even other languages, but since the graph design overall is simple, a relational database suffice our needs.

How to implement relationships with inherits at the parent table to the children tables

I'm trying to implement a database that has inheritance between some tables, there is three tables involved in the question problem: Customers, Users and Addresses (actually there is more tables involved, but with the same problem, so..).
The Customers table inherits from Users table, and the Users table has a relationship with the Addresses table (1 to many, respectively).
So My problem is that I want that table 'Customers' to has the same relationship that 'Users' has with 'Addresses', cause Customers is inherits from it. I also try to insert data to 'Addresses' with an ID from 'Customers', but this give an foreign key constraint violation, the value doesn't exists in table "myDb.users" error
this is a image of my modeling:
(Note: I'm actually using PostgreSQL, I'm just using the ADO.NET to modeling, and I know a way to get around this, but if has no way by inheritance I will change the entire DB to full relational-database.)
I assume that you're using PostgreSQL table inheritance which, unfortunately, doesn't work quite as we would expect. In particular, although records from child tables appear in selects from parent table, they are not physically recorded there, and thus their ids can't be used in foreign keys referencing parent tables.
You may consider implementing inheritance using classic approach:
CREATE TABLE Users(id INT PRIMARY KEY, user_property INT);
CREATE TABLE Customers(id INT PRIMARY KEY REFERENCES Users, customer_property INT);
CREATE TABLE Addresses(user_id INT REFERENCES Users, address TEXT);
This way you physically store properties of Customer in two tables, and you are sure that for every Customer there is a record in Users table which can be referenced from other tables.
-- inserting customer with id=1, user_property=10, customer_property=20
INSERT INTO Users(id, user_property) VALUES (1, 10);
INSERT INTO Customers(id, customer_property) VALUES (1, 20);
-- Inserting address
INSERT INTO Addresses(user_id, address) VALUES (1, 'Wall Street');
The drawback is that you need to join Users and Customers if you want to get all properties of a single customer from both tables:
-- All customer properties
SELECT * FROM Customers JOIN Users USING(id) WHERE Customers.id=1;
-- Customer and address
SELECT * FROM Customers JOIN Users USING(id) JOIN Addresses ON Users.id=Addresses.user_id WHERE Customers.id=1;

SQL Server Database Design - multiple columns in one table reference single column is another table

I'm designing my first SQL Server database and wonder if there's a better way to accomplish what I'm trying to do.
The goal is to be able to create one of 14 documents based on 200+ different document sections (titles, headings, paragraphs, lists, etc). Each document section is part of 1 or more documents.
My application does a single database lookup for a particular document and retrieves the data stored in the 50 text fields.
To do this, I first stored each unique document section in a "sections" table, giving each section a unique identifier (sectionID) and made the identifier a primary key, for example:
dbo.sections
sectionID(pk) sectionText
iv1 this is the text for the first section
AHv1 this text is for another section
APv2 more text to include
.
.
.
EFv3 another text section
GHv2 this is the last section text in the table
I then created a second table called "documents" to store each document name and the sections that belong to it. There are 51 columns in this table. The first column is the document name and the other 50 columns store the id's of the sections (they're named section1, section2, ...) that make up that particular document. Each of the section columns are foreign keys that reference the primary key in the "sections" table, for example:
dbo.documents
docID section1(fk) section2(fk) ... section50(fk)
option1 iv1 AHv1 ... GHv2
option2 iv1 APv2 ... EFv3
All of this seems straightforward to me. However, in order to get the text for each document to be part of a given record, I have to create a view that does 50 joins of the sections table. By doing that, each document id and its text are stored in one row of a table.
Is there a better way to get the same end result? Or a better design? It seems like there may be a lot of overhead to join the data between tables.
Any input would be greatly appreciated!
Let's say you have one table, document, with a one-to-many relationship with a second table, documentSection. Document has a PK field documentID, documentSection's PK is compound, documentID and sectionID, so when the two tables are joined, it's only on the documentID field. Then you won't need one column for each section.
Actually, it sounds like you have all of the document section text stored in your section table, which can be used in multiple documents. Maintenance nightmares aside, you can have Section be your primary table and sectionDocument have the one-to-many relationship, but you may need to introduce a sectionSequence field to keep the sections of your document in sequence. You'll actually need the sequence field regardless of which table is primary.
regarding your comment, let's say you have a section table with a PK field sectionID. Then you can have a sectionDocument table with a compound PK, sectionID and documentID, which will probably need to include a sequence number. You're currently using the ordinal position of the column to identify the sequence of the section in the document, but as you say, you don't want 50 relationships to the section table. The way to handle that is to have the sections defined vertically instead of horizontally. In rows instead of columns. You can also have a document table with PK documentID and the document name.
Building on (and maybe clarifying) what Beth is talking about, you might consider a three-table approach. The lords-of-data generally refer to normalization rules or normal forms to describe patterns in your data that result in great flexibility and performance.
At first blush, these rules seem to spread your data out, but it's very worthwhile learning about these patterns. You don't have to worry about your database "joining a lot" as this is what relational databases are really good at - and normalized database are really easy to join up.
For example, in order to select all the section texts in order for a given document, you would do something like this:
select
s.SectionText
from
Documents d
inner join
DocumentSections ds
on
d.DocumentId=ds.DocumentId
inner join
Sections s
on
ds.SectionId = s.SectionId
where
d.DocumentId = 'MyDoc'
order by
ds.Position
Basically, this converts your 50 columns in documents to an unlimited number of rows in DocumentSections.
Here's how you'd define such a system in SQL Server:
create table dbo.Sections
(
SectionId
nvarchar(8) not null
constraint [Sections.SectionId.PrimaryKey]
primary key clustered,
SectionText
nvarchar( max ) not null
)
create table dbo.Documents
(
DocumentId
nvarchar(8) not null
constraint [Documents.DocumentId.PrimaryKey]
primary key clustered,
Name
nvarchar( 255 ) not null
constraint [Documents.Name.Unique]
unique nonclustered
)
create table dbo.DocumentSections
(
DocumentId
nvarchar(8) not null
constraint [DocumentSections.to.Documents]
foreign key references dbo.Documents( DocumentId )
on delete cascade,
SectionId
nvarchar(8) not null
constraint [DocumentSections.to.Sections]
foreign key references dbo.Sections( SectionId )
on delete cascade,
Position
int not null,
constraint [DocumentSections.DocumentId.SectionId.PrimaryKey]
primary key clustered( DocumentId, SectionId ),
constraint [DocumentSections.DocumentId.Position.Unique]
unique ( DocumentId, Position )
)
There are a couple of things worth noting:
In this code, if you delete a row from Documents, the DocumentSections rows also go away (but not the Sections that were used in the Documents row). Likewise, if you delete a Sections row, the DocumentSections rows for that deleted Sections row go away, leaving the Documents unmolested. This is done with the on delete cascade clauses in the foreign key constraints. They're totally optional, but I showed it just for fun. This is often very handy.
I added a restriction (again optional) that prevents a Section from being used more than once in a Document. If that's not what you want, you can just remove that whole constraint.
I picked nvarchar(8) for the size of the key fields - for no particular reason. If you make these bigger, be sure to increase the width in the referring tables, too.

How to design a Tag table from another tables?

I'm designing a software that will use tags to identify posts similar to StackOverflow, but with some differences.
The tags must be loaded dinamically (like here), but come from various different tables because it will be used for identification.
Example: the tag Brazil identifies an country in the country table, the tag Monday identifies a day in the week day table.
I need an idea of how design this in the database. How have the tags from all tables loading, but identifying the correct table the data belongs.
This might do what you want:
CREATE TABLE countries (
name VARCHAR PRIMARY KEY,
...
);
CREATE TABLE weekdays (
name VARCHAR PRIMARY KEY,
...
);
CREATE VIEW tags AS
(SELECT name AS tag, 'countries' AS source
FROM countries)
UNION ALL
(SELECT name AS weekdays, 'weekdays' AS source
FROM weekdays)
UNION ALL ...;
Then you can make additional tables and add them to the view. When you tag some other table, you'll treate the name and source of the tag as the primary key and refer to this view, like so:
CREATE TABLE foo (
id SERIAL PRIMARY KEY,
...
);
CREATE TABLE foo_tags (
foo_id INTEGER REFERENCES foo,
tag_name VARCHAR,
tag_source VARCHAR
);
Unfortunately, it isn't possible to define a foreign key from the table foo_tags to the view tags defined above.

Defining View in Entity Data Model

I am trying to add a view to an entity data model but I get the error below.
The view is a group by with a count. I don’t understand this because a view does not have a primary key by it’s nature.
I modified the original post because I figured out how to add a key to the view. But I still have the same problem.
warning 6013: The table/view
'fmcsa.dbo.vieFMCSADocumentCount' does
not have a primary key defined and no
valid primary key could be inferred.
This table/view has been excluded. To
use the entity, you will need to
review your schema, add the correct
keys, and uncomment it.
Here is the View
CREATE VIEW [dbo].[vieFMCSADocumentCount] with SCHEMABINDING
AS
SELECT COUNT_BIG(*) AS CountOfDocs, ROLE_ID, OWNER_ID
FROM dbo.FMCSA_DOCUMENT
GROUP BY ROLE_ID, OWNER_ID
then I can add a key
CREATE UNIQUE CLUSTERED INDEX [MainIndex] ON [dbo].[vieFMCSADocumentCount]
(
[OWNER_ID] ASC,
[ROLE_ID] ASC
)
Still not working.
You didn't specify, but I'm assuming you're using EF4. I've come across this before--you either want to define a key manually or edit recreate your view WITH SCHEMABINDING and reimport.
Schema binding effectively tells SQL to track dependencies for your view. It's both a blessing and a curse (try adding a column to FMCSA_DOCUMENT once this view has schema binding), so you might want to read up on the effects.
CREATE VIEW [dbo].[vieFMCSADocumentCount] WITH SCHEMABINDING
AS
SELECT COUNT(ID) AS CountOfDocs, ROLE_ID, OWNER_ID
FROM dbo.FMCSA_DOCUMENT GROUP BY ROLE_ID, OWNER_ID
Alternately, in the EF Model Browser Go to the Entity Types folder, find your view (right click and choose Show in Designer). Then on the view, highlight the column(s) that comprise your primary key and right click and choose "Entity Key"

Resources