How to best represent items with variable # of attributes in a database? - database

Lets say you want to create a listing of widgets
The Widget Manufacturers all create widgets with different number and types of attributes. And the Widget sellers all have different preferences on what type and number of attributes they want to store in the database and display.
The problem here now is that each time you add in a new widget, it may have attributes on it that donot currently exist for any other widget, and currently you accomplish this by modifying the table and adding in a new column for that attribute and then modifying all forms and reports to reflect this change.
How do you go about creating a database which takes into account that attributes on a widget are fluid and can change from widget to widget.
Ideally the widget attributes should be something the user can define according to his/her preference and needs

I would have a table for widgets and one for widget attributes. For example:
Widgets
- Id
- Name
WidgetAttributes
- Id
- Name
Then, you would have another table which has what widgets have which attributes:
WidgetAttributeMap
- Id
- WidgetId
(a value from the Id column in the Widget table)
- WidgetAttributeId
(a value from the Id column in the WidgetAttribute table)
This way, you can add attributes to widgets by modifying rows in the WidgetAttributeMap table, not by modifying the structure of your widget table.

casperOne is showing the way, although I would personally add yet one more table for the attribute values, ending up with
Widgets
-WidgetID (pk)
-Name
WidgetAttributes
-AttributeID (pk)
-Name
WidgetHasAttribute
-WidgetID (pk)
-AttributeID (pk)
WidgetAttributeValues
-ValueID (pk)
-WidgetID
-AttributeID
-Value
In order to retrieve the results, you want to join the tables and perform an aggregate concatenation, so you can end up with data looking like (for example):
Name Properties
Widget1 Attr1:Value1;Attr2:Value2;...etc
Then you could split the Properties string in your Business Logic Layer and use as you wish.
A suggestion on how to join the data:
SELECT w.Name, wa.Name + ':' + wav.Value
FROM ((
Widgets w
INNER JOIN
WidgetHasAttribute wha
ON w.WidgetID = wha.WidgetID)
INNER JOIN WidgetAttributes wa
ON wha.AttributeID = wa.AttributeID)
INNER JOIN WidgetAttributeValues wav
ON (w.WidgetID = wav.WidgetID AND wa.AttributeID = wav.AttributeID)
You can read more on aggregate concatenation here.
As far as performance is concerned, it shouldn't be a problem as long as you make sure to index all columns that will be frequently read - that is
All the ID columns, as they will be compared in the join clauses
WidgetAttributes.Name and WidgetAttributeValues.Value, as they will be concatenated

Related

Query of records based on relationships

I have an object Contract that has a look-up to another object Indexationtype. I have another object IndexationEntry that has master-detail to Indexationtype. Now I would like to get the value of the percentage field in the IndexationEntry onto Contract based on the yer fields. The year in the IndexationEntry matches Year in Contract. How should I achieve this?
From Contract "up" to IndexationType__c, then "down" to IndexationEntry__c?
If there's no direct link between them it's not going to be pretty. One way would be something like this
SELECT Id, Name,
(SELECT Id, ContractNumber FROM Contracts__r WHERE Year__c = '2021'),
(SELECT Id, Percent__c FROM IndexationEntries__r WHERE Year__c = '2021')
FROM IndexationType__c
You'd have to run it once for each year. Or (since you tagged it Apex) maybe you can prepare the reference data a bit, query Indexation Types + Entries and build something like Map<Id, Map<Integer, IndexationEntry__c>> (1st key is by Indexation Type Id, then by year). Query them, populate the Map, then loop through your contracts and use map.get() to fetch your values.

Access Form doesn't display the add new record (empty record) - Recordsource is a View

In my forms I usually have the empty line at the end of the records with which I can add a new record. Though in one of my forms I dont have this empty line.
After reading some issues about it propably has something to do with the recordsource of my form being a View (which I cant edit / add records to).
Some background:
My access application has linked tables and views from my SQL server. The old project was an ADP project and in the ADP project you always could set a unique table for views so you could add records to it.
Sadly Access 2013 doesn't have this feature and I fixed this (or at least I thought I did) by linking the views in the same way as the tables. For the Views that had to be edited (add records to it) I set primary keys to the view fields (these primary keys were the same of the primary keys of the unique table - the table on which the records getting add/edited from the view).
This seems to work since when I open the view directly in Access I can add records to it and these records also getting added in the connected table (unique table).
Though like I said before, in the form I can't add records (not manually with the empty line or with a add new record button) while I can do it directly in the view.
The following stackoverflow question: Microsoft Access form - cannot add new record made me wonder if the provided list in the answer (http://allenbrowne.com/ser-61.html) is still valid / the case when you can add the records directly in the view, cause if it is it probably has something to do with my view (since it contains a GROUP BY which is also listed as causing it to be read only). If so, what is the problem in my view, is it only the GROUP BY or is it something else as well?
The view in question:
SELECT dbo.tblInkreg.becode, dbo.tblInkreg.ionummer, dbo.tblInkreg.iovolgnr, dbo.tblInkreg.ioregel, dbo.tblInkreg.arcode, dbo.tblInkreg.eenhedenbesteld, dbo.tblInkreg.eenheidbesteld, dbo.tblInkreg.aantalbesteld,
dbo.tblInkreg.eenhedengeleverd, dbo.tblInkreg.eenheidgeleverd, dbo.tblInkreg.aantalgeleverd, dbo.tblInkreg.crArtNr, dbo.tblInkreg.aromschrijving, dbo.tblInkreg.stukprijs, dbo.tblInkreg.prijs, dbo.tblInkreg.irbtwcode,
dbo.tblInkreg.type, dbo.tblInkreg.bocheckedtmpacc, dbo.tblInkreg.regellevdat, dbo.tblInkreg.plandat, sub.voorraad - sub.gereserveerd AS vrijevoorraad, a.arLocatie, dbo.tblInkreg.verpakbelastprijs, a.arPALocatie
FROM dbo.tblInkreg LEFT OUTER JOIN
dbo.tblPLInkoop ON dbo.tblInkreg.becode = dbo.tblPLInkoop.becode AND dbo.tblInkreg.arcode = dbo.tblPLInkoop.ArCode AND dbo.tblPLInkoop.Prioriteit = 1 LEFT OUTER JOIN
dbo.tblArtikelEenheden ON dbo.tblInkreg.arcode = dbo.tblArtikelEenheden.arcode AND dbo.tblInkreg.eenheidbesteld = dbo.tblArtikelEenheden.eenheid LEFT OUTER JOIN
(SELECT dbo.tblArtikel.arcode, ISNULL(SUM(dbo.tblVoorraadMutaties.besteld), 0) AS inbestelling, ISNULL(SUM(dbo.tblVoorraadMutaties.voorraad), 0) AS voorraad,
ISNULL(SUM(dbo.tblVoorraadMutaties.gereserveerd), 0) AS gereserveerd
FROM dbo.tblArtikel LEFT OUTER JOIN
dbo.tblVoorraadMutaties ON dbo.tblArtikel.arcode = dbo.tblVoorraadMutaties.arcode
WHERE (dbo.tblArtikel.isvrdart = 1)
GROUP BY dbo.tblArtikel.arcode) AS sub ON dbo.tblInkreg.arcode = sub.arcode LEFT OUTER JOIN
dbo.tblArtikel AS a ON sub.arcode = a.arcode
First, the GROUP BY will make the query read-only.
You may not be able to remove that and still have the form to operate as you expect, so - likely - you will have to redesign your concept and how the form operates.

Linq to Entities (join query) - WPF - 4 layer application

I'm new to this all and I like to know how experienced programmers handle the following situation
Let's say I have a table "Supplier" with some common fields and also fields MailingAddress, BillingAddress which are 2 key fields to link to another table "Address"
TB Supplier
Id = 1
Name = MyNameFoo
MailingAddressId = 100
BillingAddress = 101
TB Address
Id = 100
Street = MailingStreetFoo
City = MailingCityFoo
Id = 101
Street = BillingStreetFoo
City = BillingCityFoo
I have a 4 layer application solution:
project xyz.WPF which hold al my WPF windows to presentate the data to the user (Listbox with Binding)
project xyz.Business : in between WPF and DataAccess (SupplierB.vb, AddressB.vb)
project xyz.DataAccess.SqlServer : data layer (SupplierDAL.vb, AddressDAL.vb)
project xyz.Entities : to store my entity classes, each representing a table and just holding properties. I use this to pass from one layer to another (SupplierE.vb, AddressE.vb).
SupplierDAL class gets data from SQL Server and returns a "List(Of SupplierE)" to SupplierB class which return this "List(Of SupplierE)" to the WPF where it is bound to listbox -> Listbox.ItemsSource = SupplierB.GetAllSuppliers().
Now I can get all suppliers and show them in a WPF window in a listbox (or datagrid, ...) Also I can get all addresses and show them.
But at some point I want to get all the addresses (Mailing, Billing) for one Supplier. I know how to write my query with linq (join the 2 tables) but what do I have to do with the result of it. It is not a SupplierE nor a AddressE; it's a combination of the two. I want the result to be an object that I can pass around the layers, just like SupplierE.vb, AddressE.vb. Of course I could make another entity like SupplierAddressesE.vb; problem solved but in my real live solution I have dozens of join queries with joins between several different tables (not just Supplier, Address).
So a lot of tables, a lot of join query results; does this means I have to make a entity object (xxxE.vb) for each different join query or is there another way we can do this in a 4-tier layer application with Linq to Entities?
Thanks for any reply which can direct me to a good approach for this situation
Luc
If you are lazy loading the entities, then your supplier entity should have the attached address entity for each of Mailing/billing. Why not just have create a proper Supplier class that has properties for Mailing/Billing that are Address classes as opposed to Ids?

Best database design (model) for user tables

I'm developping a web application using google appengine and django, but I think my problem is more general.
The users have the possibility to create tables, look: tables are not represented as TABLES in the database. I give you an example:
First form:
Name of the the table: __________
First column name: __________
Second column name: _________
...
The number of columns is not fixed, but there is a maximum (100 for example). The type in every columns is the same.
Second form (after choosing a particular table the user can fill the table):
column_name1: _____________
column_name2: _____________
....
I'm using this solution, but it's wrong:
class Table(db.Model):
name = db.StringProperty(required = True)
class Column(db.Model):
name = db.StringProperty(required = True)
number = db.IntegerProperty()
table = db.ReferenceProperty(table, collection_name="columns")
class Value(db.Model):
time = db.TimeProperty()
column = db.ReferenceProperty(Column, collection_name="values")
when I want to list a table I take its columns and from every columns I take their values:
data = []
for column in data.columns:
column_data = []
for value in column.values:
column_data.append(value.time)
data.append(column_data)
data = zip(*data)
I think that the problem is the order of the values, because it is not true that the order for one column is the same for the others. I'm waiting for this bug (but until now I never seen it):
Table as I want: as I will got:
a z c a e c
d e f d h f
g h i g z i
Better solutions? Maybe using ListProperty?
Here's a data model that might do the trick for you:
class Table(db.Model):
name = db.StringProperty(required=True)
owner = db.UserProperty()
column_names = db.StringListProperty()
class Row(db.Model):
values = db.ListProperty(yourtype)
table = db.ReferenceProperty(Table, collection_name='rows')
My reasoning:
You don't really need a separate entity to store column names. Since all columns are of the same data type, you only need to store the name, and the fact that they are stored in a list gives you an implicit order number.
By storing the values in a list in the Row entity, you can use an index into the column_names property to find the matching value in the values property.
By storing all of the values for a row together in a single entity, there is no possibility of values appearing out of their correct order.
Caveat emptor:
This model will not work well if the table can have columns added to it after it has been populated with data. To make that possible, every time that a column is added, every existing row belonging to that table would have to have a value appended to its values list. If it were possible to efficiently store dictionaries in the datastore, this would not be a problem, but list can really only be appended to.
Alternatively, you could use Expando...
Another possibility is that you could define the Row model as an Expando, which allows you to dynamically create properties on an entity. You could set column values only for the columns that have values in them, and that you could also add columns to the table after it has data in it and not break anything:
class Row(db.Expando):
table = db.ReferenceProperty(Table, collection_name='rows')
#staticmethod
def __name_for_column_index(index):
return "column_%d" % index
def __getitem__(self, key):
# Allows one to get at the columns of Row entities with
# subscript syntax:
# first_row = Row.get()
# col1 = first_row[1]
# col12 = first_row[12]
value = None
try:
value = self.__dict__[Row.__name_for_column_index]
catch KeyError:
# The given column is not defined for this Row
pass
return value
def __setitem__(self, key, value):
# Allows one to set the columns of Row entities with
# subscript syntax:
# first_row = Row.get()
# first_row[5] = "New values for column 5"
self.__dict__[Row.__name_for_column_index] = value
# In order to allow efficient multiple column changes,
# the put() can go somewhere else.
self.put()
Why don't you add an IntegerProperty to Value for rowNumber and increment it every time you add a new row of values and then you can reconstruct the table by sorting by rowNumber.
You're going to make life very hard for yourself unless your user's 'tables' are actually stored as real tables in a relational database. Find some way of actually creating tables and use the power of an RDBMS, or you're reinventing a very complex and sophisticated wheel.
This is the conceptual idea I would use:
I would create two classes for the data-store:
table this would serve as a
dictionary, storing the structure of
the pseudo-tables your app would
create. it would have two fields :
table_name, column_name,
column_order . where column_order
would give the position of the
column within the table
data
this would store the actual data in
the pseudo-tables. it would have
four fields : row_id, table_name,
column_name , column_data. row_id
would be the same for data
pertaining to the same row and would
be unique for data across the
various pseudo-tables.
Put the data in a LongBlob.
The power of a database is to be able to search and organise data so that you are able to get only the part you want for performances and simplicity issues : you don't want the whole database, you just want a part of it and want it fast. But from what I understand, when you retrieve a user's data, you retrieve it all and display it. So you don't need to sotre the data in a normal "database" way.
What I would suggest is to simply format and store the whole data from a single user in a single column with a suitable type (LongBlob for example). The format would be an object with a list of columns and rows of type. And you define the object in whatever language you use to communicate with the database.
The columns in your (real) database would be : User int, TableNo int, Table Longblob.
If user8 has 3 tables, you will have the following rows :
8, 1, objectcontaintingtable1;
8, 2, objectcontaintingtable2;
8, 3, objectcontaintingtable3;

Questions about DB modelling

How would you model these relationships in a db?
You have a Page entity that can contain PageElements.
A PageElement can for instance be an Article, or a Picture. An Article table obviously has other members / columns than a Picture. An article could have ie. "Title", "Lead", "Body" columns that are all of type nvarchar, while a Picture might have something like "AltText", "Path", "Width", "Height". I like this to be extensible, who knows what PageElements I might need in 3 months? So I guess I'd need a PageElementTypes table.
For the relationships, what about tables like these:
Pages with an Id, and other mumbo jumbo. (Create Date, Visible, what not)
Pages_PageElements with PageId and PageElementId.
PageElements with an Id and a PageElementTypeId and more mumbojumbo (SortOrder, Visibility etc.).
PageElementTypes with an Id and a Name (for instance "Article", "Picture", "AddressBlock")
Now, should I create a PageElementId column in every Articles, Pictures, AddressBlocks table to finish things up? That's where I'm a bit stuck, it's a simple 1:1 relationship so this should work, but somehow I might miss something.
Follow up:
The recommended solutions below with separate attributes would force me to store all attributes as the same type, or not? What If one PageElement has attributes that are nvarchar(255) and some are nvarchar(1000), what if some are integers?
If I got the EAV way I would have to create tons of tables for holding the attribute values for all the different data types out there.
The two common choices are Single Table Inheritance and Multi Table Inheritance. Other approaches include having tables for each concrete class which I've never used, and what I'd call a meta-table implementation, where the attribute definitions are moved into data rather than any sort of schema.
I've had generally good experiences with STI, and provided you don't expect a plethora of classes and attributes it's the simplest solution. Simple is very good in my book.
Unless new page element types need to be created by users at runtime, I'd avoid the meta-tables approach and anything that begins to look like it. In my experience such code quickly becomes a quagmire and rarely delivers much value compared to a more concrete implementation updated at regular intervals by developers.
Just as you have configured Page Elements, you need to configure the Attributes associated with the Page Elements.
So we have two items that are extensible Page Elements & their Attributes.
I sugges the following tables:
Page : Page ID | ...
Page Elements : Page Element ID | Element Type ID | Page ID | ...
Page Element Type : Element Type ID | Page Element Type Label
Page Element Attribute Type : Attribute Type ID | Element Type ID | Attribute Label
Page Element Attributes : Page Element ID | Attribute Type ID | Attribute Value
The Page Element Attribute Type table will contain the list of attributes associated with an element. Example :
Atttibute Type ID 1 | Article | "Title"
Atttibute Type ID 2 | Article | "Lead"
Atttibute Type ID 3 | Picture | "AltText"
The Page Element Attributes table will store the actual value for the attributes assciated with a page element. Example :
Page Element ID 1 | Attribute Type ID 1 | "Everybody Loves Raymond"
Page Element ID 2 | Attribute Type ID 3 | "World Map"
The universal solution would be:
PageElementType: ID, Name, [Mumbo Jumbo]
PageElementTypeParameter: ID, PageElementTypeID, [Mumbo Jumbo]
Page: ID, [Mumbo Jumbo]
PageElement: ID, PageElementTypeID, [Mumbo Jumbo]
PageElementParameters: ID, PageElementID, PageElementTypeParameterID, Value, [Mumbo Jumbo]
In human words: There is a table for page element types, and an associated table, which lists possible parameters for each page element (like SRC and ALT for an image; TEXT for an article, etc).
Then there is a table with all the pages; an associated table which lists elements in each page; and a table which lists parameter values for each element.
I use a different naming convention then you but this is essentially what I would do:
PageElementType(PageElementTypeID, PageElementTypeName)
PageElement(PageElementID, PageElementTypeID)
Article(ArticleID, PageElementID, ...)
Picture(PictureID, PageElementID, ...)
Page(PageID, ...)
PageHasPageElement(PageHasPageElementID, PageID, PageElementID) => {PageID, PageElementID} are unique
This what I do and seems to be fairly well normalized and performs fine.
I guess I'll just go with what I got, EAV is no option for me. What I got now is a somewhat hybrid approach.

Resources