Make a MustOverride (abstract) member with EF 4? - sql-server

I have an abstract class Contact.
It leads to two subclasses:
Company (Title)
Person (FirstName, LastName)
I want to add a computed 'Title' col in the Person table, that return FirstName + ' ' + LastName, which will give me better search options.
So I want to create the Contact table have an abstract property Title, which each of these two implements, and so, I will be able to use:
Dim contacts = From c In context.Contacts
Where c.Title.Contains("Nash")
I am pretty sure this is impossible, the question is what is the efficient alternative way?
In my scenario I have a ListBox showing all the Contacts of both Company and Person types, I have a search TextBox and I want the Service query (GetContacts(searchQuery As String)) to query the filtered set against the DB.
Update
After Will's answer, I decided to create in the Person table a computed col as above.
The question is what what be the most efficient way to imlpement the WCF-RIA query method:
Public Function GetContacts(searchQuery As String) As IQueryable(Of Contact)
'Do here whatever it takes to retieve from Contacts + People
'and mix the results of both tables ordered by Title
End Function

Unfortunately, while there is a way to do this with partial classes, I am 99% sure you cannot mix linq queries that touch entity properties and "POCO" properties defined in a partial class.
The Linq to Entity context will actually convert these queries to sql, and it cannot handle situations where a particular method isn't directly supported by the context. A common example for L2E is the inability to use enums in your query. Like knowing how to handle enums, the context certainly doesn't know how to handle your POCO properties when converting to raw sql.
An option you might want to investigate is to create a the computed column within your database, or to run your queries, do the traditional ToArray() in order to trigger enumeration, and then examine the computed column in memory. This might not be a good solution, depending on the size of your table, however.
So, essentially, you wish to search two disparate types (backed by two different tables) and then combine the results for display to the user.
I would have to say that polymorphism is NOT the best solution. The desire to show them in the UI shouldn't force a design decision all the way down into your type definitions.
I have done something similar a few times before in WPF. I've done it two ways; by using polymorphism in the form of facade types which wrap the models and which can be treated by a common base type, and by treating all the different types in the collection as System.Object.
The first way is okay when you need type safety and the ability to treat different types the same way. The wrappers extend a common base class and are coded to "know" how to handle each of their wrapped types correctly.
The second way is okay when you don't need type safety, such as when exposing a collection to a WPF View where you are displaying them in an ItemsControl, which can figure out the correct DataTemplate to use by the type of each instance in the collection.
I'm not sure which way is best for you, but whichever it is, you should query both your Company and Person tables separately, Union the two result sets, then sort them appropriately.
Pseudocode:
//Wrapper version
var results = Company
.Where(x=>x.Title.Contains(searchTerm))
.Select(x=> new CompanyWrapper(x))
Cast<BaseWrapper>().Union(
Person
.Where(x=>x.ComputedTitle.Contains(searchTerm))
.Select(x=> new PersonWrapper(x))
.Cast<BaseWrapper>());
//System.Object version
var results = Company
.Where(x=>x.Title.Contains(searchTerm))
Cast<object>().Union(
Person
.Where(x=>x.ComputedTitle.Contains(searchTerm))
.Cast<object>());
In both cases, you may not have to downcast specifically. Again, the first gives you type safety if you need it in the UI, the second is simpler and requires less code on the backend but is only useful if you don't require type safety in the UI.
As for sorting, once you've searched and combined your result, you can OrderBy to sort the result, however you will have to provide a function which can perform the ordering. This function will differ depending on which version you choose.

Related

database structure for unconstant items

I have a database which own events as following :
event(id, name, type, startDate, endDate)
The type is another custom object (Type(id, name, icon)) which is linked as one-to-many.
I want to add another custom object named EventDetail containing details, but the fields would be different with the type of the event.
For instance, if my event is a travel type, fields could be a weblink to the wikipedia of destination, a photo, but if event is an internship for work they would evidently be different.
What kind of structure could handle this type of situation ?
If it is important, I'm using java for android, with an ORM library named SugarORM
Final answer
I finally headed to ORM libraries for managing my database. Using a one-to-one relation between Type and EventDetail and one-to-many between Type and Event, it works really well for now.
You can go a few different ways with this...
One would be to create a catch-all table for EventDetail that has every field would need for every type and use the type to determine which fields you read from in that table.
Another way, if you do not have a lot of different types, would be to create a table for each type that would be structured to the needs of that type, in your queries, you would need to join to each of the tables based on the type (probably with using UNION [ALL], one query per type - or create views).
A third, and probably more ideal depending on the situation and use, would be to remove that level of complexity from the database. Store your "Event Details" as a CLOB or BLOB or XMLTYPE, etc... and use the type to deserialize the data to an object on the client side.

Structuring Google App Engine for strong consistency

I want to run over this plan I have for achieving strong consistency with my GAE structure. Currently, here's what I have (it's really simple, I promise):
You have a Class (Class meaning Classroom not a programming "class") model and an Assignment model and also a User model. Now, a Class has an integer list property called memberIds, which is an indexed list of User ids. A class also has a string list of Assignment ids.
Anytime a new Assignment is created, its respective Class entity is also updated and adds the new Assignment id to its list.
What I want to do is get new Assignments for a user. What I do is query for all Classes where memberId = currentUserId. Each Class I get back has a list of assignment ids. I use those ids to get their respective Assignments by key. After months with this data model, I just realized that I might not get strong consistency with this (for the Class query part).
If user A posts an assignment (which consequently updates ClassA), user B who checks in for new assignments a fraction of a second later might not yet see the updated changes to ClassA (right?).
This is undesired. One solution would be to user ancestor queries, but that is not possible in my case, and entity groups are limited to 1 write per second, and I'm not sure if that's enough for my case.
So here's what I figured: anytime a new assignment is posted, we do this:
Get the respective Class entity
add the assignment id to the Class
get the ids of all this Class's members
fetch all users who are members, by keys
a User entity has a list of Classes that the user is a member of. (A LocalStructuredProperty, sort of like a dictionary:{"classId" : "242", "hasNewAssignment" : "Yes"} ). We flag that class as hasNewAssignment =
YES
Now when a user wants to get new assignments, instead of querying for groups that have a new assignment and which I am a member of, I
check the User objects list of Classes and check which classes have
new assignments.
I retrieve those classes by key (strongly consistent so far, right?)
I check the assignments list of the classes and I retrieve all assignments by key.
So throughout this process, I've never queried. All results should be strongly consistent, right? Is this a good solution? Am I overcomplicating things? Are my read/write costs skyrocketing with this? What do you think?
Queries are not strongly consistent. Gets are strongly consistent.
I think you did the right thing:
Your access is strongly consistent.
Your reads will be cheaper: one get is half cheaper as then query that returns one entity.
You writes will be more expensive: you also need to update all User entities.
So the cost depends on your usage pattern: how many assignment reads do you have vs new assignment creation.
I think using ancestor queries is a better solution.
Set the ancestor of the Assignment entities as the Class to which the assignment is allotted
Set the ancestor of a Student entity as the Class to which the student belongs.
This way all the assignments and students in a particular class belong to the same entity group. So Strong consistency is guaranteed w.r.t a query that has to deal with only a single class.
N.B. I am assuming that not too many people won't be posting assignments into a class at the same time. (but any number of people can post assignments into different classes, as they belong to different entity groups)

Custom Fields for a Form representing an object

I have an architectural question concerning custom fields in a view for an object. Let's say you have a User Object with some basic information like firstname, lastname, ... that can be used by all customers.
Now, often we get a question from a customer to add couple of custom fields typical for their domain. Our solution now is an xml data column where key value pairs are stored. This has been ok so far, but now we'll have to find a more architectural solution.
For instance, now, a customer wants a dropdown where it can select the value for its custom field. We could still store the selected value in the xml data column, but where do we store all those dropdown values...
I know that in sharepoint you can also add custom fields like dropdowns and I was wondering how to deal with this best. I want to avoid creating custom tables for customers, or having a table with 90 columns (10 basic and then 10 for each customer), ...
You get the idea, it should be generic and be able to deal with all sorts of problems in the future.
What I was thinking about is a Table UserConfiguration where each record has a Foreign Key to the Customer (Channel in our database), then a column FieldName, a column FieldType and a column Values. The column values should be an xml type column, because for a dropdown, we'll need to add multiple values. Also, each value can have extra data attached to it (not just a name). The other problem then is how to store the selected value. I don't like the idea of having foreign keys to xml in my database (read somewhere that Azure can't handle this all to well). Do you just store the name of the value (what if the value were to disappear out of the xml?)?
Any documentation, links on this kind of problems would also be great. I'm trying to find a design pattern that deals with this kind of problem in the database.
I want to answer your question in two parts:
1) Implementing custom fields in a database server
2) Restricting custom fields to an enumeration of values
Although common solutions to 1) are discussed in the question referenced by #Simon, maybe you are looking for a bit of discussion on what the problem is and why it hasn't been solved for us already.
databases are great for structured, typed data
custom fields are inherently less structured
therefore, custom fields are more difficult to work with in a database
some or many of the advantages of using a database are lost
some queries may be more difficult or impossible
type safety may be lost (in the database)
data integrity may no longer be enforced (by the database)
it's a lot more work for the implementers and maintainers
As discussed in the other question, there's no perfect solution.
But these benefits/features still need to be implemented somewhere, and so often the application becomes responsible for data integrity and type safety.
For situations like these, people have created Object-Relation Mapping tools, although, as Jeff Atwood says, even using an ORM could create more problems than it solved. However, you mentioned that it 'should be generic and be able to deal with all sorts of problems in the future' -- this makes me think an ORM might be your best bet.
So, to sum up my answer, this is a known problem with known solutions, none of which are completely satisfactory (because it's so hard). Pick your poison.
To answer the second part of (what I think is) your question:
As mentioned in the linked question, you could implement Entity-Attribute-Value in your database for custom fields, and then add an extra table to hold the legal values for each entity. Then, the attribute/value of the EAV table is a foreign key into the attribute-value table.
For example,
CREATE TABLE `attribute_value` ( -- enumerations go in this table
`attribute` varchar(30),
`value` varchar(30),
PRIMARY KEY (`attribute`, `value`)
);
CREATE TABLE `eav` ( -- now the values of attributes are restricted
`entityid` int,
`attribute` varchar(30),
`value` varchar(30),
PRIMARY KEY (`entityid`, `attribute`),
FOREIGN KEY (`attribute`, `value`) REFERENCES `attribute_value`(`attribute`, `value`)
);
Of course, this solution isn't perfect or complete -- it's only supposed to illustrate the idea. For instance, it uses varchars, and lacks a type column. Also, who gets to decide what the possible values for each attribute are? Can these be changed at any time by the user?
I'm doing something similar for a customer. I've create a JSON FieldType which holds the entire JSON stream of a complex object and a String containing the FQTN (FullQualifiedTypeName) of my C# model class.
By using custom New-, Edit- and Display-Forms we'd ensured that our custom objects are rendered the correct way for best user experience.
To promote fields from the complex C# model to the SharePoint list, we've build something like Microsoft did in InfoPath. Users are able to select Properties or MetaData from the Complex C# type, which will be automatically promoted to the hosting SharePoint list.
The big advantage of JSON is, that its smaller than XML and easier to work with in the web world. (JavaScript...)
When you let the users create the data models, I would recommend looking at an document database or 'NoSQL' since you want exactly that, to store schemaless data structures.
Also, sharePoint stores metadata the way you mentioned (10 columns for text, 5 for dates etc)
That said, in my current project (locked in SharePoint, so Framework 3.5 + SQL Server and all the constraints that follow) we use a somewhat similar structure as below:
Form
Id
Attribute (or Field)
Name
Type (enum) Text, List, Dates, Formulas etc
Hidden (bool)
Mandatory
DefaultValue
Options (for lists)
Readonly
Mask (for SSN etc)
Length (for text fields)
Order
Metadata
FormId
AttributeId
Text (the value for everything but dates)
Date (the value for dates)
Our formulas employ functions such as Increment: INC([attribute1][attribute2], 6) and this would produce something like 000999 for the 999th instance of the combined values for attribute 1 and attribute 2 for a form, this is stored as:
AttributeIncrementFormula
AtributeId
Counter
Token
Other 'formulas' (aka anything non-trivial) such as barcodes are stored as single metadata values. In the actual implementation, we would have something like this:
var form = formRepository.GetById(1);
form.Metadata["firstname"].Value
Value above is a readonly property that decides whether we should get the value from Text or Date and if some additional transform is required. Note that the database here is merely a storage, we hold all the domain complexity in the application.
We also let our customer decide which attribute is the form title for example, so if firstname is the form title, they'll set an in-memory param that spans the entire application to be something like Params.InMemory.TitleAttributeId = <user-defined-id>.
I hope this gives you some insight on a production impl of a similar scenario.
This is really more of a comment than an answer, but I need more space than SO will allow for comments, so here 'tis:
I think your UserConfiguration table approach is good, and would suggest only abstracting the "type" and "value" pieces of your design a bit more:
Since your application will need to validate user input, each notion of "type" will have an associated piece of evaluation logic. Obviously the more of this you can abstract into data the easier it will be to keep your code small. Enumerated lists are a good start, but if your "validator" logic can be extended to handle pattern matching for text strings and Boolean logical expressions (e.g. to describe/enforce constraints on input values), then you can express pretty much any "type" of input that your application may need to handle in terms of (relatively) simple "atoms" that you can map naturally to DB tables.
When storing a user-specified value, you can either store the "raw" data (e.g. in JSON) and a foreign key to the associated "type", or you can add an lookup/cache system that assigns an integer to each new value that is encountered by the system ("novelty" can be checked by checking a hash of the "raw" data, for example). The latter approach obviously scales better if you're expecting lots of data duplication (which of course you would in the case of a multiple-choice menu).

Database design rules to follow for a programmer

We are working on a mapping application that uses Google Maps API to display points on a map. All points are currently fetched from a MySQL database (holding some 5M + records). Currently all entities are stored in separate tables with attributes representing individual properties.
This presents following problems:
Every time there's a new property we have to make changes in the database, application code and the front-end. This is all fine but some properties have to be added for all entities so that's when it becomes a nightmare to go through 50+ different tables and add new properties.
There's no way to find all entities which share any given property e.g. no way to find all schools/colleges or universities that have a geography dept (without querying schools,uni's and colleges separately).
Removing a property is equally painful.
No standards for defining properties in individual tables. Same property can exist with different name or data type in another table.
No way to link or group points based on their properties (somehow related to point 2).
We are thinking to redesign the whole database but without DBA's help and lack of professional DB design experience we are really struggling.
Another problem we're facing with the new design is that there are lot of shared attributes/properties between entities.
For example:
An entity called "university" has 100+ attributes. Other entities (e.g. hospitals,banks,etc) share quite a few attributes with universities for example atm machines, parking, cafeteria etc etc.
We dont really want to have properties in separate table [and then linking them back to entities w/ foreign keys] as it will require us adding/removing manually. Also generalizing properties will results in groups containing 50+ attributes. Not all records (i.e. entities) require those properties.
So with keeping that in mind here's what we are thinking about the new design:
Have separate tables for each entity containing some basic info e.g. id,name,etc etc.
Have 2 tables attribute type and attribute to store properties information.
Link each entity (or a table if you like) to attribute using a many-to-many relation.
Store addresses in different table called addresses link entities via foreign keys.
We think this will allow us to be more flexible when adding, removing or querying on attributes.
This design, however, will result in increased number of joins when fetching data e.g.to display all "attributes" for a given university we might have a query with 20+ joins to fetch all related attributes in a single row.
We desperately need to know some opinions or possible flaws in this design approach.
Thanks for your time.
In trying to generalize your question without more specific examples, it's hard to truly critique your approach. If you'd like some more in depth analysis, try whipping up an ER diagram.
If your data model is changing so much that you're constantly adding/removing properties and many of these properties overlap, you might be better off using EAV.
Otherwise, if you want to maintain a relational approach but are finding a lot of overlap with properties, you can analyze the entities and look for abstractions that link to them.
Ex) My Db has Puppies, Kittens, and Walruses all with a hasFur and furColor attribute. Remove those attributes from the 3 tables and create a FurryAnimal table that links to each of those 3.
Of course, the simplest answer is to not touch the data model. Instead, create Views on the underlying tables that you can use to address (5), (4) and (2)
1 cannot be an issue. There is one place where your objects are defined. Everything else is generated/derived from that. Just refactor your code until this is the case.
2 is solved by having a metamodel, where you describe which properties are where. This is probably needed for 1 too.
You might want to totally avoid the problem by programming this in Smalltalk with Seaside on a Gemstone object oriented database. Then you can just have objects with collections and don't need so many joins.

How to get my SQL DB to match my Domain Driven Design

Okay, I'll be straight with you guys: I'm not sure exactly how Domain Driven my Design is, but I did start by building Model objects and ignoring the persistence layer altogether. Now I'm having difficulty deciding the best way to build my tables in SQL Server to match the models.
I'm building a web application in ASP.NET MVC, although I don't think the platform matters that much. I have the following object model hierarchy:
Property - has properties such as Address and Postcode
which have one or more
Case - inherits from PropertyObject
Quote - inherits from PropertyObject
which have one or more
Message - simple class that has properties Reference, Text and SentDate
Case and Quote have a lot of similar properties, so I also have a PropertyObject abstract base class that they inherit from. So Property has an Items property of type List which can contain both Case and Quote objects.
So essentially, I can have a Property that has a few Quotes and Cases and a load of Messages that can belong to either of those.
A PropertyObject has a Reference property (and therefore so do Quote and Case) so any Message object can be related back to a Quote OR Case by it's Reference property.
I'm thinking of using the Entity Framework to get my Models in and out of the database.
My initial thoughts were to have four tables: Property, Case, Quote and Message.
They'd all have their own sequential IDs, and the Case and Quote would be related back to Property by a PropertyID field.
The only way I can think of to relate a Message table back to the Case and Quote tables is to have both a RelationID and RelationType field, but there's no obvious way to tell SQL server how that relationship works, so I won't have any referential integrity.
Any ideas, suggestions, help?
Thanks,
Anthony
I am assuming Property doesn't also inherit from PropertyObject.
Given that these tables, Property, Case, Quote and Message, leads to a Table per Concrete Class or TPC inheritance strategy, which I generally don't recommend.
My recommendation is that you use either:
Table per Hierarchy or TPH - Case and Quote are stored in the same table with one column used as a discriminator, with nullable columns for properties that are not shared.
Table per Type or TPT - add a PropertyObject table with the shared fields and Case and Quote tables with just the extra fields for those types
Both of these strategies will allow you to maintain referential integrity and are supported by most ORMs.
see this for more: Tip 12 - How to choose an inheritance strategy
Hope this helps
Alex
Ahhh... Abstraction.
The trick with DDD is to recognize that abstraction is not always your friend. In some cases, too much abstraction leads to a too-complex relational model.
You don't always need inheritance. Indeed, the major purpose of inheritance is to reuse code. Reusing a structure can be important, but less so.
You have a prominent is-a pair of relationships: Case IS-A Property and Quote IS-A Property.
You have several ways to implement class hierarchies and "is-a" relationships.
As you've suggested with type discriminators to show which subclass this really is. This works when you often have to produce a union of the various subclasses. If you need all properties -- a union of CaseProperty and QuoteProperty, then this can work out.
You do not have to rely on inheritance; you can have disjoint tables for each set of relationships. CaseProperty and QuoteProperty. You'd have CaseMessage and QuoteMessage also, to follow the distinction forward.
You can have common features in a common table, and separate features in a separate table, and do a join to reconstruct a single object. So you might have a Property table with common features of all properties, plus CaseProperty and QuoteProperty with unique features of each subclass of Property. This is similar to what you're proposing with Case and Quote having foreign keys to Property.
You can flatten a polymorphic class hierarchy into a single table and use a type discriminator and NULL's. A master Property table has type discriminator for Case and Quote. Attributes of Case are nulled for rows that are supposed to be a Quote. Similarly, attributes of Quote are nulled for rows that are supposed to be a Case.
Your question "[how] to relate a Message table back to the Case and Quote tables" stems from a polymorphic set of subclases. In this case, the best solution might be this.
Message has an FK reference to Property.
Property has a type discriminator to separate Quote from Case. The Quote and Case class definitions both map to Property, but rely on a type discriminator, and (usually) different sets of columns.
The point is that the responsibility for Property, CaseProperty and QuoteProperty belongs to that class hierarchy, and not Message.
This is where the DDD concept of Services would come in. The Repository for each of your concrete classes only persist that entity, not the related objects.
So you have Property(), and is the base for your CaseProperty() : Property(). This special-entity is accessed via CasePropertyService(). Within here is where you would do your JOINs and such to the related tables in order to generate your CaseProperty() special entity (which is not really Case() and Property on its own, but a combination).
OT: Due to limitation of .net of where you can't inherit multiple classes, this is my work around. DDD is meant to be a guideline to the overall understanding of your domain. I often give my DDD outline to friends, and have them try to figure out what it does/represent. If it looks clean and they figure it out, it's clean. If your friends look at it and say, "I have no idea what you are trying to persist here." then go back to the drawing board.
But, there's a catch about using any ORM to persist storage of DDD objects (linq, EntityFramework, etc). Have a look at my answer over here:
Stackoverflow: Question about Repositories and their Save methods for domain objects
The catch is all objects must have an identity in the database for ORM. So, this helps you plan your DB structure.
I have recently moved away from using ORM to control direct access, and just have a clean DDD layer. I let my repositories and services control access to the DB layer, and use Velocity to entity-cache my objects. This actually works very well for: 1) DB performance, you design however is most efficient not being coupled to your DOmain objects with direct ORM representation, and 2) your domain model becomes much cleaner with no forced identies on Value Objects and such. Free!

Resources