Linq can retrieve distinct entities that have text/clob properties? - sql-server

I have an Entity that has varchar and text columns (Text/CLOB database type).
The following query doesn't work, because MyEntity has Clob columns (Oracle). Similar behavior is expected in SqlServer with its Text columns.
var query = from e in ctx.MyEntity select e.Distinct()
How can I use Linq to retrive distinct rows from this table/entity?
Note: I can't use SqlServer's "varchar(max)"; it must be 'text' type; because in other DBMS like Oracle there's no varchar(max) type.
My issue is due to SqlServer and Oracle compatibility with Linq / Entity Framework.
Note2: I need objects of MyEntity type in the output select, and not generic types and no other kind of types.
I appreciate any suggestions. Thanks.

Related

Peewee - Optional fields in Model?

We have a few thousand databases, but the number of columns are not consistent.
Is it possible to define columns that may or may not appear in the table?
As example:
class ContactFields(Model):
id = IntegerField()
id_2 = IntegerField()
Sometimes id_2 does not exist. However, if I try to create a query, peewee errors out with:
InternalError: (1054, "Unknown column 't1.id_2' in 'field list'")
No, that would be magical as hell. You can try using reflection if you need to dynamically access tables. Or you can just explicitly select only those columns which are present across all databases.
http://docs.peewee-orm.com/en/latest/peewee/playhouse.html#generate_models

Sql user define type field ignore in Entity Framework

I have created User define type in sql server DB and used in one table column now when i Add this table in Entity Framework, the field with this datatype is ignore and Show error message-
Warning 2 Error 6005: The data type 'Type_Name' is currently not
supported for the target Entity Framework version; the column 'column_name'
in the table 'DB_Name' was excluded.
So how can we map our sql user define type in Entity framework, so that we can add this field with that datatype.
You'll have to create a complex type in the EF designer to accomplish this.
Read here for that process:
https://msdn.microsoft.com/en-us/data/jj680147.aspx

How to retrieve a record with a binary id

We have a table in SQL Server. Some genius made the identify field the type binary(8). Whatever anyone else may say, that's a bad idea. I've been trying for over an hour to get a record out of the table in Entity Framework using Linq. I can do it in SQL Management Studio using the criteria "WHERE Contact_Id = 53147", but in EF the field is declared as type byte[] so it won't take an int parameter. I found a way to convert my int to byte[], but it still can't find the record. How can I construct a query in Linq that will retrieve this simple record?

Approach to generic database design

An application that I'm facing at a customer, looks like this:
it allows end users to enter "materials".
To those materials, they can append any number of "properties".
Properties can have a any value of type: decimal, int, dateTime and varchar (length varying from 5 characters to large chunks of text),
Essentially, the Schema looks like this:
Materials
MaterialID int not null PK
MaterialName varchar(100) not null
Properties
PropertyID
PropertyName varchar(100)
MaterialsProperties
MaterialID
PropertyID
PropertyValue varchar(3000)
An essential feature of the application is the search functionality:
end users can search materials by entering queries like:
[property] inspectionDate > [DateTimeValue]
[property] serialNr = 35465488
Guess how this performs over the MaterialsProperties-table with nearly 2 million records in it.
Database was initially created under SQL Server 2000 and later on migrated to SQL Server 2005
How can this be done better?
You could consider separating your MaterialsProperties table by typel e.g. into IntMaterialProperties, CharMaterialProperties, etc. This would:
Partition your data.
Allow for potentially faster look-ups for integer (or other numeric) type look-ups.
Potentially reduce storage costs.
You could also introduce a Type column to Properties, which you could use to determine which MaterialProperties table to query. The column could also be used to validate the user's input is of the correct type, eliminating the need to query given "bad" input.
Since users can enter their own property names, i guess every query is going to involve a scan of the properties table (in your example i need to find the propertyid of [inspectionDate]). If the properties table is large, your join would also take a long time. You could try and optimize by denormalizing and storing name with propertyID. This would be a denaormalized column in the MaterialsProperties table.
You could try adding a property type (int, char etc) to the materialsproperty table and partition the table on the type.
Look at Object Relational Mapping/Entity Attribute Value Model techniques for query optimization.
Since you already have a lot of data (2 million records) do some data mining as see if there are repeating groups of properties for many materials. You can them put them in one schema and the rest as the EAV table. Look here for details: http://portal.acm.org/citation.cfm?id=509015&dl=GUIDE&coll=GUIDE&CFID=49465839&CFTOKEN=33971901

How to design a database for unkown amount of 'meta'-data

I want to store certain items in the database with variable amount of properties.
For example:
An item can have 'url' and 'pdf' property both others do not en instead have 'image' and 'location' properties.
So the problem is an some items can have some properties and others a lot.
How would you design this database. How to make it searchable and performant?
What would the schema look like?
Thanks!
What you are after has a name - Entity Attribute Value (EAV). It is "a data model that is used in circumstances where the number of attributes (properties, parameters) that can be used to describe a thing (an "entity" or "object") is potentially very vast, but the number that will actually apply to a given entity is relatively modest."
If you are not necessarily tied to SQL, a triple store is designed for precisely this task. Most are designed to be queried with the SPARQL query language.
That sounds like a perfect job for a document database.
Start with your object (item) and create a table for items. Your item can have 1 or many attributes or none at all right? So set up a table of attributes with unique ids. Now set up a table that holds many items (some can duplicate) and many attributes (can duplicate as well)
Item
ItemID
ItemDescription
...
Attributes
AttributeID
AttributeDescription
...
ItemAttributes
rowID
ItemID
AttributeID
Now when you want to query you can simply join the tables and filter however you desire...
The Entity Attribute Value (EAV) model is very flexible. The semantic web and its query language sparql are based on EAV too. But some people don't like it because there is a performance penalty with this model.
Start with doing some high load performance tests on your database. Don't do them when you are done coding, because then it is too late.
edit: Focus on the speed of you select statements. Users expect quick results when they search.
I have designed tables like this in the past to have the following fields:
id
type
subtype
value
And then I would have another table that would define the type and subtypes used, and possibly give the datatype for that type and subtype combination so that you could programatically enforce it.
Its not pretty, and you don't want to do it unless you have to. But its the best way I have found when you do.
update: even if you leave subtype blank, I find its a good thing to have, because its too often that you want to subcategorize something that already exists. Example you create type: address, now you need mailing address and billing address and physical address.
For this kind of scenario's I use the XML-type column in MS SQL 2005...
you'll have all the advantages of XML + SQL. That is use an XPath expression as part of an SQL-statement.
It's a feature of MS SQL 2005, I am not sure which other RDBMS support this.
I am not sure what the implications are performance wise.
Create a properties table with the following fields:
item_id int(or whatever the ID type is in the item table)
property_name varchar(500)
property_value varchar(500)
Set a foreign key between item_id and the item's id field, and you're done.
That's how you do a many-to-one relationship in SQL.
Looks like an "items" table with primary key "item_id", a "properties" table with primary key "property_id" and a foreign key "item_id" with the "items" table. "properties" will have columns "name" and "value", both of type varchar.
Performant? Don't know.

Resources