How do i model multiple photos (for a Hotel) with schema.org? - data-modeling

I am new to schema.org. Currently i am trying to use it as our internal data model for imports as it offers a good "common ground" for all source systems.
The Hotel schema (https://schema.org/Hotel) offers a "photo" (singular) property, it inherits from Place. It used to have a "photos" (plural) property in the past.
When using schema.org for markup, this would not matter, as i can just mark up multiple elements as "photo".
However, when using it as a data class, how should i model it?
Should i just make it an array of Photograph?
If yes, does schema.org actually assume on ANY property that it may be multiple (amenityFeature, availableLanguage, etc. suspiciously look like that)?
Does that mean, i have to actually model every property as an array?

After some additional research i have to assume schema.org is not meant as a full data model. It is mostly about providing a common vocabulary and a hierarchy of information. Its primary use case seems to be markup, so types definitions are very vague since they have to work on content that is actually meant to be presented to a user. So i will have to specify my own schema and let my decisions and my naming be guided by schema.org.

Related

Does extensive use of ndb models affect performance?

I'm new to GAE and I'm still trying to figure things out. We're developing an Android app which uses Cloud Datastore to store images, videos, text, audios, etc. So we have now over 15 types of content objects.
I've been modelling each type of object as a distinct ndb Model class, but I'm wondering if this kind of design could affect performance.
Specifically, wouldn't it be better to write a simple class (e.g ContentObject) which simply had a content_type, and a few generic fields as string, number and blob?
I guess I'd go for the latter if I had to worry about creating/maintaining tables (or simply knowing that there are regular db tables behind).
I really like the first option, but I had to ask, just in case.
There are no performance differences to worry about between the 2 approaches.
With dedicated models you'll have to write a bit more code - each model needs to be handled separately. But it's simpler code, especially if eventually you will have some properties which only exist for some entities or are handled differently, which would require conditional logic with a generic model.
Building queries is also simpler with dedicated models if there are property differences, using a single model may require filling in unused properties (maybe by using default values) if they are used for sorting/filtering query results (entities with missing properties aren't indexed by the respective properties so they won't show up in the results).
On the other hand you'll need separate queries for each model, you can't obtain results for different kinds in the same query. And you'll need to maintain separate composite indexes for each kind (with a total limit of 200 such indexes per application).
If you're worrying about code duplication, which could also be a reason for which you'd consider a shared model, it's also possible to combine the common properties in a single ndb model class, with a single/common implementation for handling those common properties, and inherit that class in dedicated subclasses handling the differences. Something like this:
class Content(ndb.Model):
type = ndb.StringProperty() # not really needed, cls._get_kind() can be used instead
blob = ndb.StringProperty()
# other generic/common content properties and related methods
class Video(Content):
has_cc = ndb.BooleanProperty()
# other video-specific content properties and related methods
But this is just an implementation approach, from the datastore perspective you're still using dedicated models - in the above example a video entity will have a Video kind, not a Content kind.
There are no tables with the datastore, the only thing shared between entities of the same kind is their ndb model (which is specific just for the more performant ndb client library, other client libraries don't have one) and the search indexes definitions.

Class between database and UI

I have a class that handles writing and reading data from my database. What is a proper name to call this class?
There are a couple of conventions. Assuming a Person model, you could use:
PersonDataAccessObject,
PersonDao,
PersonRepository,
PersonDataAccess,
...
It is also dependent on the technology you are using. I mean, who knows what conventions exist for the language you are using. Let us know what language and what data access framework and the answer may vary.
I used to append "Dao" because it's short and clear. But then I moved over more to Martin Fowler's vocabulary and patterns, so now I use Repository. A little more long winded, but I'm long winded by nature, so it fits my style. In the end, that's the key. It's stylistic and there is no across the board standard that I'm aware of. What's most important is that you pick something that is clear and you use it consistently. If you decide, later on, to switch to something else, have mercy on any programmers that may follow you and rename everything so that all your data access components are consistently named.
Edit: in rereading this, I realized I am assuming you are going to have multiple such classes, one for each of your model entities. Who knows what your setup is. If you aren't going to do it like that, and you're just looking for a standard name for a single point of entry to all data access, you could use:
DataMapper
Gateway
Typically, the assumption is that you are going to have several of these around, one for each of your "tables"/model entities. More than a naming convention, that is probably a standard coding convention. This way, when you change or add some aspect of how you interact with your "persons" table, you don't have to modify a class in which you have code to access the "addresses" table. Check out Martin Fowler's Patterns of Enterprise Application Architecture (PofEAA), for more
PofEAA catalog of patterns (check out Data Source Architectural Patterns
and
Domain Driven Design Quickly (free pdf) esp. Ch. 3
Depending on the entity this class represents it could be for example Person. Then you design a PersonViewModel which is passed to the GUI. So the Person you got from the database is mapped to a PersonViewModel which is passed to the UI layer for being shown under some form. The view model is just a representation of the domain model you fetched from the database and containing only the necessary information that you need to display on the given UI.

Django models generic modelling

Say, there is a Page that has many blocks associated with it. And each block needs custom rendering, saving and data.
Simplest it is, from the code point of view, to define different classes (hence, models) for each of these models. Simplified as follows:
class Page(models.Model):
name = models.CharField(max_length=64)
class Block(models.Model):
page = models.ForeignKey(Page)
class Meta():
abstract = True
class BlockType1(Block):
other_data = models.CharField(max_length=32)
def render(self):
"""Some "stuff" here """
pass
class BlockType2(Block):
other_data2 = models.CharField(max_length=32)
def render(self):
"""Some "other stuff" here """
pass
But then,
Even with this code, I can't do a query like page.block_set.all() to obtain all the different blocks, irrespective of the block type.
The reason for the above is that, each model defines a different table; Working around to accomplish it using a linking model and generic foreign keys, can solve the problem, but it still leaves multiple database tables queries per page.
What would be the right way to model it? Can the generic foreign keys (or something else) be used in some way, to store the data preferably in the same database table, yet achieve inheritance paradigms.
Update:
My point was, How can I still get the OOP paradigms to work. Using a same method with so many ifs is not what I wanted to do.
The best solution, seems to me, is to create separate standard python class (Preferably in a different blocks.py), that defines a save which saves the data and its "type" by instantiating the same model. Then create a template tag and a filter that calls the render, save, and other methods based on the model's type.
Don't model the page in the database. Pages are a presentation thing.
First -- and foremost -- get the data right.
"And each block needs custom rendering, saving and data." Break this down: you have unique data. Ignore the "block" and "rendering" from a model perspective. Just define the data without regard to presentation.
Seriously. Just define the data in the model without any consideration of presentation or rending or anything else. Get the data model right.
If you confuse the model and the presentation, you'll never get anything to work well. And if you do get it to work, you'll never be able to extend or reuse it.
Second -- only after the data model is right -- you can turn to presentation.
Your "blocks" may be done simply with HTML <div> tags and a style sheet. Try that first.
After all, the model works and is very simple. This is just HTML and CSS, separate from the model.
Your "blocks" may require custom template tags to create more complex, conditional HTML. Try that second.
Your "blocks" may -- in an extreme case -- be so complex that you have to write a specialized view function to transform several objects into HTML. This is very, very rare. You should not do this until you are sure that you can't do this with template tags.
Edit.
"query different external data sources"
"separate simple classes (not Models) that have a save method, that write to the same database table."
You have three completely different, unrelated, separate things.
Model. The persistent model. With the save() method. These do very, very little.
They have attributes and a few methods. No "query different external data sources". No "rendering in HTML".
External Data Sources. These are ordinary Python classes that acquire data.
These objects (1) get external data and (2) create Model objects. And nothing else. No "persistence". No "rendering in HTML".
Presentation. These are ordinary Django templates that present the Model objects. No external query. No persistence.
I just finished a prototype of system that has this problem in spades: a base Product class and about 200 detail classes that vary wildly. There are many situations where we are doing general queries against Product, but then want to to deal with the subclass-specific details during rendering. E.g. get all Products from Vendor X, but display with slightly different templates for each group from a specific subclass.
I added hidden fields for a GenericForeignKey to the base class and it auto-fills the content_type & object_id of the child class at save() time. When we have a generic Product object we can say obj = prod.detail and then work directly with the subclass object. Took about 20 lines of code and it works great.
The one gotcha we ran into during testing was that manage.py dumpdata followed by manage.py loaddata kept throwing Integrity Errors. Turns out this is a well-known problem and a fix is expected in the 1.2 release. We work around it by using mysql commands to dump/reload the test dataset.

Model-View-ViewModel pattern violation of DRY?

I read this article today http://dotnetslackers.com/articles/silverlight/Silverlight-3-and-the-Data-Form-Control-part-I.aspx about the use of the MVVM pattern within a silverlight app where you have your domain entities and view spesific entities which basically is a subset of the real entity objects. Isn't this a clear violation of the DRY principle? and if so how can you deal with it in a nice way?
Personally, I don't like what Dino's doing there and I wouldn't approach the problem the same way. I usually think of a VM as a filtered, grouped and sorted collections of Model classes. A VM to me is a direct mapping to the View, so I might create a NewOrderViewModel class that has multiple CollectionViews used by the View (maybe one CV for Customers and another CV for Products, probably both filtered). Creating an entirely new VM class for every class in the Model does violate DRY in my opinion. I would rather use derivation or partial classes to augment the Model where necessary, adding in View specific (often calculated) properties. IMO .NET RIA Services is an excellent implementation of combining M and VM data with the added bonus that it's usable in on both the client and the server. Dino's a brilliant guy, but way to call him out on this one.
DRY is a principle, not a hard rule. You are a human and can differentiate.
E.g. If DRY really was a hard rule you would never assign the same value to two different variables. I guess in any non trivial program you would have more than one variable containing the value 0.
Generally speaking: DRY does usually not apply to data. Those view specific entities would probably only be data transfer objects without any noteworthy logic. Data may be duplicated for all kinds of reasons.
I think the answer really depends on what you feel should be in the ViewModel. For me the ViewModel represents the model of the screen currently being displayed.
So for something like a ViewCategoryViewModel, I don't have a duplication of the fields in Category. I expose a Category object as a property on the ViewModel (under say "SelectedCategory"), any other data the view needs to display and the Commands that screen can take.
There will always be some similarity between the domain model and the view model, but it all comes down to how you choose to create the ViewModel.
It's the same as with Data Transfer Objects (DTO).
The domain for those two object types is different, so it's not a violation of DRY.
Consider the following example:
class Customer
{
public int Age
}
And a corsponding view model:
class CustomerViewModel
{
public string Age;
// WPF validation code is going to be a bit more complicated:
public bool IsValid()
{
return string.IsNullOrEmpty(Age) == false;
}
}
Differnt domains - differnet property types - different objects.

Teaching: Field, Class & Package Relationships

In general I think I can convey most programming related concepts quite well.
Yet, I still find it hard to summarise the relationship between Fields, Classes and Packages.
How do You summarise "Fields", "Classes" and "Packages" and "Their Relationship" ?
I've faced a similar problem since I taught C, C++, and Java.
Here is what I do:
First, I keep packages separately and explain them in the end.
Ideally, in my opinion, students should first learn about ADTs, preferably in C. They have the struct, they have the separate operations on it. Fields are then simply the "slots" in the struct and you can even show the memory layout to demonstrate it. Functions are separate entities that operate on those structs.
You then make the transition to classes, methods, and fields and show that in essence (barring inheritance and some anecdotes) they are in many ways syntactic sugar for ADTs.
If you need, you can then teach object layouts, inheritance, and virtual tables (in my experience it helps students understand inheritance better to see the memory layout).
Finally you get to the topic of how to organize classes together. If you teach C++, you don't really have packages but you can explain namespaces and discuss organization and separate compilation.
If you are in Java, then you just explain that these are collections of classes in the same namespace, that have special access rules and show them. The package system in Java is kind of broken anyway so I usually go through patterns (e.g., separating a UI package from the C).
So in summary: Classes form the basis for objects that are a memory arrangement of several fields and associated methods that operate on them. Packages are collections of classes that have one more access restriction mechanism.
The way I describe it is:
Objects are collections of slots, slots holding data are fields, slots holding code are methods. Public slots are on the outside of the object, private slots are on the inside. Methods should be mostly public because an object offers services to clients, fields should be private so clients don't know how the services work. Fields are therefore an implementation detail of objects.
Class names need to be unique, so that you can combine your code with third party libraries. Simple/short class names are insufficient, since there are probably thousands of classes called 'List', 'Customer' etc... Hence classes are placed in packages to create longer, harder to duplicate names. Only a subset of the classes in the package need to be visible to clients, hence the two access levels of public and default. This allows a package to function as a library.
So fields are an implementation detail of objects, whose classes live in packages to guarantee unique names and provide library-like modularity.
Depending on the age of the person you're trying to explain it to, there's a simple analogy that can be used: tax forms. A tax form (such as the 1040EZ, for instance) is like a class, and each space to be filled in on the form is a field of the form. The tax form even contains instructions on what to be done with the information in the fields, just as a class includes member functions to be performed on the data in the fields. And just as a complete set of tax forms includes not just the main tax form, but others that may need to be filled out (additional schedules, for example) so a package contains not just the main classes but other classes it may need to interact with.
Fields are variables that belong to the class, or to object instances of the class. The difference between a local variable and a field is that fields have a broader scope.
Classes are templates for user-defined data types. Classes are more advanced than the primitive data types because they have both state and behavior.
Packages are used to group classes and to resolve potential naming conflicts. With multiple developers and publicly available code libraries it's very likely that some of us will name our classes the same (Math, LinkedList, FileUtils, etc.). Having a unique package name prefixing the class name allows the compiler (and other developers) to determine which class you intend to use.
Interestingly, you tackled OO programming without mentioning objects. I think that may be your problem.
Here's what I use.
Objects are things. They have attributes (measurements, states of being, etc.) Attributes can be called fields. [I often use things I find in the classroom -- cups, markers, hats, coats, etc., to illustrate this.]
Objects also engage in behaviors, called methods, method functions or operations.
The features (attributes and operations, fields and methods, whatever) of an object provide a way to classify objects.
The features that are common to a class of objects is -- well -- can be collected into a class definition. A class definition describes the attributes and methods of the objects that are members of the class.
A package is a collection of class definitions. While -- ideally -- the classes in a package have something in common, that isn't a requirement and isn't a helpful distinction.

Resources