Teaching: Field, Class & Package Relationships - theory

In general I think I can convey most programming related concepts quite well.
Yet, I still find it hard to summarise the relationship between Fields, Classes and Packages.
How do You summarise "Fields", "Classes" and "Packages" and "Their Relationship" ?

I've faced a similar problem since I taught C, C++, and Java.
Here is what I do:
First, I keep packages separately and explain them in the end.
Ideally, in my opinion, students should first learn about ADTs, preferably in C. They have the struct, they have the separate operations on it. Fields are then simply the "slots" in the struct and you can even show the memory layout to demonstrate it. Functions are separate entities that operate on those structs.
You then make the transition to classes, methods, and fields and show that in essence (barring inheritance and some anecdotes) they are in many ways syntactic sugar for ADTs.
If you need, you can then teach object layouts, inheritance, and virtual tables (in my experience it helps students understand inheritance better to see the memory layout).
Finally you get to the topic of how to organize classes together. If you teach C++, you don't really have packages but you can explain namespaces and discuss organization and separate compilation.
If you are in Java, then you just explain that these are collections of classes in the same namespace, that have special access rules and show them. The package system in Java is kind of broken anyway so I usually go through patterns (e.g., separating a UI package from the C).
So in summary: Classes form the basis for objects that are a memory arrangement of several fields and associated methods that operate on them. Packages are collections of classes that have one more access restriction mechanism.

The way I describe it is:
Objects are collections of slots, slots holding data are fields, slots holding code are methods. Public slots are on the outside of the object, private slots are on the inside. Methods should be mostly public because an object offers services to clients, fields should be private so clients don't know how the services work. Fields are therefore an implementation detail of objects.
Class names need to be unique, so that you can combine your code with third party libraries. Simple/short class names are insufficient, since there are probably thousands of classes called 'List', 'Customer' etc... Hence classes are placed in packages to create longer, harder to duplicate names. Only a subset of the classes in the package need to be visible to clients, hence the two access levels of public and default. This allows a package to function as a library.
So fields are an implementation detail of objects, whose classes live in packages to guarantee unique names and provide library-like modularity.

Depending on the age of the person you're trying to explain it to, there's a simple analogy that can be used: tax forms. A tax form (such as the 1040EZ, for instance) is like a class, and each space to be filled in on the form is a field of the form. The tax form even contains instructions on what to be done with the information in the fields, just as a class includes member functions to be performed on the data in the fields. And just as a complete set of tax forms includes not just the main tax form, but others that may need to be filled out (additional schedules, for example) so a package contains not just the main classes but other classes it may need to interact with.

Fields are variables that belong to the class, or to object instances of the class. The difference between a local variable and a field is that fields have a broader scope.
Classes are templates for user-defined data types. Classes are more advanced than the primitive data types because they have both state and behavior.
Packages are used to group classes and to resolve potential naming conflicts. With multiple developers and publicly available code libraries it's very likely that some of us will name our classes the same (Math, LinkedList, FileUtils, etc.). Having a unique package name prefixing the class name allows the compiler (and other developers) to determine which class you intend to use.

Interestingly, you tackled OO programming without mentioning objects. I think that may be your problem.
Here's what I use.
Objects are things. They have attributes (measurements, states of being, etc.) Attributes can be called fields. [I often use things I find in the classroom -- cups, markers, hats, coats, etc., to illustrate this.]
Objects also engage in behaviors, called methods, method functions or operations.
The features (attributes and operations, fields and methods, whatever) of an object provide a way to classify objects.
The features that are common to a class of objects is -- well -- can be collected into a class definition. A class definition describes the attributes and methods of the objects that are members of the class.
A package is a collection of class definitions. While -- ideally -- the classes in a package have something in common, that isn't a requirement and isn't a helpful distinction.

Related

How do i model multiple photos (for a Hotel) with schema.org?

I am new to schema.org. Currently i am trying to use it as our internal data model for imports as it offers a good "common ground" for all source systems.
The Hotel schema (https://schema.org/Hotel) offers a "photo" (singular) property, it inherits from Place. It used to have a "photos" (plural) property in the past.
When using schema.org for markup, this would not matter, as i can just mark up multiple elements as "photo".
However, when using it as a data class, how should i model it?
Should i just make it an array of Photograph?
If yes, does schema.org actually assume on ANY property that it may be multiple (amenityFeature, availableLanguage, etc. suspiciously look like that)?
Does that mean, i have to actually model every property as an array?
After some additional research i have to assume schema.org is not meant as a full data model. It is mostly about providing a common vocabulary and a hierarchy of information. Its primary use case seems to be markup, so types definitions are very vague since they have to work on content that is actually meant to be presented to a user. So i will have to specify my own schema and let my decisions and my naming be guided by schema.org.

Does extensive use of ndb models affect performance?

I'm new to GAE and I'm still trying to figure things out. We're developing an Android app which uses Cloud Datastore to store images, videos, text, audios, etc. So we have now over 15 types of content objects.
I've been modelling each type of object as a distinct ndb Model class, but I'm wondering if this kind of design could affect performance.
Specifically, wouldn't it be better to write a simple class (e.g ContentObject) which simply had a content_type, and a few generic fields as string, number and blob?
I guess I'd go for the latter if I had to worry about creating/maintaining tables (or simply knowing that there are regular db tables behind).
I really like the first option, but I had to ask, just in case.
There are no performance differences to worry about between the 2 approaches.
With dedicated models you'll have to write a bit more code - each model needs to be handled separately. But it's simpler code, especially if eventually you will have some properties which only exist for some entities or are handled differently, which would require conditional logic with a generic model.
Building queries is also simpler with dedicated models if there are property differences, using a single model may require filling in unused properties (maybe by using default values) if they are used for sorting/filtering query results (entities with missing properties aren't indexed by the respective properties so they won't show up in the results).
On the other hand you'll need separate queries for each model, you can't obtain results for different kinds in the same query. And you'll need to maintain separate composite indexes for each kind (with a total limit of 200 such indexes per application).
If you're worrying about code duplication, which could also be a reason for which you'd consider a shared model, it's also possible to combine the common properties in a single ndb model class, with a single/common implementation for handling those common properties, and inherit that class in dedicated subclasses handling the differences. Something like this:
class Content(ndb.Model):
type = ndb.StringProperty() # not really needed, cls._get_kind() can be used instead
blob = ndb.StringProperty()
# other generic/common content properties and related methods
class Video(Content):
has_cc = ndb.BooleanProperty()
# other video-specific content properties and related methods
But this is just an implementation approach, from the datastore perspective you're still using dedicated models - in the above example a video entity will have a Video kind, not a Content kind.
There are no tables with the datastore, the only thing shared between entities of the same kind is their ndb model (which is specific just for the more performant ndb client library, other client libraries don't have one) and the search indexes definitions.

Class between database and UI

I have a class that handles writing and reading data from my database. What is a proper name to call this class?
There are a couple of conventions. Assuming a Person model, you could use:
PersonDataAccessObject,
PersonDao,
PersonRepository,
PersonDataAccess,
...
It is also dependent on the technology you are using. I mean, who knows what conventions exist for the language you are using. Let us know what language and what data access framework and the answer may vary.
I used to append "Dao" because it's short and clear. But then I moved over more to Martin Fowler's vocabulary and patterns, so now I use Repository. A little more long winded, but I'm long winded by nature, so it fits my style. In the end, that's the key. It's stylistic and there is no across the board standard that I'm aware of. What's most important is that you pick something that is clear and you use it consistently. If you decide, later on, to switch to something else, have mercy on any programmers that may follow you and rename everything so that all your data access components are consistently named.
Edit: in rereading this, I realized I am assuming you are going to have multiple such classes, one for each of your model entities. Who knows what your setup is. If you aren't going to do it like that, and you're just looking for a standard name for a single point of entry to all data access, you could use:
DataMapper
Gateway
Typically, the assumption is that you are going to have several of these around, one for each of your "tables"/model entities. More than a naming convention, that is probably a standard coding convention. This way, when you change or add some aspect of how you interact with your "persons" table, you don't have to modify a class in which you have code to access the "addresses" table. Check out Martin Fowler's Patterns of Enterprise Application Architecture (PofEAA), for more
PofEAA catalog of patterns (check out Data Source Architectural Patterns
and
Domain Driven Design Quickly (free pdf) esp. Ch. 3
Depending on the entity this class represents it could be for example Person. Then you design a PersonViewModel which is passed to the GUI. So the Person you got from the database is mapped to a PersonViewModel which is passed to the UI layer for being shown under some form. The view model is just a representation of the domain model you fetched from the database and containing only the necessary information that you need to display on the given UI.

Game player object design

I'm supporting a server for an online card game and while thinking about refactoring it into a better state I have found myself unable to decide what is a proper object model for my needs.
I have a Player class which has a lot of attributes. The first problem is just that - the class is too big. The second problem is that I don't know how to refactor it. I will list some of the attributes and issues with these.
Some attributes are very tightly bound to a player: nick, email, last login &c. These, I suppose, are to be kept directly in the player class and in the same table in the DB.
Now, some attributes are a little more difficult, like money and gold amount. The problem with these is that they are historically stored in a different table, there might be some more currencies later on and that they MUST be synched into the database at their own pace.
Third category of attributes are loosely coupled to the player, like status string, experience, achievements, statistics &c. These are stored in different tables in the DB and MUST be stored, retrieved, cached and synchronized at their own pace.
Note that one of the big problems here is that we have to implement relatively complex database synchronization schemes because we have a lot of online players and our game is soft-realtime and we have to make load on the DB as low as possible.
My questions are:
How to determine which attributes to store within a player class and which not to? Say, experience, nickname, money amount?
When one has some attributes that may be grouped together like (strength, agility, endurance, &c.) and (handItem, headItem, feetItem, weapon) when they should be grouped and when not?
What to do with complex database synchronization schemes? Make a separate model for each attribute that needs to be synched independently or make some DataManager classes to take them apart and work with them?
What to do with the need for a class to have several different "data representations" for external consumers? Like XML, Json, another XML for some external service, human-readable string, &c.
I'm sorry if my questions are bogus, I'm not really good at OOP design, I'm more an FP guy. And my English is not very good =).
There is no "limit" to what you can store in a player class. As long as it is concerning him and him only, it should be in his class. But one thing you should consider is to make several player classes. The idea is : if you don't need is, don't query it. You may have PlayerView_Small, PlayerBuying, PlayerFighting, PlayerSettings (depending on your game, they may not be fulfilling the exact same purpose)... This way for each "need" of info on a player, you only load the player data you need, and can handle it properly. Also, you may use inheritance if some class is only a more detailed version of the other.
If you are talking about the class, it may be in a sub-class PlayerAttributes of which an instance is contained into PlayerFighting and PlayerView_Detailed. In the database, it might be interesting to store it as a string (conveniently outputted by our class, and accepted in constructor), to avoid having too much fields, but you will lose the sorting ability. That's probably not a problem in our case, but might be in some others.
Blank for now, I don't understand where there is synchronization, will edit when informed.
In your PlayerViewDetailInfo(or in your PlayerAllData depending what you need), you place some methods such as ToXmlClient1(), ToJson(), ToHumanReadableString() (although that might be a bit confusing to the eye, you should consider HTML^^). The class having the method should be the class with the least (but sufficient to provide the answer) data. When requested, you load the Player... which has the method giving the correct output, and you write it directly in the response.

Is it a bad practice to have multiple classes in the same file?

I used to have one class for one file. For example car.cs has the class car. But as I program more classes, I would like to add them to the same file. For example car.cs has the class car and the door class, etc.
My question is good for Java, C#, PHP or any other programming language. Should I try not having multiple classes in the same file or is it ok?
I think you should try to keep your code to 1 class per file.
I suggest this because it will be easier to find your class later. Also, it will work better with your source control system (if a file changes, then you know that a particular class has changed).
The only time I think it's correct to use more than one class per file is when you are using internal classes... but internal classes are inside another class, and thus can be left inside the same file. The inner classes roles are strongly related to the outer classes, so placing them in the same file is fine.
In Java, one public class per file is the way the language works. A group of Java files can be collected into a package.
In Python, however, files are "modules", and typically have a number of closely related classes. A Python package is a directory, just like a Java package.
This gives Python an extra level of grouping between class and package.
There is no one right answer that is language-agnostic. It varies with the language.
One class per file is a good rule, but it's appropriate to make some exceptions. For instance, if I'm working in a project where most classes have associated collection types, often I'll keep the class and its collection in the same file, e.g.:
public class Customer { /* whatever */ }
public class CustomerCollection : List<Customer> { /* whatever */ }
The best rule of thumb is to keep one class per file except when that starts to make things harder rather than easier. Since Visual Studio's Find in Files is so effective, you probably won't have to spend much time looking through the file structure anyway.
No I don't think it's an entirely bad practice. What I mean by that is in general it's best to have a separate file per class, but there are definitely good exception cases where it's better to have a bunch of classes in one file. A good example of this is a group of Exception classes, if you have a few dozen of these for a given group does it really make sense to have separate a separate file for each two liner class? I would argue not. In this case having a group of exceptions in one class is much less cumbersome and simple IMHO.
I've found that whenever I try to combine multiple types into a single file, I always end going back and separating them simply because it makes them easier to find. Whenever I combine, there is always ultimately a moment where I'm trying to figure out wtf I defined type x.
So now, my personal rule is that each individual type (except maybe for child classes, by which a mean a class inside a class, not an inherited class) gets its own file.
Since your IDE Provides you with a "Navigate to" functionality and you have some control over namespacing within your classes then the below benefits of having multiple classes within the same file are quite worth it for me.
Parent - Child Classes
In many cases i find it quite helpful to have Inherited classes within their Base Class file.
It's quite easy then to see which properties and methods your child class inherits and the file provides a faster overview of the overall functionality.
Public: Small - Helper - DTO Classes
When you need several plain and small classes for a specific functionality i find it quite redundant to have a file with all the references and includes for just a 4-8 Liner class.....
Code navigation is also easier just scrolling over one file instead of switching between 10 files...Its also easier to refactor when you have to edit just one reference instead of 10.....
Overall breaking the Iron rule of 1 class per file provides some extra freedom to organize your code.
What happens then, really depends on your IDE, Language,Team Communication and Organizing Skills.
But if you want that freedom why sacrifice it for an iron rule?
The rule I always go by is to have one main class in a file with the same name. I may or may not include helper classes in that file depending on how tightly they're coupled with the file's main class. Are the support classes standalone, or are they useful on their own? For example, if a method in a class needs a special comparison for sorting some objects, it doesn't bother me a bit to bundle the comparison functor class into the same file as the method that uses it. I wouldn't expect to use it elsewhere and it doesn't make sense for it to be on its own.
If you are working on a team, keeping classes in separate files make it easier to control the source and reduces chances of conflicts (multiple developers changing the same file at the same time). I think it makes it easier to find the code you are looking for as well.
It can be bad from the perspective of future development and maintainability. It is much easier to remember where the Car class is if you have a Car.cs class. Where would you look for the Widget class if Widget.cs does not exist? Is it a car widget? Is it an engine widget? Oh maybe it's a bagel widget.
The only time I consider file locations is when I have to create new classes. Otherwise I never navigate by file structure. I Use "go to class" or "go to definition".
I know this is somewhat of a training issue; freeing yourself from the physical file structure of projects requires practice. It's very rewarding though ;)
If it feels good to put them in the same file, be my guest. Cant do that with public classes in java though ;)
You should refrain from doing so, unless you have a good reason.
One file with several small related classes can be more readable than several files.
For example, when using 'case classes', to simulate union types, there is a strong relationship between each class.
Using the same file for multiple classes has the advantage of grouping them together visually for the reader.
In your case, a car and a door do not seem related at all, and finding the door class in the car.cs file would be unexpected, so don't.
As a rule of thumb, one class/one file is the way to go. I often keep several interface definitions in one file, though. Several classes in one file? Only if they are very closely related somehow, and very small (< 5 methods and members)
As is true so much of the time in programming, it depends greatly on the situation.
For instance, what is the cohesiveness of the classes in question? Are they tightly coupled? Are they completely orthogonal? Are they related in functionality?
It would not be out of line for a web framework to supply a general purpose widgets.whatever file containing BaseWidget, TextWidget, CharWidget, etc.
A user of the framework would not be out of line in defining a more_widgets file to contain the additional widgets they derive from the framework widgets for their specific domain space.
When the classes are orthogonal, and have nothing to do with each other, the grouping into a single file would indeed be artificial. Assume an application to manage a robotic factory that builds cars. A file called parts containing CarParts and RobotParts would be senseless... there is not likely to be much of a relation between the ordering of spare parts for maintenance and the parts that the factory manufactures. Such a joining would add no information or knowledge about the system you are designing.
Perhaps the best rule of thumb is don't constrain your choices by a rule of thumb. Rules of thumb are created for a first cut analysis, or to constrain the choices of those who are not capable of making good choices. I think most programmers would like to believe they are capable of making good decisions.
The Smalltalk answer is: you should not have files (for programming). They make versioning and navigation painful.
One class per file is simpler to maintain and much more clear for anyone else looking at your code. It is also mandatory, or very restricted in some languages.
In Java for instance, you cannot create multiple top level classes per file, they have to be in separate files where the classname and filename are the same.
(C#) Another exception (to one file per class) I'm thinking of is having List in the same file as MyClass. Where I envisage using this is in reporting. Having an extra file just for the List seems a bit excessive.

Resources