How to pupulate nested DB without hurting encapsulation? - database

I have the following situation (code is python, but this is not language specific):
I need to populate a tabular DB in a way equivalent to:
for i,a in enumerate(as):
for j,b in enumerate(a.bs):
for k,c in enumerate(b.cs):
db[i,j,k] = c.data()
This method forces c to publish its private Data, thus breaking its encapsulation.
I am aware some of this chain could be replaced by a.update_db, b.update_db and so on, but still at the very least, c would still have to allow access to its private data.
Another way to tackle this would be to allow C to update the DB like so:
for i,a in enumerate(as):
for j,b in enumerate(a.bs):
for k,c in enumerate(b.cs):
c.update_db_at(i,j,k)
Which would make C know i,j,k which it shouldn't really care about, and C's responsibility isn't anything about updating the DB, but rather representing some object.
This looks like a very common problem to me, and i'm sure there is some standard best practice for this.
What is a good way to populate a nested DB from a corresponding nested object structure that doesn't break encapsulation?

Related

Ruby: Hash, Arrays and Objects for storage information

I am learning Ruby, reading few books, tutorials, foruns and so one... so, I am brand new to this.
I am trying to develop a stock system so I can learn doing.
My questions are the following:
I created the following to store transactions: (just few parts of the code)
transactions.push type: "BUY", date: Date.strptime(date.to_s, '%d/%m/%Y'), quantity: quantity, price: price.to_money(:BRL), fees: fees.to_money(:BRL)
And one colleague here suggested to create a Transaction class to store this.
So, for the next storage information that I had, I did:
#dividends_from_stock << DividendsFromStock.new(row["Approved"], row["Value"], row["Type"], row["Last Day With"], row["Payment Day"])
Now, FIRST question: which way is better? Hash in Array or Object in Array? And why?
This #dividends_from_stock is returned by the method 'dividends'.
I want to find all the dividends that were paid above a specific date:
puts ciel3.dividends.find_all {|dividend| Date.parse(dividend.last_day_with) > Date.parse('12/05/2014')}
I get the following:
#<DividendsFromStock:0x2785e60>
#<DividendsFromStock:0x2785410>
#<DividendsFromStock:0x2784a68>
#<DividendsFromStock:0x27840c0>
#<DividendsFromStock:0x1ec91f8>
#<DividendsFromStock:0x2797ce0>
#<DividendsFromStock:0x2797338>
#<DividendsFromStock:0x2796990>
Ok with this I am able to spot (I think) all the objects that has date higher than the 12/05/2014. But (SECOND question) how can I get the information regarding the 'value' (or other information) stored inside the objects?
Generally it is always better to define classes. Classes have names. They will help you understand what is going on when your program gets big. You can always see the class of each variable like this: var.class. If you use hashes everywhere, you will be confused because these calls will always return Hash. But if you define classes for things, you will see your class names.
Define methods in your classes that return the information you need. If you define a method called to_s, Ruby will call it behind the scenes on the object when you print it or use it in an interpolation (puts "Some #{var} here").
You probably want a first-class model of some kind to represent the concept of a trade/transaction and a list of transactions that serves as a ledger.
I'd advise steering closer to a database for this instead of manipulating toy objects in memory. Sequel can be a pretty simple ORM if used minimally, but ActiveRecord is often a lot more beginner friendly and has fewer sharp edges.
Using naked hashes or arrays is good for prototyping and seeing if something works in principle. Beyond that it's important to give things proper classes so you can relate them properly and start to refine how these things fit together.
I'd even start with TransactionHistory being a class derived from Array where you get all that functionality for free, then can go and add on custom things as necessary.
For example, you have a pretty gnarly interface to DividendsFromStock which could be cleaned up by having that format of row be accepted to the initialize function as-is.
Don't forget to write a to_s or inspect method for any custom classes you want to be able to print or have a look at. These are usually super simple to write and come in very handy when debugging.
thank you!
I will answer my question, based on the information provided by tadman and Ilya Vassilevsky (and also B. Seven).
1- It is better to create a class, and the objects. It will help me organize my code, and debug. Localize who is who and doing what. Also seems better to use with DB.
2- I am a little bit shamed with my question after figure out the solution. It is far simpler than I was thinking. Just needed two steps:
willpay = ciel3.dividends.find_all {|dividend| Date.parse(dividend.last_day_with) > Date.parse('10/09/2015')}
willpay.each do |dividend|
puts "#{ciel3.code} has approved #{dividend.type} on #{dividend.approved} and will pay by #{dividend.payment_day} the value of #{dividend.value.format} per share, for those that had the asset on #{dividend.last_day_with}"
puts
end

Using "using" statements for every object implementing IDisposable?

I'm currently skimming through some code that reads Active Directory entries and manipulates them. Since I haven't had to do with this kind of stuff, I F12'd the classes (DirectoryEntry, SearchResultCollection, ...), and I found out they all implement the IDisposable interface, but I couldn't see any using blocks in our code.
Are those even necessary in this case (i.e., should I blindly refactor them in)?
Another question of mine regarding this (there are very many instantiated IDisposable objects in the code: Isn't IDisposable making stuff very "ugly" in this case? I mean, I like using statements as they basically free my mind from worrying about things, but in many cases the code has a layout similar to the following:
using (var one = myObject.GetSomeDisposableObject())
using (var two = myObject.GetSomeOtherDisposableObject())
{
one.DoSomething();
using (var foo = new DisposableFoo())
{
MyMethod(foo);
using (...)
using (...)
{
...
}
}
}
I feel that this becomes quite unreadable due to high indentation levels (even stacking the using statements). But extracting some of this into new methods can lead to many parameters that need to be passed, since naturally the "inner" code often needs the objects created in the using statements.
What is an elegant way to solve this without losing readability?
For the first part, this question refers to 'memory used by the task increasing constantly' when not disposing of AD references
For the second, a using block is syntactic sugar for a try/finally with the Dispose call in the finally block, which would be an alternative construct allowing you to dispose of everything in one place, reducing indentation

(De)serializing an object as an array in XStream

I'm trying to clean up some old code by replacing some arrays that were being passed around with proper objects to improve readability and to encapsulate some behaviour. I ran into a problem when it turned out the arrays were being run through XStream for persistence.
I need to retain the format of the serialization and the arrays in question are inside various other objects being (de)serialized through XStream. Is there and easy way to handle this?
I'm hoping there's an Annotation I can apply or simple XStream Converter I can write for my new classes and be done with it, but from what I can see it would require writing Converters for each of the containing classes instead. I'm not sure as I'm not familiar with XStream. If there isn't a easy solution I'm just going to have to give up and leave the arrays in place as I don't have the time budgeted for anything fancy or to learn the finer points of XStream.
Specifically, I have a TileLayer that has a member int[] metaTileFactors and I want to replace that with class MetaTiling which has members final int x and final int y and still have it serialize and deserialize to/from the same XML as before.

What's the difference between an object and a struct in OOP?

What distinguishes and object from a struct?
When and why do we use an object as opposed to a struct?
How does an array differ from both, and when and why would we use an array as opposed to an object or a struct?
I would like to get an idea of what each is intended for.
Obviously you can blur the distinctions according to your programming style, but generally a struct is a structured piece of data. An object is a sovereign entity that can perform some sort of task. In most systems, objects have some state and as a result have some structured data behind them. However, one of the primary functions of a well-designed class is data hiding — exactly how a class achieves whatever it does is opaque and irrelevant.
Since classes can be used to represent classic data structures such as arrays, hash maps, trees, etc, you often see them as the individual things within a block of structured data.
An array is a block of unstructured data. In many programming languages, every separate thing in an array must be of the same basic type (such as every one being an integer number, every one being a string, or similar) but that isn't true in many other languages.
As guidelines:
use an array as a place to put a large group of things with no other inherent structure or hierarchy, such as "all receipts from January" or "everything I bought in Denmark"
use structured data to compound several discrete bits of data into a single block, such as you might want to combine an x position and a y position to describe a point
use an object where there's a particular actor or thing that thinks or acts for itself
The implicit purpose of an object is therefore directly to associate tasks with the data on which they can operate and to bundle that all together so that no other part of the system can interfere. Obeying proper object-oriented design principles may require discipline at first but will ultimately massively improve your code structure and hence your ability to tackle larger projects and to work with others.
Generally speaking, objects bring the full object oriented functionality (methods, data, virtual functions, inheritance, etc, etc) whereas structs are just organized memory. Structs may or may not have support for methods / functions, but they generally won't support inheritance and other full OOP features.
Note that I said generally speaking ... individual languages are free to overload terminology however they want to.
Arrays have nothing to do with OO. Indeed, pretty much every language around support arrays. Arrays are just blocks of memory, generally containing a series of similar items, usually indexable somehow.
What distinguishes and object from a struct?
There is no notion of "struct" in OOP. The definition of structures depends on the language used. For example in C++ classes and structs are the same, but class members are private by defaults while struct members are public to maintain compatibility with C structs. In C# on the other hand, struct is used to create value types while class is for reference types. C has structs and is not object oriented.
When and why do we use an object as opposed to a struct?
Again this depends on the language used. Normally structures are used to represent PODs (Plain Old Data), meaning that they don't specify behavior that acts on the data and are mainly used to represent records and not objects. This is just a convention and is not enforced in C++.
How does an array differ from both,
and when and why would we use an
array as opposed to an object or a
struct?
An array is very different. An array is normally a homogeneous collection of elements indexed by an integer. A struct is a heterogeneous collection where elements are accessed by name. You'd use an array to represent a collection of objects of the same type (an array of colors for example) while you'd use a struct to represent a record containing data for a certain object (a single color which has red, green, and blue elements)
Short answer: Structs are value types. Classes(Objects) are reference types.
By their nature, an object has methods, a struct doesn't.
(nothing stops you from having an object without methods, jus as nothing stops you from, say, storing an integer in a float-typed variable)
When and why do we use an object as opposed to a struct?
This is a key question. I am using structs and procedural code modules to provide most of the benefits of OOP. Structs provide most of the data storage capability of objects (other than read only properties). Procedural modules provide code completion similar to that provided by objects. I can enter module.function in the IDE instead of object.method. The resulting code looks the same. Most of my functions now return stucts rather than single values. The effect on my code has been dramatic, with code readability going up and the number of lines being greatly reduced. I do not know why procedural programming that makes extensive use of structs is not more common. Why not just use OOP? Some of the languages that I use are only procedural (PureBasic) and the use of structs allows some of the benefits of OOP to be experienced. Others languages allow a choice of procedural or OOP (VBA and Python). I currently find it easier to use procedural programming and in my discipline (ecology) I find it very hard to define objects. When I can't figure out how to group data and functions together into objects in a philosophically coherent collection then I don't have a basis for creating classes/objects. With structs and functions, there is no need for defining a hierarchy of classes. I am free to shuffle functions between modules which helps me to improve the organisation of my code as I go. Perhaps this is a precursor to going OO.
Code written with structs has higher performance than OOP based code. OOP code has encapsulation, inheritance and polymorphism, however I think that struct/function based procedural code often shares these characteristics. A function returns a value only to its caller and only within scope, thereby achieving encapsulation. Likewise a function can be polymorphic. For example, I can write a function that calculates the time difference between two places with two internal algorithms, one that considers the international date line and one that does not. Inheritance usually refers to methods inheriting from a base class. There is inheritance of sorts with functions that call other functions and use structs for data transfer. A simple example is passing up an error message through a stack of nested functions. As the error message is passed up, it can be added to by the calling functions. The result is a stack trace with a very descriptive error message. In this case a message inherited through several levels. I don't know how to describe this bottom up inheritance, (event driven programming?) but it is a feature of using functions that return structs that is absent from procedural programming using simple return values. At this point in time I have not encountered any situations where OOP would be more productive than functions and structs. The surprising thing for me is that very little of the code available on the internet is written this way. It makes me wonder if there is any reason for this?
Arrays are ordered collection of items that (usually) are of the same types. Items can be accessed by index. Classic arrays allow integer indices only, however modern languages often provide so called associative arrays (dictionaries, hashes etc.) that allow use e.g. strings as indices.
Structure is a collection of named values (fields) which may be of 'different types' (e.g. field a stores integer values, field b - string values etc.). They (a) group together logically connected values and (b) simplify code change by hiding details (e.g. changing structure layout don't affect signature of function working with this structure). The latter is called 'encapsulation'.
Theroretically, object is an instance of structure that demonstrates some behavior in response to messages being sent (i.e., in most languages, having some methods). Thus, the very usefullness of object is in this behavior, not its fields.
Different objects can demonstrate different behavior in response to the same messages (the same methods being called), which is called 'polymorphism'.
In many (but not all) languages objects belong to some classes and classes can form hierarchies (which is called 'inheritance').
Since object methods can work with its fields directly, fields can be hidden from access by any code except for this methods (e.g. by marking them as private). Thus encapsulation level for objects can be higher than for structs.
Note that different languages add different semantics to this terms.
E.g.:
in CLR languages (C#, VB.NET etc) structs are allocated on stack/in registers and objects are created in heap.
in C++ structs have all fields public by default, and objects (instances of classes) have all fields private.
in some dynamic languages objects are just associative arrays which store values and methods.
I also think it's worth mentioning that the concept of a struct is very similar to an "object" in Javascript, which is defined very differently than objects in other languages. They are both referenced like "foo.bar" and the data is structured similarly.
As I see it an object at the basic level is a number of variables and a number of methods that manipulate those variables, while a struct on the other hand is only a number of variables.
I use an object when you want to include methods, I use a struct when I just want a collection of variables to pass around.
An array and a struct is kind of similar in principle, they're both a number of variables. Howoever it's more readable to write myStruct.myVar than myArray[4]. You could use an enum to specify the array indexes to get myArray[indexOfMyVar] and basically get the same functionality as a struct.
Of course you can use constants or something else instead of variables, I'm just trying to show the basic principles.
This answer may need the attention of a more experienced programmer but one of the differences between structs and objects is that structs have no capability for reflection whereas objects may. Reflection is the ability of an object to report the properties and methods that it has. This is how 'object explorer' can find and list new methods and properties created in user defined classes. In other words, reflection can be used to work out the interface of an object. With a structure, there is no way that I know of to iterate through the elements of the structure to find out what they are called, what type they are and what their values are.
If one is using structs as a replacement for objects, then one can use functions to provide the equivalent of methods. At least in my code, structs are often used for returning data from user defined functions in modules which contain the business logic. Structs and functions are as easy to use as objects but functions lack support for XML comments. This means that I constantly have to look at the comment block at the top of the function to see just what the function does. Often I have to read the function source code to see how edge cases are handled. When functions call other functions, I often have to chase something several levels deep and it becomes hard to figure things out. This leads to another benefit of OOP vs structs and functions. OOP has XML comments which show up as tool tips in the IDE (in most but not all OOP languages) and in OOP there are also defined interfaces and often an object diagram (if you choose to make them). It is becoming clear to me that the defining advantage of OOP is the capability of documenting the what code does what and how it relates to other code - the interface.

OOP Best Practices When One Object Needs to Modify Another

(this is a C-like environment) Say I have two instance objects, a car and a bodyShop. The car has a color iVar and corresponding accesors. The bodyShop has a method named "paintCar" that will take in a car object and change its color.
As far as implementation, in order to get the bodyShop to actually be able to change a car object's color, I see two ways to go about it.
Use the "&" operator to pass in a pointer to the car. Then the bodyShop can either tell the car to perform some method that it has to change color, or it can use the car's accessors directly.
Pass in the car object by value, do the same sort of thing to get the color changed, then have the method return a car object with a new color. Then assign the original car object to the new car object.
Option 1 seems more straightforward to me, but I'm wondering if it is in-line with OOP best practices. In general for "maximum OOP", is the "&" operator good or bad? Or, maybe I'm completely missing a better option that would make this super OOPer. Please advise :)
Option 1 is prefered:
The bodyShop can either tell the car
to perform some method that it has to
change color, or it can use the car's
accessors directly.
Even better still...create an IPaintable interface. Have Car implement IPaintable. Have BodyShop depend on IPaintable instead of Car. The benefits of this are:
Now BodyShop can paint anything that implements IPaintable (Cars, Boats, Planes, Scooters)
BodyShop is no longer tightly coupled to Car.
BodyShop has a more testable design.
I would assume that the responsibility of the bodyShop is to modify car objects, so #1 seems like the right way to go to me. I've never used a language where the "&" operator is necessary. Normally, my bodyShop object would call car.setColor(newColor) and that would be that. This way you don't have to worry about the rest of the original car's attributes, including persistence issues - you just leave them alone.
Since you're interested in the best OOP practice, you should ignore the performance hit you get with option 2. The only things you should be interested in is do either option unnecessarily increase coupling between the two classes, is encapsulation violated and is identity preserved.
Given this, option 2 is less desirable since you can't determine which other objects are holding references to the original car or worse, contain the car. In short you violate the identity constraint since two objects in the system may have different ideas of the state of the car. You run the risk of making the overall system inconsistent.
Of-course your particular environment may avoid this but it certainly would be best practice to avoid it.
Last point, does your bodyShop object have state; behaviour and identity? I realise that you have explained only the minimum necessary but possibly the bodyShop isn't really an object.
Functional v OO approaches
As an interesting aside, option 2 would close to the approach in a functional programming environment - since state changes are not allowed, your only approach would be to create a new car if it's colour changed. That's not quite what you're suggesting but it's close.
That may sound like complete overkill but it does have some interesting implications for proving the correctness of the code and parallelism.
Option 1 wins for me. The & operator is implicit in many OO languages (like Java, Python etc). You don't use "passing by value" in that languages often - only primitive types are passed in that way.
Option 2 comes with multiple problems: You might have a collection of cars, and some function unaware of it might send a car to bodyShop for painting, receive new car in return and don't update your collection of cars. See? And from more ideologic point of view - you don't create new object each time you want to modify it in real world - why should you do so in virtual one? This will lead to confusion, because it's just counterintuitive. :-)
I am not sure what this "C-like environment" mean. In C, you need this:
int paintCar(const bodyShop_t *bs, car_t *car);
where you modify the contents pointed by car. For big struct in C, you should always pass the pointer, rather than the value to a function. So, use solution 1 (if by "&" you mean the C operator).
I too agree with the first 1. I can't say it's best practice because i'm never really sure what best practice is in other peoples minds... I can tell you that best practice in my mind is the most simple method that works for the job. I've also seen this aproach taken in the hunspell win api and other c-ish api's that i've had to use. So yea i agree with scott.
http://hunspell.sourceforge.net/
//just in-case your interested in looking at other peoples code
It depends on whether the body shop's method can fail and leave the car in an indeterminate state. In that case, you're better off operating on a copy of the car, or a copy of all relevant attributes of the car. Then, only when the operation succeeds, you copy those values to the car. So you end up assigning the new car to the old car within the body shop method. Doing this correctly is necessary for exception safety in C++, and can get nasty.
It's also possible and sometimes desirable to use the other pattern - returning a new object on modification. This is useful for interactive systems which require Undo/Redo, backtracking search, and for anything involving modelling how a system of objects evolves over time.
In addition to other optinions, option 1 lets paintCar method return a completion code that indicates if the car has changed the color successfully or there were problems with it

Resources