Is there a reason to encapsulate if I am the only one using my code? - encapsulation

I understand that we encapsulate data to prevent things from being accessed that don't need to be accessed by developers working with my code. However I only program as a hobby and do not release any of code to be used by other people. I still encapsulate, but it mostly just seems to me like I'm just doing it for the sake of good policy and building the habit. So, is there any reason to encapsulate data when I know I am the only one who will be using my code?

Encapsulation not only about hiding data.
It is also about hiding details of implementation.
When such details are hidden, that forces you to use defined class API and the class is only who can change it inside.
So just imagine a situation, when you opened all methods to any class interested in them and you have a function that performs some calculations. And you've just realized that you want to replace it because the logic is not right, or you want to perform some complicated calculations.
In such cases sometimes you have to change all the places across your application to change the result instead of changing it in only one place, in API, that you provided.
So don't make everything public, it leads to strong coupling and pain during update process.

Encapsulation is not only creating "getters" and "setters", but also exposing a sort of API to access the data (if needed).
Encapsulation lets you keep access to the data in one place and allow you to manage it in a more "abstract" way, reducing errors and making your code more maintainable.
If your personal projects are simple and small, you can do whatever you feel like in order to produce fast what you need, but bear in mind the consequences ;)

I don't think unnecessary data access can happen only by third party developers. It can happen by you as well right? When you allow direct access to data through access rights on variables/properties, whoever is working with that, be it you, or someone else may end up creating bugs by accessing data directly.

Related

Why do so many React/Flux tutorials advocate multiple stores?

I've been looking at the Baobab library and am very attracted to the "single-tree" approach, which I interpret as essentially a single store. But so many Flux tutorials seem to advocate many stores, even a "store per entity." Having multiple stores seems to me to present all kinds of concurrency issues. My question is, why is single store a bad idea?
It depends on what want to do and how big is your project. There are a few reason why having several stores is a good idea:
If your project is not so small afterall you may end up with a huge 2000/3000lines store and you don't want that. That's the point of writing modules in general. You want to avoid files bigger than 1000lines (and below 500 is even nicer :) ).
Writing everything in one store makes that you can't enjoy the dependency management with the dispatcher using the waitFor function.It's gonna be harder to check dependencies and potential circular dependencies between your models (since they are all in one store). I would suggest you take a look at https://facebook.github.io/flux/docs/chat.html for that.
It's harder to read. With several store you can at one glance figure out what type of data you have and with a constant file for the dispatcher events you can see all your events.
So it's possible to keep everything in one store and it may work perfectly but if your project grows you may regret it badly and rewrite everything in several modules/store. Just my opinion I prefer to have clean modules and data workflows.
Hope it helps!
From my experience, working with a single store is definitely not a bad idea. It has some advantages, such as:
A single store to access all data can make it easier to query and make relationships about different pieces of data. Using multiple stores can make this a little bit more difficult (definitely not impossible though).
It will be easier to make atomic updates to the application state (aka data store).
But the way you implement the Flux pattern will influence on your experience with a single data store. The folks at Facebook have been experimenting with this and it seems like they encourage the use of a single data store with their new Relay+GraphQL stuff (read more about it here: http://facebook.github.io/react/blog/2015/02/20/introducing-relay-and-graphql.html).

Why use a cache

In AngularJS you can create a cache.
This is essentially nothing more than the equivalent of an ArrayList<T> in Java where you can add/remove items. In Javascript though, you have push and pop to add/remove stuff from an array.
So why would you want to use AngularJS' cache?
https://docs.angularjs.org/api/ng/service/$cacheFactory
You do it for the purpose of reuse and abstraction.
The cache will only exist once, but if you implement it in every controller or service, you are duplicating the same code over and over, making it harder to maintain.
It's also an abstraction, you are basicly creating a module with an interface, which makes your services independent from the implementation behind it.
In example, you could have a cache item that expires, you could either a write the code/logic to remove the expired ones in every single service, or you could have it in a single module you call cache. This way you are reusing code, and making it easy to maintain.
Wether you use theirs, or make your own doesn't matter, the principals are the same.
One reason to use the Angular cache, is so that you don't have to write the same boiler plate everyone else have already implemented a thousand times. You can go straight to your domain and business logic
You also get certain benifits from using a Cache Module.
You don't have to care about the implementation behind the interface. (Program against an interface, not an implementation).
You can inject a different module with different logic behind it, but with the same interface, with DI.
It's easy to maintain.
You can easily extend and expand it, eg. add expiration.
It's easier to make test stubs (see point 2.).
You can easily reuse the module.
The logic is in it's rightful place, and not scattered around.

Clojure database interaction - application design/approach

I hope this question isn't too general or doesn't make sense.
I'm currently developing a basic application that talks to an SQLite database, so naturally I'm using the clojure.java.jdbc library (link) to interact with the DB.
The trouble is, as far as I can tell, the way you insert data into the DB using this library is by simply passing a map (e.g. {:id 1 :name "stackoverflow"} and a table name (e.g. :website)
The thing that that I'm concerned about is how can I make this more robust in the wider context of my application? What I mean by this is when I write data to the database and retrieve it, I want to use the same formatted map EVERYWHERE in the application, so from the data access layer (returning or passing in maps) all the way up to the application layer where it works on the data and passes it back down again.
What I'm trying to get at is, is there an 'idiomatic' clojure equivalent of JavaBeans?
The problem I'm having right now is having to repeat myself by defining maps manually with column names etc - but if I change the structure of my table in the DB, my whole application has to be changed.
As far as I know, there really isn't such a library. There are various systems that make it easier to write queries, but not AFAIK, anything that "fixes" your data objects.
I've messed around trying to write something like you propose myself but I abandoned the project since it became very obvious very quickly that this is not at all the right thing to do in a clojure system (and actually, I tend to think now that the approach has only very limited use even in languages that have really "fixed" data structures).
Issues with the clojure collection system:
All the map access/alteration functions are really functional. That
means that alterations on a map always return a new object, so it's
nearly impossible to create a forcibly fixed map type that's also
easy to use in idiomatic clojure.
General conceptual issues:
Your assumption that you can "use the same formatted map EVERYWHERE
in the application, so from the data access layer (returning or
passing in maps) all the way up to the application layer where it
works on the data and passes it back down again" is wrong if your
system is even slightly complex. At best, you can use the map from
the DB up to the UI in some simple cases, but the other way around is
pretty much always the wrong approach.
Almost every query will have its own result row "type"; you're
probably not going to be able to re-use these "types" across queries
even in related code.
Also, forcing these types on the rest of the program is probably
binding your application more strictly to the DB schema. If your
business logic functions are sane and well written, they should only
access as much data as they need and no more; they should probably
not use the same data format everywhere.
My serious answer is; don't bother. Write your DB access functions for the kinds of queries you want to run, and let those functions check the values moving in and out of the DB as much detail as you find comforting. Do not try to forcefully keep the data coming from the DB "the same" in the rest of your application. Use assertions and pre/post conditions if you want to check your data format in the rest of the application.
Clojure favour the concept of Few data structure and lots of functions to work on these few data structures. There are few ways to create new data structure (which I guess internally uses the basic data structures) like defrecord etc. But again if you are able to use them that won't really solve the problem that DB schema changes should effect the code less as you will eventually have to go through layers to remove/add the effects of the schema changes, because anywhere you are reading/creating that data that needs to be changed

Has inheritance become bad?

Personally, I think inheritance is a great tool, that, when applied reasonably, can greatly simplify code.
However, I seems to me that many modern tools dislike inheritance. Let's take a simple example: Serialize a class to XML. As soon as inheritance is involved, this can easily turn into a mess. Especially if you're trying to serialize a derived class using the base class serializer.
Sure, we can work around that. Something like a KnownType attribute and stuff. Besides being an itch in your code that you have to remember to update every time you add a derived class, that fails, too, if you receive a class from outside your scope that was not known at compile time. (Okay, in some cases you can still work around that, for instance using the NetDataContract serializer in .NET. Surely a certain advancement.)
In any case, the basic principle still exists: Serialization and inheritance don't mix well. Considering the huge list of programming strategies that became possible and even common in the past decade, I feel tempted to say that inheritance should be avoided in areas that relate to serialization (in particular remoting and databases).
Does that make sense? Or am messing things up? How do you handle inheritance and serialization?
There are indeed a few gotcha with inheritance and serialization. One is that it leads to an asymmetry between serialization/deserialization. If a class is subclassed, this will work transparently during serialization, but will fail during deserialization, unless the deserialization is made aware of the new class. That's why we have tags such as #SeeAlso to annotate data for XML serialization.
These problems are however not new about inheritance. It's frequently discussed under the terminology open/closed world. Either you consider you know the whole world and classes, or you might be in a case where new classes are added by third-parties. In a closed world assumption, serialization isn't much a problem. It's more problematic in an open world assumption.
But inheritance and the open world assumption have anyway other gotchas. E.g. if you remove a protected method in your classes and refactor accordingly, how can you ensure that there isn't a third party class that was using it? In an open world, the public and internal API of your classes must be considered as frozen once made available to other. And you must take great care to evolve the system.
There are other more technical internal details of how serialization works that can be surprising. That's for Java, but I'm pretty sure .NET has similarities. E.g. Serialization Killer, by Gilad Bracha, or Serialization and security manager bug exploit.
I ran into this on my current project and this might not be the best way, but I created a service layer of sorts for it and its own classes. I think it came out being named ObjectToSerialized translator and a couple of interfaces. Typically this was a one to one (the "object" and "serialized" had the exact same properties) so adding something to the interface would let you know "hey, add this over here too".
I want to say I had a IToSerialized interface with a simple method on it for generic purposes and used automapper for most of the conversions. Sure, it's a bit more code but hey whatever, it worked and doesn't gum up other things.

Is there value in producing code so flexible that it will never need to be updated?

I am currently involved in a debate with my coworkers surrounding how I should design an API that will be used by my department. Specifically, I am tasked with writing an API that will serve as a wrapper facade to access Active Directory information - tailored to my company's/department's needs. I am aware that open source wrappers facades already exist but that is not the crux of this question and is merely being used to serve as an example.
When I presented my design proposal to my team, they shot me down because the API was not "configurable" enough. They claimed that they did not want the API to make the link between "Phone number" and <Obscure Active Directory representation of Phone number>. Every person in the meeting (except for me) agreed that they would prefer to ask around, "What is the correct field in Active Directory to use for the user's phone number?", and plug that into their respective apps (LOL!).
They asked me, "What if our company decides to use a different field for phone number and you weren't around to make the change in your source code?" They eventually admitted that they were afraid to be tasked with changing someone else's source code, even if the code was pristine and had extensive unit tests. Every senior IT person in my department was in agreement on this.
Is this really the correct attitude to have when designing software?!
http://en.wikipedia.org/wiki/Inner_platform_effect
While hard-coding too many assumptions into your program is bad, overzealously avoiding hard-coded assumptions can be just as bad. If you try to make code excessively flexible, it becomes essentially impossible to configure, as the configuration scheme becomes almost a programming language in itself. I think in general, phone number is a common enough thing that it can just be hard coded as a field.
If I understood correctly, they want to have the option of mapping the links outside the code, be it through a configuration file, a database, whatever. If that is correct, I think they have a valid point - why be forced to change any code at all if all you need to do is to change a configuration mapping.
If possible, you should always err on the side of more configurable. It will save you headaches later.
Column Names
Specifically in your case, columns in tables are an inherently non-static variable. They will commonly change as your needs change.
If you have a "phonenum" column, then they add a second phone number, they change the column to "phonenum1" and "phonenum2". It would need to be changed in the code. Then if they change them to "Home_Phone", "Work_Phone", "Cell_Phone" then the code would again have to be changed. If, however, you had a mapping file (a key/value config file) then all these changes would be extremely simple to make.
In General
I disagree with dsimcha that an application can be 'too configurable'. What he is talking about is 'feature bloat', where there are so many intertwining configurables that it becomes impossible to change any one without futzing all the others. This is a very real problem. However, the problem is not the number of configuring options, the problem is how they are presented to the user.
If you present all the configuration options in a concise, clear, streamlined manner. There should be comments to explain each one, and how it interacts with the others. In that case, you can have as many configuration variables as you want, because you have been careful to keep them segregated into singles or pairs, and have marked them as such.
You should be writing applications so that external (environmental) changes do NOT require code changes. Things such as
Database user password changes
Column names change
"Temp folder" location changes
Target Machine name/ip change
App needs to be run twice a day instead of once
Logging levels
None of those changes affect the function of the application and so there should be NO CODE CHANGES required. That is the metric you should use if you ever wonder whether hard-coding is all right.
If the functionality needs to change, it should be a code change. Otherwise, make it configurable.
It seems easy enough to do both: produce a flexible API which allows the field to be specified, and then a wrapper around it which knows about the obscure ActiveDirectory name.
Of course, you could build that flexible solution later and just hard code the name for the moment. If that's significantly easier than the two-pronged approach, it's reasonable to argue for it - but if you'd probably end up with that sort of separation internally anyway, then it doesn't do much harm to expose it.
I can honestly say I have been in your position before and I agree with the argument they are presenting you. Especially with an in-house app you will see feature creep. The more useful your application, the worse the feature creep. It is possible your application could be used in another office and they will have fields mapped differently than your current office. If you hard code mappings you are then stuck with different versions for different locations. Maintaining separate versions of source code quickly becomes a nightmare for a programmer. If you design in configurability now and your application is forgotten you have lost very little, but if your application becomes a standard across the company you have saved yourself an immense amount of time in the future.
Fear of change, as well as fear of accountability for making a change, is not uncommon in IT software organizations. Often, the culture in the organization is what drives this attitude.
Having said that, in your specific example, you are describing an API that provides a facade on top of the ActiveDirectory service - one that appears to be intended to be used by different groups and/or projects in your organization.
In that particular scenario, it makes sense to make portions of your API support configurability, since you may ultimately find that the requirements for different projects or groups could be different, or change over time.
As a general practice, when I build code that involves a mapping of one programming interface to another and there are data mapping considerations involved, I try to make the mapping configurable. I've found that this helps both unit testing as well as dealing with evolving requirements or contradictory requirements from different consumers.
If you're saying "should I hard code everything", then I think it's not a good idea.
In 2 years you will be gone and there will be a programmer that will waste a lot of time trying to update your legacy code when updating a configuration file would have been way easier.
In some cases it makes sense to hard code information, but I' don't think that your situation is one of these cases. I'd need more knowledge of the situation to be sure, this is just my guess from what you said.
I think it depends on why the API is being created, and what problems you're aiming to solve. If the aim of the API is to be a service that lives on a server somewhere and manages requests from different applications, then I think your approach is probably the way to go, with the addition of a database or config files to perhaps customize the LDAP paths of certain properties.
However, if the goal of the API is to simple be a set of classes that abstract away the details of accessing Active Directory, but not what properties are being accessed, then what your coworkers have specified is the way to go.
Either approach isn't necessarily right or wrong, so it ultimately depends on your overall reasons for creating the API in the first place.

Resources