Does a stack state in the visitor break the visitor pattern? - visitor-pattern

I need to process the AST of a language, and a visitor on the tree just solves it nice. however some features would require that I kept some kind of stack (the stack of known variables) in the visitors permanent context, that is extended and reduced as the visit progresses. Does it break the visitor pattern?

Visitors can accumulate information during their visits – in fact the Visit implementation is the addition state that might be required with complex operations (like when expression tree nodes are far away from each other and still need to be know of each other)…
So it is safe to say that you can store a state (even in the form of a stack) in the Visitor as long as you don’t store any kind of information on the processed/visited nodes themselves

Related

What's the difference between using the Visitor pattern and a separate class?

I would like to know what is the difference between the Visitor pattern and using a static method to execute code in separation.
Let's take a look at an example where I might call the Visitor pattern:
new AnalyticsVisitor.accept(myClass);
and this when called in from myClass for example, would move the work into a visitor to execute. It would even garbage collect faster if it's memory intensive.
Now lets take a look at using a simple method to achieve more or less the same thing:
new AnalyticsManager.execute(myClass);
Have I achieved the same thing?
I have code separation.
I can apply this to several data structures
I can add info to legacy code without changing it.
So why use the Visitor pattern instead of just a class (unless for double dispatch)?
This question is still a little confused. I suspect you haven't understood the goal of the Visitor pattern.
As discussed here the visitor pattern is useful when you have complex data structure (such as a parse tree) that is relatively stable (in terms of development), but you want to be able to keep adding new operations on all of its elements. This is clumsy with standard OO techniques.
The technology the visitor pattern is based on is double-dispatch, so when you say "Why use the Visitor pattern unless for double-dispatch?" you are effectively saying "Why use the visitor pattern?"
Your example code only includes the client, so it isn't clear what your new technique actually offers.
The supplied code appears to be backwards for a real visitor pattern. It should be:
my_datastructure.accept(analytics_visitor);
where analytics_visitor inherits from MyDataStructureVisitor, and supplies individual methods for each of the element types that the data structure can hold.
As for the achievements:
"Code separation" is a vague term. The visitor pattern allows the data structure to be defined without all the the operations (putative methods) to be defined. Instead, they can be defined separately - with a cost of poorer encapsulation.)
It isn't clear what it means to apply a visitor pattern to several data structures. Each visitor class is associated with one data structure.
The goal isn't to add 'info' to legacy code. It is to add operations to legacy code.

How to store a stack or long array in database?

I am implementing a depth first tree traversal code a large tree. It's single traversal process can span several days because of the long processing time at each node and in between the system might crash or shutdown.
Therefore I want to make the whole process resumable if it the process stops in between for some reason. For that reason I am planning make the whole process backed by persistent datastore which essentially stores the state of the process.
As I figured out that for depth first traversal I will need a Stack type of data structure and which can be realized through a linked list type of array implementation. So my question is if there is some datastore which provides the ability to persist large array to maintain the order of the entities to represent a stack by it. Or if there is some other way through which I can maintain the state of my traversal in a persistent storage.
Thanks.
IMHO: You can implement a custom class of stack behavior using link list. This custom class should be serializable. Storing the state of object intermittently. So even when the system crashes you will loose some data and recreate the complete structure by de-serializing the object from persistent store.

Real name of the "container_of" pattern

E.g. in Linux driver development one can find the container_of macro. In essence it is the reverse operator to an ->, yielding the pointer to the containing structure if you got a pointer to a member.
Besides from Greg Kroah's blog I found this pattern in the list and hash implementation of Pintos.
The real name of this pattern is "container_of()." Attempting to fit this C-ism into a Java or C++ design pattern taxonomy is futile. The point is not to chain responsibility, or to designate or delegate anything. If you must think in these terms then it's a "messy generalized inheritance." If you don't have to think in these terms then it's a lot less messy.
I'd say it's a not-very featureful Chain Of Responsibility. The only reason you need a pointer back to your parent container structure is to place parent container functionality within reach of the contained elements. As such, it could be seen as an implementation detail required to allow a request to trickle up the "chain" until it gets handled at the correct "level".
With a container / contained relationship, that "correct" level is just one level up, and the trickle up doesn't go through enough levels (since there is only one level) to generate much interest as an ideal example of the pattern. Still, the general ideas behind Chain of Responsibility still hold; a request is made at a point in the chain which cannot handle it, and is handled at a different point in the change which can.
With a small non-generic container / contained relationship, the coupling of this two link chain can get quite tight. For example, your examples lack of a generic "command" handling framework (since the command language set is small), and such a framework generally requires (for type safety) a Command / Message Object. That's a lot of overhead, for a list that just wants to let it's elements directly notify at the element level that they want to be removed from the list.
And yes, there is a C2 pattern's page for it... If you agree with my reasoning.

Building a NetHack bot: is Bayesian Analysis a good strategy?

A friend of mine is beginning to build a NetHack bot (a bot that plays the Roguelike game: NetHack). There is a very good working bot for the similar game Angband, but it works partially because of the ease in going back to the town and always being able to scum low levels to gain items.
In NetHack, the problem is much more difficult, because the game rewards ballsy experimentation and is built basically as 1,000 edge cases.
Recently I suggested using some kind of naive bayesian analysis, in very much the same way spam is created.
Basically the bot would at first build a corpus, by trying every possible action with every item or creature it finds and storing that information with, for instance, how close to a death, injury of negative effect it was. Over time it seems like you could generate a reasonably playable model.
Can anyone point us in the right direction of what a good start would be? Am I barking up the wrong tree or misunderstanding the idea of bayesian analysis?
Edit: My friend put up a github repo of his NetHack patch that allows python bindings. It's still in a pretty primitive state but if anyone's interested...
Although Bayesian analysis encompasses much more, the Naive Bayes algorithm well known from spam filters is based on one very fundamental assumption: all variables are essentially independent of each other. So for instance, in spam filtering each word is usually treated as a variable so this means assuming that if the email contains the word 'viagra', that knowledge does affect the probability that it will also contain the word 'medicine' (or 'foo' or 'spam' or anything else). The interesting thing is that this assumption is quite obviously false when it comes to natural language but still manages to produce reasonable results.
Now one way people sometimes get around the independence assumption is to define variables that are technically combinations of things (like searching for the token 'buy viagra'). That can work if you know specific cases to look for but in general, in a game environment, it means that you can't generally remember anything. So each time you have to move, perform an action, etc, its completely independent of anything else you've done so far. I would say for even the simplest games, this is a very inefficient way to go about learning the game.
I would suggest looking into using q-learning instead. Most of the examples you'll find are usually just simple games anyway (like learning to navigate a map while avoiding walls, traps, monsters, etc). Reinforcement learning is a type of online unsupervised learning that does really well in situations that can be modeled as an agent interacting with an environment, like a game (or robots). It does this trying to figure out what the optimal action is at each state in the environment (where each state can include as many variables as needed, much more than just 'where am i'). The trick then is maintain just enough state that helps the bot make good decisions without having a distinct point in your state 'space' for every possible combination of previous actions.
To put that in more concrete terms, if you were to build a chess bot you would probably have trouble if you tried to create a decision policy that made decisions based on all previous moves since the set of all possible combinations of chess moves grows really quickly. Even a simpler model of where every piece is on the board is still a very large state space so you have to find a way to simplify what you keep track of. But notice that you do get to keep track of some state so that your bot doesn't just keep trying to make a left term into a wall over and over again.
The wikipedia article is pretty jargon heavy but this tutorial does a much better job translating the concepts into real world examples.
The one catch is that you do need to be able to define rewards to provide as the positive 'reinforcement'. That is you need to be able to define the states that the bot is trying to get to, otherwise it will just continue forever.
There is precedent: the monstrous rog-o-matic program succeeded in playing rogue and even returned with the amulet of Yendor a few times. Unfortunately, rogue was only released an a binary, not source, so it has died (unless you can set up a 4.3BSD system on a MicroVAX), leaving rog-o-matic unable to play any of the clones. It just hangs cos they're not close enough emulations.
However, rog-o-matic is, I think, my favourite program of all time, not only because of what it achieved but because of the readability of the code and the comprehensible intelligence of its algorithms. It used "genetic inheritance": a new player would inherit a combination of preferences from a previous pair of successful players, with some random offset, then be pitted against the machine. More successful preferences would migrate up in the gene pool and less successful ones down.
The source can be hard to find these days, but searching "rogomatic" will set you on the path.
I doubt bayesian analysis will get you far because most of NetHack is highly contextual. There are very few actions which are always a bad idea; most are also life-savers in the "right" situation (an extreme example is eating a cockatrice: that's bad, unless you are starving and currently polymorphed into a stone-resistant monster, in which case eating the cockatrice is the right thing to do). Some of those "almost bad" actions are required to win the game (e.g. coming up the stairs on level 1, or deliberately falling in traps to reach Gehennom).
What you could try would be trying to do it at the "meta" level. Design the bot as choosing randomly among a variety of "elementary behaviors". Then try to measure how these bots fare. Then extract the combinations of behaviors which seem to promote survival; bayesian analysis could do that among a wide corpus of games along with their "success level". For instance, if there are behaviors "pick up daggers" and "avoid engaging monsters in melee", I would assume that analysis would show that those two behaviors fit well together: bots which pick daggers up without using them, and bots which try to throw missiles at monsters without gathering such missiles, will probably fare worse.
This somehow mimics what learning gamers often ask for in rec.games.roguelike.nethack. Most questions are similar to: "should I drink unknown potions to identify them ?" or "what level should be my character before going that deep in the dungeon ?". Answers to those questions heavily depend on what else the player is doing, and there is no good absolute answer.
A difficult point here is how to measure the success at survival. If you simply try to maximize the time spent before dying, then you will favor bots which never leave the first levels; those may live long but will never win the game. If you measure success by how deep the character goes before dying then the best bots will be archeologists (who start with a pick-axe) in a digging frenzy.
Apparently there are a good number of Nethack bots out there. Check out this listing:
In nethack unknown actions usually have a boolean effect -- either you gain or you loose. Bayesian networks base around "fuzzy logic" values -- an action may give a gain with a given probability. Hence, you don't need a bayesian network, just a list of "discovered effects" and wether they are good or bad.
No need to eat the Cockatrice again, is there?
All in all it depends how much "knowledge" you want to give the bot as starters. Do you want him to learn everything "the hard way", or will you feed him spoilers 'till he's stuffed?

HCI: make the user wait through everything up front, or amortize?

I'm writing a silverlight app that queries a web service to populate a tree control. Each element will have at least 2 levels of children, so something like this:
a
+-b
+-c
d
+-g
+-h
e
+-i
+-j
f
+-k
+-l
The web service API is such that I can only get one level of child nodes at a time, so the first trip, I can get a,d,e,f. To get b,g,i,k, I have to make 4 trips. Similarly, I have to make 4 more trips to get c,h,j,l. (The service does actually allow me to get all the nodes in one trip, but it doesn't give me parent-child relationships along with it :-()
My question is this: should I make the user wait for a while up front while I get all the nodes for the tree view, or should I just get the top few nodes, and get the other nodes on-demand, or in a background task? Also, the nodes can change asynchronously, so if I get all the nodes up front, I'll need a "refresh" button for the treeview, and if I do it on demand, I'll have to have a caching strategy.
Which is best for the user?
A compromise where you load the first level up front and then load the remaining items in the background overridden by on-demand as required. If you load the nodes breadth first (e.g. a,d,e,f then b,g,i,k) rather than depth first (e.g. a,d,e,f followed by b,c) you can redirect your loading to be focused on the most recently expanded node.
Personally, as a user, I would prefer all the data to be loaded up front so that once the application finishes loading I can trust that I won't have to wait anymore (or at least very little)
But, I suppose it depends on several traits of your application / data:
How dynamic is the data? Does it update more often then the rate at which the user explores the nodes? If it does, then you will have to read the data as the user explores it, otherwise you can probably get away with only updating it occasionally and checking for the freshest data before performing important operations.
How much of the data will the user explore during normal use? If they are constantly exploring throughout the entire tree, then having the entire tree loaded is important. On the other hand, if most users will usually only expand a small portion of the tree, then maybe loading on demand is better so you don't waste thier time loading data they will never see anyway.
How much affect with this have on performance? Does it really take a long time to load all the data? If the data is not too much, maybe the whole thing can be loaded in a matter of seconds, in which case the amount of work to implement the optimization will not be significant to the end user and in turn will not have a good return on investment.
Most likely you don't have clear cut answers to these questions, but they're probably good to consider when you're attacking this interesting problem.
Short answer is to make the user wait for as little as possible. They will curse your name if they have to wait 10-20 seconds on application load, but not notice 0.1-0.2 seconds for a tree node to expand.
I have an app in production with a similar structure. I cannot load up-front because it'd be effectively loading the entire database. Here's my strategy:
The tree control starts with 1 level expanded below the root.
Each unexpanded node has a dummy child node in order to get the [+] expansion icon to show
When a node is expanded, it fires an event which is trapped by the app. If the only child node is the dummy one, the dummy is deleted and the children are loaded from the database.
Changes in the data are not reflected automatically by visible nodes, however the context menu for the tree has a Refresh item that can be used to refresh a node.
I have considered showing updates asynchronously, but have tended to avoid it because large amounts of data can be shown in the tree and I'm wary of DB load if I'm checking them all for changes.
The app is WinForms, written in C# using .NET 2.0.

Resources