A question about the inner workings of ASP - logic-programming

I'm confused about the inner workings of ASP. Say I'm modelling a country's. For example, say we have
city(london).
city(manchester).
city(leeds).
city(newcastle).
city(sheffield).
road(a1).
road(b1).
connects(a1, london,sheffield).
connects(b1, sheffield, newcastle).
closed(b1).
etc. What I'm wondering is the following. Say I want to work out whether there is a route between two cities. So, I define a predicate like this:
is_road(A, B) :- connects(R, A, B),
-closed(R).
road(A,B) :- is_road(A,B).
road(A,C) :- is_road(A,B),
road(B,C).
Then does ASP work out all the routes or does it just work out there is a route between two cities and stop there? This has become important for me because I'm trying to define a predicate for a slow road, and how I do this depends on whether ASP has access to all the possible routes or not. My intuition tells me that it should because ASP works by grounding and solving (so surely all the correct variable instantiations should show up, but the algorithm ASP uses to compute answer sets is somewhat mysterious.
To clarify Say there's a road from London to Sheffield and Sheffield to Newcastle. Then there is a route from London to Newcastle by using those two roads. But another route would be going back to London then going back to Newcastle, or if there also a road from Sheffield to Leeds and Leeds to Newcastle, then a third route would be to go to Sheffield then go to Newcastle via Leeds.
Thank you for any help.

Related

What Database/ Datastructure to use for discrete 2D (landscape) map of values?

I hope this is not a too stupid question to ask. Grateful for any advice.
I want to store a 2d landscape consisting of discrete values in a database.
The value for each point represents what type of landscape it is (20 = tree).
Not each point must be defined/ stored.
I expect ~ 10.000 read- operations per minute, the write operations are neglectible (<1 per minute).
A optimal query might look like: "Return all values around point (x/y) with radius r." or "Return all values in a given rectangle."
I wont host the hardware by myself and will rent the server/ cloud sulution that is needed. In case it makes any difference: Prefered language for BE and FE is JS, but Java, Scala and C# are viable options, too.
What kind of database and datastructure would you use for this use case?
What if clients should be able to subscribe to changes in certain areas - how would you tacle this problem in theory?
What if each point holds a JSON, not an integer?

Cleaning up variable with highly similar observations

So I have a dataset in Stata that has a variable called "program description" that has very similar observations although the observations don't follow any pattern. My objective is to clean the variable so that the observations which are very similar will have the same name.
Here is an example of what the variable looks like:
Variable Name
phys ed
physical education
phys ed k-12
learning disabilities
learn dis
learn disable
Therefore, I would like the first three to just be called "phys ed" (or some derivative of that) and the last three to just be called "learning disabilities"
I've been using the function strpos() to replace observations that contain certain phrases but because the variable has 100k observations and a lot of different names, this takes a while.
You can use strgroup from SSC, but it's unlikely to get you all the way there. For example, this seems to work:
. strgroup string , gen(group) threshold(.7) normalize(longer)
. list, clean noobs
string group
phys ed 1
physical education 1
phys ed k-12 1
learning disabilities 2
learn dis 2
learn disable 2
However, "physics" would have been mapped to group 1 with these settings. Also, note that this command is case sensitive, so it might make sense to uppercase/lowercase everything first. The threshold is really a kind of tuning parameter.
I've also had some luck with Google/Open Refine with these problems. This is called reconciliation.
With all these approaches, some standardization goes a long way.

How to go about creating a prolog program that can work backwards to determine steps needed to reach a goal

I'm not sure what exactly I'm trying to ask. I want to be able to make some code that can easily take an initial and final state and some rules, and determine paths/choices to get there.
So think, for example, in a game like Starcraft. To build a factory I need to have a barracks and a command center already built. So if I have nothing and I want a factory I might say ->Command Center->Barracks->Factory. Each thing takes time and resources, and that should be noted and considered in the path. If I want my factory at 5 minutes there are less options then if I want it at 10.
Also, the engine should be able to calculate available resources and utilize them effectively. Those three buildings might cost 600 total minerals but the engine should plan the Command Center when it would have 200 (or w/e it costs).
This would ultimately have requirements similar to 10 marines # 5 minutes, infantry weapons upgrade at 6:30, 30 marines at 10 minutes, Factory # 11, etc...
So, how do I go about doing something like this? My first thought was to use some procedural language and make all the decisions from the ground up. I could simulate the system and branching and making different choices. Ultimately, some choices are going quickly make it impossible to reach goals later (If I build 20 Supply Depots I'm prob not going to make that factory on time.)
So then I thought weren't functional languages designed for this? I tried to write some prolog but I've been having trouble with stuff like time and distance calculations. And I'm not sure the best way to return the "plan".
I was thinking I could write:
depends_on(factory, barracks)
depends_on(barracks, command_center)
builds_from(marine, barracks)
build_time(command_center, 60)
build_time(barracks, 45)
build_time(factory, 30)
minerals(command_center, 400)
...
build(X) :-
depends_on(X, Y),
build_time(X, T),
minerals(X, M),
...
Here's where I get confused. I'm not sure how to construct this function and a query to get anything even close to what I want. I would have to somehow account for rate at which minerals are gathered during the time spent building and other possible paths with extra gold. If I only want 1 marine in 10 minutes I would want the engine to generate lots of plans because there are lots of ways to end with 1 marine at 10 minutes (maybe cut it off after so many, not sure how you do that in prolog).
I'm looking for advice on how to continue down this path or advice about other options. I haven't been able to find anything more useful than towers of hanoi and ancestry examples for AI so even some good articles explaining how to use prolog to DO REAL THINGS would be amazing. And if I somehow can get these rules set up in a useful way how to I get the "plans" prolog came up with (ways to solve the query) other than writing to stdout like all the towers of hanoi examples do? Or is that the preferred way?
My other question is, my main code is in ruby (and potentially other languages) and the options to communicate with prolog are calling my prolog program from within ruby, accessing a virtual file system from within prolog, or some kind of database structure (unlikely). I'm using SWI-Prolog atm, would I be better off doing this procedurally in Ruby or would constructing this in a functional language like prolog or haskall be worth the extra effort integrating?
I'm sorry if this is unclear, I appreciate any attempt to help, and I'll re-word things that are unclear.
Your question is typical and very common for users of procedural languages who first try Prolog. It is very easy to solve: You need to think in terms of relations between successive states of your world. A state of your world consists for example of the time elapsed, the minerals available, the things you already built etc. Such a state can be easily represented with a Prolog term, and could look for example like time_minerals_buildings(10, 10000, [barracks,factory])). Given such a state, you need to describe what the state's possible successor states look like. For example:
state_successor(State0, State) :-
State0 = time_minerals_buildings(Time0, Minerals0, Buildings0),
Time is Time0 + 1,
can_build_new_building(Buildings0, Building),
building_minerals(Building, MB),
Minerals is Minerals0 - MB,
Minerals >= 0,
State = time_minerals_buildings(Time, Minerals, Building).
I am using the explicit naming convention (State0 -> State) to make clear that we are talking about successive states. You can of course also pull the unifications into the clause head. The example code is purely hypothetical and could look rather different in your final application. In this case, I am describing that the new state's elapsed time is the old state's time + 1, that the new amount of minerals decreases by the amount required to build Building, and that I have a predicate can_build_new_building(Bs, B), which is true when a new building B can be built assuming that the buildings given in Bs are already built. I assume it is a non-deterministic predicate in general, and will yield all possible answers (= new buildings that can be built) on backtracking, and I leave it as an exercise for you to define such a predicate.
Given such a predicate state_successor/2, which relates a state of the world to its direct possible successors, you can easily define a path of states that lead to a desired final state. In its simplest form, it will look similar to the following DCG that describes a list of successive states:
states(State0) -->
( { final_state(State0) } -> []
; [State0],
{ state_successor(State0, State1) },
states(State1)
).
You can then use for example iterative deepening to search for solutions:
?- initial_state(S0), length(Path, _), phrase(states(S0), Path).
Also, you can keep track of states you already considered and avoid re-exploring them etc.
The reason you get confused with the example code you posted is essentially that build/1 does not have enough arguments to describe what you want. You need at least two arguments: One is the current state of the world, and the other is a possible successor to this given state. Given such a relation, everything else you need can be described easily. I hope this answers your question.
Caveat: my Prolog is rusty and shallow, so this may be off base
Perhaps a 'difference engine' approach would be appropriate:
given a goal like 'build factory',
backwards-chaining relations would check for has-barracks and tell you first to build-barracks,
which would check for has-command-center and tell you to build-command-center,
and so on,
accumulating a plan (and costs) along the way
If this is practical, it may be more flexible than a state-based approach... or it may be the same thing wearing a different t-shirt!

Getting win percentages for Texas hold'em poker without monte carlo/exhaustive enumeration

Sorry I'm just starting this project and don't have any ideas or code, I'm asking more of a theoretical question than a programming one.
It seems that every google search provides the same responses and it's very hard to find an answer to this question:
Is there a way to calculate win percentages for texas holdem poker (the same way they do on poker after dark or other televised poker events) without using the monte carlo/exhaustive enumeration methods. Assuming all cards are face up and we know every card in the deck.
Every response on other forums just seems to be "use pokerstove" or something similar, I'm looking for the theory to write the code.
Thanks.
Is there a way to calculate win percentages for texas holdem poker
(the same way they do on poker after dark or other televised poker
events) without using the monte carlo/exhaustive enumeration methods.
In specific instances it is possible...
You can use a perfect lookup table preflop for two players heads-up preflop matchups: note that the "typical" 169 vs 169 approximation ain't good enough (say Jh Th vs 9h 8h ain't really "JTs vs 98s": I mean, that would quite a gross approximation).
Besides that if you have a lot of memory and if you can live with gigantic cache misses, you technically could precompute gigantic lookup tables (say on the server side) and do lookups for other cases (e.g. for every possible three players all-in matchups preflop), but you'd really need a lot of memory : )
Note that "full enumeration" at flop and turn ain't an issue: at flop there's only 2 more cards to come, so there are typically only C(45,2) [two players all-in at flop, we know 2*2 holecards + 3 community cards -- hence leaving 990 possibilities] or C(43,2) [three players all-in at at flop, we know 3*2 holecards + 3 community cards].
So an actual evaluator would not use one but several methods. For example:
lookup table for two players all-in preflop (the fastest)
full enum for any number of players all-in at flop or turn because it's tiny (max 990 possibilities) -- very fast
monte-carlo or full enum for three players or more all-in preflop -- incredibly slower
It is interesting to see here that in the most typical cases you'll get the result very, very fast: most actual all-ins involve two players, not three or more.
So you're either looking up in a "1 vs 1 preflop" lookup table or doing full C(45,2) or C(46,1) full enum (which are, in both case, amazingly fast).
It's really only the "three players or more all-in preflop" case which do take time.
The answer is no.
There is no closed form computation that you can do to compute poker equities. Using combinatorics, you can identify and solve many subproblems, which speeds up computation.
For example, if you are considering all five card hands, there are 52 choose 5 = 2,598,960 different hands. But knowing that suits are equivalent and using combinatorial methods (either analytic or computational), you can reduce the space of all hands to 134,459 classes each weighted according to the number of different hands in each equivalent class.
There are also various ways of using exhaustive evaluations tailored to your application. If you need to perform some subset of evaluations repeatedly, you can use caches or precomputed lookup tables targeted to your specific needs.

Best way of splitting website into Cities/Countries?

i am running a dating website, or better say, try to. I want to be as generic
as possible in terms of, coverage for countries.
Since it's a local dating oriented website, i have to keep track of cities
and so on so i am running into a few problems:
I need to have information about cities, people could join from
(a country i don't know anything about, i really can't decide about a city
or anything)
When people join from different countries, they would like to see people
near by. How would you approach this sort or problem?
This may seem like 2 easy points but they really make me some trouble these
days.
** Try to approach them as Database related issues **
I circled around SE and find this to be the best point to ask, not any other
SE website.
Thank you very much
The full name of a city in the USA is {country_name, state_name, city_name}. I say that assuming that, on a global scale, what I call state_name is not unique. I can easily imagine some region of a Latin American country being named "Nevada" or "Colorado".
You also have the problem of spelling. An American living in Germany might say she lived in Cologne, North Rhine-Westphalia, Germany. A German living there would call it "Köln".
"Nearby" is an even tougher nut to crack. I used to live in a big city. It wasn't unusual to drive an hour on an Interstate highway to pick up my date. I thought of that as "nearby". Now I live in a small city. Now 30 minutes seems like a long drive, but it's probably only 15 or 20 miles instead of 65 or 70.
If you live in a border town, 15 miles might be in a different country. Maybe two different countries.
I think your best bet is to use a geolocation API or service to get the latitude and longitude of new cities when they're inserted. The calculation of distance given two points is straightforward, but it leads to badly performing queries unless you use a bounding box. (You don't want to calculate the distance to every city in Texas to find people "near" Waco.)
Get the latitude/longitude for all of the cities. Don't worry about exact distances.
Play with excel and find the maximum change for latitude and/or longitude (Δlat or Δlong) for given distances (10miles, 25 miles, 50 miles), and then when you search for others, just search the database for within (Firstuser'sLongitude +/- Δlong) and within (Firstuser'sLatitude +/-Δlong).
I don't think people will mind the difference between 10miles and 10*2^0.5 miles.

Resources