Pumping Lemma and Hierachy - theory

I have a question involving the Pumping Lemma for Regular Languages and Pumping Lemma for Context-free Languages:
Is it possible that there's a language which doesn't meet the criteria for the pumping-lemma for context-free languages but does meet the criteria for the pumping-lemma for regular languges?
Or is there a hierachy similar to the Chomsky-Hierachy?
I'm just trying to understand that and pumping lemma in general

Is it possible that there's a language which doesn't meet the criteria for the pumping-lemma for context-free languages but does meet the criteria for the pumping-lemma for regular languges?
Let's consider the classic a^nb^n language. aabb is in it, while abb is not.
We know it is a CFL. (S -> aSb | epsilon)
We can proof that is is not a regular language using the PL for CFL (cf. https://stackoverflow.com/a/2309755)
The PL for CFL is used to proof that a language is NOT CF.
But the language IS CF (see above!).
Thus we can never use the PL for CFL for the language to proof that is it not CF.
A regular language [...] must be a CFL itself and therefore should be able to meet the PL criteria for CFL or am I wrong?
Yes, any RL is also a CFL (and also a CSL and also a REL).
You are wrong in your conclusion though.
The PL is used to proof that a given language is NOT in the class. So we use the PL for RL to show that a language is not a RL, so "at most" CF. And we use the PL for CFL to show that a language is not even a CFL, so "at most" context sensitive.
Is there a hierarchy similar to the Chomsky-Hierarchy ?
Well if you can proof a language is not context free, it can also not be regular, as RL is a subset of CFL.

Related

library(samsort) in SICStus Prolog

In SICStus Prolog there's library(samsort),
a library for generic sorting.
The library exports the predicates samsort/2, samsort/3, samkeysort/2, and others.
I can see the use of these predicates, but I'm somewhat puzzled by the name prefix sam.
What is "sam"? Is it some abbreviation, initialism, or acronym?
What does "sam" mean? Is there a story/history behind the name "sam"?
This is Smooth Applicative Merge sort.
Originally from the DEC-10 Prolog library, I think.
It was described in a technical report, which I cannot find, unfortunately:
O’Keefe, R.: A smooth applicative merge sort. Tech. rep., Department of Artificial Intelligence, University of Edinburgh (1982)

Maximum minimum and mean in Datalog

I cannot figure how calculate a mean, maximum and minimum using Datalog declarative logic programming language.
Eg. Considering this simple schema
Flows(Stream, River)
Rivers(River, Length)
If I want
a) the mean length of the rivers,
b) the longest river,
c) and the river with less Streams
what are the right Datalog queries?
I have read the Datalog theory, but cannot figure how these simple in another language queries could be solved with Datalog and haven't found any similar sample.
NOTE
The datalog that I use is with basic arithmetic functions like y is z+1, y is z-1, y is z\1 or y is z*1, and you can use X<Y or Y>X statements, and negation, so theoretically should be possible to do this kind of interrogation in some way since It has enough expressive power.
Is negation supported or not? If so, we can do max (or min) as follows:
shorterRivers(R1, L1) :- Rivers(R1, L1), Rivers(R2, L2), L1 < L2.
longestRivers(R1, L1) :- Rivers(R1, L1), !shorterRivers(R1,L1).
"mean" will be harder to do, as it require "SUM" and "COUNT" aggregations.
Standard Datalog supports first-order logic only, which does not include aggregate functions. However, some datalog implementations, such as pyDatalog, supports aggregate functions as an extension.

What is the usage of 'confidence' and 'lift' concepts of Apriori algorithm

I am going to implement a personal recommendation system using Apriori algorithm.
I know there are three useful concepts as 'support',confidence' and 'lift. I already know the meaning of them. Also I know how to find the frequent item sets using support concept. But I wonder why confidence and lift concepts are there for if we can find frequent item sets using support rule?
could you explain me why 'confidence' and 'lift' concepts are there when 'support' concept is already applied and how can I proceed with 'confidence' and 'lift' concepts if I have already used support concept for the data set?
I would be highly obliged if you could answer with SQL queries since I am still an undergraduate. Thanks a lot
Support alone yields many redundant rules.
e.g.
A -> B
A, C -> B
A, D -> B
A, E -> B
...
The purpose of lift and similar measures is to remove complex rules that are not much better than the simple rule.
In above case, the simple rule A -> B may have less confidence than the complex rules, but much more support. The other rules may be just coincidence of this strong pattern, with a marginally stronger confidence because of the smaller sample size.
Similarly, if you have:
A -> B confidence: 90%
C -> D confidence: 90%
A, C -> B, D confidence: 80%
then the last rule is even bad, despite the high confidence!
The first two rules yield the same outcome, but with higher confidence. So that last rule shouldn't be 80% correct, but -10% correct if you assume the first two rules to hold!
Thus, support and confidence are not enough to consider.

What is supported in First Order Logics which is not supported in Description Logic?

While studying description logics (DL), it is very common to read that it is a fragment of first order logics (FOL), but it is hard to read something explicitely on what is excluded from DL which is part of FOL, which makes DL (with all its dialects ALC, SHOIN etc...) decidable.
Or in another words, could you provide me some examples in FOL which are not expressible
through DL (and which are the reason for semi/non-decidability in FOL) ?
The following facts about description logics are closely related to decidability:
(a form of) tree-model property — this property is important for tableu methods;
embeddability into multimodal systems — which are known to be "robustly decidable";
embeddability into the so-called guarded fragments of FOL — see below;
embeddability into two-variables FOL fragments — which are decidable;
locality — see below.
Some of these facts are syntactical, while some are semantical. Here below are two interesting, decidability-related, and more or less syntactical characteristics of description logics:
Locality (from The Description Logic Handbook, 2nd edition, section 3.6):
One of the main reasons why satisfiability and subsumption in many Description Logics are decidable – although highly complex – is that
most of the concept constructors can express only local properties
about an element 〈...〉 Intuitively, this implies that
a constraint regarding x will not “talk about” elements which are
arbitrarily far (w.r.t. role links) from x. This also means that in
ALC, and in many Description Logics, an assertion on an individual
cannot state properties about a whole structure satisfying it.
However, not every Description Logic satisfies locality.
Guarded fragment (from The Description Logic Handbook, 2nd edition, section 4.2.3)
Guarded fragments are obtained from first-order logic by allowing the
use of quantified variables only if these variables are guarded by
appropriate atoms before they are used in the body of a formula.
More precisely, quantifiers are restricted to appear only in the form
     ∃y(P(x,y) ∧ Φ(y))
        or      ∀y(P(x,y) ⊃ Φ(y))
              (First Guarded Fragment)
     ∃y(P(x,y) ∧ Φ(x,y))
     or      ∀y(P(x,y) ⊃ Φ(x,y))
           (Guarded Fragment)
for atoms P, vectors of variables x and y and (first) guarded fragment
formulae Φ with free variables in y and x (resp. in y).
From these points of view, analyze the examples from #JoshuaTaylor's comments:
∀x.(C(X) ↔ ∃y.(likes(x,y) ∧ ∃z.(likes(y,z) ∧ likes(z,x))))
∀x.(C(x) ↔ ∃z.(favoriteTeacher(x,z) ∧ firstGradeTeacherOf(x,z)))
The reasons why DL is preferred to FOL for knowledge representation are not only decidability or computational complexity related. Look at the slide called "FOL as Semantic Web Language?" in this lecture.
As shown by Turing and Church, FOL is undecidable, because there is no algorithm for deciding if a FOL formula is valid. Many description logics are decidable fragments of first-order logic, however, some description logics have more features than FOL, and many spatial, temporal, and fuzzy description logics are undecidable as well.

Logic programming with integer or even floating point domains

I am reading a lot about logic programming - ASP (Answer Set Programming) is one example or this. They (logic programs) are usually in the form:
[Program 1]
Rule1: a <- a1, a2, ..., not am, am+1;
Rule2: ...
This set of rules is called the logic program and the s.c. model is the result of such computation - some kind of assignment of True/False values to each of a1, a2, ...
There is lot of research going on - e.g. how such kind of programs (rules) can be integrated with the (semantic web) ontologies to build knowledge bases that contain both - rules and ontologies (some kind of constraints/behaviour and data); there is lot of research about ASP itself - like parallel extensions, extensions for probabilistic logic, for temporal logic and so on.
My question is - is there some kind of research and maybe some proof-of-concept projects where this analysis is extended from Boolean variables to variables with integer and maybe even float domains? Currently I have not found any research that could address the following programs:
[Program 2]
Rule1 a1:=5 <- a2=5, a3=7, a4<8, ...
Rule2 ...
...
[the final assignment of values to a1, a2, etc., is the solution of this program]
Currently - as I understand - if one could like to perform some kind of analysis on Program-2 (e.g. to find if this program is correct in some sense - e.g. if it satisfies some properties, if it terminates, what domains are allowed not to violate some kind of properties and so on), then he or she must restate Program-2 in terms of Program-1 and then proceed in way which seems to be completely unexplored - to my knowledge (and I don't believe that is it unexplored, simply - I don't know some sources or trend). There is constraint logic programming that allow the use of statements with inequalities in Program-1, but it is too focused on Boolean variables as well. Actually - Programm-2 is of kind that can be fairly common in business rules systems, that was the cause of my interest in logic programming.
SO - my question has some history - my practical experience has led me to appreciate business rules systems/engines, especially - JBoss project Drools and it was my intention to do some kind of research of theory underlying s.c. production rules systems (I was and I am planning to do my thesis about them - if I would spot what can be done here), but I can say that there is little to do - after going over the literature (e.g. http://www.computer.org/csdl/trans/tk/2010/11/index.html was excellent IEEE TKDE special issues with some articles about them, one of them was writter by Drools leader) one can see that there is some kind of technical improvements of the decades old Rete algorithm but there is no theory of Drools or other production rule systems that could help with to do some formal analysis about them. So - the other question is - is there theory of production rule systems (for rule engines like Drools, Jess, CLIPS and so on) and is there practical need for such theory and what are the practical issues of using Drools and other systems that can be addressed by the theory of production rule systems.
p.s. I know - all these are questions that should be directed to thesis advisor, but my current position is that there is no (up to my knowledge) person in department where I am enrolled with who could fit to answer them, so - I am reading journals and also conference proceedings (there are nice conference series series of Lecture Notes in Computer Science - RuleML and RR)...
Thanks for any hint in advance!
In a sense the boolean systems already do what you suggest.
to ensure A=5 is part of your solution, consider the rules (I forget my ASP syntax so bear with me)
integer 1..100 //integers 1 to 100 exist
1{A(X) : integer(X)}1 //there is one A(X) that is true, where X is an integer
A(5) //A(5) is true
and I think your clause would require:
integer 1..100 //integers 1 to 100 exist
1{A(X) : integer(X)}1 //A1 can take only one value and must take a value
1{B(X) : integer(X)}1 //A2 ``
1{C(X) : integer(X)}1 //A3 ``
1{D(X) : integer(X)}1 //A4 ``
A(5) :- B(5), C(7), D(8) //A2=5, A3=7, A4=8 ==> A1=5
I hope I've understood the question correctly.
Recent versions of Clojure core.logic (since 0.8) include exactly this kind of support, based on cKanren
See an example here: https://gist.github.com/4229449

Resources