Suppose I've to parse a list of one and unique terminal token x.
I might be wrong but I thing I've two way:
One declaring a recursive non terminal this way : A -> xA
One other declaring a recursive non terminal that way : A -> Ax
That should leaves me with these two DFA of Item sets (if I'm not wrong once again) :
That shows me that only the first way produces recursivity.
At this point I'm sure to be wrong in my method to construct DFA from grammar.
Could you so explain me what did I do wrong?
I if your motivation is strong enough, could you give and explain method to achieve LALR table construction with both grammar? (Including First() and Follow() finding).
I can read some C code easily if you want to answer with some pseudo code.
I come from this site to try to understand LALR table construction.
Thanks by advance.
Related
I am conducting a matching project in Informatica 10.2.1 wherein I need to identify matching strings within product descriptions. Ratcliffe-Obershelp is the matching strategy I need to implement.
I've heard Ratcliffe-Obershelp yields greater results than Jaro - Winkler but I am not sure how to code this into a transformation in Informatica since it is not built in.
No code to show as I don't even know where to start.
I'd expect this to be a transformation/group of transformations that would reproduce the matching score that Ratcliffe-Obershelp creates on a per-line basis.
If I understand correctly, the matching logic performs operations in a loop iterating over the input strings. It is not possible to implement such "loop over string" in Expression Transformation using built-in functions. I see two options:
create DECODE function with multiple conditions for each possible length. - This will be ugly. And can be possible assuming only that we start at the begining of each string - implementing full substring comparison will be... so ugly I can't imagine :)
use Java Transformation - as much as I have putting Java into mappings, there are some cases where it's justified. This look like one of the few. Here's some JS reference
Yes this is homework. I am not asking for any easy answers, just help moving in the right direction. here is the assignment: "Create a function that receives two numbers: a and b. The function calculates and returns the multiplication of all the numbers between a and b. Create three versions of this function."
I created the function using a for loop and a while loop, but I am at a loss how to use recursion- the final part of the assignment.
Kudos for admitting this is a homework question. As such, while I won't give you the answer, I will give you a few pointers towards it.
When writing a recursive function, there are two key things to consider:
What stops the recursion, and
What happens until the recursion stops
In your case, where you have to calculate the product of a list of numbers, this works out as:
What should the function do when there is only 1 item in the list? (ie: when a and b are the same)
How can I multiply one element by the product of the rest of the list?
For extra credit, look up tail recursion and understand why it can help keep your memory usage down.
Does that give you enough of a start?
It's a simple instance of dynamic programming — you start with one problem and attempt to resolve it by breaking it into problems that are easier to solve and combining the results.
You can then usually attack these problems by working backwards: what's the most trivial case, that you could answer immediately? What would you do if the problem were a notch harder than that?
As you've explicitly been told to find a recursive solution, you can assume that you're looking for a method that can either directly return a result or else must call itself with modified parameters, and do something with that result to get its own.
Failing that, given that the question is slightly artificial, consider looking up how you could literally just implement a for loop using a recursive structure, then directly adapt your existing for loop. No great thought about the nature of breaking problems down, just looking at how to express your existing solution in a different way.
function recursiveMultiplication(num1, num2) {
if (num2 == num1) {
return num2;
}
return num2 * recursiveMultiplication(num1, num2 - 1);
}
console.log(recursiveMultiplication(5, 8));
Given that I have an input string, for example: aab
And I am given a target string, for example: bababa
And then I am given a set of transformation rules. For example:
ab -> bba
b -> ba
How could I do, in C, an algorithm that would find the minimum number of transformations that would need to be applied in the input string to get the target string.
In this example, for example, the number would be 3. Because we would do:
1 - Apply rule 1 (abba)
2 - Apply rule 1 again (bbaba)
3 - Apply rule 2 (bababa)
It could happen that given an input and a target, there is no solution and that should be noticed too.
I am pretty much lost in strategies on doing this. It comes to my mind creating an automata but I am not sure how would I apply in this situation. I think is an interesting problem and I have been researching online, but all I can find is transformations given rules, but not how to ensure it's a minimum.
EDIT: As one of the answers suggested, we could do a graph starting from the initial string and create nodes that are the result of applying transformations to the previous node. However, this brings some problems, from my point of view:
Imagine that I have a transformation that looks like this a --> ab. And my initial string is 'a'. And my output string is 'c'. So, I keep doing transformations (growing the graph) like this:
a -> ab
ab -> abb
abb -> abbb
...
How would I know when I need to stop building the graph?
Say I have the following string aaaa, and I have a transformation rule like aa->b. How would I create the new nodes? I mean, how would I find the substrings in that input string and remember them?
I dont think there is an efficient solution for this. I think you have to do breadth-first search. by doing that you will know that as soon as you have a solution that it is a shortest solution.
EDIT:
Image: modify string breadth first
Every layer is made from the previous by applying all possible rules to all possible substrings. For example the b->ba rule can be applied to abba for each b. It is important to only apply a single rule and then remember that string (eg ababa and abbaa) in a list. You have to completely have each layer in a List in your program before you start the next Layer (=breadth first).
EDIT 2:
You write you now have an output c. For this you obviously need a rule with XX->c. So say you have rule aaa->c. Now in layer 2 you will have a string aaa which came from some a->aa rules. You will then apply a->aa again and get aaaa, that is ok, since you should go breadth first you will THEN apply the aaa->c rule to aaa and now have layer 3 consisting of aaaa, c and others. You do not continue modifying aaaa because that would go to layer 4, you already found the target c in layer 3 so you can stop.
EDIT 3:
You now ask if you can decide for an unspecified set of rules how you can decide when to stop layering. In general it is impossible, it is called the Halting problem https://en.wikipedia.org/wiki/Halting_problem .
BUT For specific rules you can tell if you can ever reach the output.
Example 1: if the target contains an atom that no rule can provide (your 'c'-Example).
Example 2: if your rules are all either increasing the string's length or keeping the length as it is (no rules that decrease the length of the string)
Example 3: you can drop certain rules if you found by algorithm that they are cyclic
Other examples exist
I'm wondering is there an algorithm or a library which helps me identify the components in an English which has no meaning? e.g., very serious grammar error? If so, could you explain how it works, because I would really like to implement that or use that for my own projects.
Here's a random example:
In the sentence: "I closed so etc page hello the door."
As a human, we can quickly identify that [so etc page hello] does not make any sense. Is it possible for a machine to point out that the string does not make any sense and also contains grammar errors?
If there's such a solution, how precise can that be? Is it possible, for example, given a clip of an English sentence, the algorithm returns a measure, indicating how meaningful, or correct that clip is? Thank you very much!
PS: I've looked at CMU's link grammar as well as the NLTK library. But still I'm not sure how to use for example link grammar parser to do what I would like to do as the if the parser doesn't accept the sentence, I don't know how to tweak it to tell me which part it is not right.. and I'm not sure whether NLTK supported that.
Another thought I had towards solving the problem is to look at the frequencies of the word combination. Since I'm currently interested in correcting very serious errors only. If I define the "serious error" to be the cases where words in a clip of a sentence are rarely used together, i.e., the frequency of the combo should be much lower than those of the other combos in the sentence.
For instance, in the above example: [so etc page hello] these four words really seldom occur together. One intuition of my idea comes from when I type such combo in Google, no related results jump out. So is there any library that provides me such frequency information like Google does? Such frequencies may give a good hint on the correctness of the word combo.
I think that what you are looking for is a language model. A language model assigns a probability to each sentence of k words appearing in your language. The simplest kind of language models are n-grams models: given the first i words of your sentence, the probability of observing the i+1th word only depends on the n-1 previous words.
For example, for a bigram model (n=2), the probability of the sentence w1 w2 ... wk is equal to
P(w1 ... wk) = P(w1) P(w2 | w1) ... P(wk | w(k-1)).
To compute the probabilities P(wi | w(i-1)), you just have to count the number of occurrence of the bigram w(i-1) wi and of the word w(i-1) on a large corpus.
Here is a good tutorial paper on the subject: A Bit of Progress in Language Modeling, by Joshua Goodman.
Yes, such things exist.
You can read about it on Wikipedia.
You can also read about some of the precision issues here.
As far as determining which part is not right after determining the sentence has a grammar issue, that is largely impossible without knowing the author's intended meaning. Take, for example, "Over their, dead bodies" and "Over there dead bodies". Both are incorrect, and could be fixed either by adding/removing the comma or swapping their/there. However, these result in very different meanings (yes, the second one would not be a complete sentence, but it would be acceptable/understandable in context).
Spell checking works because there are a limited number of words against which you can check a word to determine if it is valid (spelled correctly). However, there are infinite sentences that can be constructed, with infinite meanings, so there is no way to correct a poorly written sentence without knowing what the meaning behind it is.
I think what you are looking for is a well-established library that can process natural language and extract the meanings.
Unfortunately, there's no such library. Natural language processing, as you probably can imagine, is not an easy task. It is still a very active research field. There are many algorithms and methods in understanding natural language, but to my knowledge, most of them only work well for specific applications or words of specific types.
And those libraries, such as the CMU one, seems to be still quite rudimental. It can't do what you want to do (like identifying errors in English sentence). You have to develop algorithm to do that using the tools that they provide (such as sentence parser).
If you want to learn about it check out ai-class.com. They have some sections that talks about processing language and words.
I've got the following diagram given:
Diagram here
The first gateway/connector is an OR-gateway/connector (it has a circle in it). The gateway/connector with a 'x' in it is a XOR-gateway/connector.
An OR-gateway specifies that one or more of the available paths will be taken.
An XOR-gateway represents a decision to take exactly one path in the flow.
I need to transform this diagram to PROLOG in order to get all possible paths from node 1 to node 8 but I have problems to code the OR-gateway and to find all possible paths.
How can I transform this diagram easily to Prolog and how can i find all possible paths respecting the gateways between two nodes?
Thank you for answers in advance.
As you should know, a Prolog program is basically a set of rules. From your graph, each node could begin a rule where each directed edge gives an explicit rule. By encoding your graph as a set of rules, a query on what satisfies say, (1, X, 8), would give you every possible path, even infinitely.
Encoding the rules should be easy (basic Prolog). Maybe I'm not understanding the special functions behind the OR and XOR. Please explain more if this isn't as trivial as it seems.