Confusing Language for a Deterministic Finite State Machine - theory

I am supposed to show a DFSM to accept the following language for my theory of computation class, but that isn't what I am having trouble with. I'm not even sure what the language means. Can someone explain what this means in English? If I understand what it means, I'm sure I can create the DFSM. Thanks for any help. Here is the language:
{w E {O, 1}* : w corresponds to the binary encoding, without leading O's, of natural
numbers that are evenly divisible by 4}.

In base 10: 4, 8, 12, 16,...
In the requested encoding: 100, 1000, 1100, 10000, ...

Related

What is the output of this array pseudocode and why?

I am a new student in a programming fundamentals course. I have absolutely no previous experience with programming. I am trying to understand a question that was posed on a recent quiz about arrays. This is it: What will be the output of the pseudocode below?
nums = [30, 10, 20, 50, 40]
val = 0
For i = 0 to 5
val = nums[i] + val
Display val
Thanks for any thoughts on this. I don't understand it at all. I originally thought 40 was the answer but sadly, that was wrong lol. Can someone tell me what the answer is and explain why that's so? Our textbook doesn't have any examples like this.
Well, as Wayne already said, the for loops is running on to many times, which would result in an exception. Assuming that is a mistake and that the loop only runs 5 times. Then your result would be 30 + 10 + 20 + 50 + 40, as it just adds all values of the array up and saves it in val. So it would probably display 150, unless I am completely wrong and I also misunderstood it.
As posed, the correct answer is "This code generates an error/exception." The reason is that the loop is iterating over six array elements, but the array contains only five. Here are just some of the things that various real languages might do in this case:
Generate an incorrect result.
Abort the program with a memory error (accessing memory out of bounds).
Generate an exception due to an invalid array index.
Generate an exception due to adding nil/null to an integer.
But the one thing you can be sure of is that the program is incorrect.

Reducing complex filter rules to a comparable form

I want to determine whether or not an input array of integers "matches" a set of rules.
The Matcher
The Matcher is built from a set of helper methods to describe rules for input data. These rules are essentially logic gates on arrays of integers:
AND(1, 2) // Requires both 1 AND 2 be present in the input array.
OR(3, 4, 5) // Requires that 3 OR 4 OR 5 be present in the input array.
NOR(6, 7) // Requires that neither 6 NOR 7 be present in the input array.
XOR(8, 9) // Requires that either 8 (X)OR 9 be present in the input array, but not both.
Thus, I could say that, given the input array:
[0, 1, 2, 3]
I could build a Matcher like:
AND(OR(0, 1), AND(1, 2) NOR(4))
Which would match the input, because the input satisfies:
OR(0, 1) // 0 or 1 is present
AND(1, 2) // Both 1 and 2 are present
NOR(4) // 4 is not present
And each of those cumulatively satisfies the overarching AND rule.
The Problem
I need to reduce matchers to the simplest and most basic form that still describes the rules. For example, given the above matcher, a sample reduction could be:
rules = {
or: [1, 2],
xor: [], // No XORs
nor: [4]
}
Each rule has three arrays of sub-rules, comprised of integers or rules.
Notice that the ORs are empty, because 1 is required anyways, which means OR(0, 1) => [0, 1] is redundant because it must be satisfied.
Since Matchers need to be comparable (I need to be able to determine an equivalence between the underlying rules), it becomes a bit more complicated when I get to:
input = [1, 2, 5, 9, 11, 12, 13, 14, 17]
XOR(OR(AND(1, 2), NOR(3, 4), XOR(3 11), AND(11, 14)), AND(1, 5, 17))
Now, a large amount of that is redundant and/or contradictory, so what I was thinking I could do was first place it into a tree-like structure, and then recurse it and reduce unnecessary entries. Any ideas for a better way to do this?
I'm specifically looking for something deterministic (any set of input rules that mean the same thing yield the same final reduced form). If there is a better way to express this problem, I'm interested, and if rules are contradictory it's fine for the reducer to break and throw an exception. This is intended for occasional use in the program, so performance is not much of an issue.
What you are actually dealing with here is propositional logic. Consider the integer your propositions being either false or true depending on whether they are present in the input array.
Your constraints (XOR, AND, etc.) then form a logical formula that is either satisfiable or not. It is in fact hard to figure out for any given formula whether it is satisfiable. However, at first sight this shouldn't concern you because you only have to check whether a given input satisfies the formula.
Unfortunately what you are actually asking is how you can determine whether two propositional formulas are equivalent. It turns out this is equally hard: https://math.stackexchange.com/questions/1050127/how-to-efficiently-determine-if-any-two-propositional-formulas-are-equivalent

Effects of randomizing the order of inputs to a neural network

For my Advanced Algorithms and Data Structures class, my professor asked us to pick any topic that interested us. He also told us to research it and to try and implement a solution in it. I chose Neural Networks because it's something that I've wanted to learn for a long time.
I've been able to implement an AND, OR, and XOR using a neural network whose neurons use a step function for the activator. After that I tried to implement a back-propagating neural network that learns to recognize the XOR operator (using a sigmoid function as the activator). I was able to get this to work 90% of the time by using a 3-3-1 network (1 bias at the input and hidden layer, with weights initialized randomly). At other times it seems to get stuck in what I think is a local minima, but I am not sure (I've asked questions on this before and people have told me that there shouldn't be a local minima).
The 90% of the time it was working, I was consistently presenting my inputs in this order: [0, 0], [0, 1], [1, 0], [1, 0] with the expected output set to [0, 1, 1, 0]. When I present the values in the same order consistently, the network eventually learns the pattern. It actually doesn't matter in what order I send it in, as long as it is the exact same order for each epoch.
I then implemented a randomization of the training set, and so this time the order of inputs is sufficiently randomized. I've noticed now that my neural network gets stuck and the errors are decreasing, but at a very small rate (which is getting smaller at each epoch). After a while, the errors start oscillating around a value (so the error stops decreasing).
I'm a novice at this topic and everything I know so far is self-taught (reading tutorials, papers, etc.). Why does the order of presentation of inputs change the behavior of my network? Is it because the change in error is consistent from one input to the next (because the ordering is consistent), which makes it easy for the network to learn?
What can I do to fix this? I'm going over my backpropagation algorithm to make sure I've implemented it right; currently it is implemented with a learning rate and a momentum. I'm considering looking at other enhancements like an adaptive learning-rate. However, the XOR network is often portrayed as a very simple network and so I'm thinking that I shouldn't need to use a sophisticated backpropagation algorithm.
the order in which you present the observations (input vectors) comprising your training set to the network only matters in one respect--randomized arrangement of the observations according to the response variable is strongly preferred versus ordered arrangement.
For instance, suppose you have 150 observations comprising your training set, and for each the response variable is one of three class labels (class I, II, or III), such that observations 1-50 are in class I, 51-100 in class II, and 101-50 in class III. What you do not want to do is present them to the network in that order. In other words, you do not want the network to see all 50 observations in class I, then all 50 in class II, then all 50 in class III.
What happened during training your classifier? Well initially you were presenting the four observations to your network, unordered [0, 1, 1, 0].
I wonder what was the ordering of the input vectors in those instances in which your network failed to converge? If it was [1, 1, 0, 0], or [0, 1, 1, 1], this is consistent with this well-documented empirical rule i mentioned above.
On the other hand, i have to wonder whether this rule even applies in your case. The reason is that you have so few training instances that even if the order is [1, 1, 0, 0], training over multiple epochs (which i am sure you must be doing) will mean that this ordering looks more "randomized" rather than the exemplar i mentioned above (i.e., [1, 1, 0, 0, 1, 1, 0, 0, 1, 1, 0, 0] is how the network would be presented with the training data over three epochs).
Some suggestions to diagnose the problem:
As i mentioned above, look at the ordering of your input vectors in the non-convergence cases--are they sorted by response variable?
In the non-convergence cases, look at your weight matrices (i assume you have two of them). Look for any values that are very large (e.g., 100x the others, or 100x the value it was initialized with). Large weights can cause overflow.

Artificial neural networks

I want to know whether Artificial Neural Networks can be applied to discrete values inputs? I know they can be applied to continuous valued inputs, but can they be applied to discrete valued ones? Also, will perform well for discrete valued inputs?
Yes, artificial neural networks may be applied to data featuring discrete-value input variables. In the most commonly used neural network architectures (which are numeric), discrete inputs are typically represented by a series of dummy variables, just as in statistical regression. Also, as with regression, one less than the number of distinct values dummy variables is needed. There are other methods, but this is the most straightforward.
Well, good question let me say!
First of all let me answer directly yes to your question!
The answer implies to consider few aspects about the use and the implementation of the network itself.
Than let me explain why:
The easiest way is to normalize input as usual, this is the first rule of thumb with NN,
than let the neural network compute the task, and once you have your output, invert the normalization to get the output in the original range but still continuous, to get back descrete values just consider the integer part of your output. It is easy, it works and is fine, DONE! A good result just depends on the topology you design for you network.
As a plus you could consider the use of "step" transfer function, instead of "tan-sigmoid", between layers just to strenght and mimic a sort of digitization forcing the output to be just 0 or 1. But you should reconsider also the starting normalization as well as the use of well tuned thresholds.
NB: this latter trick is not really necessary but could give some secondary benefits; maybe test it in a second stage of your development and look at the differences.
PS: Just let me suggest something that should apply to your issue; if you would be smart take into account the use of some fuzzy logic on your learning algorithm ;-)
Cheers!
I'm late on this question, but this may help someone.
Say you have a categorical output variable, for example 3 different categories (0, 1 and 2),
outputs
0
2
1
2
1
0
then becomes
1, 0, 0
0, 0, 1
0, 1, 0
0, 0, 1
0, 1, 0
1, 0, 0
A possible NN output result is
0.2, 0.3, 0.5 (winner is categ 2)
0.05, 0.9, 0.05 (winner is categ 1)
...
Then your NN hill have 3 output nodes in this case, so take the max value.
To improve this, use entropy as a error measure and a softmax activation on the output layer, so that the outputs sum up to 1.
The purpose of a neural network is to approximate complicated functions by interpolating samples. As such, they tend to be a poor fit for discrete data, unless that data can be expressed by thresholding a continuous function. Depending on your problem, there are likely to be much more effective learning methods.

SUBSET-SUM, upper bound on number of solutions

As you probably know, the SUBSET-SUM problem is defined as determining if a subset of a set of whole numbers sum to a specified whole number. (there is another definition of subset-sum, where a group of integers sum to zero, but let's use this definition for now)
For example ((1,2,4,5),6) is true because (2,4) sums to 6. We say that (2,4) is a "solution"
Furthermore, ((1,5,10),7) is false because nothing in the arguments sum to 7
My question is, given a set of argument numbers for SUBSET-SUM is there a polynomial upper bound on the number of possible solutions. In the first example there was (2,4) and (1,5).
We know that since SUBSET-SUM is NP-complete deciding in polynomail time probably is impossible. However my question is not related to the decision time, I'm asking strictly about the size of the list of solutions.
Obviously the size of the power set of the argument numbers can be an upper bound on solution list size, however this has exponential growth. My intuition is that there should be a polynomial bound, but I cannot prove this.
nb I know this sounds like a homework question, but please trust me it isn't. I am trying to teach myself certain aspects of CS theory and this is where my thoughts have taken me.
No; take numbers:
(1, 2, 1+2, 4, 8, 4+8, 16, 32, 16+32, ..., 22n, 22n+1, 22n+22n+1)
and ask about forming the sum 1 + 2 + 4 + ... + 22n + 22n+1. (For example: with n = 3 take the set (1,2,3,4,8,12,16,32,48) and ask about the subsets summing to 63.)
You can form 1+2 either using 1 and 2 or using 1+2.
You can form 4+8 either using 4 and 8 or using 4+8.
....
You can form 22n + 22n+1 either using 22n and 22n+1 or 22n+22n+1.
The choices are independent, so there are at least 3n=3m/3, where m is the size of your set. I bet this can be sharply strengthened, but this proves there's no polynomial bound.
Sperner's Theorem provides a nice (albeit non-polynomial) upper bound, at least in the case when the numbers in the set are strictly greater than zero (as seems to be the case in your problem).
The family of all subsets with a given sum form a Sperner family, which is a collection of subsets where no subset in the family is itself a subset of any of the other subsets in the family. This is where the assumption that the elements are strictly greater than zero is used. Sperner's theorem states that the maximum size of such a Sperner family is given by the binomial coefficient n Choose floor(n/2).
If you drop the assumption that the n numbers are distinct, it is easy to see that this upper bound can't be improved (just take all numbers = 1 and let the target sum be floor(n/2)). I don't know if it can be improved under the assumption that the numbers are distinct.

Resources