analphabetic strings decidable by DFA? - dfa

Is the language of panalphabetic strings decidable by DFA? If so, how can I prove it?
A string {a,…,z}* is said to be panalphabetic if it contains at least one occurrence of each letter.

Yes, it is decidable by a DFA, albeit one that needs 226 (that is, 67108864) states.
The easiest way to prove this is probably by the Myhill-Nerode theorem. To use that, you'll need to divide all strings into equivalence classes based on what can be added to the string to make it part of the language. (see the wikipedia article)
To do that, define a function f over strings in {a,...,z}* that is the set of all letters in the string. Obviously, for any two strings x and y, f(xy) is f(x) ⋃ f(y). (That is, the union of f(x) and f(y))
The language of panalphabetic strings is then all strings s such that f(s) is the set of all letters. That is, whether a string is panalphabetic can be determined just by what the value of f applied to that string is.
Now consider two strings x and y such that f(x) = f(y). Then, for any third string z, f(xz) = f(x) ⋃ f(z) = f(y) ⋃ f(z) = f(yz). Therefore, xz is panalphabetic if and only if yz is panalphabetic. Therefore, x and y are equivalent.
Therefore, there can only be as many different equivalence classes as there are possible values of f. Since f(s) is a subset of {a, b, ..., z}, there are only 226 possible values of f(s). This is finite, so the language of panalphabetic strings is recognizable by a DFA.
(To show that 226 is the smallest number of DFA states, you also need to show that if f(x) != f(y), then x and y are not equivalent, which therefore means that there are exactly as many equivalence classes as there are possible values for f(s). That's fairly straightforward, but I'll let you complete that bit)

Related

How to interpret results for a moderating model?

I have a moderation model, but the results look a bit wired.
A is DV, which is a binary variable.
B is M, also a binary variable.
C is IV, which is a continuous variable.
If I ran the regression of A only on B, the coefficient on B is positive and significant (e.g., 0.1418). When I add C in the model and the interactive term of B and C, the coefficient on B is still positively significant, but the number is lower (the coefficient on B is 0.1376). And the interactive term between B and C is significantly positive (0.0222).
Can I explain the results in this way?
The higher the C, the more likelihood that A presents. However, this tendency is weaker when M is present.

Are there floating point infinitesimals?

Since there are finitely many floating point numbers and one can compare each possible pair of such numbers (I assume), there must always exist a number 'b' which is
smaller than some given number 'a' (not +/- infinity) and
there exists no number 'c' smaller than 'a' and greater than 'b';
i.e. the 'next' smaller floating-point-represented number. I wonder if:
there is a function smaller(float a) returning such number b (or greater(float a) for that matter) in the C programming language
if not, then if there is a way to obtain these 'next' numbers for certain types of numbers 'a', for example if 'a' is an integer/zero.
Trying
float smaller(float a) return a - 0.00...001f;
seems to me like a hack that probably doesn't work for all possible inputs, but I might be wrong, so that's why I'm turning to you guys. Any help is appretiated.
Indeed there is. You're after the "nextafter" family of functions.
These can be used to move from one floating point number to the next, much in the same way as you can use ++ and -- for integral types.
See https://en.cppreference.com/w/c/numeric/math/nextafter
(This is C documentation).
The C99/POSIX functions nextafter/nexttoward can do this. You provide a start value x and a destination value y, and they return the next value from the start in the direction of the destination.
Also, if your language does not have the nextafter family of functions, but does let you treat values stored in memory as integers (by pointer casting or other, dirtier tricks), then, for any floating-point type (double, float, half, ...) that conforms to IEEE 754, if you want to find the next larger number than value, you can do
FLOATING value = ...;
if (value >= 0) {
integer_increment(value);
} else {
integer_decrement(value);
}
and vice versa for the next smaller number, where integer_increment increments the value of value as if value was of an integral type.

Finding first duplicated element in linear time [duplicate]

There is an array of size n and the elements contained in the array are between 1 and n-1 such that each element occurs once and just one element occurs more than once. We need to find this element.
Though this is a very FAQ, I still haven't found a proper answer. Most suggestions are that I should add up all the elements in the array and then subtract from it the sum of all the indices, but this won't work if the number of elements is very large. It will overflow. There have also been suggestions regarding the use of XOR gate dup = dup ^ arr[i] ^ i, which are not clear to me.
I have come up with this algorithm which is an enhancement of the addition algorithm and will reduce the chances of overflow to a great extent!
for i=0 to n-1
begin :
diff = A[i] - i;
sum = sum + diff;
end
diff contains the duplicate element, but using this method I am unable to find out the index of the duplicate element. For that I need to traverse the array once more which is not desirable. Can anyone come up with a better solution that does not involve the addition method or the XOR method works in O(n)?
There are many ways that you can think about this problem, depending on the constraints of your problem description.
If you know for a fact that exactly one element is duplicated, then there are many ways to solve this problem. One particularly clever solution is to use the bitwise XOR operator. XOR has the following interesting properties:
XOR is associative, so (x ^ y) ^ z = x ^ (y ^ z)
XOR is commutative: x ^ y = y ^ x
XOR is its own inverse: x ^ y = 0 iff x = y
XOR has zero as an identity: x ^ 0 = x
Properties (1) and (2) here mean that when taking the XOR of a group of values, it doesn't matter what order you apply the XORs to the elements. You can reorder the elements or group them as you see fit. Property (3) means that if you XOR the same value together multiple times, you get back zero, and property (4) means that if you XOR anything with 0 you get back your original number. Taking all these properties together, you get an interesting result: if you take the XOR of a group of numbers, the result is the XOR of all numbers in the group that appear an odd number of times. The reason for this is that when you XOR together numbers that appear an even number of times, you can break the XOR of those numbers up into a set of pairs. Each pair XORs to 0 by (3), and th combined XOR of all these zeros gives back zero by (4). Consequently, all the numbers of even multiplicity cancel out.
To use this to solve the original problem, do the following. First, XOR together all the numbers in the list. This gives the XOR of all numbers that appear an odd number of times, which ends up being all the numbers from 1 to (n-1) except the duplicate. Now, XOR this value with the XOR of all the numbers from 1 to (n-1). This then makes all numbers in the range 1 to (n-1) that were not previously canceled out cancel out, leaving behind just the duplicated value. Moreover, this runs in O(n) time and only uses O(1) space, since the XOR of all the values fits into a single integer.
In your original post you considered an alternative approach that works by using the fact that the sum of the integers from 1 to n-1 is n(n-1)/2. You were concerned, however, that this would lead to integer overflow and cause a problem. On most machines you are right that this would cause an overflow, but (on most machines) this is not a problem because arithmetic is done using fixed-precision integers, commonly 32-bit integers. When an integer overflow occurs, the resulting number is not meaningless. Rather, it's just the value that you would get if you computed the actual result, then dropped off everything but the lowest 32 bits. Mathematically speaking, this is known as modular arithmetic, and the operations in the computer are done modulo 232. More generally, though, let's say that integers are stored modulo k for some fixed k.
Fortunately, many of the arithmetical laws you know and love from normal arithmetic still hold in modular arithmetic. We just need to be more precise with our terminology. We say that x is congruent to y modulo k (denoted x ≡k y) if x and y leave the same remainder when divided by k. This is important when working on a physical machine, because when an integer overflow occurs on most hardware, the resulting value is congruent to the true value modulo k, where k depends on the word size. Fortunately, the following laws hold true in modular arithmetic:
For example:
If x ≡k y and w ≡k z, then x + w ≡k y + z
If x ≡k y and w ≡k z, then xw ≡k yz.
This means that if you want to compute the duplicate value by finding the total sum of the elements of the array and subtracting out the expected total, everything will work out fine even if there is an integer overflow because standard arithmetic will still produce the same values (modulo k) in the hardware. That said, you could also use the XOR-based approach, which doesn't need to consider overflow at all. :-)
If you are not guaranteed that exactly one element is duplicated, but you can modify the array of elements, then there is a beautiful algorithm for finding the duplicated value. This earlier SO question describes how to accomplish this. Intuitively, the idea is that you can try to sort the sequence using a bucket sort, where the array of elements itself is recycled to hold the space for the buckets as well.
If you are not guaranteed that exactly one element is duplicated, and you cannot modify the array of elements, then the problem is much harder. This is a classic (and hard!) interview problem that reportedly took Don Knuth 24 hours to solve. The trick is to reduce the problem to an instance of cycle-finding by treating the array as a function from the numbers 1-n onto 1-(n-1) and then looking for two inputs to that function. However, the resulting algorithm, called Floyd's cycle-finding algorithm, is extremely beautiful and simple. Interestingly, it's the same algorithm you would use to detect a cycle in a linked list in linear time and constant space. I'd recommend looking it up, since it periodically comes up in software interviews.
For a complete description of the algorithm along with an analysis, correctness proof, and Python implementation, check out this implementation that solves the problem.
Hope this helps!
Adding the elements is perfectly fine you just have to take mod(%) of the intermediate aggregate when calculating the sum of the elements and the expected sum. For the mod operation you can use something like 2n. You also have to fix the value after substraction.

Integrating in maple with integer parameter

I'm attempting to integrate
> ans1 := ([int(e^inx/(2*pi), x = -Pi .. Pi, AllSolutions)], assuming [n::integer]);
I was able to get several other similar integrals to evaluate properly. However, for some reason when I evaluate this integral I simply get back e^{inx}. Moreover, if I add '*' between i,n and x I get a different answer.
Is there any reason for this? Am I missing something?
As stated, 'inx' is a single variable in your expression and thus, the answer that you're getting is expected since you do not have an 'x' term in your function. In order to have three separate terms, i, n, and x, you will need to add in the * between each term, 'inx'. If you are entering this in Maple's 2-D math notation, then spaces are interpreted as implicit multiplications and you can leave out the *s.
In addition, you might need to consider changing a few other parts of your syntax to conform to the Maple language (unless of course these are intentional):
The exponential function 'e' is entered as 'exp'
'pi' is the symbolic lowercase Greek letter; 'Pi' is the mathematical constant.
By default, Maple uses 'I' for imaginary numbers. You can change your default to use 'i', but otherwise this is just a symbol 'i'.
Applying these changes to your code, try something like:
int( exp(I*n*x)/(2*Pi), x = -Pi .. Pi, ...)

Finding out the duplicate element in an array

There is an array of size n and the elements contained in the array are between 1 and n-1 such that each element occurs once and just one element occurs more than once. We need to find this element.
Though this is a very FAQ, I still haven't found a proper answer. Most suggestions are that I should add up all the elements in the array and then subtract from it the sum of all the indices, but this won't work if the number of elements is very large. It will overflow. There have also been suggestions regarding the use of XOR gate dup = dup ^ arr[i] ^ i, which are not clear to me.
I have come up with this algorithm which is an enhancement of the addition algorithm and will reduce the chances of overflow to a great extent!
for i=0 to n-1
begin :
diff = A[i] - i;
sum = sum + diff;
end
diff contains the duplicate element, but using this method I am unable to find out the index of the duplicate element. For that I need to traverse the array once more which is not desirable. Can anyone come up with a better solution that does not involve the addition method or the XOR method works in O(n)?
There are many ways that you can think about this problem, depending on the constraints of your problem description.
If you know for a fact that exactly one element is duplicated, then there are many ways to solve this problem. One particularly clever solution is to use the bitwise XOR operator. XOR has the following interesting properties:
XOR is associative, so (x ^ y) ^ z = x ^ (y ^ z)
XOR is commutative: x ^ y = y ^ x
XOR is its own inverse: x ^ y = 0 iff x = y
XOR has zero as an identity: x ^ 0 = x
Properties (1) and (2) here mean that when taking the XOR of a group of values, it doesn't matter what order you apply the XORs to the elements. You can reorder the elements or group them as you see fit. Property (3) means that if you XOR the same value together multiple times, you get back zero, and property (4) means that if you XOR anything with 0 you get back your original number. Taking all these properties together, you get an interesting result: if you take the XOR of a group of numbers, the result is the XOR of all numbers in the group that appear an odd number of times. The reason for this is that when you XOR together numbers that appear an even number of times, you can break the XOR of those numbers up into a set of pairs. Each pair XORs to 0 by (3), and th combined XOR of all these zeros gives back zero by (4). Consequently, all the numbers of even multiplicity cancel out.
To use this to solve the original problem, do the following. First, XOR together all the numbers in the list. This gives the XOR of all numbers that appear an odd number of times, which ends up being all the numbers from 1 to (n-1) except the duplicate. Now, XOR this value with the XOR of all the numbers from 1 to (n-1). This then makes all numbers in the range 1 to (n-1) that were not previously canceled out cancel out, leaving behind just the duplicated value. Moreover, this runs in O(n) time and only uses O(1) space, since the XOR of all the values fits into a single integer.
In your original post you considered an alternative approach that works by using the fact that the sum of the integers from 1 to n-1 is n(n-1)/2. You were concerned, however, that this would lead to integer overflow and cause a problem. On most machines you are right that this would cause an overflow, but (on most machines) this is not a problem because arithmetic is done using fixed-precision integers, commonly 32-bit integers. When an integer overflow occurs, the resulting number is not meaningless. Rather, it's just the value that you would get if you computed the actual result, then dropped off everything but the lowest 32 bits. Mathematically speaking, this is known as modular arithmetic, and the operations in the computer are done modulo 232. More generally, though, let's say that integers are stored modulo k for some fixed k.
Fortunately, many of the arithmetical laws you know and love from normal arithmetic still hold in modular arithmetic. We just need to be more precise with our terminology. We say that x is congruent to y modulo k (denoted x ≡k y) if x and y leave the same remainder when divided by k. This is important when working on a physical machine, because when an integer overflow occurs on most hardware, the resulting value is congruent to the true value modulo k, where k depends on the word size. Fortunately, the following laws hold true in modular arithmetic:
For example:
If x ≡k y and w ≡k z, then x + w ≡k y + z
If x ≡k y and w ≡k z, then xw ≡k yz.
This means that if you want to compute the duplicate value by finding the total sum of the elements of the array and subtracting out the expected total, everything will work out fine even if there is an integer overflow because standard arithmetic will still produce the same values (modulo k) in the hardware. That said, you could also use the XOR-based approach, which doesn't need to consider overflow at all. :-)
If you are not guaranteed that exactly one element is duplicated, but you can modify the array of elements, then there is a beautiful algorithm for finding the duplicated value. This earlier SO question describes how to accomplish this. Intuitively, the idea is that you can try to sort the sequence using a bucket sort, where the array of elements itself is recycled to hold the space for the buckets as well.
If you are not guaranteed that exactly one element is duplicated, and you cannot modify the array of elements, then the problem is much harder. This is a classic (and hard!) interview problem that reportedly took Don Knuth 24 hours to solve. The trick is to reduce the problem to an instance of cycle-finding by treating the array as a function from the numbers 1-n onto 1-(n-1) and then looking for two inputs to that function. However, the resulting algorithm, called Floyd's cycle-finding algorithm, is extremely beautiful and simple. Interestingly, it's the same algorithm you would use to detect a cycle in a linked list in linear time and constant space. I'd recommend looking it up, since it periodically comes up in software interviews.
For a complete description of the algorithm along with an analysis, correctness proof, and Python implementation, check out this implementation that solves the problem.
Hope this helps!
Adding the elements is perfectly fine you just have to take mod(%) of the intermediate aggregate when calculating the sum of the elements and the expected sum. For the mod operation you can use something like 2n. You also have to fix the value after substraction.

Resources