So I just came back from a job interview and one of the questions I had to face with was :
"Given an array of characters and three characters for example :
Array : [a,b,c,z,s,w,y,z,o]
Char 1: 'z'
Char 2 : 'R'
Char 3 : 'R'
Your goal is to replace each 'z' in the array to become two R characters within O(N) time complexity.
so your input will be Array : [a,b,c,z,s,w,y,z,o]
and your output array will be : [a,b,c,R,R,s,w,y,R,R,o]
assume that there is no 'R' in the array before.
You are not allowed to use other arrays or other variables.
The algorithm should be in-line algorithm.
Your final array must be a characters array."
My solution was within O(N^2) time complexity but there is a solution within O(N) time complexity .
The interview is over but I am still thinking about this problem, Can anyone help me to solve this ?
First scan the input to count how many occurrences of char 1 exist. This has a linear time complexity.
From that you know that the length of the final array will be the input length + the number of occurrences.
Then extend the array to its new length, leaving the new slots empty (or whatever value). The exact nature of the operation depends on how the array data structure is implemented. This can surely be done with at worst a linear time complexity.
Use two indexes, i and j, where i references the last character of the input array and j references the very last index in the array (potentially to an empty slot).
Start copying from i to j each time decreasing the values of these indices with one. If you copy the matching letter, then duplicate the copied character to j again, and only reduce j. This has again a linear time complexity.
The algorithm will end with both i and j equal to -1.
Do two iterations.
First, count the number of char1s ('z' in your example).
Now you know how long your array should be at the end: array.size() + num_char1s
Then, go from last to first with input and output iterators. If the element is char1, insert to the end iterator the new chars, otherwise - just copy.
Pseudo code:
num_char1s = 0
for x in array:
if x == char1:
num_char1s++
// Assuming array has sufficient memory already allocated.
out_iterator = num_char1s + size - 1
in_iterator = size - 1
while (in_iterator >= 0):
if (array[in_iterator] == char1):
array[out_iterator--] = char3
array[out_iterator--] = char2
else:
array[out_iterator--] = array[in_iterator]
in_iterator--
In your question, two things are very important.
can't use new variable
can't use new array
So, we must need to use given array.
First we will increase our given array size double. why? Cause at most our new array size = given_array_size*2 (if all characters = char 1)
Now we will shift our given array n times right, where n= given_array_size.
Now we will iterate our array from the new shifted position = n. iterate i=n to 2*n-1
We will take j=0, which will write new array. if we found char 1, we will
make array[j++]=char 2 and array[j++]=char 3.
But if a character is not 'z', we simply don't do anything. array[j++]=array[i]
At last 0 to j-1 is the right answer.
Complexity: O(n)
No new variable and array needed
There is a known problem "Longest increasing subsequence", which is: Given an array of integers, find out the longest increasing sequence in that array. I now face a similar but apparently more complicated problem: Given an array of integers and a given number N, find N sequences in that array so that each of them is increasing, they do not intersect by indexes and their combined sum of lengths is maximal.
So far I have tried "greedy" algorithms in the line of:
Use the longest increasing subsequence algorithm, throw that sequence away from the array, repeat N times, provide found sequences as result. This works if N=1 by design, works in several odd cases but returns incorrect results for shuffled arrays such as an array constructed of N increasing subsequences.
Construct a number of sequences, adding each element to the now-longest possible subsequence. Obviously flawed, as it finds "substrings" more often than prolonged sequences.
Construct a number of sequences, adding each element to the sequence that has the largest last element. This works better, at least if an array is known to contain N increasing subsequences, this algorithm correctly returns full array as the result, but it does not work properly in general, as it does not consume N as is.
Any other ideas?
If you want to play with sample data of decent size, here's an array:
103,202,234,260,301,324,356,379,405,412,421,284,137,439,315,150,322,454,185,335,481,208,495,223,358,258,522,267,365,526,
536,374,399,566,580,424,302,602,335,365,618,441,380,455,397,483,510,410,419,622,529,534,633,442,544,568,653,668,474,502,
689,583,607,694,699,530,618,648,654,555,705,723,563,738,672,595,746,697,766,720,624,740,794,798,818,845,859,653,752,758,
783,674,793,805,876,831,892,918,929,689,865,950,874,966,997,716,738,899,759,1023,1032,917,1053,938,944,1080,771,797,
960,1089,980,815,839,850,1110,1011,1115,861,878,1143,901,1025,931,1175,1192,1197,1050,1229,959,988,1058,1008,1038,1088,
1116,1126,1135,1063,1256,1269,1082,1275,1088,1305,1122,1154,1157,1326,1184,1350,1184,1205,1236,1268,1293,1324,1373,1347,
1365,1217,1400,1240,1261,1414,1381,1406,1413,1443,1282,1451,1456,1442,1476,1485,1475,1488,1499,1510,1508,1316,1325,1338,
1540,1536,1353,1556,1558,1588,1363,1587,1617,1382,1625,1402,1609,1415,1633,1642,1655,1671,1689,1697,1439,1712,1458,1732,
1481,1693,1510,1747,1715,1762,1730,1791,1820,1522,1539,1748,1759,1566,1577,1584,1611,1646,1834,1790,1653,1820,1659,1833,
1693,1842,1704,1717,1846,1868,1729,1744,1773,1882,1796,1915,1937,1814,1861,1846,1941,1871,1905,1893,1931,1945,1917,
1960,1979,1941,1960,1980,1933,1962,2014,2046,1975,1988,2008,1988,2040,1995,2062,2000,2009,2025,2083,2058,2067,2083,2103,
2038,2114,2121,2134,2063,2166,2115,2124,2178,2202,2135,2090,2104
This is an array constructed of 3 randomized increasing subsequences with overlapping ranges, each having a length of 100, so processing this array with a proper algorithm with N=3 should return full array, with N=1 the answer should be 123, and for N=2, no less than 222. (True value yet undetermined)
You are given an unsorted array of n integers, and you would like to find if there are any duplicates in the array (i.e. any integer appearing more than once).
Describe an algorithm (implemented with two nested loops) to do this.
The question that I am stuck at is:
How can you limit the input data to achieve a better Big O complexity? Describe an algorithm for handling this limited data to find if there are any duplicates. What is the Big O complexity?
Your help will be greatly appreciated. This is not related to my coursework, assignment or coursework and such. It's from the previous year exam paper and I am doing some self-study but seem to be stuck on this question. The only possible solution that i could come up with is:
If we limit the data, and use nested loops to perform operations to find if there are duplicates. The complexity would be O(n) simply because the amount of time the operations take to perform is proportional to the data size.
If my answer makes no sense, then please ignore it and if you could, then please suggest possible solutions/ working out to this answer.
If someone could help me solve this answer, I would be grateful as I have attempted countless possible solution, all of which seems to be not the correct one.
Edited part, again.. Another possible solution (if effective!):
We could implement a loop to sort the array so that it sorts the array (from lowest integer to highest integer), therefore the duplicates will be right next to each other making them easier and faster to be identified.
The big O complexity would still be O(n^2).
Since this is linear type, it would simply use the first loop and iterate n-1 times as we are getting the index in the array (in the first iteration it could be, for instance, 1) and store this in a variable names 'current'.
The loop will update the current variable by +1 each time through the iteration, within that loop, we now write another loop to compare the current number to the next number and if it equals to the next number, we can print using a printf statement else we move back to the outer loop to update the current variable by + 1 (next value in the array) and update the next variable to hold the value of the number after the value in current.
You can do linearly (O(n)) for any input if you use hash tables (which have constant look-up time).
However, this is not what you are being asked about.
By limiting the possible values in the array, you can achieve linear performance.
E.g., if your integers have range 1..L, you can allocate a bit array of length L, initialize it to 0, and iterate over your input array, checking and flipping the appropriate bit for each input.
A variance of Bucket Sort will do. This will give you complexity of O(n) where 'n' is the number of input elements.
But one restriction - max value. You should know the max value your integer array can take. Lets say it as m.
The idea is to create a bool array of size m (all initialized to false). Then iterate over your array. As you find an element, set bucket[m] to true. If it is already true then you've encountered a duplicate.
A java code,
// alternatively, you can iterate over the array to find the maxVal which again is O(n).
public boolean findDup(int [] arr, int maxVal)
{
// java by default assigns false to all the values.
boolean bucket[] = new boolean[maxVal];
for (int elem : arr)
{
if (bucket[elem])
{
return true; // a duplicate found
}
bucket[elem] = true;
}
return false;
}
But the constraint here is the space. You need O(maxVal) space.
nested loops get you O(N*M) or O(N*log(M)) for O(N) you can not use nested loops !!!
I would do it by use of histogram instead:
DWORD in[N]={ ... }; // input data ... values are from < 0 , M )
DWORD his[M]={ ... }; // histogram of in[]
int i,j;
// compute histogram O(N)
for (i=0;i<M;i++) his[i]=0; // this can be done also by memset ...
for (i=0;i<N;i++) his[in[i]]++; // if the range of values is not from 0 then shift it ...
// remove duplicates O(N)
for (i=0,j=0;i<N;i++)
{
his[in[i]]--; // count down duplicates
in[j]=in[i]; // copy item
if (his[in[i]]<=0) j++; // if not duplicate then do not delete it
}
// now j holds the new in[] array size
[Notes]
if value range is too big with sparse areas then you need to convert his[]
to dynamic list with two values per item
one is the value from in[] and the second is its occurrence count
but then you need nested loop -> O(N*M)
or with binary search -> O(N*log(M))
This is an interview question.
Given an array of integers, find the single integer value in the array which occurs with even frequency. All integers will be positive. All other numbers occur odd frequency. The max number in the array can be INT_MAX.
For example, [2, 8, 6, 2] should return 2.
the original array can be modified if you can find better solutions such as O(1) space with O(n) time.
I know how to solve it by hashtable (traverse and count freq). It is O(n) time and space.
Is it possible to solve it by O(1) space or better time?
Given this is an interview question, the answer is: O(1) space is achievable "for very big values of 1":
Prepare a matcharray 1..INT_MAX of all 0
When traversing the array, use the integer as an index into the matcharray, adding 1
When done, traverse the match array to find the one entry with a positive even value
The space for this is large, but independent of the size of the input array, so O(1) space. For really big data sets (say small value range, but enormous array length), this might even be a practically valid solution.
If you are allowed to sort the original array, I believe that you can do this in O(n lg U) time and O(lg U) space, where U is the maximum element of the array. The idea is as follows - using in-place MSD radix sort, sort the array in O(n lg U) time and O(lg U) space. Then, iterate across the array. Since all equal values are consecutive, you can then count how many times each value appears. Once you find the value that appears an even number of times, you can output the answer. This second scan requires O(n) time and O(1) space.
If we assume that U is a fixed constant, this gives an O(n)-time, O(1)-space algorithm. If you don't assume this, then the memory usage is still better than the O(n) algorithm provided that lg U = O(n), which should be true on most machines. Moreover, the space usage is only logarithmically as large as the largest element, meaning that the practical space usage is quite good. For example, on a 64-bit machine, we'd need only space sufficient to hold 64 recursive calls. This is much better than allocating a gigantic array up-front. Moreover, it means that the algorithm is a weakly-polynomial time algorithm as a function of U.
That said, this does rearrange the original array, and thus does destructively modify the input. In a sense, it's cheating because it uses the array itself for the O(n) storage space.
Hope this helps!
Scan through the list maintaining two sets, the 'Even' set and the 'Odd' set. If an element hasn't been seen before (i.e. if it's in neither set), place it in the 'Odd' set. If an element is in one set, move it to the other set. At the end, there should be only one item in the 'Even' set. This probably won't be fast, but the memory usage should be reasonable for large lists.
-Make a hash table containing ints. Call it is_odd or something. Since you might have to look through an array of size INT_MAX, just make it an array of size INT_MAX. Initialize to 0.
-Traverse through the whole array. You have to do this. There's no way to beat O(n).
for each number:
if it's not in the hash table, mark its spot in the table as 1.
if it is in the hash table then:
if its value is '1', make it '2'
if its value is '2', make it '1'.
Now you have to traverse through the hash table. Pull out the sole entry with "2" as the value.
Time:
You traverse the array once and the hash table once, so O(n).
Space:
Just an array of size INT_MAX. Or if you know the range of your array you can restrict your memory use to that.
edit: I just saw that you already had this method. Sorry about that!
I guess we read the task improperly. It asks us "find the single integer value in the array which occurs with even frequency". So, assuming that there is exactly ONE even element, the solution is:
public static void main(String[] args) {
int[] array = { 2, 1, 2, 4, 4 };
int count = 0;
for (int i : array) {
count^=i;
}
System.out.println(count); // Prints 1
}