Inserting an element into an array in J - arrays

What is the best practice for inserting an element into an array at an arbitrary position in J?
I guess this is sort of a double question: my main issue is figuring out how to provide three arguments to the verb I want to create. The gist of the code that I want to write is
insert =. dyad : '(n {. y) , x , (n }. y)'
for a position n. The best solution to this that I can think of is taking a two-length array of boxes as the right argument and the position as the left, but that seems a bit clunky
insert =. dyad : 0
NB. the array to be inserted is the first argument
i =. > {. y
NB. the original array is the second argument
a =. > {: y
(x {. a) , i , (x }. a)
)
EDIT: Furthermore, would it be possible to take an array of indices to insert the item at and an array of items to be inserted at those indices -- i.e. inserting multiple items at a time? It seems to me like this is something J would be good at, but I'm not sure how it would be done.

Boxing the arguments is an often used technique. You can use multiple assignment for cleaner code:
f =: 3 : 0
'arg1 arg2' =: y
)
f (i.5);(i.9) NB. arg1 is i.5, arg2 is i.9
To insert array a at position n in L, you can more compactly write:
n ({., a, }.) L
Another way to insert an element into an array is to fill with #!.. Some examples:
1 1 1j2 1 (#!.999) 1 2 3 4
1 2 3 999 999 4
1j1 1 1j1 1 (#!.999) 1 2 3 4
1 999 2 3 999 4
1 1 0j1 1 (#!.999) 1 2 3 4
1 2 999 4
Depending on your needs, there are many other tricks you can use, like shifting by n n |. and then undoing the shift with dual &.:
a,&. (n |. ]) L
(reply to the comment that got too long)
Both from readability and performance standpoint the two methods are about the same. I would slightly favor the first as more readable but would probably use the second.
You can use timespacex verb to check the performance: eg.
NB. define the different methods
f1 =: 4 :'x ({., a, }.) y
f2 =: 4 :' a,&. (x |. ]) y'
NB. set some parameters
a =: 1000 $ 9
L =: 1e6 $ 5
n =: 333456
NB. check if the too methods give identical results
(n f1 L) -: (n f2 L)
1
NB. iterate 100 times to get performance averages
100 timespacex'n f1 L'
0.00775349 2.09733e7
100 timespacex'n f2 L'
0.00796431 1.67886e7

Related

For loop with Maxima

for i:1 thru 3 step 1 do;
posix:arithsum(li*cos(ri(t))),1,i-1)+(li*cos(ri(t))/2);
posiy:arithsum(li*sin(ri(t))),1,i-1)+(li*sin(ri(t))/2);
What I wanna do is to get 6 position functions(3 x and 3 y). It should give me values like following:
pos1x:l1*cos(r1(t))/2;
pos2x:l1*cos(r1(t))+l2*cos(r2(t))/2;
pos3x:l1*cos(r1(t))+l2*cos(r2(t))+l3*cos(r3(t))/2;
So, why my code is not working?
Couple of things here. (1) for loop takes just one expression as its loop body; typically multiple expressions are combined into one as (e1, e2, e3) or block(e1, e2, e3). Note that for ... do; isn't correct syntax, since it doesn't have a loop body -- the semicolon terminates the for expression. Note also that expressions in the body are separated by commas, not semicolons. (2) You can use subscript notation to index items; Maxima won't automatically construct symbol names such as pos1x. Instead, use subscript notation: posx[1], posy[i], etc.
Given that, here's a solution.
(%i1) load (functs);
(%o1) /Applications/Maxima.app/Contents/Resources/opt/share/maxima/5.41.0/shar\
e/simplification/functs.mac
(%i2) for i:1 thru 3 step 1 do
(posx[i]:arithsum(l[i]*cos(r[i](t)),1,i-1)+(l[i]*cos(r[i](t))/2),
posy[i]:arithsum(l[i]*sin(r[i](t)),1,i-1)+(l[i]*sin(r[i](t))/2));
(%o2) done
(%i3) [posx[1], posx[2], posx[3]];
l cos(r (t)) 3 l cos(r (t)) l cos(r (t))
1 1 2 2 1 3 3
(%o3) [-------------, ---------------, 2 (l cos(r (t)) + -) + -------------]
2 2 3 3 2 2
(%i4) [posy[1], posy[2], posy[3]];
l sin(r (t)) 3 l sin(r (t)) l sin(r (t))
1 1 2 2 1 3 3
(%o4) [-------------, ---------------, 2 (l sin(r (t)) + -) + -------------]
2 2 3 3 2 2
I am guessing that l[i] and r[i] should be subscripted too. I changed the parentheses in order to fix a syntax problem; if you intended something else, of course you can go ahead and change it again.
Note that in this formulation posx and posy are so-called undeclared arrays. Undeclared arrays are suitable for representing subscripted symbolic variables. You can get the list of elements via listarray.

Find a duplicate in array of integers

This was an interview question.
I was given an array of n+1 integers from the range [1,n]. The property of the array is that it has k (k>=1) duplicates, and each duplicate can appear more than twice. The task was to find an element of the array that occurs more than once in the best possible time and space complexity.
After significant struggling, I proudly came up with O(nlogn) solution that takes O(1) space. My idea was to divide range [1,n-1] into two halves and determine which of two halves contains more elements from the input array (I was using Pigeonhole principle). The algorithm continues recursively until it reaches the interval [X,X] where X occurs twice and that is a duplicate.
The interviewer was satisfied, but then he told me that there exists O(n) solution with constant space. He generously offered few hints (something related to permutations?), but I had no idea how to come up with such solution. Assuming that he wasn't lying, can anyone offer guidelines? I have searched SO and found few (easier) variations of this problem, but not this specific one. Thank you.
EDIT: In order to make things even more complicated, interviewer mentioned that the input array should not be modified.
Take the very last element (x).
Save the element at position x (y).
If x == y you found a duplicate.
Overwrite position x with x.
Assign x = y and continue with step 2.
You are basically sorting the array, it is possible because you know where the element has to be inserted. O(1) extra space and O(n) time complexity. You just have to be careful with the indices, for simplicity I assumed first index is 1 here (not 0) so we don't have to do +1 or -1.
Edit: without modifying the input array
This algorithm is based on the idea that we have to find the entry point of the permutation cycle, then we also found a duplicate (again 1-based array for simplicity):
Example:
2 3 4 1 5 4 6 7 8
Entry: 8 7 6
Permutation cycle: 4 1 2 3
As we can see the duplicate (4) is the first number of the cycle.
Finding the permutation cycle
x = last element
x = element at position x
repeat step 2. n times (in total), this guarantees that we entered the cycle
Measuring the cycle length
a = last x from above, b = last x from above, counter c = 0
a = element at position a, b = elment at position b, b = element at position b, c++ (so we make 2 steps forward with b and 1 step forward in the cycle with a)
if a == b the cycle length is c, otherwise continue with step 2.
Finding the entry point to the cycle
x = last element
x = element at position x
repeat step 2. c times (in total)
y = last element
if x == y then x is a solution (x made one full cycle and y is just about to enter the cycle)
x = element at position x, y = element at position y
repeat steps 5. and 6. until a solution was found.
The 3 major steps are all O(n) and sequential therefore the overall complexity is also O(n) and the space complexity is O(1).
Example from above:
x takes the following values: 8 7 6 4 1 2 3 4 1 2
a takes the following values: 2 3 4 1 2
b takes the following values: 2 4 2 4 2
therefore c = 4 (yes there are 5 numbers but c is only increased when making steps, not initially)
x takes the following values: 8 7 6 4 | 1 2 3 4
y takes the following values: | 8 7 6 4
x == y == 4 in the end and this is a solution!
Example 2 as requested in the comments: 3 1 4 6 1 2 5
Entering cycle: 5 1 3 4 6 2 1 3
Measuring cycle length:
a: 3 4 6 2 1 3
b: 3 6 1 4 2 3
c = 5
Finding the entry point:
x: 5 1 3 4 6 | 2 1
y: | 5 1
x == y == 1 is a solution
Here is a possible implementation:
function checkDuplicate(arr) {
console.log(arr.join(", "));
let len = arr.length
,pos = 0
,done = 0
,cur = arr[0]
;
while (done < len) {
if (pos === cur) {
cur = arr[++pos];
} else {
pos = cur;
if (arr[pos] === cur) {
console.log(`> duplicate is ${cur}`);
return cur;
}
cur = arr[pos];
}
done++;
}
console.log("> no duplicate");
return -1;
}
for (t of [
[0, 1, 2, 3]
,[0, 1, 2, 1]
,[1, 0, 2, 3]
,[1, 1, 0, 2, 4]
]) checkDuplicate(t);
It is basically the solution proposed by #maraca (typed too slowly!) It has constant space requirements (for the local variables), but apart from that only uses the original array for its storage. It should be O(n) in the worst case, because as soon as a duplicate is found, the process terminates.
If you are allowed to non-destructively modify the input vector, then it is pretty easy. Suppose we can "flag" an element in the input by negating it (which is obviously reversible). In that case, we can proceed as follows:
Note: The following assume that the vector is indexed starting at 1. Since it is probably indexed starting at 0 (in most languages), you can implement "Flag item at index i" with "Negate the item at index i-1".
Set i to 0 and do the following loop:
Increment i until item i is unflagged.
Set j to i and do the following loop:
Set j to vector[j].
if the item at j is flagged, j is a duplicate. Terminate both loops.
Flag the item at j.
If j != i, continue the inner loop.
Traverse the vector setting each element to its absolute value (i.e. unflag everything to restore the vector).
It depends what tools are you(your app) can use. Currently a lot of frameworks/libraries exists. For exmaple in case of C++ standart you can use std::map<> ,as maraca mentioned.
Or if you have time you can made your own implementation of binary tree, but you need to keep in mind that insert of elements differs in comarison with usual array. In this case you can optimise search of duplicates as it possible in your particular case.
binary tree expl. ref:
https://www.wikiwand.com/en/Binary_tree

J and L-systems

I'm going to create a program that can generate strings from L-system grammars.
Astrid Lindenmayer's original L-System for modelling the growth of algae is:
variables : A B
constants : none
axiom : A
rules : (A → AB), (B → A)
which produces:
iteration | resulting model
0 | A
1 | AB
2 | ABA
3 | ABAAB
4 | ABAABABA
5 | ABAABABAABAAB
that is naively implemented by myself in J like this:
algae =: 1&algae : (([: ; (('AB'"0)`('A'"0) #. ('AB' i. ]))&.>"0)^:[) "1 0 1
(i.6) ([;algae)"1 0 1 'A'
┌─┬─────────────┐
│0│A │
├─┼─────────────┤
│1│AB │
├─┼─────────────┤
│2│ABA │
├─┼─────────────┤
│3│ABAAB │
├─┼─────────────┤
│4│ABAABABA │
├─┼─────────────┤
│5│ABAABABAABAAB│
└─┴─────────────┘
Step-by-step illustration:
('AB' i. ]) 'ABAAB' NB. determine indices of productions for each variable
0 1 0 0 1
'AB'"0`('A'"0)#.('AB' i. ])"0 'ABAAB' NB. apply corresponding productions
AB
A
AB
AB
A
'AB'"0`('A'"0)#.('AB' i. ])&.>"0 'ABAAB' NB. the same &.> to avoid filling
┌──┬─┬──┬──┬─┐
│AB│A│AB│AB│A│
└──┴─┴──┴──┴─┘
NB. finally ; and use ^: to iterate
By analogy, here is a result of the 4th iteration of L-system that generates Thue–Morse sequence
4 (([: ; (0 1"0)`(1 0"0)#.(0 1 i. ])&.>"0)^:[) 0
0 1 1 0 1 0 0 1 1 0 0 1 0 1 1 0
That is the best that I can do so far. I believe that boxing-unboxing method is insufficient here. This is the first time I've missed linked-lists in J - it's much harder to code grammars without them.
What I'm really thinking about is:
a) constructing a list of gerunds of those functions that build final string (in my examples those functions are constants like 'AB'"0 but in case of tree modeling functions are turtle graphics commands) and evoking (`:6) it,
or something that I am able to code:
b) constructing a string of legal J sentence that build final string and doing (".) it.
But I'm not sure if these programs are efficient.
Can you show me a better approach please?
Any hints as well as comments about a) and b) are highly appreciated!
The following will pad the rectangular array with spaces:
L=: rplc&('A';'AB';'B';'A')
L^:(<6) 'A'
A
AB
ABA
ABAAB
ABAABABA
ABAABABAABAAB
Or if you don't want padding:
L&.>^:(<6) <'A'
┌─┬──┬───┬─────┬────────┬─────────────┐
│A│AB│ABA│ABAAB│ABAABABA│ABAABABAABAAB│
└─┴──┴───┴─────┴────────┴─────────────┘
Obviously you'll want to inspect rplc / stringreplace to see what is happening under the covers.
You can use complex values in the left argument of # to expand an array without boxing.
For this particular L-system, I'd probably skip the gerunds and use a temporary substitution:
to =: 2 : 'n (I.m=y) } y' NB. replace n with m in y
ins =: 2 : '(1 j. m=y) #!.n y' NB. insert n after each m in y
L =: [: 'c'to'A' [: 'A'ins'B' [: 'B'to'c' ]
Then:
L^:(<6) 'A'
A
AB
ABA
ABAAB
ABAABABA
ABAABABAABAAB
Here's a more general approach that simplifies the code by using numbers and a gerund composed of constant functions:
'x'-.~"1 'xAB'{~([:,(0:`(1:,2:)`1:)#.]"0)^:(<6) 1
A
AB
ABA
ABAAB
ABAABABA
ABAABABAABAAB
The AB are filled in at the end for display. There's no boxing here because I use 0 as a null value. These get scattered around quite a bit but the -.~"1 removes them. It does pad all the resulting strings with nulls on the right. If you don't want that, you can use <#-.~"1 to box the results instead:
'x'<#-.~"1 'xAB'{~([:,(0:`(1:,2:)`1:)#.]"0)^:(<6) 1
┌─┬──┬───┬─────┬────────┬─────────────┐
│A│AB│ABA│ABAAB│ABAABABA│ABAABABAABAAB│
└─┴──┴───┴─────┴────────┴─────────────┘

Indexing Permutations Having Duplicates

Given an array of length n, I need to print out the array's lexicographic index (indexed from zero). The lexicographic index is essentially the location that the given array would have if placed in a super-array containing all possible permutations of the original array.
This doesn't turn out to be all that difficult (Unique Element Permutations), but my problem is now making the same algorithm, but for an array containing duplicates of the same element.
Here's an example chart showing some of the possible permutations of a small array, and their respective expected return values:
[0 1 1 2 2]->0
[0 1 2 1 2]->1
[0 1 2 2 1]->2
[0 2 1 1 2]->3
[0 2 1 2 1]->4
[0 2 2 1 1]->5
[1 0 1 2 2]->6
[1 0 2 1 2]->7
[1 0 2 2 1]->8
[1 1 0 2 2]->9
[1 1 2 0 2]->10
[1 1 2 2 0]->11
..
[2 2 1 0 1]->28
[2 2 1 1 0]->29
Most importantly, I want to do this WITHOUT generating other permutations to solve the problem (for example, I don't want to generate all permutations less than the given permutation).
I'm looking for pseudocode - no specific language needed as long as I can understand the concept. Even the principle for calculation without pseudocode would be fine.
I've seen some implementations that do something similar but for a binary string (containing only two distinct types of elements), and they used binomial coefficients to get the job done. Hopefully that helps.
As an aside, though the answers to the question linked in Daishisan's comment fulfil the multiset case, the algorithm in your link for binary numbers (for which I was searching when I came upon your answer) works for indexing because it's bijective, but doesn't index the binary number within the sorted infinite sequence of those with the same bit count as you may expect. With the following dependencies,
from functools import reduce
fact=(lambda n: reduce(int.__mul__,range(1,n+1)) if n else 1)
choose=(lambda n,*k: fact(n)//(reduce(int.__mul__,map(fact,k))*fact(n-sum(k))) if all(map(lambda k: 0<=k,k+(n-sum(k),))) else 0)
decompose=(lambda n,l=None: (n>>i&1 for i in range(n.bit_length() if l==None else l)))
It is equivalent to
lambda i,n: reduce(lambda m,i: (lambda s,m,i,b: (s,m-1) if b else (s+choose(n+~i,m),m))(*m,*i),enumerate(decompose(i,n)),(0,i.bit_count()-1))[0]
However, I played with it and found a reduced version that does fulfil this purpose (and thus doesn't need a length specified).
lambda i: reduce(lambda m,i: (lambda s,m,i,b: (s+choose(i,m),m) if b else (s,m+1))(*m,*i),enumerate(decompose(i)),(0,-1))[0]
This is equivalent to A079071 in the OEIS.
Edit: More efficient version without fact and choose (instead mutating choose's output in-place with the other variables)
lambda i: reduce(lambda m,i: (lambda s,m,c,i,b: ((s+c,m,c*i//(i-m+1)) if b else (s,m+1,c*i//m)) if m else (s,m+1-b,c))(*m,*i),enumerate(decompose(i),1),(0,0,1))[0]

2sum with duplicate values

The classic 2sum question is simple and well-known:
You have an unsorted array, and you are given a value S. Find all pairs of elements in the array that add up to value S.
And it's always been said that this can be solved with the use of HashTable in O(N) time & space complexity or O(NlogN) time and O(1) space complexity by first sorting it and then moving from left and right,
well these two solution are obviously correct BUT I guess not for the following array :
{1,1,1,1,1,1,1,1}
Is it possible to print ALL pairs which add up to 2 in this array in O(N) or O(NlogN) time complexity ?
No, printing out all pairs (including duplicates) takes O(N2). The reason is because the output size is O(N2), thus the running time cannot be less than that (since it takes some constant amount of time to print each element in the output, thus to simply print the output would take CN2 = O(N2) time).
If all the elements are the same, e.g. {1,1,1,1,1}, every possible pair would be in the output:
1. 1 1
2. 1 1
3. 1 1
4. 1 1
5. 1 1
6. 1 1
7. 1 1
8. 1 1
9. 1 1
10. 1 1
This is N-1 + N-2 + ... + 2 + 1 (by taking each element with all elements to the right), which is
N(N-1)/2 = O(N2), which is more than O(N) or O(N log N).
However, you should be able to simply count the pairs in expected O(N) by:
Creating a hash-map map mapping each element to the count of how often it appears.
Looping through the hash-map and summing, for each element x up to S/2 (if we go up to S we'll include the pair x and S-x twice, let map[x] == 0 if x doesn't exist in the map):
map[x]*map[S-x] if x != S-x (which is the number of ways to pick x and S-x)
map[x]*(map[x]-1)/2 if x == S-x (from N(N-1)/2 above).
Of course you can also print the distinct pairs in O(N) by creating a hash-map similar to the above and looping through it, and only outputting x and S-x the value if map[S-x] exists.
Displaying or storing the results is O(N2) only.The worst case as highlighted by you clearly has N2 pairs and to write them onto the screen or storing them into a result array would clearly require at least that much time.In short, you are right!
No
You can pre-compute them in O(nlogn) using sorting but to print them you may need more than O(nlogn).In worst case It can be O(N^2).
Let's modify the algorithm to find all duplicate pairs.
As an example:
a[ ]={ 2 , 4 , 3 , 2 , 9 , 3 , 3 } and sum =6
After sorting:
a[ ] = { 2 , 2 , 3 , 3 , 3 , 4 , 9 }
Suppose you found pair {2,4}, now you have to find count of 2 and 4 and multiply them to get no of duplicate pairs.Here 2 occurs 2 times and 1 occurs 1 times.Hence {2,1} will appear 2*1 = 2 times in output.Now consider special case when both numbers are same then count no of occurrence and sq them .Here { 3,3 } sum to 6. occurrence of 3 in array is 3.Hence { 3,3 } will appear 9 times in output.
In your array {1,1,1,1,1} only pair {1,1} will sum to 2 and count of 1 is 5.hence there are going to 5^2=25 pairs of {1,1} in output.

Resources