How to reverse multiple lists using the Wolfram-Language - wolfram-language

I'm trying to find the best way to solve the question: "Use Range, Reverse and Join to create {3, 2, 1, 4, 3, 2, 1, 5, 4, 3, 2, 1}"
So basically the given lists are {1, 2, 3, 4, 5}, {1, 2, 3, 4}, {1, 2, 3}.
I could easily solve this question but wanted to know if there is a better way (more efficient) than what i've come up with.:
My Solutions:
In[136]:= Join[ Reverse[Range[3]], Reverse[Range[4]], Reverse[Range[5]] ]
In[141]:= Reverse[Join[ Range[5], Range[4], Range[3] ]]
given lists: {1, 2, 3, 4, 5}, {1, 2, 3, 4}, {1, 2, 3}, where you have to use the functions Range, Reverse and Join to create the expected output:
{3, 2, 1, 4, 3, 2, 1, 5, 4, 3, 2, 1}
My solution will not be efficient if there were to be 100 lists instead of three.
Thanks in advance for the help

RUNNING EACH ELEMENT OF A LIST THROUGH A FUNCTION:
listA = {}
Function[x, listA = Join[listA, x]] /# {Range[5], Range[4], Range[3]}
listB = Reverse[listA]
Clear[listA]
output:
result -> listB: {3, 2, 1, 4, 3, 2, 1, 5, 4, 3, 2, 1}

Range[#] & /* Reverse /# {3, 4, 5} // Flatten
{3, 2, 1, 4, 3, 2, 1, 5, 4, 3, 2, 1}
Update
Someone voted to delete my answer without providing a reason. Perhaps because it did not use Join. To address that
Range[#] & /* Reverse /# {3, 4, 5} // Apply[Join]

Related

Removing array rows based on certain matches between elements

Consider the small sample of a 6-column integer array:
import numpy as np
J = np.array([[1, 3, 1, 3, 2, 5],
[2, 6, 3, 4, 2, 6],
[1, 7, 2, 5, 2, 5],
[4, 2, 8, 3, 8, 2],
[0, 3, 0, 3, 0, 3],
[2, 2, 3, 3, 2, 3],
[4, 3, 4, 3, 3, 4])
I want to remove, from J:
a) all rows where the first and second PAIRS of elements are exact matches
(this remove rows like [1,3, 1,3, 2,5])
b) all rows where the second and third PAIRS of elements are exact matches
(this remove rows like [1,7, 2,5, 2,5])
Matches between any other pairs are OK.
I have a solution, below, but it is handled in two steps. If there is a more direct, cleaner, or more readily extendable approach, I'd be very interested.
K = J[~(np.logical_and(J[:,0] == J[:,2], J[:,1] == J[:,3]))]
L = K[~(np.logical_and(K[:,2] == J[:,4], K[:,3] == K[:,5]))]
K removes the 1st, 5th, and 7th rows from J, leaving
K = [[2, 6, 3, 4, 2, 6],
[1, 7, 2, 5, 2, 5],
[4, 2, 8, 3, 8, 2],
[2, 2, 3, 3, 2, 3]])
L removes the 2nd row from K, giving the final outcome.
L = [[2, 6, 3, 4, 2, 6],
[4, 2, 8, 3, 8, 2],
[2, 2, 3, 3, 2, 3]])
I'm hoping for an efficient solution because, learning from this problem, I need to extend these ideas to 8-column arrays where
I eliminate rows having exact matches between the 1st and 2nd PAIRS, the 2nd and 3rd PAIRS, and the 3rd and 4th PAIRS.
Since we are checking for adjacent pairs for equality, a differencing on 3D reshaped data seems would be one way to do it for a cleaner vectorized one -
# a is input array
In [117]: b = a.reshape(a.shape[0],-1,2)
In [118]: a[~(np.diff(b,axis=1)==0).all(2).any(1)]
Out[118]:
array([[2, 6, 3, 4, 2, 6],
[4, 2, 8, 3, 8, 2],
[2, 2, 3, 3, 2, 3]])
If you are going for performance, skip the differencing and go for sliced equality -
In [142]: a[~(b[:,:-1] == b[:,1:]).all(2).any(1)]
Out[142]:
array([[2, 6, 3, 4, 2, 6],
[4, 2, 8, 3, 8, 2],
[2, 2, 3, 3, 2, 3]])
Generic no. of cols
Extends just as well on generic no. of cols -
In [156]: a
Out[156]:
array([[1, 3, 1, 3, 2, 5, 1, 3, 1, 3, 2, 5],
[2, 6, 3, 4, 2, 6, 2, 6, 3, 4, 2, 6],
[1, 7, 2, 5, 2, 5, 1, 7, 2, 5, 2, 5],
[4, 2, 8, 3, 8, 2, 4, 2, 8, 3, 8, 2],
[0, 3, 0, 3, 0, 3, 0, 3, 0, 3, 0, 3],
[2, 2, 3, 3, 2, 3, 2, 2, 3, 3, 2, 3],
[4, 3, 4, 3, 3, 4, 4, 3, 4, 3, 3, 4]])
In [158]: b = a.reshape(a.shape[0],-1,2)
In [159]: a[~(b[:,:-1] == b[:,1:]).all(2).any(1)]
Out[159]:
array([[4, 2, 8, 3, 8, 2, 4, 2, 8, 3, 8, 2],
[2, 2, 3, 3, 2, 3, 2, 2, 3, 3, 2, 3]])
Of course, we are assuming the number of cols allows pairing.
What you have is quite reasonable. Here's what I would write:
def eliminate_pairs(x: np.ndarray) -> np.ndarray:
first_second = (x[:, 0] == x[:, 2]) & (x[:, 1] == x[:, 3])
second_third = (x[:, 1] == x[:, 3]) & (x[:, 2] == x[:, 4])
return x[~(first_second | second_third)]
You could also apply DeMorgan's theorem and eliminate an extra not operation, but that's less important than clarity.
Let's try a loop:
mask = False
for i in range(0,3,2):
mask = (J[:,i:i+2]==J[:,i+2:i+4]).all(1) | mask
J[~mask]
Output:
array([[2, 6, 3, 4, 2, 6],
[4, 2, 8, 3, 8, 2],
[2, 2, 3, 3, 2, 3]])

Looping through a collection and deleting things on the way

I want to go through a collection and find the first pair of matching elements, but my current approach is having trouble with the indexing going out of bounds all the time.
Here's a simplified MWE example:
function processstuff(stuff)
for pointer1 in 1:length(stuff)
for pointer2 in pointer1:length(stuff)
println("$(stuff)")
pointer1 == pointer2 && continue
if stuff[pointer1] == stuff[pointer2]
# items match, remove them
deleteat!(stuff, pointer1)
deleteat!(stuff, pointer2)
end
end
end
end
processstuff(collect(rand(1:5, 20)))
[1, 4, 3, 3, 2, 4, 5, 2, 2, 2, 3, 1, 2, 1, 2, 4, 3, 2, 1, 1]
[1, 4, 3, 3, 2, 4, 5, 2, 2, 2, 3, 1, 2, 1, 2, 4, 3, 2, 1, 1]
[1, 4, 3, 3, 2, 4, 5, 2, 2, 2, 3, 1, 2, 1, 2, 4, 3, 2, 1, 1]
[1, 4, 3, 3, 2, 4, 5, 2, 2, 2, 3, 1, 2, 1, 2, 4, 3, 2, 1, 1]
[1, 4, 3, 3, 2, 4, 5, 2, 2, 2, 3, 1, 2, 1, 2, 4, 3, 2, 1, 1]
[1, 4, 3, 3, 2, 4, 5, 2, 2, 2, 3, 1, 2, 1, 2, 4, 3, 2, 1, 1]
[1, 4, 3, 3, 2, 4, 5, 2, 2, 2, 3, 1, 2, 1, 2, 4, 3, 2, 1, 1]
[1, 4, 3, 3, 2, 4, 5, 2, 2, 2, 3, 1, 2, 1, 2, 4, 3, 2, 1, 1]
[1, 4, 3, 3, 2, 4, 5, 2, 2, 2, 3, 1, 2, 1, 2, 4, 3, 2, 1, 1]
[1, 4, 3, 3, 2, 4, 5, 2, 2, 2, 3, 1, 2, 1, 2, 4, 3, 2, 1, 1]
[1, 4, 3, 3, 2, 4, 5, 2, 2, 2, 3, 1, 2, 1, 2, 4, 3, 2, 1, 1]
[1, 4, 3, 3, 2, 4, 5, 2, 2, 2, 3, 1, 2, 1, 2, 4, 3, 2, 1, 1]
[4, 3, 3, 2, 4, 5, 2, 2, 2, 3, 1, 1, 2, 4, 3, 2, 1, 1]
[4, 3, 3, 2, 4, 5, 2, 2, 2, 3, 1, 1, 2, 4, 3, 2, 1, 1]
[3, 3, 2, 4, 5, 2, 2, 2, 3, 1, 1, 2, 4, 2, 1, 1]
[3, 3, 2, 4, 5, 2, 2, 2, 3, 1, 1, 2, 4, 2, 1, 1]
[3, 3, 2, 4, 5, 2, 2, 2, 3, 1, 1, 2, 4, 2, 1, 1]
ERROR: LoadError: BoundsError: attempt to access 16-element Array{Int64,1} at index [17]
(Obviously this example is just comparing two numbers, the real comparison isn't.)
The idea of updating the collection of stuff by removing both elements that have been processed looks like it works, because I think Julia updates the iteration thing each time through. But only for a while...?
You can use the following approach (assuming you want to remove pairs):
function processstuff!(stuff)
pointer1 = 1
while pointer1 < length(stuff)
for pointer2 in pointer1+1:length(stuff)
if stuff[pointer1] == stuff[pointer2]
deleteat!(stuff, (pointer1, pointer2))
pointer1 -= 1 # correct pointer location as we later add 1 to it
break
end
end
pointer1 += 1
end
end
In your code there were several problems:
you called deleteat! twice, which could invalidate indexing
your inner loop tried to delete pointer1 several times
in outer loop I use while to dynamically track changing size of stuff

Rearrange array [1, 2, 3, 4, 5, 6] to [1, 3, 5, 2, 4, 6]

I'm looking for the royal road to rearrange an array of variable length like
[1, 2, 3, 4, 5, 6]
into something like this:
[1, 3, 5, 2, 4, 6]
The length of the array is always dividable by 3. So I could also have an array like this:
[1, 2, 3, 4, 5, 6, 7, 8, 9]
which should be turned into:
[1, 4, 7, 2, 5, 8, 3, 6, 9]
A real example would look like this:
['X.1', 'X.2', 'Y.1', 'Y.2', 'Z.1', 'Z.2']
which I would like to turn into:
['X.1', 'Y.1', 'Z.1', 'X.2', 'Y.2', 'Z.2']
An array of size 3 or an empty array should remain unmodified.
How would I do that?
If a is the name of your NumPy array, you can write:
a.reshape(3, -1).ravel('f')
(This assumes that your array is divisible by 3, as you have stated.)
This method works by first viewing each chunk of len(a) / 3 elements as rows of a 2D array and then unravels that 2D array column-wise.
For example:
>>> a = np.array([1, 2, 3, 4, 5, 6])
>>> a.reshape(3, -1).ravel('f')
array([1, 3, 5, 2, 4, 6])
>>> b = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9])
>>> b.reshape(3, -1).ravel('f')
array([1, 4, 7, 2, 5, 8, 3, 6, 9])
no numpy solution:
>>> r = range(1,10)
>>> sum([r[i::len(r)/3] for i in range(len(r)/3)],[])
[1, 4, 7, 2, 5, 8, 3, 6, 9]
i used sum for concatenting lists as it is the most self contained and readable example. But as mentioned in the comments, it is certainly not the most efficient one. For efficiency, you can use list (in)comprehension:
>>> r = range(1,10)
>>> [x for y in [r[i::len(r)/3] for i in range(len(r)/3)] for x in y]
[1, 4, 7, 2, 5, 8, 3, 6, 9]
or any of the other methods mentioned here.
Using reshape and transpose (or T) will do :
import numpy as np
t = np.arange(1, 10)
t2 = np.reshape(t, [t.shape[0]/3, 3]).T.reshape(len(t))

How get sets of integer which appear in at least K or more than K sub-array(s) from array. Each set must have exactly L elements ?

I want to write program with input an array which contains N sub-array(s). Each sub-array is an array of integers which has M element(s). N and M can be very large (e.g. 1.000.000). It assume that the arrays can be stored in physical memory. The program would output: All sets of integers which appear in at least K or more than K sub-array(s). Each set must have exactly L elements.
Ex:
Input:
N = 5
{1, 2, 3, 4, 5}
{1, 3, 2}
{1, 4, 3, 2, 9}
{2, 3, 4, 7, 9}
{3, 4, 5, 9, 10}
Output:
1. In case K = 3, L = 2
{1, 2}
{1, 3}
{2, 3}
{2, 4}
{3, 4}
{3, 9}
{4, 9}
2. In case K = 3, L = 3
{1, 2, 3}
{2, 3, 4}
How can I do ?
thanks

LingPipe LDA matrix representation

I am trying to extract possible topics from list of tweets and LingPipe LDA seems easy to understand and well documented with code sample.
My challenge is to produce the matrix representation using tweets data. For example,
static String[] WORDS = new String[] {
"river", "stream", "bank", "money", "loan"
};
static final int[][] DOC_WORDS = new int[][] {
{ 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 4, 4, 4, 0, 0, 0 },
{ 1, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 4, 4, 0, 0 },
{ 0, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 3, 4, 4, 4, 0 },
{ 0, 1, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 4, 4, 4 }
}
The zero at the end of the above matrix is supposed to represent that none of the word in WORDS array is found in the content. However in this representation, it is presumed to be the zero index or the word 'river' is found.
As tweet is short, I am not sure how I can represent the matrix so that it can show the 'absence' of the word too.
Any advice or suggestion of other method is mush appreciated.

Resources