ValueError: Expected 2D array, got 1D array instead: array=[4. 4. 3. ... 3. 3. 3.] - pivot-table

I tried to find cosine_similarity that compare pivot table and object using .loc, and the result says
ValueError: Expected 2D array, got 1D array instead: array=[4. 4. 3. ... 3. 3. 3.]. Reshape your data either using array.reshape(-1, 1) if your data has a single feature or array.reshape(1, -1) if it contains a single sample.
So, i tried to followed the suggestion, but it doesn't work, it says "Cannot compare multidimensional index" , could you guys help me and explain to me what should i do?
this my code:
#delete related_test
for i in range(len(collab_rated_test_fix)):
ratings_test_df = ratings_test_df.drop(ratings_test_df.loc[(ratings_test_df.user_id == user_id_test) &\
(ratings_test_df.book_id == collab_rated_test_fix[i])]\
.index)
#pivot table
user_book_rating = pd.pivot_table(ratings_test_df, values="rating", index="book_id",columns="user_id")
user_book_rating = user_book_rating.fillna(user_book_rating.mean())
user_id_test = 9923
#cosine similarity
rel_user = pd.DataFrame(columns=['userId_1', 'userId_2','similarity'])
for rows, mean in user_book_rating.iteritems():
sim = cosine_similarity(user_book_rating.loc[ : , user_id_test], user_book_rating.loc[ : , rows]) #error
rel_user = rel_user.append({'UserId_1': user_id_test, 'UserId_2': rows, 'similarity': sim}, ignore_index=True)
rel_user['userId_1'] = rel_user['userId_1'].apply(int)
rel_user['userId_2'] = rel_user['userId_2'].apply(int)

Related

Assign values to an array of indexes in Python

I have an array of size 300x5. In this the column with index 3 consists if some index and column with index 4 consists of corresponding values.
I have created new array in which I am trying to assign the values in index 4 at index 3 locations in this new array. I tried this but it throws an error.
new_arr[old_arr[:,3]] = old_arr[:,4]
One of the example related to what I want to do
new_arr = np.ones((200,1))
new_arr[[2,3,4]] = [22,44,11]
It throws an error
ValueError: shape mismatch: value array of shape (3,) could not be broadcast to indexing result of shape (3,1)
With this code : new_arr[old_arr[:,3]] you try to access new_arr that index come from values are in old_arr[:,3] and you got IndexError.
Is this help you?
new_arr = np.zeros((300, 5))
new_arr[:,3] = old_arr[:,4]
For edited question you need reshape:
new_arr = np.ones((200,1))
new_arr[[2,3,4]] = np.array([2,4,6]).reshape(3,1)
# OR
# new_arr[2:5] = np.array([22,44,11]).reshape(3,1)

ValueError: setting an array element with a sequence for incorporating word2vec model in my pandas dataframe

I am getting "ValueError: setting an array element with a sequence." error when I am trying to run my random forest classifier on a heterogenous data--the text data is been fed to word2vec model and I extracted one dimensional numpy array by taking mean of the word2vec vectors for each word in the text row.
Here is the sample of the data am working with:
col-A col-B ..... col-z
100 230 ...... [0.016612869501113892, -0.04279713928699493, .....]
where col-z is the numpy array with fixed size of 300 in each row.
Following is the code for calculating mean the word2vec vectors and creating numpy arrays:
` final_data = []
for i, row in df.iterrows():
text_vectorized = []
text = row['col-z']
for word in text:
try:
text_vectorized.append(list(w2v_model[word]))
except Exception as e:
pass
try:
text_vectorized = np.asarray(text_vectorized, dtype='object')
text_vectorized_mean = list(np.mean(text_vectorized, axis=0))
except Exception as e:
text_vectorized_mean = list(np.zeros(100))
pass
try:
len(text_vectorized_mean)
except:
text_vectorized_mean = list(np.zeros(100))
temp_row = np.asarray(text_vectorized_mean, dtype='object')
final_data.append(temp_row)
text_array = np.asarray(final_data, dtype='object')`
After this, I convert text_array to pandas dataframe and concatenate it with my original dataframe with other numeric columns. But as soon as I try to feed this data into a classifier, it gives me the above error at this line:
--> array = np.array(array, dtype=dtype, order=order, copy=copy)
Why am I getting this error?
You are trying to create an array from a mixed list containing both numeric values and an another list. Try to flatten the array first using .ravel()
For example,
text_array = np.asarray(final_data.ravel(), dtype='object')

Modifying a numpy array efficiently

I have a numpy array A of size 10 with values ranging from 0-4. I want to create a new 2-D array B from this with its ith column being a vector corresponding to the ith element of A.
For example, the value 1 as the first element of A would correspond to B having a column vector [0,1,0,0,0] as it's first column. A having 4 as its third element would correspond to B having it's 3rd column as [0,0,0,1,0]
I have the following code:
import numpy as np
A = np.random.randint(0,5,10)
B = np.ones((5,10))
iden = np.identity(5, dtype=np.float64)
for i in range(0,10):
a = A[i]
B[:,i:i+1] = iden[:,a:a+1]
print A
print B
The code is doing what it's supposed to be doing but I am sure there are more efficient ways of doing this. Can anyone please suggest some?
That could be solved by initializing an array of zeros and then integer-indexing into it with indices from A and assigning 1s, like so -
M,N = 5,10
A = np.random.randint(0,M,N)
B = np.zeros((M,N))
B[A,np.arange(len(A))] = 1

iPython numpy - How to change value of an array slice with a map

I've got a 3-dim array [rows][cols][3] with values between 0 and X.
I need to manipulate a specific dimension in the array. So I've taken a slice of the part I want to manipulate
arr_slice = array[:,:,0]
now I can make some manipulations like arr_slice *= 3 and that will change the original array, as I intended.
However, I need to change values according to a map, which is an array with size X that maps the values of the slice (0-X) to new values. the map is called mapping
so I know mapping[arr_slice] will do what I want, but using it like this:
arr_slice = mapping[arr_slice]
will of course change only arr_slice and not the original array I have.
So, How can I perform this task to change the original array?
The array is actually an image, that I'm trying to manipulate it's Y values in YIQ format:
im_eq = np.copy(im_orig)
if (rgb):
im_eq = rgb2yiq(im_eq)
im = im_eq[:,:,0]
else:
im = im_eq
mapping = get_cumutative_histogram(im)
im = mapping[im.astype(int)] # the problematic line
You need to address the slice elements:
im[:] = mapping[im.astype(int)]
for example:
from pylab import *
a = rand(10)
sl = a[4:9]
print sl # ->: array([ 0.97278179, 0.7894741 , 0.38051133, 0.42684762, 0.82670638])
sl[:] = 1
print a #-> array([ 0.21125781, 0.4235981 , 0.81950229, 0.93937973, 1. , 1. , 1. , 1. , 1. , 0.39047808])

Accessing n-dimensional array in R using a function of vector of indexes

my program in R creates an n-dimensional array.
PVALUES = array(0, dim=dimensions)
where dimensions = c(x,y,z, ... )
The dimensions will depend on a particular input. So, I want to create a general-purpose code that will:
Store a particular element in the array
Read a particular element from the array
From reading this site I learned how to do #2 - read an element from the array
ll=list(x,y,z, ...)
element_xyz = do.call(`[`, c(list(PVALUES), ll))
Please help me solving #1, that is storing an element to the n-dimensional array.
Let me rephrase my question
Suppose I have a 4-dimensional array. I can store a value and read a value from this array:
PVALUES[1,1,1,1] = 43 #set a value
data = PVALUES[1,1,1,1] #use a value
How can I perform the same operations using a function of a vector of indexes:
indexes = c(1,1,1,1)
set(PVALUES, indexes) = 43
data = get(PVALUES, indexes) ?
Thank you
Thanks for helpful response.
I will use the following solution:
PVALUES = array(0, dim=dimensions) #Create an n-dimensional array
dimensions = c(x,y,z,...,n)
Set a value to PVALUES[x,y,z,...,n]:
y=c(x,y,z,...,n)
PVALUES[t(y)]=26
Reading a value from PVALUES[x,y,z,...,n]:
y=c(x,y,z,...,n)
data=PVALUES[t(y)]
The indexing of arrays can be done with matrices having the same number of columns as there are dimensions:
# Assignment with "[<-"
newvals <- matrix( c( x,y,z,vals), ncol=4)
PVALUES[ newvals[ ,-4] ] <- vals
# Reading values with "["
PVALUES[ newvals[ ,-4] ]

Resources