How to remove single quote in python list - arrays

I would like to remove the single quotes from a list
Creating loop:
results = []
for k in range(1,number_of_observation+2):
results += ['X'+str(k)]
results
Output :
['X1','X2','X3','X4','X5','X6','X7','X8','X9','X10','X11','X12','X13','X14','X15','X16','X17','X18']
Actually each element in the list contains numpy array like this :
X1 = array([ 5.29869582e+03, 4.78138124e+03, 4.66993519e+03, 4.63760715e+03,
4.24625776e+03, 6.82121026e-13, 3.67310328e+03, 3.62922983e+03,
4.67551867e+03, -2.01596513e+03, 5.17998388e+03, 0.00000000e+00,
5.44605355e+03, 4.51697631e+03, 4.62300856e+03, 4.44902873e+03])
X2 = array([ 5.15984732e+03, 3.69964719e+03, 4.88607026e+03, 5.06762424e+03,
4.54623661e+03, 9.09494702e-13, 4.04998815e+03, 3.91555776e+03,
5.07698709e+03, -1.11066480e+03, 4.49209767e+03, 4.54747351e-13,
4.97724688e+03, 4.24955479e+03, 4.72048717e+03, 5.58904656e+03])
And i want to create dataframe from it
Final = pd.DataFrame(data = [results], columns= column_name)
Final
Desire output :
But it gave me output like this :

Use eval function on the result list.Instead of just [results] use list comprehension as in the image.

You can store each element as list of key value pairs, having key as X1 and value as floating point list.
So from list of dictionary you can iterate over keys and values seperately.

Related

Multiply arrays in a list

I have a list of arrays with the same shape, like this:
my_list = [arr_1, arr_2, arr_3, ...]
arr_1.shape
(1988, 1221)
...
Is there a way to multiply every array in my list and get a final array with the same shape?
I've tried this way but it doesn't work:
for i in my_list:
arr_final = np.multiply(my_list[i])
The final array should be the same of every array in the initial list.
arr_final.shape
(1988, 1221)
You can stack them and take product:
mylist = [np.array([1,2]), np.array([2,3]), np.array([1,4])]
np.stack(mylist).prod(0)
Output:
array([ 2, 24])

ValueError: setting an array element with a sequence for incorporating word2vec model in my pandas dataframe

I am getting "ValueError: setting an array element with a sequence." error when I am trying to run my random forest classifier on a heterogenous data--the text data is been fed to word2vec model and I extracted one dimensional numpy array by taking mean of the word2vec vectors for each word in the text row.
Here is the sample of the data am working with:
col-A col-B ..... col-z
100 230 ...... [0.016612869501113892, -0.04279713928699493, .....]
where col-z is the numpy array with fixed size of 300 in each row.
Following is the code for calculating mean the word2vec vectors and creating numpy arrays:
` final_data = []
for i, row in df.iterrows():
text_vectorized = []
text = row['col-z']
for word in text:
try:
text_vectorized.append(list(w2v_model[word]))
except Exception as e:
pass
try:
text_vectorized = np.asarray(text_vectorized, dtype='object')
text_vectorized_mean = list(np.mean(text_vectorized, axis=0))
except Exception as e:
text_vectorized_mean = list(np.zeros(100))
pass
try:
len(text_vectorized_mean)
except:
text_vectorized_mean = list(np.zeros(100))
temp_row = np.asarray(text_vectorized_mean, dtype='object')
final_data.append(temp_row)
text_array = np.asarray(final_data, dtype='object')`
After this, I convert text_array to pandas dataframe and concatenate it with my original dataframe with other numeric columns. But as soon as I try to feed this data into a classifier, it gives me the above error at this line:
--> array = np.array(array, dtype=dtype, order=order, copy=copy)
Why am I getting this error?
You are trying to create an array from a mixed list containing both numeric values and an another list. Try to flatten the array first using .ravel()
For example,
text_array = np.asarray(final_data.ravel(), dtype='object')

Trying to append content to numpy array

I have a script that searches Twitter for a certain term and then prints out a number of attributes for the returned results.
I'm trying to Just a blank array is returned. Any ideas why?
public_tweets = api.search("Trump")
tweets_array = np.empty((0,3))
for tweet in public_tweets:
userid = api.get_user(tweet.user.id)
username = userid.screen_name
location = tweet.user.location
tweetText = tweet.text
analysis = TextBlob(tweet.text)
polarity = analysis.sentiment.polarity
np.append(tweets_array, [[username, location, tweetText]], axis=0)
print(tweets_array)
The behavior I am trying to achieve is something like..
array = []
array.append([item1, item2, item3])
array.append([item4,item5, item6])
array is now [item1, item2, item3],[item4, item5, item6].
But in Numpy :)
np.append doesn't modify the array, you need to assign the result back:
tweets_array = np.append(tweets_array, [[username, location, tweetText]], axis=0)
Check help(np.append):
Note that
append does not occur in-place: a new array is allocated and
filled.
In the second example, you are calling list's append method which happens in place; This is different from np.append.
Here's the source code for np.append
In [178]: np.source(np.append)
In file: /usr/local/lib/python3.5/dist-packages/numpy/lib/function_base.py
def append(arr, values, axis=None):
....docs
arr = asanyarray(arr)
if axis is None:
.... special case, ravels
return concatenate((arr, values), axis=axis)
In your case arr is an array, starting with shape (0,3). values is a 3 element list. The is just a call to concatenate. So append call is just:
np.concateante([tweets_array, [[username, location, tweetText]]], axis=0)
But concatenate works with many items
alist = []
for ....:
alist.append([[username, location, tweetText]])
arr = np.concatenate(alist, axis=0)
should work just as well; better because list append is quicker. Or remove a level of nesting and let np.array stack them on a new axis, just as it does with np.array([[1,2,3],[4,5,6],[7,8,9]]):
alist = []
for ....:
alist.append([username, location, tweetText])
arr = np.array(alist) # or np.stack()
np.append has multiple problems. Wrong name. Doesn't act inplace. Hides concatenate. Flattens without much warning. Limits you to 2 inputs at a time. etc.

How to extract different values/elements of matrix or array without repeating?

I have a vector/ or it could be array :
A = [1,2,3,4,5,1,2,3,4,5,1,2,3]
I want to extract existing different values/elements from this vector without repeating:
1,2,3,4,5
B= [1,2,3,4,5]
How can I extract it ?
I would appreciate for any help please
Try this,
A = [1,2,3,4,5,1,2,3,4,5,1,2,3]
y = unique(A)
B = unique(A) returns the same values as in a but with no repetitions. The resulting vector is sorted in ascending order. A can be a cell array of strings.
B = unique(A,'stable') does the same as above, but without sorting.
B = unique(A,'rows') returns the unique rows ofA`.
[B,i,j] = unique(...) also returns index vectors i and j such that B = A(i) and A = B(j) (or B = A(i,:) and A = B(j,:)).
Reference: http://cens.ioc.ee/local/man/matlab/techdoc/ref/unique.html
Documentation: https://uk.mathworks.com/help/matlab/ref/unique.html
The answers below are correct but if the user does not want to sort the data, you can use unique with the parameter stable
A = [1,2,3,4,5,1,2,3,4,5,1,2,3]
B = unique(A,'stable')

iPython numpy - How to change value of an array slice with a map

I've got a 3-dim array [rows][cols][3] with values between 0 and X.
I need to manipulate a specific dimension in the array. So I've taken a slice of the part I want to manipulate
arr_slice = array[:,:,0]
now I can make some manipulations like arr_slice *= 3 and that will change the original array, as I intended.
However, I need to change values according to a map, which is an array with size X that maps the values of the slice (0-X) to new values. the map is called mapping
so I know mapping[arr_slice] will do what I want, but using it like this:
arr_slice = mapping[arr_slice]
will of course change only arr_slice and not the original array I have.
So, How can I perform this task to change the original array?
The array is actually an image, that I'm trying to manipulate it's Y values in YIQ format:
im_eq = np.copy(im_orig)
if (rgb):
im_eq = rgb2yiq(im_eq)
im = im_eq[:,:,0]
else:
im = im_eq
mapping = get_cumutative_histogram(im)
im = mapping[im.astype(int)] # the problematic line
You need to address the slice elements:
im[:] = mapping[im.astype(int)]
for example:
from pylab import *
a = rand(10)
sl = a[4:9]
print sl # ->: array([ 0.97278179, 0.7894741 , 0.38051133, 0.42684762, 0.82670638])
sl[:] = 1
print a #-> array([ 0.21125781, 0.4235981 , 0.81950229, 0.93937973, 1. , 1. , 1. , 1. , 1. , 0.39047808])

Resources