So I have a .csv file which has 3 value-types: Time, Torque and Angle.
The values are saved in seperate columns like that:
Row 1: Time1, Torque1, Angle1, ..., TimeX,TorqueX,AngleX.
The rest of the rows are the Values (745 Values each Instance and Variable)
So first I transposed the DataFrame and seperated it into the different variables.
time=df_trans.iloc[::3, :]
torque=df_trans.iloc[1: : 3, :]
angle=df_trans.iloc[2 : :3,:]
So now I have DataFrames with the Times, Torques and Angles of the X instances. (In this example 5 instances) Each Instance having a row and the columns being the values of the Instance.
For example print(angle):
0 ... 745
01.01.1990 01:33:51 Angle 0 ... 5225,68408203125
01.01.1990 01:35:09 Angle 0 ... 5186,560546875
01.01.1990 01:35:58 Angle 0 ... 3794,25634765625
01.01.1990 01:37:11 Angle 0 ... 3230,36791992188
01.01.1990 01:37:57 Angle 0 ... 3794,13012695313
[5 rows x 746 columns]
Now I want to plot the angles against the torque of the Instance for all instances in one plot, but I can't seem to get it working.
The code:
df = pd.read_csv("Kurven.csv", sep=';', header=[0])
df_trans=df.transpose()
time=df_trans.iloc[::3, :]
time
torque=df_trans.iloc[1: : 3, :]
torque
angle=df_trans.iloc[2::3,:]
angle
Then I tried plotting:
import matplotlib.pyplot as plt
plt.plot(angle,torque)
got this Error:
TypeError: unhashable type: 'numpy.ndarray'
Then I tried:
import matplotlib.pyplot as plt
for i, j in angle,torque:
plt.plot(i,j)
Got this Error:
ValueError: too many values to unpack (expected 2)
Related
this is my first post ever here so I'm not quit sure what is the proper form to ask the question. I'm trying to put picture of the results but since its my first post, the website telling me that I need 10 positive post for some credibility so I think that my charts doesn't appear. Also, I'm french, not perfectly bilingual. Please, be indulgent, I'm open for all comments and suggestions. I really need this for my master's projet. Thank you very much!
I have two sets of arrays which contains thousands of values In one (x_1_3) is all the value of temperature and y_0_100 contain only 0's and 100's which are associated to every temperature in x_1_3 sorted.
x_1_3 = array([[ 2.02],
[ 2.01],
[ 3.08],
...,
[ 0.16],
[ 0.17],
[-2.12]])
y_0_100 = array([ 0., 0., 0., ..., 100., 100., 100.])
The 0 in y_0_100 represent solid precipitation and 100 represent liquid precipitation I just want to plot a logistic regression line across my values
(I also tried to put the values in a dataframe, but it didnt work)
dfsnow_rain
AirTemp liquid%
0 2.02 0.0
1 2.01 0.0
2 3.08 0.0
3 3.05 0.0
4 4.89 0.0
... ... ...
7526 0.78 100.0
7527 0.40 100.0
7528 0.16 100.0
7529 0.17 100.0
7530 -2.12 100.0
7531 rows × 2 columns
X = x_1_3
y = y_0_100
# Fit the classifier
clf = linear_model.LogisticRegression(C=1e5)
clf.fit(X, y)
# and plot the result
plt.figure(1, figsize=(10, 5))
plt.clf()
plt.scatter(X.ravel(), y, color='black', zorder=20)
X_test = np.linspace(-15, 15, 300)
loss = expit(X_test * clf.coef_ + clf.intercept_).ravel()
plt.plot(X_test, loss, color='red', linewidth=3)
ols = linear_model.LinearRegression()
ols.fit(X, y)
plt.plot(X_test, ols.coef_ * X_test + ols.intercept_, linewidth=1)
#plt.axhline(1, color='.5')
plt.ylabel('y')
plt.xlabel('X')
plt.xticks(range(-10, 10))
plt.yticks([0, 100, 10])
plt.ylim(0, 100)
plt.xlim(-10, 10)
plt.legend(('Logistic Regression Model', 'Linear Regression Model'),
loc="lower right", fontsize='small')
plt.tight_layout()
plt.show()
Chart results
When I zoom in I realise that my logistic regression line is not flat, its the line that curves in a very small range (see picture below)
Chart when it's zoomed
I would like something more like this :
Logistic regression chart i would like
What am i doing wrong here? I just want to plot a regression line across my values from y0 to y100
Suppose I have this huge set of data in a .txt file that has the following structure: first and the second column represents a discrete bidimensional domain, and the third column represents the values calculated on each point of the discrete X and Y axis. Example given below
x y z
-1 -1 100
-1 0 50
-1 1 100
0 -1 50
0 0 0
0 1 50
1 -1 100
0 -1 50
1 1 100
It seems stupid, but I've been struggling to turn this data into vectors and matrices like X = [-1, 0, 1], Y = [-1, 0, 1] and Z = [[100, 50, 100], [50, 0, 50], [100, 50, 100]]. I went through many technics and methods using numpy, but coudn't manage to make it!
As a bonus: would turning this data into vectors and matrices, like I described, be a good way to plot it in a 3dscatter ou 3dcountour type using matplotlib?
To plot a scatter plot, you will not need to do anything with your data. Just plot the there columns as they are.
To get the values you want in you question, you can take the unique elements of the x and y column and reshape the z column according to those dimensions.
u="""x y z
-1 -1 100
-1 0 50
-1 1 100
0 -1 50
0 0 0
0 1 50
1 -1 100
1 0 50
1 1 100"""
import io
import numpy as np
data = np.loadtxt(io.StringIO(u), skiprows=1)
x = np.unique(data[:,0])
y = np.unique(data[:,1])
Z = data[:,2].reshape(len(x), len(y))
print(Z)
prints
[[100. 50. 100.]
[ 50. 0. 50.]
[100. 50. 100.]]
Differing y coordinates are now along the second axis of the array, which is rather unusual for plotting with matplotlib.
Hence, to get the values gridded for plotting with contour, you will need to do the same reshaping to all three columns and transpose (.T) them.
import matplotlib.pyplot as plt
X = data[:,0].reshape(len(x), len(y)).T
Y = data[:,1].reshape(len(x), len(y)).T
Z = data[:,2].reshape(len(x), len(y)).T
plt.contour(X,Y,Z)
plt.show()
I need to create a three-dimensional array in R which contains the data of a raster with a resolution of 538x907 pixel. I have this raster for each hour in one month, so in January there are 744 raster files. I have to change some values by R and want to summarize them afterwards back to an array that can be processed by the package ncdf4. Therefor I need to create a three-dimensional array which looks like prectdata[1:538, 1:907, 1:744] (first and second for x and y dimension of the raster, third for time dimension). How do I have do concatenate the 744 raster matrices to a three-dimensioanl array for Package ncdf4?
The raster package has a function called as.array which should do just what you want:
library(raster)
# single raster
r <- raster(matrix(runif(538*907),nrow =538))
# stack them
rstack <- do.call(stack,lapply(1:744,function(x) r))
# structure
> rstack
class : RasterStack
dimensions : 538, 907, 487966, 744 (nrow, ncol, ncell, nlayers)
resolution : 0.001102536, 0.001858736 (x, y)
extent : 0, 1, 0, 1 (xmin, xmax, ymin, ymax)
coord. ref. : NA
# convert to array
arr <- as.array(rstack)
# check dimensions
> dim(arr)
[1] 538 907 744
I have a pivot table array with factors and X and Y coordinates such as the one below, and I have a look up table with 64 colours that have RGB values. I have assigned a colour to each factor combination using a dictionary of tuples, but I am having a hard time figuring out how to now compare the keys of my dictonary (which are the different combination of factors) to my array so that each row that has that factor combination can be assigned the colour given in the dictionary.
This is an example of the Pivot Table:
A B C D Xpoint Ypoint
0 1 0 0 20 20
0 1 1 0 30 30
0 1 0 0 40 40
1 0 1 0 50 50
1 0 1 0 60 60
EDIT: This is an example of the LUT:
R G B
0 0 0
1 0 103
0 21 68
95 173 58
and this is an example of the dictionary that was made:
{
(0, 1, 0, 0): (1, 0, 103),
(0, 1, 1, 0): (12, 76, 161),
(1, 0, 1, 0): (0, 0, 0)
}
This is the code that I have used:
import numpy as np
from PIL import Image, ImageDraw
## load in LUT of 64 colours ##
LUT = np.loadtxt('LUT64.csv', skiprows=1, delimiter=',')
print LUT
## load in XY COordinates ##
PivotTable = np.loadtxt('PivotTable_2017-07-13_001.txt', skiprows=1, delimiter='\t')
print PivotTable
## Bring in image ##
IM = Image.open("mothTest.tif")
#bring in number of factors
numFactors = 4
#assign colour vectors to factor combos
iterColours = iter(LUT)
colour_dict = dict() # size will tell you how many colours will be used
for entry in PivotTable:
key = tuple(entry[0:numBiomarkers])
if key not in colour_dict:
colour_dict[key] = next(iterColours)
print(colour_dict)
Is there a way to compare the tuples in this dictionary to the rows in the pivot table array, or maybe there is a better way of doing this? Any help would be greatly appreciated!
If your target is, as I suppose in my comment above, to trace back the colors to the ntuple, then you already did everything. But I do not catch which role is played by the tif file ... Please note I corrected the reference to the non-existent NumBiomarkers variable...
import numpy as np
from PIL import Image, ImageDraw
## load in LUT of 64 colours ##
LUT = np.loadtxt('LUT64.csv', skiprows=1, delimiter=',')
print LUT
## load in XY COordinates ##
PivotTable = np.loadtxt('PivotTable_2017-07-13_001.txt', skiprows=1, delimiter=',')
print PivotTable
## Bring in image ##
IM = Image.open("Lenna.tif")
#bring in number of factors
numFactors = 4
#assign colour vectors to factor combos
iterColours = iter(LUT)
colour_dict = dict() # size will tell you how many colours will be used
for entry in PivotTable:
key = tuple(entry[0:numFactors])
if key not in colour_dict:
colour_dict[key] = next(iterColours)
print(colour_dict)
print '===='
for entry in PivotTable:
key = tuple(entry[0:numFactors])
print str(entry) + ' ' + str(colour_dict[key])
can you please add a short example for LUT64.csv, for PivotTable_2017-07-13_001.txt ? Maybe for this one you should also use a different delimiter than \t to ensure portability of your examples.
Regards
I have a CSV file with X values in the first column, and Y values in the first row, with Z values in the middle like so:
** 39 40 41 42
0.004 2.1802 2.1937 2.2144 2.2379
0.25 1.2409 1.2622 1.2859 1.3073
0.5 1.0538 1.02572 1.04857 1.07059
0.75 0.9479 0.96999 0.98699 1.00675
I can import this into maple as a matrix, but for the maple statistics fit command, it requires the X to be in one column, the Y to be in the second column, and the Z to be in the third column like so:
0.004 39 2.1802
0.004 40 2.1937
0.004 41 2.2144
Is there a way to create the second matrix as Maple wants, or is there a call command for Statistics[Fit] that will allow me to insert the first matrix?
Supposing A is your imported matrix, I create a matrix of three columns X, Y, and Z named XYZ thus:
n:= LinearAlgebra:-RowDimension(A):
m:= LinearAlgebra:-ColumnDimension(A):
XYZ:= Matrix([seq(seq([A[i,1],A[1,j],A[i,j]], j= 2..m), i= 2..n)]):