#1-d interpolation
import numpy as np
from scipy import interpolate
import pylab as py
import pandas as pd
#Read the Dataset from Excel File
def func(x1):
return x1*np.exp(-5.0*x1**2)
dataset=pd.read_excel('Messwerte_FIBRE1.xlsx')
dataset=dataset.drop([0])
index=[1]
index2=[9]
x=dataset.iloc[:, index]
y=dataset.iloc[:, index2]
x1=np.array(x,dtype=float)
y1=np.array(y,dtype=float)
fvals=func(x1)
xnew=np.linespace(430,490,800)
for kind in ['multiquadric','inverse multiquadric','gaussian',
'linear','cubic','quintic','thin-plate']:
newfunc=interpolate.Rbf(x1,fvals,function=kind)
fnew=newfunc(xnew)
I am getting an error:
IndentationError: expected an indented block**
Can any1 help me in fixation? I am trying to read variables from excel file and using RBF interpolation technique for prediction estimation
My excel file looks like this,please click on itMesswerte_FIBRE1.xlsx
Right here the for loop expects code to be in the block.
xnew=np.linespace(430,490,800)
for kind in ['multiquadric','inverse multiquadric','gaussian',
'linear','cubic','quintic','thin-plate']:
#SOMETHING WOULD GO HERE
newfunc=interpolate.Rbf(x1,fvals,function=kind)
fnew=newfunc(xnew)
Related
I have a problem that may be obvious for Pythonist, but I just can't google it out.
Shortly:
I want to use x from the for x in my_data_header as part of my variable name. For example instead of hardcoding my_data.selected_column use my_data.x to loop trough all columns.
Longer:
I want to make boxplot from scientific data imported from the spreadsheet. In one columns are the treatments designation by which I trim dataset. Other are measurements I want to draw boxplots from. I need to loop trough measurement columns and export the boxplots. So the x of the for loop have to be used in:
*selection of the column (within each treatment), titling the boxplot, nameing the export .png file,...
I could perform steps separately, but coudn't compose the loop.
What is recommended approach for looping through spreadsheet columns with complex task where you have to refer to column titles? (I will complete information if needed.)
I am trying to switch from RStudio/Markdown/Knit to Python.
Thank you in advance!
It was the pd.DataFrame issue: my prior used date.read method imported as non-pandas compatible. I past my working code below if someone find it useful. If SO community find it irrelevant just delete all together.
`import numpy as np`
`import pandas as pd
`import seaborn as sns
`data = pd.read_excel('/media/Data/my_file.xlsx', 0)`
`h=data.columns #read headers line`
`d = pd.DataFrame(data)
`print(type(d)) #check that is <class 'pandas.core.frame.DataFrame'>`
`for x in h:`
` yy=d[x] #forreference to column`
` bp = sns.boxplot(x='Column_with_treatments', y=yy, data=data) #make graphs`
` fig = bp.get_figure() #put created graph in to obeject fig`
` nn=yy.name+'_name_you_want.png' #crate file name string`
` print(nn) # it print in log the column name of present graph`
` fig.savefig(nn) #save graph image`
` #plt.show() # it would show each image, but you need to close it to continue`
` plt.clf() #clear the present graph from memory`
I am working on a project of Information Retrieval. For that I am using Google Colab. I am in the phase where I have computed some features ("input_features") and I have the labels ("labels") by doing a for loop, which took me about 4 hours to finish.
So at the end I have appended the results to an array:
input_features = np.array(input_features)
labels = np.array(labels)
So my question would be:
Is it possible to save those results in order to use them future purposes when using google colab?
I have found 2 options that maybe could be applied but I don't know where these files are created.
1) To save them as csv files. And my code would be:
from numpy import savetxt
# save to csv file
savetxt('input_features.csv', input_features, delimiter=',')
savetxt('labels.csv', labels, delimiter=',')
And in order to load them:
from numpy import loadtxt
# load array
input_features = loadtxt('input_features.csv', delimiter=',')
labels = loadtxt('labels.csv', delimiter=',')
# print the array
print(input_features)
print(labels)
But still I don't get something back when I print.
2) Save the results of an array by using pickle where I followed these instructions from here:
https://colab.research.google.com/drive/1EAFQxQ68FfsThpVcNU7m8vqt4UZL0Le1#scrollTo=gZ7OTLo3pw8M
from google.colab import files
import pickle
def features_pickeled(input_features, results):
input_features = input_features + '.txt'
pickle.dump(results, open(input_features, 'wb'))
files.download(input_features)
def labels_pickeled(labels, results):
labels = labels + '.txt'
pickle.dump(results, open(labels, 'wb'))
files.download(labels)
And to load them back:
def load_from_local():
loaded_features = {}
uploaded = files.upload()
for input_features in uploaded.keys():
unpickeled_features = uploaded[input_features]
loaded[input_features] = pickle.load(BytesIO(data))
return loaded_features
def load_from_local():
loaded_labels = {}
uploaded = files.upload()
for labels in uploaded.keys():
unpickeled_labels = uploaded[labels]
loaded[labels] = pickle.load(BytesIO(data))
return loaded_labes
#How do I print the pickled files to see if I have them ready for use???
When using python I would do something like this for pickle:
#Create pickle file
with open("name.pickle", "wb") as pickle_file:
pickle.dump(name, pickle_file)
#Load the pickle file
with open("name.pickle", "rb") as name_pickled:
name_b = pickle.load(name_pickled)
But the thing is that I don't see any files to be created in my google drive.
Is my code correct or do I miss some part of the code?
Long description in order to hopefully have explained in detail what I want to do and what I have done for this issue.
Thank you in advance for your help.
Google Colaboratory notebook instances are never guaranteed to have access to the same resources when you disconnect and reconnect because they are run on virtual machines. Therefore, you can't "save" your data in Colab. Here are a few solutions:
Colab saves your code. If the for loop operation you referenced doesn't take an extreme amount of time to run, just leave the code and run it every time you connect your notebook.
Check out np.save. This function allows you to save an array to a binary file. Then, you could re-upload your binary file when you reconnect your notebook. Better yet, you could store the binary file on Google Drive, mount your drive to your notebook, and reference it like that.
# Mount driver to authenticate yourself to gdrive
from google.colab import drive
drive.mount('/content/gdrive')
#---
# Import necessary libraries
import numpy as np
from numpy import savetxt
import pandas as pd
#---
# Create array
arr = np.array([1, 2, 3, 4, 5])
# save to csv file
savetxt('arr.csv', arr, delimiter=',') # You will see the results if you press in the File icon (left panel)
And then you can load it again by:
# You can copy the path when you find your file in the file icon
arr = pd.read_csv('/content/arr.csv', sep=',', header=None) # You can also save your result as a txt file
arr
I have written a code which will give maximum value from all arrays.I want to store all these values in a single list but i am getting different lists of different values. how to create a single list of these values?
import numpy as np
import re
import gdal
import ogr
from osgeo import gdal_array
from gdalconst import *
import os
import osr
import shutil
import h5py
import subprocess
import operator
#np.set_printoptions(threshold=np.inf)
indir = '/home/vigna/Documents/LST_29_03'
directory=os.fsencode(indir)
e=1
for file in os.listdir(directory):
file1=os.fsdecode(file)
if '_Gadchiroli' in file1:
b=str(e)+'_Gadchiroli.tif'
#print(file1)
gadchiroli=gdal.Open(b,GA_ReadOnly)
gadchiroli_arry=gadchiroli.ReadAsArray()
a=np.amax(gadchiroli_arry)
L=[]
L.append(a)
print(L)
e+=1
it gives result as shown in picture. i need all these values in single list.enter image description here
For example consider this dataset:
(1)
https://archive.ics.uci.edu/ml/machine-learning-databases/annealing/anneal.data
Or
(2)
http://data.worldbank.org/topic
How does one call such external datasets into scikit-learn to do anything with it?
The only kind of dataset calling that I have seen in scikit-learn is through a command like:
from sklearn.datasets import load_digits
digits = load_digits()
You need to learn a little pandas, which is a data frame implementation in python. Then you can do
import pandas
my_data_frame = pandas.read_csv("/path/to/my/data")
To create model matrices from your data frame, I recommend the patsy library, which implements a model specification language, similar to R formulas
import patsy
model_frame = patsy.dmatrix("my_response ~ my_model_fomula", my_data_frame)
then the model frame can be passed in as an X into the various sklearn models.
Simply run the following command and replace the name 'EXTERNALDATASETNAME' with the name of your dataset
import sklearn.datasets
data = sklearn.datasets.fetch_EXTERNALDATASETNAME()
I need to save and load an array, but I get this error:
cv.Save('i.xml',i)
TypeError: Cannot identify type of 'structPtr'
This is the code:
import cv
i = [[1,2],[3,4],[5,6],[7,8]]
cv.Save('i.xml',i)
That's because cv.Save needs to receive the object to be stored in the file as an OpenCV object. For example, the following is a minimal workable example that saves a numpy array in a file using cv.Save:
import cv2
import numpy as np
i = np.eye(3)
cv2.cv.Save('i.xml', cv2.cv.fromarray(i))
As you can see here, arrays should be converted back to numpy from OpenCV as well after reading.
Regards.