Extract Data From NetCDF4 File Using List - arrays

I am using a list of integers corresponding to an x,y index of a gridded NetCDF array to extract specific values, the initial code was derived from here. My NetCDF file has a single dimension at a single timestep, which is named TMAX2M. My code written to execute this is as follows (please note that I have not shown the call of netCDF4 at the top of the script):
# grid point lists
lat = [914]
lon = [2141]
# Open netCDF File
fh = Dataset('/pathtofile/temperaturedataset.nc', mode='r')
# Variable Extraction
point_list = zip(lat,lon)
dataset_list = []
for i, j in point_list:
dataset_list.append(fh.variables['TMAX2M'][i,j])
print(dataset_list)
The code executes, and the result is as follows:
masked_array(data=73,mask=False,fill_value=999999,dtype=int16]
The data value here is correct, however I would like the output to only contain the integer contained in "data". The goal is to pass a number of x,y points as seen in the example linked above and join them into a single list.
Any suggestions on what to add to the code to make this achievable would be great.

The solution to calling the particular value from the x,y list on single step within the dataset can be done as follows:
dataset_list = []
for i, j in point_list:
dataset_list.append(fh.variables['TMAX2M'][:][i,j])
The previous linked example contained [0,16] for the indexed variables, [:] can be used in this case.

I suggest converting to NumPy array like this:
for i, j in point_list:
dataset_list.append(np.array(fh.variables['TMAX2M'][i,j]))

Related

Structuring a for loop to output classifier predictions in python

I have an existing .py file that prints a classifier.predict for a SVC model. I would like to loop through each row in the X feature set to return a prediction.
I am currently trying to define the element from which to iterate over so as to allow for definition of the test statistic feature set X.
The test statistic feature set X is written in code as:
X_1 = xspace.iloc[testval-1:testval, 0:5]
testval is the element name used in the for loop in the above line:
for testval in X.T.iterrows():
print(testval)
I am having trouble returning a basic set of index values for X (X is the pandas dataframe)
I have tested the following with no success.
for index in X.T.iterrows():
print(index)
for index in X.T.iteritems():
print(index)
I am looking for the set of index values, with base 1 if possible, like 1,2,3,4,5,6,7,8,9,10...n
seemingly simple stuff...i haven't located an existing question via stackoverflow or google.
ALSO, the individual dataframes I used as the basis for X were refined with the line:
df1.set_index('Date', inplace = True)
Because dates were used as the basis for the concatenation of the individual dataframes the loops as written above are returning date values rather than
location values as I would prefer hence:
X_1 = xspace.iloc[testval-1:testval, 0:5]
where iloc, location is noted
please ask for additional code if you'd like to see more
the loops i've done thus far are returning date values, I would like to return index values of the location of the rows to accommodate the line:
X_1 = xspace.iloc[testval-1:testval, 0:5]
The loop structure below seems to be working for my application.
i = 1
j = list(range(1, len(X),1)
for i in j:

Move N elements in array from back to front

I have a text file contains 2 columns, I need to select one column of them as an array
which contains 200000 and cut N elements from this array and move them from back to front.
I used the following code:
import numpy as np
import glob
files = glob.glob("input/*.txt")
for file in files:
data_file = np.loadtxt(file)
2nd_columns = data_file [:,1]
2nd_columns_array = np.array(2nd_columns)
cut = 62859 # number of elements to cut
remain_points = 2nd_columns_array[:cut]
cut_points = 2nd_columns_array[cut:]
new_array = cut_points + remain_points
It doesn't work and gave me the following error:
ValueError: operands could not be broadcast together with shapes (137141,) (62859,)
any help, please??
It doesn't work because you are trying to add values stored in both arrays and they have different shapes.
One of the ways is to use numpy.hstack:
new_array = np.hstack((2nd_columns_array[cut:], 2nd_columns_array[:cut]))
Side notes:
with your code you will reorder only 2nd column of the last file since reordering is outside of the for loop
you don't need to store cut_poinsts nor remain_points in separate variables. You can operate directly on the 2nd_columns_array
you shouldn't name variables starting from a number
A simple method for this process is numpy.roll.
new_array = np.roll(2nd_column, cut)

How to save dynamic variable from workspace in a separate file in matlab?

I'm working on a problem where I have an array A of 100 elements.
All these 100 elements are changing with time.
So in my workspace, I only get the final values of all these elements after the entire time cycle has run.
I'm trying to save the values with time in a separate file (.txt or .mat) so that I can access that file in order to check how the variable varies with time.
I'm trying the following command:
save('file.mat','A','-append');
But this command overwrites the existing values in my file.
Kindly suggest me a way to save these values without overwriting them and also guide me how to access them in MATLAB.
You can also change the output filename to be unique for each iteration:
for iter=1:n
A = rand(10);
save(sprintf('file%d.mat',iter), 'A');
end
That way each iteration creates one file.
The reason that saving to a file (even using the -append) flag didn't work is because the variable A already exists in the file and will be over-written each time through the loop. You would need to create a new file or new variable name every time through the loop in order for this to not happen.
Saving the results in a file is probably not the best way to store the time-varying values of A. You would be better off using a cell array to store all intermediate values of A.
A_over_time = cell();
for k = 1:n
%// Get A somehow
A_over_time{k} = A;
end
Depending on what A is, you could also store the values of A in a numeric array or matrix.
%// Using an array
A_over_time = zeros(N, 1);
for k = 1:N
A_over_time(k) = A;
end
%// Using a matrix
A_over_time = zeros(N, numel(A));
for k = 1:N
A_over_time(k,:) = A;
end

vector to string matlab

I am trying to convert a vector which is the following :
A =
[
02376R102 ;
21871B206 ;
81765M106 ;
G3156P103]
into string.
It should give the following result :
['02376R102' ;
'21871B206' ;
'81765M106' ;
'G3156P103']
I can not use functions such as num2str because my vector is composed of both letters and numbers...
The ultimate goal is that I want to use the function mkdir to create directories with names in the vector A.
for i=1:end
mkdir('mypath', A(i))
end
but the mkdir functions need to have strings in A...
Thank you a lot for your help
edit :
Sorry for my mispecification, the array I am working with are CUSIP (firms code created by CRSP database) which I have uploaded with excel. The exact array when I copy paste the array is :
'02376R102'
'21871B206'
'81765M106'
'G3156P103'
Which looks like strings... But when I try the function for directories
i=1:end
mkdir('mypath', A(i))
end
matlab says that argument must contain a string.
The full code is the following :
CUSIP_list = unique(CUSIP)
for i = 1:length(CUSIP_list)
mkdir('C:\Users\Marc-Aurèle\Desktop\MASTER THESIS\DATAS',CUSIP_list(i))
end
You are looking for cells instead of strings, first I wonder how you loaded your data in, because data formats with numbers and string should be loaded in as cells, cell arrays or char arrays:
A = {
'02376R102' ;
'21871B206' ;
'81765M106' ;
'G3156P103'}
This would generate 4 x 1 cells
Alternatively you can use vertcat to concatenate them into 1 cell array
A = vertcat( '02376R102','21871B206')
which would give you 2 x 9 chars
I believe you might want to use the second method, after which you can run your function to create directories.

How to convert MATLAB output into an array?

I've made an M-file that outputs data into my MATLAB command window in the form below (I've deleted the extra spaces). Is there an easy way to convert this output into an array, without having to type all the data into the array editor as I'm currently doing? (Or even run it straight from the M-file into an array?)
T = 0.3000
price = 24.8020
T = 0.4000
price = 28.3453
T = 0.5000
price = 31.3934
T = 0.6000
price = 34.0880
Organize your data into arrays.
For example:
T=0.3:0.1:0.6;
Price=yourfunction(T);
Then if you want a price vs T graph,
plot(T,Price)
If you've got a large amount of data, try to avoid for loops, as they're slower than vectorized code.
At some point in your M-file you are printing each line of data to the command window, presumably using DISP or FPRINTF. You can replace that line with the following:
data = [data; T price];
Where T and price are the variables holding your data. Every time you call the above line (say, in a loop) it will append your data as a new row to the variable data. At some point at the beginning of your M-file, you would therefore have to add the following initialization:
data = []; %# An empty array
Appending values to an array like this can sometimes be inefficient, so if you already know ahead of time how many rows of data you will collect you can instead preallocate data with a given size. For example, if you know you will have 4 pairs of values for T and price, you can initialize data in the following way:
data = zeros(4,2); %# A 4-by-2 array of zeroes
Then, when you add data to the array you would instead do the following:
data(i,:) = [T price]; %# Fill row i with data
Another issue to consider is whether your M-file is a script or a function. A function M-file has a line like function output = file_name(input) at the top, whereas a script M-file does not. Running a script M-file is equivalent to typing the entire contents of the file directly into the MATLAB command window, so all of the variables created in the M-file are available in the workspace.
If you are using a function M-file, all variables created are local to the function, and any you want to use in the workspace will have to be passed as outputs from the function. For example, the top line of your function M-file could look like:
function data = your_file
where your_file is the name of the M-file and data is a variable being returned. When you call this function from the workspace you would then have to capture the output in a variable:
outputData = your_file();
Now you have the contents of the variable data from your_file stored as a new variable outputData in the workspace.
Why do you print out data instead of collecting it in an array?
M = [];
for ...
M(end+1, :) = [T, price];
end;
or, more idiomatically,
M = 0.3:0.1:0.6; % or whatever your T values should be
M = [M' (M'.^2)] % replace the .^2 by your price function

Resources