How to write a random array (with no spatial reference) to geotiff format? - arrays

The following MATLAB script generates random locations within a 300x400 array and codes those locations with values from 1-12. How can I convert this non-spatial array to a geotiff? I hope to use the geotiff output to perform some trial analyses. Any projected coordinate system (e.g. UTM) would do for this analysis.
I have tried using geotiffwrite() without success using the following implementation:
out = geotiffwrite('C:\path\to\file\test.tif', m)
Which yields the following error:
>> test
Error using geotiffwrite
Too many output arguments.
EDIT:
The main problem I am encountering is a lack of inputs into the geotiffwrite() function. I am unsure how to deal with this problem. For example, I have no A or R variable because the array has no spatial reference. As long as each pixel is georeferenced somewhere, I do not care what the spatial reference is. The purpose of this is to create a sample dataset that I can experiment with using MATLAB spatial functions.
% Generate a totally black image to start with.
m = zeros(300, 400, 'uint8');
% Generate 1000 random locations.
numRandom = 1000;
linearIndices = randi(numel(m), 1, numRandom);
% Set those locations to be "white".
m(linearIndices) = randi(12, [numel(linearIndices) 1]);
% Display it. Random locations will appear white.
image(m);
colormap(gray);

I believe your question has a very simple answer. Skip the out-variable when you call geotiffwrite. That is, use:
geotiffwrite('C:\path\to\file\test.tif', m)
Instead of
out = geotiffwrite('C:\path\to\file\test.tif', m)
This is example of a working code using geotiffwrite, taken from the documentation. As you can see, there is no output variable there:
basename = 'boston_ovr';
imagefile = [basename '.jpg'];
RGB = imread(imagefile);
worldfile = getworldfilename(imagefile);
R = worldfileread(worldfile, 'geographic', size(RGB));
filename = [basename '.tif'];
geotiffwrite(filename, RGB, R)
figure
usamap(RGB, R)
geoshow(filename)
Update:
According to the documentation, you need at least 3 input parameters. The correct syntax is:
geotiffwrite(filename,A,R)
geotiffwrite(filename,X,cmap,R)
geotiffwrite(...,Name,Value)
From documentation:
geotiffwrite(filename,A,R) writes a georeferenced image or data grid,
A, spatially referenced by R, into an output file, filename.
Please visit this link to see how to use the function.

Related

Structuring a for loop to output classifier predictions in python

I have an existing .py file that prints a classifier.predict for a SVC model. I would like to loop through each row in the X feature set to return a prediction.
I am currently trying to define the element from which to iterate over so as to allow for definition of the test statistic feature set X.
The test statistic feature set X is written in code as:
X_1 = xspace.iloc[testval-1:testval, 0:5]
testval is the element name used in the for loop in the above line:
for testval in X.T.iterrows():
print(testval)
I am having trouble returning a basic set of index values for X (X is the pandas dataframe)
I have tested the following with no success.
for index in X.T.iterrows():
print(index)
for index in X.T.iteritems():
print(index)
I am looking for the set of index values, with base 1 if possible, like 1,2,3,4,5,6,7,8,9,10...n
seemingly simple stuff...i haven't located an existing question via stackoverflow or google.
ALSO, the individual dataframes I used as the basis for X were refined with the line:
df1.set_index('Date', inplace = True)
Because dates were used as the basis for the concatenation of the individual dataframes the loops as written above are returning date values rather than
location values as I would prefer hence:
X_1 = xspace.iloc[testval-1:testval, 0:5]
where iloc, location is noted
please ask for additional code if you'd like to see more
the loops i've done thus far are returning date values, I would like to return index values of the location of the rows to accommodate the line:
X_1 = xspace.iloc[testval-1:testval, 0:5]
The loop structure below seems to be working for my application.
i = 1
j = list(range(1, len(X),1)
for i in j:

Pytables Value Error (rank of the appended object and "..."EArray differ)

I am trying to use pytables to store my images dataset. I am using Earray to append each image as it is read. The dimensions of my Earray and image are same(except for the first, along which appending is done). I am using the following code:
atom = Atom.from_dtype(np.dtype(np.uint32,(278,278,1)))
i=0
for <read each image from folder using os into img>:
im = cv2.imread(img.path,0)
im = np.expand_dims(im,2) #this is because keras needs 3d images and grayscale images are 2d
if not i:
X = data.create_earray(dataGroup,"X",atom,(0,)+im.shape,chunkshape=(20,20,20,1))
X.append(np.expand_dims(im,0)) #as appending require same dim.
i=1
But still when I run the code, it gives my ValueError saying the my object rank is 1 and X rank is 4. How is that possible when I am assigning X size using im. I even tried printing shape of im, it gives (278,278,1). So, what is the problem? I am using Pytables for first time, so dont know them in depth.
Adding a second answer with a more complicated write method plus an EArray.read example. Frankly, I prefer my simpler method (above) to create the EArray with obj= defined, and let Pytables handle the data structures. However, if you prefer to manage this yourself, see example 2 (below). Key items to note:
Atom definition has 4 dimensions, with 0 axis set to zero (defines
the direction that will be extended).
im = np.expand_dims(im,0) is done until AFTER im.shape is referenced in the
definition of the EArray shape at creation.
[UPDATED CODE BELOW]
import tables as tb, numpy as np
data = tb.open_file("image_data1.h5", mode='w')
dataGroup = data.create_group(data.root, 'MyData')
MyAtom = tb.Atom.from_dtype(np.dtype(np.uint32,(0,278,278,1)))
im = np.arange(278*278).reshape((278,278))
im = np.expand_dims(im,2)
X = data.create_earray(dataGroup,"X", MyAtom, (0,)+im.shape)
im = np.expand_dims(im,0)
X.append( im )
print ('flavor =', X.flavor )
print ('dim=', X.ndim, ', rows = ', X.nrows)
im = np.arange(278*278,278*278+278*278).reshape((278,278))
im = np.expand_dims(im,2)
im = np.expand_dims(im,0)
X.append( im )
print ('dim=', X.ndim, ', rows = ', X.nrows)
data.close()
Here are the lines you need to read the data from EArray X (with a couple of print statements to verify values in the corners). This should work so long as the EArray flavor is Numpy (as it is in my example). You can also use the out= parameter to specify a NumPy array to receive the output data. There are other methods to access EArray data, including .iterrows() to iterate, and .__getitem__() to slice with fancy indexing. Read the Pytables documentation if you want to do any of these.
Y_1 = X.read( 0 )
print (Y_1[0,0,0])
print (Y_1[-1,-1,-1])
Y_2 = X.read( 1 )
print (Y_2[0,0,0])
print (Y_2[-1,-1,-1])
First, note that you don't have to create the EArray before you load the first image dataset. Pytables is smart enough to determine the atom and shape definition from the first object.
It was hard for me to exercise your code without a complete example and your data. So, I created a very simple example that uses np.arange() to create a couple of (278,278) image arrays, then extends them in the 2 and 0 directions. Hopefully this mimics the data you are trying to load to the EArray. The 2 Pytables functions (file.create_earray and earray.append) create 2 rows of data, 1 for each "image". After running, open image_data1.h5 with HDFView and inspect the data.
Maybe this will help you understand how to load your images to HDF5 Earrays:
import tables as tb, numpy as np
data = tb.open_file("image_data1.h5", mode='w')
dataGroup = data.create_group(data.root, 'MyData')
im = np.arange(278*278).reshape((278,278))
im = np.expand_dims(im,2)
im = np.expand_dims(im,0)
X = data.create_earray( dataGroup,"X",obj=im )
print ('dim=', X.ndim, ', rows = ', X.nrows)
im = np.arange(278*278, 278*278+278*278).reshape((278,278))
im = np.expand_dims(im,2)
im = np.expand_dims(im,0)
X.append( im )
print ('dim=', X.ndim, ', rows = ', X.nrows)
data.close()

Extract Data From NetCDF4 File Using List

I am using a list of integers corresponding to an x,y index of a gridded NetCDF array to extract specific values, the initial code was derived from here. My NetCDF file has a single dimension at a single timestep, which is named TMAX2M. My code written to execute this is as follows (please note that I have not shown the call of netCDF4 at the top of the script):
# grid point lists
lat = [914]
lon = [2141]
# Open netCDF File
fh = Dataset('/pathtofile/temperaturedataset.nc', mode='r')
# Variable Extraction
point_list = zip(lat,lon)
dataset_list = []
for i, j in point_list:
dataset_list.append(fh.variables['TMAX2M'][i,j])
print(dataset_list)
The code executes, and the result is as follows:
masked_array(data=73,mask=False,fill_value=999999,dtype=int16]
The data value here is correct, however I would like the output to only contain the integer contained in "data". The goal is to pass a number of x,y points as seen in the example linked above and join them into a single list.
Any suggestions on what to add to the code to make this achievable would be great.
The solution to calling the particular value from the x,y list on single step within the dataset can be done as follows:
dataset_list = []
for i, j in point_list:
dataset_list.append(fh.variables['TMAX2M'][:][i,j])
The previous linked example contained [0,16] for the indexed variables, [:] can be used in this case.
I suggest converting to NumPy array like this:
for i, j in point_list:
dataset_list.append(np.array(fh.variables['TMAX2M'][i,j]))

How can I store a list of ggplots to use in multiplot without overwriting previous plots?

I want to plot some heatmaps of covariance/correlation matrices in a multiplot using an object created from another function (the cd parameter below). The covariance matrices are stored in an array of 3 dimensions, so that cd$covmat[,,i] calls the ith covariance matrix.
Originally I had some issues with this with having the same plot replicated. However, I discovered I had an environment issue. I've tried resolving this several ways, with the code below being the most recent, but I can't figure out why it's not reading it properly.
Is there a particular reason for this? I've tried including and excluding the environment parameter (which I hopefully shouldn't need) and I've tried directly using the cd$covmat[,,i] in the
aes() parameter.
drawCovs<-function(cd,ncols){
require(ggplot2)
coords=expand.grid(x=1:cd$q,y=1:cd$q)
climits = c(-1,1)*max(cd$covmat)
cd$levels=c(cd$levels,"Total")
covtext=ifelse(!(cd$use.cor),'Covariance','Correlation')
plots=list()
cmat=list()
for (i in 1:(nlevels+1)){
cmat[[i]]<-cd$covmat[,,i]
.e<-environment
plots[[i]]<-ggplot(environment=.e)+geom_tile(aes(x=coords$x,y=coords$y,
fill=as.numeric(cmat[[i]]),color='white'))+
scale_fill_gradient(covtext,low='darkblue',high='red',limits=climits)+ylab('')
+xlab('')+guides(color='none')+scale_x_discrete(labels=cd$varnames,
limits=1:cd$q, expand=c(0,0))+scale_y_discrete(labels=cd$varnames,
limits=1:cd$q, expand=c(0,0))+theme(axis.text.x = element_text(angle = 90,
hjust = 1))+labs(title=paste0(covtext,"s of data, ",cd$levels[i]))
}
multiplot(plotlist=plots,cols=ncols)
}
If you end up trying to fix things with direct calls to environments, you are probably overcomplicating your code. Here's a simple snippet that may serve as a core for your function:
drawCovs <- function(cd, ncols) {
require(ggplot2)
require(reshape2)
plots=list()
cmat=list()
for (i in 1:(length(cd$covmat))) {
cmat[[i]] <- cd$covmat[[i]]
plots[[i]] <- ggplot(melt(cmat), aes(x=Var1, y=Var2, fill=value)) +
geom_tile(color='white')
}
multiplot(plotlist=plots,cols=ncols)
}
cd <- list()
cd$covmat <- list(matrix(runif(25), 5), matrix(runif(25), 5))
drawCovs(cd, 1)

Creating multi-dimentional NetCDF in matlab

I am attempting to create a four dimensional NetCDF structure of integers using matlab. This is my code so far...
mode = netcdf.getConstant('NETCDF4');
mode = bitor(mode,netcdf.getConstant('CLASSIC_MODEL'));
ncid = netcdf.create('USTEC_01_01_2010.nc',mode);
latDimId = netcdf.defDim(ncid,'latitude',51);
longDimId = netcdf.defDim(ncid,'longitude',101);
satDimId = netcdf.defDim(ncid,'satellite',33);
timeDimId = netcdf.defDim(ncid,'time',96);
varid = netcdf.defVar(ncid,'TECgrid','int',[latDimId longDimId satDimId timeDimId]);
My question is...How do I go about using putVar to insert values at specific four dimensional positions? FYI, this is my first time using NetCDF. Thanks in advance! -Dom
Which version do you have?
If you have a later version, look at these functions: nccreate and ncwrite.
Or:
netcdf.endDef(ncid);
% Write one specific value to the last position.
% See help netcdf.putVar. start is zero based.
% start argument's order corresponds to dimension definition above.
netcdf.putVar(ncid,varid,[50 100 32 95], 10);
netcdf.close(ncid);

Resources