How to save a tensor - tensorflow.js

I have a dataset of 1000 items. I normalize the data before I train the model against it.
I would now like to use the model to make predictions. However, from what I understand, I need to normalize the inputs that I will feed to the model for which I need the predictions for. In order to carry this out, I would need the mean and std calculated at the time of training.
While I am able to print it to the console, how does one "save" it - to be used later? I am trying to understand the procedure here on how to save the mean and std used at the time of normalization of the training data - so that I can use it again at the time of making predictions.

I determined that we could first get the array representation of the tensor through:
// tensor here is the tensor variable that contains the tensor
const tensorAsArray = tensor.arraySync()
and then, we save it to a file like any other string
fs.writeFile(myFilePath, JSON.stringify(tensorAsArray), 'utf-8')
To read it back and use it as a tensor, we would do the opposite:
const tensorAsArray = JSON.parse(fs.readFile(myFilePath, 'utf-8'))
const tensor = tf.tensor(tensorAsArray)
This allowed me to save the mean and std for use later.

Related

How to create an Array (IDataHolder) in Thinkscript? (Thinkorswim)

I am trying to create an irregular volume scanner on Thinkorswim using Thinkscript. I want to create an array of volume's in past periods so that I can compare them to the current period's volume (using fold or recursion). However, while the Thinkorswim documentation details what is called an IDataHolder datatype, which is an array of data, I cannot find how one can actually create one, as opposed to just referencing the historical data held by Thinkorswim. Here is the documentation: https://tlc.thinkorswim.com/center/reference/thinkScript/Data-Types/IDataHolder
I have tried coding something as simple as this to initialize an array:
def array = [];
This throws an error. I have tried different types of brackets, changing any possible syntax issues, etc.
Is this possible in the Thinkscript language? If not, are there any workarounds? If not even that, is there a third party programming interface that I could use to pull data from Thinkorswim and get a scanner that way? Thanks for any help.
IDataHolder represents data such as close, open, volume, etc, that is held across multiple bars or ticks. You can reference one of these pre-defined data series, or you can create your own using variables: def openPlus5cents = open + 0.05, say, would be an IDataHolder type value.
There's no way to create an array in the usual programming sense, as you've found, so you'll have to be a little creative. Perhaps, say, you could do the comparison within the fold? volume[1] > volume, or the like? Maybe post another question with an example of the comparison you're trying to do?

Working with CSV imge data to perform CNN in Julia - format problem

I am trying to make a convolusional neural network on MNIST sign language dataset. It is provided in a CSV format where each row is one picture and there are 784 columns refering to a single pixel (the pictures have a size 28x28).
My problem is that in order to perform the algorithm I need to transpose my data to a different format, the same as is the format of a built-in ML dataset fashion MNIST, which is:
Array{Array{ColorTypes.Gray{FixedPointNumbers.Normed{UInt8,8}},2},1}
I would like to end up with the following format, where my data is joined with the encoded labels:
Array{Tuple{Array{Float32,4},Flux.OneHotMatrix{Array{Flux.OneHotVector,1}}},1}
I was trying to use reshape function to convert it to a 4-dimensional array, but all I get is:
7172×28×28×1 Array{Float64,4}
My labels are in the following (correct) format:
25×7172 Flux.OneHotMatrix{Array{Flux.OneHotVector,1}}
I understand that somehow the proper data format is an array in an array while my data is a simple array with 4 dimensions, but I can't figure out how to change that.
I am new to Julia and some of the code I am using has been written by someone else.

How to pass in 2 different length arrays as one training unit in Tensorflow?

I am currently learning Tensorflow, and as part of this I'm building a neural network. As an input, it'll take a 42-length array and a 7-length array, and as output it'll put out a 7-length array or a single digit, it doesn't matter. I want to have 69 hidden layers.
Somehow, I need to train my Tensorflow model on a bunch of groups of 42-and-7-length arrays, but I'm not sure how to group them.
I currently have a large array like this:
myLargeArray = numpy.ndarray([[[42-length-array],[7-length-array]],[[42-length-array],[7-length-array]],[[42-length-array],[7-length-array]],[[42-length-array],[7-length-array]],[[42-length-array],[7-length-array]](and so on)]
How can I pass in each one of the grouped arrays into my Tensorflow model? I can't quite understand how because all of the input data in the Tensorflow tutorials is processed from CSVs.
Looks like you want to feed different arrays to different tensors. In this case, there's no need to combine the two into a single myLargeArray.
Here's an example MNIST classifier that also recognizes a digit. It uses x of length 784 and y of length 10, but the idea is the same.
# None is for batch processing
x = tf.placeholder(tf.float32, [None, 784])
y = tf.placeholder(tf.float32, [None, 10]) # one-hot encoded
[...]
batch_xs, batch_ys = mnist.train.next_batch(batch_size)
sess.run([optimizer, cost], feed_dict={x: batch_xs, y: batch_ys})
Arrays batch_xs and batch_ys have different size and both fed to the session. The result of the model is probability distribution of each digit, i.e., also of size 10.

Why does pyserial read as b'number'

I want to save those values in a file.txt, when the program saves them, it saves b'number', I'd like to plot those values but I can't with b'number' I just want number saved
You need to transform your input data into numbers because in your case pyserial is sending binary data, not prepared numeric. Also, byte order matters, you should specify whether it is 'big-endian', 'little-endian', 'native' etc.
If you are using Python 3.x, your job can be done this way:
values = [int.from_bytes(binary_number, 'little') for binary_number in binary_data]
And plot your values.
Hope this helps.

Saving parts of Matlab cell array

I am using Matlab for some data collection, and I want to save the data after each trial (just in case something goes wrong). The data is organized as a cell array of cell arrays, basically in the format
data{target}{trial} = zeros(1000,19)
But the actual data gets up to >150 MB by the end of the collection, so saving everything after each trial becomes prohibitively slow.
So now I am looking at opting for the matfile approach (http://www.mathworks.de/de/help/matlab/ref/matfile.html), which would allow me to only save parts of the data. The problem: this doesn't support cells of cell arrays, which means I couldn't change/update the data for a single trial; I would have to re-save the entire target's data (100 trials).
So, my question:
Is there another different method I can use to save parts of the cell array to speed up saving?
(OR)
Is there a better way to format my data that would work with this saving process?
A not very elegant but possibly effective solution is to use trial as part of the variable name. That is, use not a cell array of cell arrays (data{target}{trial}), but just different cell arrays such as data_1{target}, data_2{target}, where 1, 2 are the values of the trial counter.
You could do that with eval: for example
trial = 1; % change this value in a for lopp
eval([ 'data_' num2str(trial) '{target} = zeros(1000,19);']); % fill data_1{target}
You can then save the data for each trial in a different file. For example, this
eval([ 'save temp_save_file_' num2str(trial) ' data_' num2str(trial)])
saves data_1 in file temp_save_file_1, etc.
Update:
Actually it does appear to be possible to index into cell arrays, just not iside cell arrays. Hence, if you store your data slightly differently it seems like you can use matfile to update only part of it. See this example:
x = cell(3,4);
save x;
matObj = matfile('x.mat','writable',true);
matObj.x(3,4) = {eye(10)};
Note that this gives me a version warning, but it seems to work.
Hope this does the trick. However, still look into the next part of my answer as it may help you even more.
For calculations it is usually not required to save to disk after every iteration. An easy way to get a speedup (at the cost of a little more risk) is to save only after every n trials.
Like this for example:
maxTrial = 99;
saveEvery = 10;
for trial = 1:maxTrial
myFun; %Do your calculations here
if trial == maxTrial || mod(trial, saveEvery) == 0
save %Put your save command here
end
end
If your data is always at (or within) a certain size, you can also choose to store your data in a matrix rather than a cell array, then you can use indexing to save only part of the file.
In response to #Luis I will post an other way to deal with the situation.
It is indeed an option to save data in named variables or files, but to save a named variable in a named file seems too much.
If you only change the name of the file, you can save everything without using eval:
assuming you are dealing with trial 't':
filename = ['temp_save_file_' + num2str(t)];
If you really want, you can use print commands to write it as 001 for example.
Now you can simply use this:
save(filename, myData)
To use this, construct the filename again and so something like this:
totalData = {}; %Initialize your total data
And then read them as you wrote them (inside a loop):
load(filename)
totalData{t} = myData

Resources