I have an array which stores the information of a 20x20 black and white image.
int array[][] = new int[20][20];
If the pixel is black at a specific point, for example (0,5) I insert the value one into my array.
array[0][5] = 1;
I am trying to create a dataset so I can feed it into a neural network. I was wondering if there is a way to reduce the size of input values (20x20 = 400) by compressing the information.
Related
I am trying to figure out how to load a PNG image and create a matrix containing the RGB values of each pixel. I am currently using the following method to load the file and get the various RGB values:
def to_pixels
File.open(#file, 'r') do |file|
byte_block = file.read
byte_block.each_byte do |byte|
#pixels << byte
end
end
end
From my understanding, each pixel contains 3-bytes representing the R,G, and B value. I initially tried taking the output array #pixels and grouping into sub-groups of 3 elements assuming that the pixel order and RGB value of each pixel was preserved in my output. E.g.:
#pixels = #pixels.each_slice(3).to_a
The length of the array that I created was nearly the same length as the total number of pixels in my original image, so I was encouraged. However, I used ChunkyPNG to take my RGB pixel array and print back to an image, and it looks like random color noise. Could some of the bytes being input into #pixels represent metadata? Or perhaps would the bytes being output not be ordered as R,G, then B values of individual pictures, but perhaps all the R bytes, then all the G bytes, then all the B bytes for example?
I would like to figure out how to transform the byte array into an array of arrays grouping RGB values of the image in some logical order (start at top left and work across to the right, or start in top left and work down, etc)
The chunky_png gem can do this. https://github.com/wvanbergen/chunky_png
Something like:
img = ChunkyPNG::Image.from_file(#file)
img.pixels.each do |pixel|
puts ChunkyPNG::Color.to_hex(pixel) # would spit out a hex string like "#b8e1f6ff"
end
There are a number of other methods if you want different formats: to_grayscale, to_grayscale_alpha_bytes, to_grayscale_bytes, to_hex, to_hsb, to_hsl, to_hsv, to_s, to_truecolor_alpha_bytes, to_truecolor_bytes.
The problem I need to face right now is that I have a certain amount of bytearrays with different sizes. I want to put them into one single bytearray that is way larger than all the other bytearrays together so that voids occur inside of this large bytearray where no bytes have been put in.
The bytearrays are distributed randomly over the large bytearray but no bytearray may collide with another bytearray that has already been put there.
Is there an efficient why to do this random distribution without collisions?
With the following function:
from random import randint, shuffle
import itertools
import operator
def random_distribution(image, files):
shuffle(files)
available_size = len(image) - sum(map(len, files))
gap_positions = sorted([randint(0, available_size) for i in range(len(files))])
gap_deltas = itertools.starmap(operator.sub, zip(gap_positions, [0] + gap_positions))
position = 0
for file, gap_delta in zip(files, gap_deltas):
position += gap_delta
image[position : position + len(file)] = file
position += len(file)
return image
We randomly distribute 3 files into an image of 20 bytes for 10 times:
>>> files = ['abc', '1234', 'ABCDE']
>>> for i in range(10):
... print(random_distribution(bytearray(20), files).replace('\x00', '.'))
...
.abc.1234...ABCDE...
..ABCDE.abc....1234.
abcABCDE......1234..
..1234...ABCDE...abc
..1234.....abc.ABCDE
...abc.1234..ABCDE..
abc...ABCDE....1234.
.ABCDE..abc1234.....
....ABCDE..1234.abc.
1234..ABCDE.abc.....
The idea is to first calculate the free space by subtracting the total size of the files from the size of the image, and then randomly slice up the free space by picking random positions within the free space (which I call "gap positions"). The deltas between the gap positions will become the gaps between the files, so when we copy the files into the image, we skip the corresponding gap delta before we place a file behind the end of the last file.
I have these series of 2D CT images and i have been able to read them into Matlab using "imread". The issue however is that i need the image read-in as a single 3D matrix rather than stack of several 2D matrices. I have been made aware that it is possible to store the number of 2D layers as the 3rd dimension, but i have no idea how to do this as i am still a learner.
The code i have for reading in the 2D stack are as follows:
a = dir('*.tif');
for i = 1: numel(a)
b = imread(a(i).name); %read in the image
b_threshold = graythresh(b); %apply threshold
b_binary = im2bw(b, b_threshold); %binarize image
[m, n] = size(b); %compute the size of the matrix
phi(i) = ((m*n) - sum((b_binary(:))))/(m*n); %compute the fraction of pore pixels in the image
phi(:,i) = phi(i); %store each of the above result
end
I have added just a single image although several of these are needed. Nevertheless, one can easily duplicate the image to create a stack of 2D images. For the code to work, it is however important to rename them in a numerical order.pore_image
Any help/suggestions/ideas is welcomed. Thanks!
You can simply assign along the third dimension using i as your index
stack_of_images(:,:,i) = b_binary
Well, the first advice is try to don't use the variable i and j in matlab because they are reserved (have a look here and here).
After it depends on along which dimension you want to store the 2D images:
if you want to store the images along the first dimension just use this code:
a = dir('*.tif');
for ii = 1: numel(a)
b = imread(a(ii).name); %read in the image
b_threshold = graythresh(b); %apply threshold
b_binary = im2bw(b, b_threshold); %binarize image
[m, n] = size(b); %compute the size of the matrix
phi(ii) = ((m*n) - sum((b_binary(:))))/(m*n); %compute the fraction of pore pixels in the image
phi(:,ii) = phi(ii); %store each of the above result
matrix_3D_images(ii,:,:)=b_binary; %adding a new layer
end
If you want to store the images along other dimensions it is easy to do: just change the posizion of the "pointer" ii:
matrix_3D_images(:,ii,:)=b_binary; or
matrix_3D_images(:,:,ii)=b_binary;
I need to update image 1 with rgb values from image 2 for specific coordinates.
I have two 2d matrices (im1Cart_toupdate 2x114056 and im2EstCart_tocopyfrom also 2x114056). These contain the ordered x-y pairs for which I want to copy rgb values from image 2 to image 1.
i.e. there are 114,056 pixels where I want to copy colours across.
im1 (440x1370x3) and im2 (240x320x3) are the image arrays. Note im2 is going to be stretched, so some pixels from im2 will appear more than once in im2EstCart_tocopyfrom.
I need an efficient way of doing this, as even with the above image sizes my current implementation is very slow. I had thought that there may be some approach using sub2ind - but am not sure how to do this with 3d arrays.
Here's my current code. It's the for loop that's killing me!
%Create a matrix of all pixel coordinates in im1 (homogenised form)
[im1gridx im1gridy]=meshgrid(1:im1width,1:im1height);
im1Cart = [im1gridx(:) im1gridy(:)]';
im1Hom = [im1Cart; ones(1,numel(im1gridy))];
%transform pixel positions with homography (HEst is a matrix built
%elsewhere) to find where they are in the coordinates of image 2
im2EstHom = HEst*im1Hom;
im2EstCart = im2EstHom(1:2,:)./repmat(im2EstHom(3,:),2,1);
im2EstCart = round(im2EstCart);
%check if the the transformed position is within the boundary of image 2
validCoords = im2EstCart(1,:)>0 & im2EstCart(2,:)>0 & im2EstCart(1,:)<=im2width & im2EstCart(2,:)<=im2height;
im1Cart_toupdate=im1Cart(:,validCoords);
im2EstCart_tocopyfrom=im2EstCart(:,validCoords);
%copy colour from image 2 to image 1 - currently pixel by pixel
%but CAN THIS BE VECTORISED?
for i=1:size(im1Cart_toupdate,2)
im1y=im1Cart_toupdate(1,i);
im1x=im1Cart_toupdate(2,i);
im2y=im2EstCart_tocopyfrom(1,i);
im2x=im2EstCart_tocopyfrom(2,i);
im1(im1y,im1x,:) = im2(im2y,im2x,:);
drawnow
end
Many thanks for any advice!
Approach #1
This would be one vectorized approach using linear indexing with bsxfun -
[m2,n2,r2] = size(im2);
RHS_idx1 = (im2EstCart_tocopyfrom(2,:)-1)*m2 + im2EstCart_tocopyfrom(1,:)
RHS_allidx = bsxfun(#plus,RHS_idx1(:),(0:r2-1)*m2*n2)
[m1,n1,r1] = size(im1);
LHS_idx1 = (im1Cart_toupdate(2,:)-1)*m1 + im1Cart_toupdate(1,:)
LHS_allidx = bsxfun(#plus,LHS_idx1(:),(0:r1-1)*m1*n1)
im1(LHS_allidx) = im2(RHS_allidx)
Approach #2
Here's another approach that reshapes the input 3D array to a 2D array after merging the first two dimensions, then using linear indexing for extracting and setting values and finally reshaping back to its original 3D size, like so -
[m2,n2,r2] = size(im2)
RHS_idx1 = (im2EstCart_tocopyfrom(2,:)-1)*m2 + im2EstCart_tocopyfrom(1,:)
im2r = reshape(im2,[],r2)
[m1,n1,r1] = size(im1)
LHS_idx1 = (im1Cart_toupdate(2,:)-1)*m1 + im1Cart_toupdate(1,:)
im1r = reshape(im1,[],r1)
im1r(LHS_idx1,:) = im2r(RHS_idx1,:)
im1 = reshape(im1r,size(im1));
I have a 4D matrix of size 300x200x3x20 where 300x200 is the size of one video frame, 3 is the number of channels (Red-Green-Blue channels) and 20 is the number of frames.
I want to extract all the color vectors from this matrix and store them in a 2D array of size 3x1,200,000 (300 x 200 x 20 = 1,200,000) where each row represents a component of the RGB color space and each column contain the RGB values of one pixel in the original matrix.
Besides, I want to carry out pixel-wise operations on this data such as extracting visual features but I cannot find a way to effectively access vectors along the third dimension.
How could I efficiently do these, possible without using loops?
Try this code -
IN = your_4D_data;
OUT = reshape(permute(IN,[3 1 2 4]),3,numel(IN)/3);
help reshape says:
B = reshape(A,m,n,p,...) or B = reshape(A,[m n p ...]) returns an n-dimensional array with the same elements as A but reshaped to have the size m-by-n-by-p-by-.... The product of the specified dimensions, m*n*p*..., must be the same as numel(A).
is this what you are looking for?
also, you can adress pixels like this: Matrix(i,j,:,k) which gives you the 3 colorchanels of pixel i,j in frame k.