The function imwrite() on imageio (Python) seems to be rescaling image data prior to saving. My image data has values in the range [30, 255] but when I save it, it stretches the data so the final image spreads from [0, 255], hence creating "holes" in the histogram so as increasing overall contrast.
Is there any parameter to fix this and make imwrite() not to modify the data?
Thanks
So far I am setting a pixel to 0 to prevent this from happening:
prediction[0, 0, 0] = 0
(prediction is a [1024, 768, 3] array containing a colour photograph)
imageio.imwrite('prediction.png', prediction)
Fixed! I was using uint32 values instead of uint8, then imwrite() seems to perform some scaling corrections because it expects uint8 type. The problem is solved using:
prediction = np.round(prediction*255).astype('uint8')
Instead of converting to 32-bit integer, which I did at the beginning:
prediction = np.round(prediction*255).astype(int)
Related
I am a beginner to deep learning and I am working with Keras built on top of Tensorflow. I am trying to using RGB images (540 x 360) resolution to predict bounding boxes.
My labels are binary (black/white) 2 dimensional np array of dimensions (540, 360) where all pixels are 0 except for the box edges which are a 1.
Like this:
[[0 0 0 0 0 0 ... 0]
[0 1 1 1 1 0 ... 0]
[0 1 0 0 1 0 ... 0]
[0 1 0 0 1 0 ... 0]
[0 1 1 1 1 0 ... 0]
[0 0 0 0 0 0 ... 0]]
There can be more than one bounding box in every picture. A typical image could look like this:
So, my input has the dimension (None, 540, 360, 3), output has dimensions (None, 540, 360) but if I add an internal array I can change the shape to (None, 540, 360, 1)
How would I define a CNN model such that my model could fit this criteria? How can I design a CNN with these inputs and outputs?
You have do differentiate between object detection and object segmentation. While both can be used for similar problems, the underlying CNN architectures look very different.
Object detection models use a CNN classification/regression architecure, where the output refers to the coordinates of the bounding boxes. It's common practice to use 4 values belonging to vertical center, horizontal center, width and height of each bounding box. Search for Faster R-CNN, SSD or YOLO to find popular object detection models for keras. In your case you would need to define a function that converts the current labels to the 4 coordinates I mentioned.
Object segmentation models commonly use an architecture referred to as encoder-decoder networks, where the original image is scaled down and compressed on the first half and then brought back to it's original resolution to predict a full image. Search for SegNet, U-Net or Tiramisu to find popular object segmentation models for keras. My own implementation of U-Net can be found here. In your case you would need to define a custom function, that fills all the 0s inside your bounding boxes with 1s. Understand that this solution will not predict bounding boxes as such, but segmentation maps showing regions of interest.
What is right for you, depends on what precisely you want to achieve. For getting actual bounding boxes you want to perform an object detection. However, if you're interested in highlighting regions of interest that go beyond rectangle windows a segmentation may be a better fit. In theory, you can use your rectangle labels for a segmentation, where the network will learn to create better masks than the inaccurate segmentation of the ground truth, provided you have enough data.
This is a simple example of how to write intermediate layers to achieve the output. You can use this as a starter code.
def model_360x540(input_shape=(360, 540, 3),num_classes=1):
inputs = Input(shape=input_shape)
# 360x540x3
downblock0 = Conv2D(32, (3, 3), padding='same')(inputs)
# 360x540x32
downblock0 = BatchNormalization()(block0)
downblock0 = Activation('relu')(block0)
downblock0_pool = MaxPooling2D((2, 2), strides=(2, 2))(block0)
# 180x270x32
centerblock0 = Conv2D(1024, (3, 3), padding='same')(downblock0_pool)
#180x270x1024
centerblock0 = BatchNormalization()(center)
centerblock0 = Activation('relu')(center)
upblock0 = UpSampling2D((2, 2))(centerblock0)
# 180x270x32
upblock0 = concatenate([downblock0 , upblock0], axis=3)
upblock0 = Activation('relu')(upblock0)
upblock0 = Conv2D(32, (3, 3), padding='same')(upblock0)
# 360x540x32
upblock0 = BatchNormalization()(upblock0)
upblock0 = Activation('relu')(upblock0)
classify = Conv2D(num_classes, (1, 1), activation='sigmoid')(upblock0)
#360x540x1
model = Model(inputs=inputs, outputs=classify)
model.compile(optimizer=RMSprop(lr=0.001), loss=bce_dice_loss, metrics=[dice_coeff])
return model
The downblock represents the block of layers which perform downsampling(MaxPooling2D).
The centerblock has no sampling layer.
The upblock represents the block of layers which perform up sampling(UpSampling2D).
So here you can see how (360,540,3) is being transformed to (360,540,1)
Basically, you can add such blocks of layers to create your model.
Also check out Holistically-Nested Edge Detection which will help you better with the edge detection task.
Hope this helps!
I have not worked with keras but I will provide a solution approach in more generalized way which can be used on any framework.
Here is full procedure.
Data preparation: I know your labels are edges of boxes which will also work but i will recommend that instead of edges you prepare dataset marking complete box like given in sample (I have marked for two boxes). Now your dataset have three classes (Box,Edges of box and background). Create two lists, Image and label.
Get a pre-trained model (RESNET-51 recommended) solver and train prototxt from here, Remove fc1000 layer and add de-convolution/up-sampling layers to match your input size. use paddding in first layer to make it square and crop in deconvolution layer to match input output dimensions.
Transfer weights from previously trained network (Original) and train your network.
Test your dataset and create bounding boxes using detected blobs.
Using SceneKit to build some large custom geometries, in order to keep array sizes as small as possible, I fed SCNGeometrySource with vectors with (x,y,z) as short (Int16) integers, as follows:
let vertexSource = SCNGeometrySource(data: dataContent!,
semantic: SCNGeometrySourceSemanticVertex, vectorCount: 4960,
floatComponents: false, componentsPerVector: 3, bytesPerComponent: 2,
dataOffset: 0, dataStride: 6)
.. however, this data was not acceptable, generating the error:
SceneKit: error, C3DMeshSourceGetValueAtIndexAsVector3 - Type not supported
Question: What forms of vector components does SceneKit support?
After writing this question I'm going to try trial & error to see what happens with triples and quads of Float32, Int32, etc. I know Float64 does work but I don't need better than ±2^15 precision, so why waste memory? Perhaps someone can provide a definitive answer .. Apple's documentation appears to not do so.
I am trying to have a flexible Geometry Instancing code able to handle meshes with multiple materials. For a mesh with one material everything is fine. I manage to render as many instances as I want with a single draw call.
Things get a bit more complicated with multiple materials. My mesh comes from an .x file. It has one vertex buffer, one index buffer but several materials. The indexes to render for each subset (materials) is stored in an attribute array.
Here is the code I use:
d3ddev->SetVertexDeclaration( m_vertexDeclaration );
d3ddev->SetIndices( m_indexBuffer );
d3ddev->SetStreamSourceFreq(0, (D3DSTREAMSOURCE_INDEXEDDATA | m_numInstancesToDraw ));
d3ddev->SetStreamSource(0, m_vertexBuffer, 0, D3DXGetDeclVertexSize( m_geometryElements, 0 ) );
d3ddev->SetStreamSourceFreq(1, (D3DSTREAMSOURCE_INSTANCEDATA | 1ul));
d3ddev->SetStreamSource(1, m_instanceBuffer, 0, D3DXGetDeclVertexSize( m_instanceElements, 1 ) );
m_effect->Begin(NULL, NULL); // begin using the effect
m_effect->BeginPass(0); // begin the pass
for( DWORD i = 0; i < m_numMaterials; ++i ) // loop through each subset.
{
d3ddev->SetMaterial(&m_materials[i]); // set the material for the subset
if(m_textures[i] != NULL)
{
d3ddev->SetTexture( 0, m_textures[i] );
}
d3ddev->DrawIndexedPrimitive(
D3DPT_TRIANGLELIST, // Type
0, // BaseVertexIndex
m_attributes[i].VertexStart, // MinIndex
m_attributes[i].VertexCount, // NumVertices
m_attributes[i].FaceStart * 3, // StartIndex
m_attributes[i].FaceCount // PrimitiveCount
);
}
m_effect->EndPass();
m_effect->End();
d3ddev->SetStreamSourceFreq(0,1);
d3ddev->SetStreamSourceFreq(1,1);
This code will work for the first material only. When I say the first I meant the one at index 0 because if I start my loop with the second material, it will not be rendered. However, by debugging the vertex buffer in PIX, I can see all my materials being processed properly. So something happens after the vertex shader.
Another weird issue, all my materials will be rendered if I set my stream source containing the instance data to be a vertex size of zero.
So Instead of this:
d3ddev->SetStreamSource(1, m_instanceBuffer, 0, D3DXGetDeclVertexSize( m_instanceElements, 1 ) );
I replace it by:
d3ddev->SetStreamSource(1, m_instanceBuffer, 0, 0 );
But of course, with this code, all my instances are rendered at the same position since I reuse the same instance data over and over again.
And last point, everything works fine if I create my device with D3DCREATE_SOFTWARE_VERTEXPROCESSING. Only Hardware has the issue but unfortunately DirectX does not report any problem in debug mode.
See the Shader Model 3 docs
If you are implementing shaders in hardware, you may not use vs_3_0 or ps_3_0 with any other shader versions, and you may not use either shader type with the fixed function pipeline. These changes make it possible to simplify drivers and the runtime. The only exception is that software-only vs_3_0 shaders may be used with any pixel shader version.
I had the same problem and in my case, the problem was with pool for instancing mesh. I originally had this mesh in SYSTEM_MEMORY, but the instanced mesh in POOL_DEFAULT. When I changed instancing mesh to sit in a default mem, everything worked as desired.
Hope it helps.
I'm posting this thread because I have some difficulties to deal with pictures in Java. I would like to be able to convert a picture into a byte[] array, and then to be able to do the reverse operation, so I can change the RGB of each pixel, then make a new picture. I want to use this solution because setRGB() and getRGB() of BufferedImage may be too slow for huge pictures (correct me if I'm wrong).
I read some posts here to obtain a byte[] array (such as here) so that each pixel is represented by 3 or 4 cells of the array containing the red, the green and the blue values (with the additional alpha value, when there are 4 cells), which is quite useful and easy to use for me. Here's the code I use to obtain this array (stored in a PixelArray class I've created) :
public PixelArray(BufferedImage image)
{
width = image.getWidth();
height = image.getHeight();
DataBuffer toArray = image.getRaster().getDataBuffer();
array = ((DataBufferByte) toArray).getData();
hasAlphaChannel = image.getAlphaRaster() != null;
}
My big trouble is that I haven't found any efficient method to convert this byte[] array to a new image, if I wanted to transform the picture (for example, remove the blue/green values and only keeping the red one). I tried those solutions :
1) Making a DataBuffer object, then make a SampleModel, to finally create a WritableRaster and then BufferedImage (with additional ColorModel and Hashtable objects). It didn't work because I apparently don't have all the information I need (I have no idea what's the Hashtable for BufferedImage() constructor).
2) Using a ByteArrayInputStream. This didn't work because the byte[] array expected with ByteArrayInputStream has nothing to do with mine : it represents each byte of the file, and not each component of each pixel (with 3-4 bytes for each pixel)...
Could someone help me?
Try this:
private BufferedImage createImageFromBytes(byte[] imageData) {
ByteArrayInputStream bais = new ByteArrayInputStream(imageData);
try {
return ImageIO.read(bais);
} catch (IOException e) {
throw new RuntimeException(e);
}
}
I have tried the approaches mentioned here but for some reason neither of them worked. Using ByteArrayInputStream and ImageIO.read(...) returns null, whereas byte[] array = ((DataBufferByte) image.getRaster().getDataBuffer()).getData(); returns a copy of the image data, not a direct reference to them (see also here).
However, the following worked for me. Let's suppose that the dimensions and the type of the image data are known. Let also byte[] srcbuf be the buffer of the data to be converted into BufferedImage. Then,
Create a blank image, for example
img=new BufferedImage(width, height, BufferedImage.TYPE_3BYTE_BGR);
Convert the data array to Raster and use setData to fill the image, i.e.
img.setData(Raster.createRaster(img.getSampleModel(), new DataBufferByte(srcbuf, srcbuf.length), new Point() ) );
BufferedImage image = new BufferedImage(width, height, BufferedImage.TYPE_3BYTE_BGR);
byte[] array = ((DataBufferByte) image.getRaster().getDataBuffer()).getData();
System.arraycopy(pixelArray, 0, array, 0, array.length);
This method does tend to get out of sync when you try to use the Graphics object of the resulting image. If you need to draw on top of your image, construct a second image (which can be persistant, i.e. not constructed every time but re-used) and drawImage the first one onto it.
Several people upvoted the comment that the accepted answer is wrong.
If the accepted answer isn't working, it may be because Image.IO doesn't have support for the type of image you're trying, for example tiff images.
To make it work, you need to add an extra jar to handle the image type.
You can add jai-imageio-core-1.3.1.jar to your classpath with:
<!-- https://mvnrepository.com/artifact/com.github.jai-imageio/jai-imageio-core -->
<dependency>
<groupId>com.github.jai-imageio</groupId>
<artifactId>jai-imageio-core</artifactId>
<version>1.3.1</version>
</dependency>
To add support for:
wbmp
bmp
pcx
pnm
raw
tiff
gif (write)
You can check the list of supported formats with:
for(String format : ImageIO.getReaderFormatNames())
System.out.println(format);
Note that you only have to drop the jar (jai-imageio-core-1.3.1.jar for example) into your classpath to make it work.
Other projects that add additional support for image types include:
https://github.com/haraldk/TwelveMonkeys
https://github.com/geosolutions-it/imageio-ext
The approach by using ImageIO.read directly is not right in some cases. In my case, the raw byte[] doesn't contain any information about the width and height and format of the image. By only using ImageIO.read, It is impossible for the program to construct a valid image.
It is necessary to pass the basic information of the image to BufferedImage object:
BufferedImage outBufImg = new BufferedImage(width, height, bufferedImage.TYPE_3BYTE_BGR);
Then set the data for the BufferedImage object by using setRGB or setData. (When using setRGB, it seems we must convert byte[] to int[] first. As a result, it may cause performance issues if the source image data is big. Maybe setData is a better idea for big byte[] typed source data.)
I have plotted a contour map but i need to make some improvements. This is the structure of the data that are used:
str(lon_sst)
# num [1:360(1d)] -179.5 -178.5 -177.5 -176.5 -175.5 ...
str(lat_sst)
# num [1:180(1d)] -89.5 -88.5 -87.5 -86.5 -85.5 -84.5 -83.5 -82.5 -81.5 -80.5 ...
dim(cor_Houlgrave_SF_SST_JJA_try)
# [1] 360 180
require(maps)
maps::map(database="world", fill=TRUE, col="light blue")
maps::map.axes()
contour(x=lon_sst, y=lat_sst, z=cor_Houlgrave_SF_SST_JJA_try[c(181:360, 1:180),],
zlim=c(-1,1), add=TRUE)
par(ask=TRUE)
filled.contour(x = lon_sst, y=lat_sst,
z=cor_Houlgrave_SF_SST_JJA_try[c(181:360, 1:180),],
zlim=c(-1,1), color.palette=heat.colors)
Because most of the correlations are close to 0, it is very hard to see the big ones.
Can i make it easier to see, or can i change the resolution so i can zoom it in? At the moment the contours are too tightly spaced so I can't see what the contour levels were.
Where can i see the increment, i set my range as (-1,1), i don't know how to set the interval manually.
Can someone tell me how to plot a specific region of the map, like longitude from 100 to 160 and latitude from -50 to -80? I have tried to replace lon_sst and lat_sst, but it has a dimension error. Thanks.
To answer 1 and 3 which appear to be the same request, try:
maps::map(database="world", fill=TRUE, col="light blue",
ylim=c(-80, -50), xlim=c(100,160) )
To address 2: You have a much smaller range than [-1,1]. The labels on those contour lines are numbers like .06, -.02 and .02. The contour function will accept either an 'nlevels' or a 'levels' argument. Once you have a blown up section you can use that to adjust the z-resolution of contours.
contourplot in the lattice package can also produce these types of contour plots, and makes it easy to both contour lines and fill colours. This may or may not suit your needs, but by filling contour intervals, you can do away with the text labels, which can get a little crowded if you want to have high resolution contours.
I don't have your sea surface temperature data, so the following figure uses dummy data, but you should get something similar. See ?contourplot and ?panel.levelplot for possible arguments.
For your desired small scale plot, overlaying the world map plot is probably inappropriate, especially considering that the area of interest is in the ocean.
library(lattice)
contourplot(cor_Houlgrave_SF_SST_JJA_try, region=TRUE, at=seq(-1, 1, 0.25),
labels=FALSE, row.values=lon_sst, column.values=lat_sst,
xlim=c(100, 160), ylim=c(-80, -50), xlab='longitude', ylab='latitude')
Here, the at argument controls the position at values at which contour lines will be calculated and plotted (and hence the number of breaks in the colour ramp). In my example, contour lines are provided at -0.75, -0.5, -0.25, 0, 0.25, 0.5, 0.75 and 1 (with -1 being the background). Changing to at=seq(-1, 1, 0.5), for example, would produce contour lines at -0.5, 0, 0.5, and 1.