I need help about tiff files. I need to resolve tags and return user image details (such as imageWidth, imageHeight, orientation etc.) without using any special library. With this and this I resolved header and found first IFD, but I couldn't understand finding tag and next inside a single IFD.
for example, documentation says,
Offset - 2+x*12
Datatype - Tag structure
Value - Tag data
but there is not any information about x. OR it says 2+(number of tags in IFD)*12 for next IFD offset, but the result is out of range.
In conclusion, I want to know how to find tags address and get their data, and find next IFD.
Related
I have been searching all over the internet for a way to extract a meaningful page structure from an uploaded document (headlines/titles and paragraphs). The document could be of any format but I'm currently testing with PDF.
Example of what I'm trying to do:
Upload PDF file client-side
Save it to S3
Request AWS textract to detect or analyze text in that S3 object
Classify the output into: Headlines and Paragraphs
My application is working fine until step 3 and AWS textract outputs the result as blocks, block types can be either page, line or words and each block has a Geometry object which includes bounding box details and Polygon object as well (More info here: AnalayzeCommandOutput(JS_SDK) and AnalayzeCommandOutput(General)
However, I still need to process the output and classify it into headlines (e.g. 1 block of type line could be a headline and the following 3 blocks of type line are a single paragraph) so the output of step 4 would be:
{
"Headlines": ["Headline1", "Headline2", "Headline3"],
"Paragraphs": [{"Paragraph": "Paragraph1", "Headline": "Headline1"}, {"Paragraph": "Paragraph2", "Headline": "Headline1}
The unsuccessful methods I tried:
Calculate the size of bounding box of a line relative to the page size and comparing it the average bounding box sizes if it's greater then it's a headline if it's smaller than or equal it's a paragraph (not practical)
Use other PDF parsers but most of them just output unformatted text
Use the "Query" option of analyze document input but it would require to define each line in the PDF as key value pairs to output something meaningful. As per here So the PDF content would be something like:
Headline1: Headline
Paragraph1: Paragraph
Paragraph2: Paragraph
Headline2: Headline
Paragraph1: Paragraph
I'm not asking for a coding solution. Maybe I'm overcomplicating things and there is a simpler way to do it. Maybe someone has tried something similar and can point me into the right direction or approach.
Lets say I have an image called Test.jpg.
I just figured out how to bring an image into the project by the following line:
FILE *infile = fopen("Stonehenge.jpg", "rb");
Now that I have the file, do I need to convert this file into a bmp image in order to apply a filter to it?
I have never worked with images before, let alone OpenCl so there is a lot that is going over my head.
I need further clarification on this part for my own understanding
Does this bmp image also need to be stored in an array in order to have a filter applied to it? I have seen a sliding window technique be used a couple of times in other examples. Is the bmp image pretty much split up into RGB values (0-255)? If someone can provide a link on this item that should help me understand this a lot better.
I know this may seem like a basic question to most but I do not have a mentor on this subject in my workplace.
Now that I have the file, do I need to convert this file into a bmp image in order to apply a filter to it?
Not exactly. bmp is a very specific image serialization format and actually a quite complicated one (implementing a BMP file parser that deals with all the corner cases correctly is actually rather difficult).
However what you have there so far is not even file content data. What you have there is a C stdio FILE handle and that's it. So far you did not even check if the file could be opened. That's not really useful.
JPEG is a lossy compressed image format. What you need to be able to "work" with it is a pixel value array. Either an array of component tuples, or a number of arrays, one for each component (depending on your application either format may perform better).
Now implementing image format decoders becomes tedious. It's not exactly difficult but also not something you can write down on a single evening. Of course the devil is in the details and writing an implementation that is high quality, covers all corner cases and is fast is a major effort. That's why for every image (and video and audio) format out there you usually can find only a small number of encoder and decoder implementations. The de-facto standard codec library for JPEG are libjpeg and libjpeg-turbo. If your aim is to read just JPEG files, then these libraries would be the go-to implementation. However you also may want to support PNG files, and then maybe EXR and so on and then things become tedious again. So there are meta-libraries which wrap all those format specific libraries and offer them through a universal API.
In the OpenGL wiki there's a dedicated page on the current state of image loader libraries: https://www.opengl.org/wiki/Image_Libraries
Does this bmp image also need to be stored in an array in order to have a filter applied to it?
That actually depends on the kind of filter you want to apply. A simple threshold filter for example does not take a pixel's surroundings into account. If you were to perform scanline signal processing (e.g. when processing old analogue television signals) you may require only a single row of pixels at a time.
The universal solution of course to keep the whole image in memory, but then some pictures are so HUGE that no average computer's RAM can hold them. There are image processing libraries like VIPS that implement processing graphs that can operate on small subregions of an image at a time and can be executed independently.
Is the bmp image pretty much split up into RGB values (0-255)? If someone can provide a link on this item that should help me understand this a lot better.
In case you mean "pixel array" instead of BMP (remember, BMP is a specific data structure), then no. Pixel component values may be of any scalar type and value range. And there are in fact colour spaces in which there are value regions which are mathematically necessary but do not denote actually sensible colours.
When it comes down to pixel data, an image is just a n-dimensional array of scalar component tuples where each component's value lies in a given range of values. It doesn't get more specific for that. Only when you introduce colour spaces (RGB, CMYK, YUV, CIE-Lab, CIE-XYZ, etc.) you give those values specific colour-meaning. And the choice of data type is more or less arbitrary. You can either use 8 bits per component RGB (0..255), 10 bits (0..1024) or floating point (0.0 .. 1.0); the choice is yours.
I was assigned to edit part of Ansi C application but my knowledge of pure C is just basics. Anyway current situation is I have map1_data1.h, map1_data2.h, map2_data1.h, map2_data2.h and variables in those files are always connected to the map name = map1_structure in map1_data1.h and so on.
In app there is #include for each file and in code then something like
if (game->map == 1){
mapStructure = map1_structure
} else {
mapStructure = map2_structure
}
I have to extend this to be able to load the map dynamicly so something like
void loadMap(int mapId){
mapStructure = map*mapId*_structure // just short for what i want to achieve
}
My first idea to do so was removing map name connection in variables name in map1_data.h and have just structure variable in there. That requires only one header file at time to be loaded and thats where I'm stucked. Havent found any clues to do so on google.
I would like to have it as variable as possible so something like #include "map*mapId*_data1.h" but should be ok to have 1 switch in one place in whole app to decide on what map to be loaded.
One more thing, the app keeps running for more than 1 game = it will load various maps in one run.
Judging from the comments, you have a single type, call it Map, which is a structure type containing a collection of different data types, including 3D arrays and points and so on. You need to have some maps built into the program; later on, you will need to load new maps at runtime.
You have two main options for the runtime loading the maps:
Map in shared object (shared library, dynamically loaded library, aka DLL).
Map in data file.
Of these two, you will choose the data file over the shared object because it is, ultimately, simpler and more flexible.
Shared Object
With option 1, only someone who can compile a shared library can create the new maps. You'd have a 'library' consisting of one or more data objects, which can be looked up by name. On most Unix-like systems, you'd end up using dlopen() to load the library, and then dlsym() to find the symbol name in that library (specifying the name via a string). If it is present in the library, dlsym() will return you a pointer.
In outline:
typedef void *SO_Handle;
const char *path_to_library = "/usr/local/lib/your_game/libmap32.so";
const char *symbol_name = "map32_structure";
SO_Handle lib = dlopen(path_to_library, RTLD_NOW);
if (lib == 0)
...bail out...
map_structure = dlsym(lib, symbol_name);
if (map_structure == 0)
...bail out...
You have to have some way of generating the library name based on where the software is installed and where extensions are downloaded. You also have to have some way of knowing the name of the symbol to look for. The simplest system is to use a single fixed name (map_structure), but you are not constrained to do that.
After this, you have your general map_structure read for use. You can invent endless variations on the theme.
Data file
This is the more likely way you'll do it. You arrange to serialize the map structure into a disk file that can be read by your program. This will contain a convenient representation of the data. You should consider the TLV (type-length-value) encoding scheme, so that you can tell by looking at the type what sort of data follows, and the length tells you how many of them, and the value is the data. You can do this with binary data or with text data. It is easier to debug text data because you can look at and see what's going on. The chances are that the difference in performance between binary and text is small enough (swamped by the I/O time) that using text is the correct way to go.
With a text description of the map, you'd have information to identify the file as being a map file for your game (perhaps with a map format version number). Then you'd have sections describing each of the main elements in the Map structure. You'd allocate the Map (malloc() et al), and then load the data from the file into the structure.
The aim is for a given GIF file including several images inside to extract those image pixels, edit (change them) and put them back to the GIF file.
Trying to do it using giflib.
The language used is C.
I have successfully read the Gif file and have an access to the pixels of image using the following code:
GifFileType *gifFile = DGifOpenFileName(filename);
DGifSlurp(gifFile);
But as it is said in the Documentation:
About the DGifSlurp function:
When you have modified the image to taste, write it out with
EGifSpew().
However using that function results in:
GIF-LIB error: Given file was not opened for write.
In the following code:
GifFileType *gifFile = DGifOpenFileName(filename);
DGifSlurp(gifFile);
EGifSpew(gifFile);
Do you know how to save the edited gif image?
Your doc is outdated. You should take a look here: http://giflib.sourceforge.net/gif_lib.html#idp26995312
You can write to a GIF file through a function hook. Initialize with
GifFileType *EGifOpen(void *userPtr, OutputFunc writeFunc, int *ErrorCode)
and see the library header file for the type of OutputFunc.
Moreover the function EGifSpew() takes a fd in second argument to a gifFile.
Hope it helped.
i would like to know how can i cut a jpg file using a coordinates i want to retrieve using artoolkit and opencv, see:
Blob Detection
i want to retrieve coordinates of the white sheet and then use those coordinates to cut a jpg file I'm took before.
Find this but how can this help?
How to slice/cut an image into pieces
If you already have the coordinates, you might want to deskew the image first:
http://nuigroup.com/?ACT=28&fid=27&aid=1892_H6eNAaign4Mrnn30Au8d
This post uses cv::warpPerspective() to achieve that effect.
The references above use the C++ interface of OpenCV, but I'm sure you are capable of converting between the two.
Second, cutting a particular area of an image is known as extracting a Region Of Interest (ROI). The general procedure is: create a CvRect to define your ROI and then call cvSetImageROI() followed by cvSaveImage() to save it on the disk.
This post shares C code to achieve this task.