I was wondering if it was possible to save out a vector of cv::KeyPoints using the CvFileStorage class or the cv::FileStorage class. Also is it the same process to read it back in?
Thanks.
I am not sure about what you really expect :
The code I provide you is simply an example, to show how the file storage works in the OpenCV C++ bindings. It assumes here that you write in the XML file all the Keypoints separately, with their name being their position in the vector they were stored in.
It assumes aswell that when you read them back, you know the number of them you want to read, if not, the code is a little bit more complex. You'll find a way (if for instance you read the filestorage and test what it gives you, if it doesn't give you anything, then it means there is no more point to read) -it's just an idea, you have to find a solution, maybe this piece of code will be enough for you.
I should precise that i use ostringstream to put the integer in the string and by the way change the place where it will be written in the *.yml file.
//TO WRITE
vector<Keypoint> myKpVec;
FileStorage fs(filename,FileStorage::WRITE);
ostringstream oss;
for(size_t i;i<myKpVec.size();++i) {
oss << i;
fs << oss.str() << myKpVec[i];
}
fs.release();
//TO READ
vector<Keypoint> myKpVec;
FileStorage fs(filename,FileStorage::READ);
ostringstream oss;
Keypoint aKeypoint;
for(size_t i;i<myKpVec.size();<++i) {
oss << i;
fs[oss.str()] >> aKeypoint;
myKpVec.push_back(aKeypoint);
}
fs.release();
Julien,
char* key;
FileStorage f;
vector<Keypoint> keypoints;
//writing
write(f, key, keypoints);
//reading
read(f[key], keypoints);
int main() {
String filename = "data.xml";
FileStorage fs(filename,FileStorage::WRITE);
Vector<Mat> vecMat;
Mat A(3,3,CV_32F, Scalar(5));
Mat B(3,3,CV_32F, Scalar(6));
Mat C(3,3,CV_32F, Scalar(7));
vecMat.push_back(A);
vecMat.push_back(B);
vecMat.push_back(C);
//ostringstream oss;
for(int i = 0;i<vecMat.size();i++) {
stringstream ss;
ss << i;
string str = "x" + ss.str();
fs << str << vecMat[i];
}
fs.release();
vector<Mat> matVecRead;
FileStorage fr(filename,FileStorage::READ);
Mat aMat;
int countlabel = 0;
while(1) {
stringstream ss;
ss << countlabel;
string str = "x" + ss.str();
cout << str << endl;
fr[str] >> aMat;
if (fr[str].isNone() == 1) {
break;
}
matVecRead.push_back(aMat.clone());
countlabel ++;
}
fr.release();
for( unsigned j = 0; j < matVecRead.size(); j++){
cout << matVecRead[j] << endl;
}
}
Put a letter eg 'a' infront of the numbering as the OPENCV XML Format specify the xml KEY must start with a letter.
This is a code to save Vector<Mat> for visual studio 2010, i think it will works for Vector<KeyPoints>
Related
How do you manage end of a message in a protocol ? I use msgpack-c and the only solution I found is to send the header before the payload (separately).
Send the header to client :
// header
{
"message_type": "hello",
"payload_size": 10
}
The client received the header, unpack it, and allocate a buffer of "payload_size", receive data from stream, and if the buffer is complete the message is finish.
I want to send header and body succinctly
{
"header": { "message_type":"hello", "payload_size": 10},
"payload": {...} // can come in multiple frame
}
My problem is that I don't know if it's possible to partially unpack the header for knowing the size before receiving the full message (splitted if > 4096kb due to libevent restriction)
How would you do that ? I am open to all solutions.
C++
Using unpack() function
You can use offset parameter of unpack() function.
See https://github.com/msgpack/msgpack-c/wiki/v2_0_cpp_unpacker#client-controls-a-buffer
Here is a code example:
#include <iostream>
#incluee <msgpack.hpp>
int main() {
msgpack::sbuffer buf;
msgpack::pack(buf, std::make_tuple("first message", 123, 56.78));
msgpack::pack(buf, std::make_tuple("second message", 42));
std::size_t off = 0; // cursor of buf
{
// unpack() function starts parse from off (0)
auto oh = msgpack::unpack(buf.data(), buf.size(), off);
// off is updated to 25. 25 is MessagePack formatted byte size
// of ["first message",123,56.78]
// (I use JSON notation but actual format is MessagePack)
std::cout << "off:" << off << std::endl;
std::cout << *oh << std::endl;
}
{
// unpack() function starts parse from off (25)
auto oh = msgpack::unpack(buf.data(), buf.size(), off);
// off is updated to 42.
// 42 - 25 = 17. 17 is MessagePack formatted byte size
// of ["second message",42]
// (I use JSON notation but actual format is MessagePack)
std::cout << "off:" << off << std::endl;
std::cout << *oh << std::endl;
}
}
Output is
off:25
["first message",123,56.78]
off:42
["second message",42]
msgpack-c unpack() manage the position of buffer internally.
You don't need to pass payload_size.
In addition you can mix non-msgpack format data in the buffer.
+--------------------+-----------------------------+--------------------+
| MessagePackBytes1 | Any format user knows size | MessagePackBytes2 |
+--------------------+-----------------------------+--------------------+
Let's say user knows the data structure that contains MessgePackBytes1(unknown size), any format data (known size), and MessgePackBytes1(unknown size).
#include <iostream>
#incluee <msgpack.hpp>
int main() {
msgpack::sbuffer buf;
msgpack::pack(buf, std::make_tuple("first message", 123, 56.78));
std::string non_mp = "non mp format data";
buf.write(non_mp.data(), non_mp.size());
msgpack::pack(buf, std::make_tuple("second message", 42));
std::size_t off = 0; // cursor of buf
{
auto oh = msgpack::unpack(buf.data(), buf.size(), off);
std::cout << "off:" << off << std::endl;
std::cout << *oh << std::endl;
}
{
std::string extracted{buf.data() + off, non_mp.size()};
std::cout << extracted << std::endl;
off += non_mp.size();
}
{
auto oh = msgpack::unpack(buf.data(), buf.size(), off);
std::cout << "off:" << off << std::endl;
std::cout << *oh << std::endl;
}
}
Output is
off:25
["first message",123,56.78]
non mp format data
off:60
["second message",42]
Using unpacker
It is a little advanced but it might fit streaming usecases.
https://github.com/msgpack/msgpack-c/wiki/v2_0_cpp_unpacker#msgpack-controls-a-buffer
Here is an example that unpacking MessagePack from continuous and scattered receiving message.
https://github.com/msgpack/msgpack-c/blob/700167995927f0348fb08ae2579440c1bc135480/example/boost/asio_send_recv.cpp#L41-L64
C
C version is basically similar to C++.
Using unpack() function
C version has the similar unpack function.
Here is the prototype:
msgpack_unpack_return
msgpack_unpack_next(msgpack_unpacked* result,
const char* data, size_t len, size_t* off);
You can pass off as offset similar to C++ version. C doesn't have reference so you need to pass the address of off using &off.
See https://github.com/msgpack/msgpack-c/wiki/v2_0_c_overview#using-unpack-function
If you want to know individual variable length field size such as stirng, you can access size member variable of unpacked object.
For example:
typedef struct {
uint32_t size;
struct msgpack_object* ptr;
} msgpack_object_array;
typedef struct {
uint32_t size;
const char* ptr;
} msgpack_object_str;
typedef struct {
uint32_t size;
const char* ptr;
} msgpack_object_bin;
typedef struct {
int8_t type;
uint32_t size;
const char* ptr;
} msgpack_object_ext;
See https://github.com/msgpack/msgpack-c/wiki/v2_0_c_overview#object
Using unpacker
See https://github.com/msgpack/msgpack-c/wiki/v2_0_c_overview#using-unpacker
Okay.. according to the title i am trying to figure out a way - function that returns the character that dominates in a string. I might be able to figure it out.. but it seems something is wrong with my logic and i failed on this. IF someome can come up with this without problems i will be extremelly glad thank you.
I say "in a string" to make it more simplified. I am actually doing that from a buffered data containing a BMP image. Trying to output the base color (the dominant pixel).
What i have for now is that unfinished function i started:
RGB
bitfox_get_primecolor_direct
(char *FILE_NAME)
{
dword size = bmp_dgets(FILE_NAME, byte);
FILE* fp = fopen(convert(FILE_NAME), "r");
BYTE *PIX_ARRAY = malloc(size-54+1), *PIX_CUR = calloc(sizeof(RGB), sizeof(BYTE));
dword readed, i, l;
RGB color, prime_color;
fseek(fp, 54, SEEK_SET); readed = fread(PIX_ARRAY, 1, size-54, fp);
for(i = 54; i<size-54; i+=3)
{
color = bitfox_pixel_init(PIXEL_ARRAY[i], PIXEL_ARRAY[i+1], PIXEL_ARRAY[i+2);
memmove(PIX_CUR, color, sizeof(RGB));
for(l = 54; l<size-54; l+=3)
{
if (PIX_CUR[2] == PIXEL_ARRAY[l] && PIX_CUR[1] == PIXEL_ARRAY[l+1] &&
PIX_CUR[0] == PIXEL_ARRAY[l+2])
{
}
Note that RGB is a struct containing 3 bytes (R, G and B).
I know thats nothing but.. thats all i have for now.
Is there any way i can finish this?
If you want this done fast throw a stack of RAM at it (if available, of course). You can use a large direct-lookup table with the RGB trio to manufacture a sequence of 24bit indexes into a contiguous array of counters. In partial-pseudo, partial code, something like this:
// create a zero-filled 2^24 array of unsigned counters.
uint32_t *counts = calloc(256*256*256, sizeof(*counts));
uint32_t max_count = 0
// enumerate your buffer of RGB values, three bytes at a time:
unsigned char rgb[3];
while (getNextRGB(src, rgb)) // returns false when no more data.
{
uint32_t idx = (((uint32_t)rgb[0]) << 16) | (((uint32_t)rgb[1]) << 8) | (uint32_t)rgb[2];
if (++counts[idx] > max_count)
max_count = idx;
}
R = (max_count >> 16) & 0xFF;
G = (max_count >> 8) & 0xFF;
B = max_count & 0xFF;
// free when you have no more images to process. for each new
// image you can memset the buffer to zero and reset the max
// for a fresh start.
free(counts);
Thats it. If you can afford to throw a big hulk of memory at this a (it would be 64MB in this case, at 4 bytes per entry at 16.7M entries), then performing this becomes O(N). If you have a succession of images to process you can simply memset() the array back to zeros, clear max_count, and repeat for each additional file. Finally, don't forget to free your memory when finished.
Best of luck.
I'm struggling to modify a program that takes two files as input (each representing a vector) and calculates the dot product between them. It's supposed to be done in parallel, but I was told that the number of points in each file might not be evenly divisible by the number of available processors and each process might read from incorrect positions within the files. What I mean is that, if there are four processors, the first 250 points might be correctly read and calculated but the second processor might read over those same 250 points and provide an incorrect result. This is what I've done so far. Any modifications I've made are noted.
#include "fstream"
#include "stdlib.h"
#include "stdio.h"
#include "iostream"
#include "mpi.h"
int main(int argc, char *argv[]){
MPI_Init(&argc, argv);
//parse command line arguments
if( argc < 3 || argc > 3){
std::cout << "*** syntax: " << argv[0] << " vecFile1.txt vecFile2.txt" << std::endl;
return(0);
}
//get input file names
std::string vecFile1(argv[1]);
std::string vecFile2(argv[2]);
//open file streams
std::ifstream vecStream1(vecFile1.c_str());
std::ifstream vecStream2(vecFile2.c_str());
//check that streams opened properly
if(!vecStream1.is_open() || !vecStream2.is_open()){
std::cout << "*** Could not open Files ***" << std::endl;
return(0);
}
//if files are open read their lengths and make sure they are compatible
long vecLength1 = 0; vecStream1 >> vecLength1;
long vecLength2 = 0; vecStream2 >> vecLength2;
if( vecLength1 != vecLength2){
std::cout << "*** Vectors are no the same length ***" << std::endl;
return(0);
}
int numProc; //New variable for managing number of processors
MPI_Comm_size(&numProc,MPI_COMM_WORLD); //Added line to obtain number of processors
int subDomainSize = (vecLength1+numProc-1)/numProc; //Not sure if this is correct calculation; meant to account for divisibility with remainders
//read in the vector components and perform dot product
double dotSum = 0.;
for(long i = 0; i < subDomainSize; i++){ //Original parameter used was vecLength1; subDomainSize used instead for each process
double ind1 = 0.; vecStream1 >> ind1;
double ind2 = 0.; vecStream2 >> ind2;
dotSum += ind1*ind2;
}
std::cout << "VECTOR DOT PRODUCT: " << dotSum << std::endl;
MPI_Finalize();
}
Aside from those changes, I don't know where to go from here. What can I do to make this program properly calculate a dot product of two vectors using paralleling processing with two text files as input? Each contains 100000 points so it's impractical to manually modify the files.
I wont write the code here as it seems to be an assignment problem but I would try to give you some tips to go into right direction.
Each processor has an assigned rank that can be found out using the MPI_Comm_rank API. So for parallel processing you can divide the vectors of the files among the processors such that processor with rank r processes the vectors r*subdomainsize to (r+1)*subdomainsize - 1.
You need to make sure that the vector from correct position is read from the file by a particular processor. Use seek api to go to the right offset and then call the read(>>) operator of your filestream.
For calculating subdomainsize I am not sure whether the equation you mentioned works or not. There can be several approaches. The simplest is to use vectorlength/numProc as subdomainsize. Each processor can handle subdomainsize elements, however the last processor (rank == numProc) will handle the remaining elements.
After the for loop, you should use a reduction operation to collect the individual sums from the processors and sum it up globally for the final result. See MPI_Reduce.
Use Barrier for synchronization between the processors. A barrier must be placed after the for loop and before calling reduction.
It is possible to determine the file type from the magic number of file?
If I've understood, the magic number can have different size, maybe a reference dictionary or something like a library could help me?
it is possible to determine the file type from the magic number of file
yes you can ,because each file format has different magic number.
for example FFD8 for .jpg files
See here Magic Numbers in Files
The file command on Linux does precisely that. Study its internals to see how it identifies files using their magic-numbers(signature bytes). The complete source-code is available at darwinsys.com/file.
The following 2 lists are the most comprehensive ones with file-types and their magic-numbers:
- File Signature Table
- Linux Magic Numbers
JmimeMagic is a java library for such
Use libmagic (apt-get install libmagic-dev on Ubuntu systems).
Example below uses the default magic database to query the file passed on the command line. (Essentially an implementation of the file command. See man libmagic for more details/functions.
#include <iostream>
#include <magic.h>
#include <cassert>
int main(int argc, char **argv) {
if (argc == 1) {
std::cerr << "Usage " << argv[0] << " [filename]" << std::endl;
return -1;
}
const char * fname = argv[1];
magic_t cookie = magic_open(0);
assert (cookie !=nullptr);
int rc = magic_load(cookie, nullptr);
assert(rc == 0);
auto f= magic_file(cookie, fname);
if (f ==nullptr) {
std::cerr << magic_error(cookie) << std::endl;
} else {
std::cout << fname << ' ' << f << std::endl;
}
}
I'm working on my gEDA fork and want to get rid of the existing simple tile-based system1 in favour of a real spatial index2.
An algorithm that efficiently finds points is not enough: I need to find objects with non-zero extent. Think in terms of objects having bounding rectangles, that pretty much captures the level of detail I need in the index. Given a search rectangle, I need to be able to efficiently find all objects whose bounding rectangles are inside, or that intersect, the search rectangle.
The index can't be read-only: gschem is a schematic capture program, and the whole point of it is to move things around the schematic diagram. So things are going to be a'changing. So while I can afford insertion to be a bit more expensive than searching, it can't be too much more expensive, and deleting must also be both possible and reasonably cheap. But the most important requirement is the asymptotic behaviour: searching should be O(log n) if it can't be O(1). Insertion / deletion should preferably be O(log n), but O(n) would be okay. I definitely don't want anything > O(n) (per action; obviously O(n log n) is expected for an all-objects operation).
What are my options? I don't feel clever enough to evaluate the various options. Ideally there'd be some C library that will do all the clever stuff for me, but I'll mechanically implement an algorithm I may or may not fully understand if I have to. gEDA uses glib by the way, if that helps to make a recommendation.
Footnotes:
1 Standard gEDA divides a schematic diagram into a fixed number (currently 100) of "tiles" which serve to speed up searches for objects in a bounding rectangle. This is obviously good enough to make most schematics fast enough to search, but the way it's done causes other problems: far too many functions require a pointer to a de-facto global object. The tiles geometry is also fixed: it would be possible to defeat this tiling system completely simply by panning (and possibly zooming) to an area covered by only one tile.
2 A legitimate answer would be to keep elements of the tiling system, but to fix its weaknesses: teaching it to span the entire space, and to sub-divide when necessary. But I'd like others to add their two cents before I autocratically decide that this is the best way.
A nice data structure for a mix of points and lines would be an R-tree or one of its derivatives (e.g. R*-Tree or a Hilbert R-Tree). Given you want this index to be dynamic and serializable, I think using SQLite's R*-Tree module would be a reasonable approach.
If you can tolerate C++, libspatialindex has a mature and flexible R-tree implementation which supports dynamic inserts/deletes and serialization.
Your needs sound very similar to what is used in collision detection algorithms for games and physics simulations. There are several open source C++ libraries that handle this in 2-D (Box2D) or 3-D (Bullet physics). Although your question is for C, you may find their documentation and implementations useful.
Usually this is split into a two phases:
A fast broad phase that approximates objects by their axis-aligned bounding box (AABB), and determines pairs of AABBs that touch or overlap.
A slower narrow phase that calculates the points of geometric overlap for pairs of objects whose AABBs touch or overlap.
Physics engines also use spatial coherence to further reduce the pairs of objects that are compared, but this optimization probably won't help your application.
The broadphase is usually implemented with an O(N log N) algorithm like Sweep and prune. You may be able to accelerate this by using it in conjunction with the current tile approach (one of Nvidia's GPUGems describes this hybrid approach). The narrow phase is quite costly for each pair, and may be overkill for your needs. The GJK algorithm is often used for convex objects in this step, although faster algorithms exist for more specialized cases (e.g.: box/circle and box/sphere collisions).
This sounds to like an application well-suited to a quadtree (assuming you are interested only in 2D.) The quadtree is hierarchical (good for searching) and it's spatial resolution is dynamic (allowing higher resolution in areas that need it).
I've always rolled my own quadtrees, but here is a library that appears reasonable: http://www.codeproject.com/Articles/30535/A-Simple-QuadTree-Implementation-in-C
It is easy to do. It's hard to do fast. Sounds like a problem I worked on where there was a vast list of min,max values and given a value it had to return how many min,max pairs overlapped that value. You just have it in two dimensions. So you do it with two trees for each direction. Then do a intersection on the results. This is really fast.
#include <iostream>
#include <fstream>
#include <map>
using namespace std;
typedef unsigned int UInt;
class payLoad {
public:
UInt starts;
UInt finishes;
bool isStart;
bool isFinish;
payLoad ()
{
starts = 0;
finishes = 0;
isStart = false;
isFinish = false;
}
};
typedef map<UInt,payLoad> ExtentMap;
//==============================================================================
class Extents
{
ExtentMap myExtentMap;
public:
void ReadAndInsertExtents ( const char* fileName )
{
UInt start, finish;
ExtentMap::iterator EMStart;
ExtentMap::iterator EMFinish;
ifstream efile ( fileName);
cout << fileName << " filename" << endl;
while (!efile.eof()) {
efile >> start >> finish;
//cout << start << " start " << finish << " finish" << endl;
EMStart = myExtentMap.find(start);
if (EMStart==myExtentMap.end()) {
payLoad pay;
pay.isStart = true;
myExtentMap[start] = pay;
EMStart = myExtentMap.find(start);
}
EMFinish = myExtentMap.find(finish);
if (EMFinish==myExtentMap.end()) {
payLoad pay;
pay.isFinish = true;
myExtentMap[finish] = pay;
EMFinish = myExtentMap.find(finish);
}
EMStart->second.starts++;
EMFinish->second.finishes++;
EMStart->second.isStart = true;
EMFinish->second.isFinish = true;
// for (EMStart=myExtentMap.begin(); EMStart!=myExtentMap.end(); EMStart++)
// cout << "| key " << EMStart->first << " count " << EMStart->second.value << " S " << EMStart->second.isStart << " F " << EMStart->second.isFinish << endl;
}
efile.close();
UInt count = 0;
for (EMStart=myExtentMap.begin(); EMStart!=myExtentMap.end(); EMStart++)
{
count += EMStart->second.starts - EMStart->second.finishes;
EMStart->second.starts = count + EMStart->second.finishes;
}
// for (EMStart=myExtentMap.begin(); EMStart!=myExtentMap.end(); EMStart++)
// cout << "||| key " << EMStart->first << " count " << EMStart->second.starts << " S " << EMStart->second.isStart << " F " << EMStart->second.isFinish << endl;
}
void ReadAndCountNumbers ( const char* fileName )
{
UInt number, count;
ExtentMap::iterator EMStart;
ExtentMap::iterator EMTemp;
if (myExtentMap.empty()) return;
ifstream nfile ( fileName);
cout << fileName << " filename" << endl;
while (!nfile.eof())
{
count = 0;
nfile >> number;
//cout << number << " number ";
EMStart = myExtentMap.find(number);
EMTemp = myExtentMap.end();
if (EMStart==myExtentMap.end()) { // if we don't find the number then create one so we can find the nearest number.
payLoad pay;
myExtentMap[ number ] = pay;
EMStart = EMTemp = myExtentMap.find(number);
if ((EMStart!=myExtentMap.begin()) && (!EMStart->second.isStart))
{
EMStart--;
}
}
if (EMStart->first < number) {
while (!EMStart->second.isFinish) {
//cout << "stepped through looking for end - key" << EMStart->first << endl;
EMStart++;
}
if (EMStart->first >= number) {
count = EMStart->second.starts;
//cout << "found " << count << endl;
}
}
else if (EMStart->first==number) {
count = EMStart->second.starts;
}
cout << count << endl;
//cout << "| count " << count << " key " << EMStart->first << " S " << EMStart->second.isStart << " F " << EMStart->second.isFinish<< " V " << EMStart->second.value << endl;
if (EMTemp != myExtentMap.end())
{
myExtentMap.erase(EMTemp->first);
}
}
nfile.close();
}
};
//==============================================================================
int main (int argc, char* argv[]) {
Extents exts;
exts.ReadAndInsertExtents ( "..//..//extents.txt" );
exts.ReadAndCountNumbers ( "..//../numbers.txt" );
return 0;
}
the extents test file was 1.5mb of:
0 200000
1 199999
2 199998
3 199997
4 199996
5 199995
....
99995 100005
99996 100004
99997 100003
99998 100002
99999 100001
The numbers file was like:
102731
104279
109316
104859
102165
105762
101464
100755
101068
108442
107777
101193
104299
107080
100958
.....
Even reading the two files from disk, extents were 1.5mb and numbers were 780k and the really large number of values and lookups, this runs in a fraction of a second. If in memory it would lightning quick.