Simple CLucene example query/search returns no hits - clucene

I have what I think should be a very simple CLucene experiment, but it returns no hits.
I have two separate programs, CreateIndex and Query.
As far as I can tell, CreateIndex builds a viable index file, but Query returns zero hits. OS is Centos 6.4, CLucene version is 2.3.3.4.
Here is CreateIndex.cpp:
lucene::analysis::SimpleAnalyzer* analyzer;
int main(int argc, char** argv)
{
analyzer = new lucene::analysis::SimpleAnalyzer();
Directory* indexDir = FSDirectory::getDirectory("../Index");
IndexWriter* w = new IndexWriter(indexDir, analyzer, true, true);
int config = Field::STORE_YES && Field::INDEX_TOKENIZED;
Field* field;
Document* doc;
doc = new Document();
field = new Field(L"president", L"Nixon", config);
doc->clear();
doc->add(*field);
w->addDocument(doc);
field = new Field(L"president", L"Obama", config);
doc->clear();
doc->add(*field);
w->addDocument(doc);
field = new Field(L"president", L"Clinton", config);
doc->clear();
doc->add(*field);
w->addDocument(doc);
w->close();
indexDir->close();
}
And here is Query.cpp:
int main(int argc, char** argv)
{
IndexReader* reader = IndexReader::open("../Index");
lucene::analysis::SimpleAnalyzer* analyzer =
new lucene::analysis::SimpleAnalyzer();
IndexReader* newreader = reader->reopen();
if ( newreader != reader )
{
_CLLDELETE(reader);
reader = newreader;
}
IndexSearcher searcher(reader);
Query* query = QueryParser::parse(L"Nixon*",
L"president", analyzer);
Hits* hits = searcher.search(query);
cout << "Total hits: " << hits->length() << endl;
}

Related

How to add subMsg to msg repeated using Nanopb?

I'm simply trying to add one message to another message (up to 60 times times)
My .proto file looks as follows;
syntax = "proto3";
message FeatureFile {
string fileName= 2;
string Id= 3;
repeated Feature features = 1;
}
message Feature {
int32 version = 1;
int32 epochTime = 2;
int32 noOfObs= 3;
int32 frequency = 4;
}
I have tried to make a callback function to add repeated data, but cannot make it work.
bool encode_string(pb_ostream_t* stream, const pb_field_t* field, void* const* arg)
{
const char* str = (const char*)(*arg);
if (!pb_encode_tag_for_field(stream, field))
return false;
return pb_encode_string(stream, (uint8_t*)str, strlen(str));
}
bool encode_repeatedMsg(pb_ostream_t* stream, const pb_field_t* field, void* const* arg)
{
const char* obj = (const char*)(*arg);
int i;
for (i = 0; i < 60; i++)
{
if (!pb_encode_tag_for_field(stream, field))
return false;
if (!pb_encode_submessage(stream, Feature_fields, *arg))
return false;
}
return true;
}
int main()
{
FeatureFile featurefile = FeatureFile_init_zero;
Feature feature = Feature_init_zero;
featurefile.fileName.arg = "092536.csv";
featurefile.fileName.funcs.encode = &encode_string;
featurefile.Id.arg = "";
featurefile.Id.funcs.encode = &encode_string;
feature.version = 1;
feature.epochTime = 12566232;
feature.noOfObs = 260;
feature.frequency = 200;
featurefile.features.funcs.encode = &encode_repeatedMsg;
I thought I could call the repeated encoding like the last line of code shows, but I doesn't allow me.
The callback itself is supposed to add 60 of the same messages (feature) to the the featurefile.
Can anyone help me here?
I myself have never used the callbacks in nanopb. I do have been using the .options file to statically allocate the desired array size. Your case this might be a bit much as your require 60 messages but this is how you do it:
You create a file with the same name as your .proto file but give it the extension .options. You place it in the same folder as your .proto file. In that file you mention there repeated variable name and assign it a size:
# XXXX.options
FeatureFile.features max_count:16
More information on the nanopb options can be found here.

Can we use arrays for multiple publish topics in ros?

So i was working on some ros based UAV simulation and it just struck me when I had to initialize separate publishers for each UAV. Is it possible to make an array of such publishers and then reference them by just using their index number? I know I should just do it and try it, but I guessed asking here would be a faster option:)
Yes this is possible by collecting multiple ros::Publishers in containers. Here is a small example using an array:
#include <ros/ros.h>
#include <std_msgs/String.h>
int main(int argc, char *argv[])
{
ros::init(argc, argv, "test_node");
ros::NodeHandle nh;
ros::WallTimer timer;
//Create publishers
std::array<ros::Publisher, 3> publishers;
for (size_t i = 0; i < publishers.size(); i++)
{
std::stringstream topic_name;
topic_name << "topic" << i;
publishers[i] = nh.advertise<std_msgs::String>(topic_name.str(), 0);
}
//Publish
ros::Rate r(1);
std_msgs::String msg;
while (nh.ok())
{
std::stringstream message;
message << "Hello World " << ros::Time::now();
msg.data = message.str();
for (size_t i = 0; i < publishers.size(); i++)
{
publishers[i].publish(msg);
}
ros::spinOnce();
r.sleep();
}
return 0;
}
The node advertises the three topics
/topic0
/topic1
/topic2
and publishes a simple string like Hello World 1562571209.130936883 with a rate of 1 Hz.

Fuzzy regex match using TRE

I'm trying to use the TRE library in my C program to perform a fuzzy regex search. I've managed to piece together this code from reading the docs:
regex_t rx;
regcomp(&rx, "(January|February)", REG_EXTENDED);
int result = regexec(&rx, "January", 0, 0, 0);
However, this will match only an exact regex (i.e. no spelling errors are allowed). I don't see any parameter which allows to set the fuzziness in those functions:
int regcomp(regex_t *preg, const char *regex, int cflags);
int regexec(const regex_t *preg, const char *string, size_t nmatch,
regmatch_t pmatch[], int eflags);
How can I set the level of fuzziness (i.e. maximum Levenshtein distance), and how do I get the Levenshtein distance of the match?
Edit: I forgot to mention I'm using the Windows binaries from GnuWin32, which are available only for version 0.7.5. Binaries for 0.8.0 are available only for Linux.
Thanks to #Wiktor Stribiżew, I found out which function I need to use, and I've successfully compiled a working example:
#include <stdio.h>
#include "regex.h"
int main() {
regex_t rx;
regcomp(&rx, "(January|February)", REG_EXTENDED);
regaparams_t params = { 0 };
params.cost_ins = 1;
params.cost_del = 1;
params.cost_subst = 1;
params.max_cost = 2;
params.max_del = 2;
params.max_ins = 2;
params.max_subst = 2;
params.max_err = 2;
regamatch_t match;
match.nmatch = 0;
match.pmatch = 0;
if (!regaexec(&rx, "Janvary", &match, params, 0)) {
printf("Levenshtein distance: %d\n", match.cost);
} else {
printf("Failed to match\n");
}
return 0;
}

Reading and parsing text file exception-C#

I am parsing big text files and it's working fine for some time but after few minutes it give me exception (An unhandled exception of type 'System.UnauthorizedAccessException' occurred in System.Core.dll
Additional information: Access to the path is denied.)
I get exception on below mention line.
accessor = MemoryMapped.CreateViewAccessor(offset, length, MemoryMappedFileAccess.Read);
Below is my function
public static void CityStateZipAndZip4(string FilePath,long offset,long length,string spName)
{
try
{
long indexBreak = offset;
string fileName = Path.GetFileName(FilePath);
if (fileName.Contains(".txt"))
fileName = fileName.Replace(".txt", "");
System.IO.FileStream file = new System.IO.FileStream(#FilePath, FileMode.Open,FileAccess.Read, FileShare.Read );
Int64 b = file.Length;
MemoryMappedFile MemoryMapped = MemoryMappedFile.CreateFromFile(file, fileName, b, MemoryMappedFileAccess.Read, null, HandleInheritability.Inheritable, false);
using (MemoryMapped)
{
//long offset = 182; // 256 megabytes
//long length = 364; // 512 megabytes
MemoryMappedViewAccessor accessor = MemoryMapped.CreateViewAccessor(offset, length, MemoryMappedFileAccess.Read);
byte byteValue;
int index = 0;
int count = 0;
StringBuilder message = new StringBuilder();
do
{
if (indexBreak == index)
{
count = count + 1;
accessor.Dispose();
string NewRecord = message.ToString();
offset = offset + indexBreak;
length = length + indexBreak;
if (NewRecord.IndexOf("'") != -1)
{ NewRecord = NewRecord.Replace("'", "''"); }
// string Sql = "insert into " + DBTableName + " (ID, DataString) values( " + count + ",'" + NewRecord + "')";
string Code = "";
if (spName == AppConfig.sp_CityStateZip)
{
Code = NewRecord.Trim().Substring(0, 1);
}
InsertUpdateAndDeleteDB(spName, NewRecord.Trim (), Code);
accessor = MemoryMapped.CreateViewAccessor(offset, length, MemoryMappedFileAccess.Read);
message = new StringBuilder();
index = 0;
//break;
}
byteValue = accessor.ReadByte(index);
if (byteValue != 0)
{
char asciiChar = (char)byteValue;
message.Append(asciiChar);
}
index++;
} while (byteValue != 0);
}
MemoryMapped.Dispose();
}
catch (FileNotFoundException)
{
Console.WriteLine("Memory-mapped file does not exist. Run Process A first.");
}
}
Somewhere deep in resource processing code we have something like this:
try {
// Try loading some strings here.
} catch {
// Oops, could not load strings, try another way.
}
Exception is thrown and handled already, it would never show up in your application. The only way to see it is to attach debugger and observe this message.
As you could see from the code, it has nothing to do with your problem. The real problem here is what debugger shows you something you should not see.
Run the solution without debugging mode and it works fine.
This exception means that your program does not get Read access to the file from Windows.
Have you made sure that this file is not locked when your program tries to read it ?
For example, it could be a file that your own program is currently using.
If not, try to run your program as an Administrator and see if it makes a difference.

SQLite in C and supporting REGEXP

I'm using sqlite3 in C and I'd like to add support for the REGEXP operator. By default, a user defined function regexp() is not present and calling REGEXP will usually result in an error (according to the SQLite pages).
How do I add a regexp function to support REGEXP? Presumably I will do this via the sqlite3_create_function call, but I don't know what the application-defined regexp() will look like.
Can I use a function from regex.h with sqlite3_create_function and how? Any function I pass to SQLite has to take three arguments of type sqlite3_context*, int, sqlite3_value**. However, the SQLite documents don't seem to explain the meaning of these parameters.
Is there sample code for a C regexp() function?
I've not been able to find much on this using Google or the SQLite pages.
You can also try this:
#include <regex.h>
...
void sqlite_regexp(sqlite3_context* context, int argc, sqlite3_value** values) {
int ret;
regex_t regex;
char* reg = (char*)sqlite3_value_text(values[0]);
char* text = (char*)sqlite3_value_text(values[1]);
if ( argc != 2 || reg == 0 || text == 0) {
sqlite3_result_error(context, "SQL function regexp() called with invalid arguments.\n", -1);
return;
}
ret = regcomp(&regex, reg, REG_EXTENDED | REG_NOSUB);
if ( ret != 0 ) {
sqlite3_result_error(context, "error compiling regular expression", -1);
return;
}
ret = regexec(&regex, text , 0, NULL, 0);
regfree(&regex);
sqlite3_result_int(context, (ret != REG_NOMATCH));
}
...
sqlite3_create_function(*db, "regexp", 2, SQLITE_ANY,0, &sqlite_regexp,0,0)
It would look something like this:
static void user_regexp(sqlite3_context *context, int argc, sqlite3_value **argv)
{
struct re_pattern_buffer buffer;
const char *out;
char *pattern;
char *input_string;
char *result;
struct re_registers regs;
if ((sqlite3_value_type(argv[0]) != SQLITE_TEXT )
|| ((sqlite3_value_type(argv[1]) != SQLITE_TEXT ))
{
sqlite3_result_err("Improper argument types");
return;
}
re_set_syntax(RE_SYNTAX_POSIX_EGREP);
memset(&buffer, 0, sizeof (buffer));
if (!(pattern = strdupa(sqlite3_value_text(argv[0])))
|| !(input_string = strdupa(sqlite3_value_text(argv[1]))))
{
sqlite3_result_err_nomem("Could not allocate memory for strings");
return;
}
if ((out = re_compile_pattern(pattern, strlen(pattern), &buffer))
{
sqlite3_result_err("Could not compile pattern!");
return;
}
if (re_match(&buffer, input_string, strlen(input_string), 0, &regs) < 0)
sqlite3_result_int64(context, 0);
else
{
result = strndupa(input_string + regs.start[0], regs.end[0] - regs.start[0]);
sqlite3_result_text(context, result, NULL, SQLITE_TRANSIENT);
}
}
Okay, a bit too late for this but I'm tempted to post this for all people who are using a C++ Wrapper for the C SQLITE API like the [ SQLiteCpp ] which I am using. This answer assumes that you use SQLiteCpp.
Install Regex for windows binaries from [ here ]. This gives you just enough files ie the regex.h include file and regex2.dll. Do remember to add the path regex.h in your project and have a copy of the dll in the folder containing client executables.
Before building the [ SQLiteCpp ], we need to make some changes to add the regex capabilities to SELECT queries. For this open the Database.cpp file from the [ SQLiteCpp ] project and
Include the regex.h header from the Regex for windows
After all the includes, add below ( of course you can customize to custom fit your needs!) piece of code just below it.
extern "C" {
void sqlite_regexp(sqlite3_context* context, int argc, sqlite3_value** values) {
int ret;
regex_t regex;
char regtext[100];
char* reg = (char*)sqlite3_value_text(values[0]);
sprintf(regtext, ".*%s.*", reg);
//printf("Regtext : %s", regtext);
char* text = (char*)sqlite3_value_text(values[1]);
/* printf("Text : %s\n", text);
printf("Reg : %s\n", reg); */
if (argc != 2 || reg == 0 || text == 0) {
sqlite3_result_error(context, "SQL function regexp() called with invalid arguments.\n", -1);
return;
}
ret = regcomp(&regex, regtext, REG_EXTENDED | REG_NOSUB | REG_ICASE);
if (ret != 0) {
sqlite3_result_error(context, "error compiling regular expression", -1);
return;
}
ret = regexec(&regex, text, 0, NULL, 0);
/* if (ret == 0) {
printf("Found a match. Press any key to continue");
getc(stdin);
}*/
regfree(&regex);
sqlite3_result_int(context, (ret != REG_NOMATCH));
}
}
Now it is time to change the constructors defined in the file. Change those like shown below.
// Open the provided database UTF-8 filename with SQLite::OPEN_xxx provided flags.
Database::Database(const char* apFilename,
const int aFlags /*= SQLite::OPEN_READONLY*/,
const int aBusyTimeoutMs/* = 0 */,
const char* apVfs/*= NULL*/) :
mpSQLite(NULL),
mFilename(apFilename)
{
const int ret = sqlite3_open_v2(apFilename, &mpSQLite, aFlags, apVfs);
//std::cout << "Reached here";
//sqlite3_create_function_v2(mpSQLite, "REGEXP", 2, SQLITE_ANY,&sqlite_regexp, NULL, NULL, NULL,NULL);
sqlite3_create_function(mpSQLite, "regexp", 2, SQLITE_ANY, 0, &sqlite_regexp, 0, 0);
if (SQLITE_OK != ret)
{
const SQLite::Exception exception(mpSQLite, ret); // must create before closing
sqlite3_close(mpSQLite); // close is required even in case of error on opening
throw exception;
}
else {
}
if (aBusyTimeoutMs > 0)
{
setBusyTimeout(aBusyTimeoutMs);
}
}
// Open the provided database UTF-8 filename with SQLite::OPEN_xxx provided flags.
Database::Database(const std::string& aFilename,
const int aFlags /* = SQLite::OPEN_READONLY*/,
const int aBusyTimeoutMs/* = 0*/,
const std::string& aVfs/* = "" */) :
mpSQLite(NULL),
mFilename(aFilename)
{
const int ret = sqlite3_open_v2(aFilename.c_str(), &mpSQLite, aFlags, aVfs.empty() ? NULL : aVfs.c_str());
sqlite3_create_function(mpSQLite, "regexp", 2, SQLITE_ANY, 0, &sqlite_regexp, 0, 0);
if (SQLITE_OK != ret)
{
const SQLite::Exception exception(mpSQLite, ret); // must create before closing
sqlite3_close(mpSQLite); // close is required even in case of error on opening
throw exception;
}
if (aBusyTimeoutMs > 0)
{
setBusyTimeout(aBusyTimeoutMs);
}
}
By now, you've some serious regex capabilities with your sqlite. Just build the project.
Write a client program to test the functionality. It can be something like below ( borrowed without shame from SQLiteCpp Example ).
#include <iostream>
#include <cstdio>
#include <cstdlib>
#include <string>
#include <SQLiteCpp/SQLiteCpp.h>
#include <SQLiteCpp/VariadicBind.h>
// Notice no sqlite3.h huh?
// Well, this is a C++ wrapper for the SQLITE CAPI afterall.
#ifdef SQLITECPP_ENABLE_ASSERT_HANDLER
namespace SQLite
{
/// definition of the assertion handler enabled when SQLITECPP_ENABLE_ASSERT_HANDLER is defined in the project (CMakeList.txt)
void assertion_failed(const char* apFile, const long apLine, const char* apFunc, const char* apExpr, const char* apMsg)
{
// Print a message to the standard error output stream, and abort the program.
std::cerr << apFile << ":" << apLine << ":" << " error: assertion failed (" << apExpr << ") in " << apFunc << "() with message \"" << apMsg << "\"\n";
std::abort();
}
}
#endif
/// Get example path
static inline std::string getExamplePath()
{
std::string filePath(__FILE__);
return filePath.substr(0, filePath.length() - std::string("Client.cpp").length());
}
/// Example Database
static const std::string filename_example_db3 = getExamplePath() + "/example.db3";
/// Image
static const std::string filename_logo_png = getExamplePath() + "/logo.png";
/// Object Oriented Basic example
class Example
{
public:
//Constructor
Example() :
mDb(filename_example_db3),
// User change the db and tables accordingly
mQuery(mDb, "SELECT id,name FROM lookup WHERE name REGEXP :keyword")
// Open a database file in readonly mode
{
}
virtual ~Example()
{
}
/// List the rows where the "weight" column is greater than the provided aParamValue
void namehaskeyword(const std::string searchfor)
{
std::cout << "Matching results for " << searchfor << "\n";
// Bind the integer value provided to the first parameter of the SQL query
mQuery.bind(1,searchfor); // same as mQuery.bind(1, aParamValue);
// Loop to execute the query step by step, to get one a row of results at a time
while (mQuery.executeStep())
{
std::cout<<mQuery.getColumn(0) << "\t" << mQuery.getColumn(1) << "\n";
}
// Reset the query to be able to use it again later
mQuery.reset();
}
private:
SQLite::Database mDb; ///< Database connection
SQLite::Statement mQuery; ///< Database prepared SQL query
};
int main()
{
// Using SQLITE_VERSION would require #include <sqlite3.h> which we want to avoid: use SQLite::VERSION if possible.
// std::cout << "SQlite3 version " << SQLITE_VERSION << std::endl;
std::cout << "SQlite3 version " << SQLite::VERSION << " (" << SQLite::getLibVersion() << ")" << std::endl;
std::cout << "SQliteC++ version " << SQLITECPP_VERSION << std::endl;
try
{
// Doing a regex query.
Example example;
char wannaquit = 'n';
std::string keyword;
// Deliberate unlimited loop. You implement something sensible here.
while (wannaquit != 'y') {
// Demonstrates the way to use the same query with different parameter values
std::cout << "Enter the keyword to search for : ";
std::getline(std::cin, keyword);
example.namehaskeyword(keyword);
}
}
catch (std::exception& e)
{
std::cout << "SQLite exception : " << e.what() << std::endl;
return EXIT_FAILURE; // unexpected error : exit the example program
}
return EXIT_SUCCESS;
}
Note : This assumes that the database is in the same folder as your cpp

Resources