I need to test if certain documents match a query before actually indexing them. How would you do this? One of the possibilities I'm thinking of is running a plain lucene index on memory (ramdisk?) and follow a index -> test query -> delete loop for every new document I have before sending it to the actual Solr server.
Can anyone think of a better solution for this problem?
Thanks a lot.
Update:
Looks like this could be a good starting point: http://www.lucenetutorial.com/lucene-in-5-minutes.html
Since Solr allows transactions / commits you can actually index them and before you do commit do state a delete query which removes all non matching documents.
/**
* #author Omnaest
*/
public class SolrSimpleIndexingTest
{
protected SolrServer solrServer = newSolrServerInstance();
#Test
public void testSolr() throws IOException,
SolrServerException
{
{
SolrInputDocument solrInputDocument = new SolrInputDocument();
{
solrInputDocument.addField( "id", "0" );
solrInputDocument.addField( "text", "test1" );
}
this.solrServer.add( solrInputDocument );
}
{
SolrInputDocument solrInputDocument = new SolrInputDocument();
{
solrInputDocument.addField( "id", "1" );
solrInputDocument.addField( "text", "test2" );
}
this.solrServer.add( solrInputDocument );
}
this.solrServer.deleteByQuery( "text:([* TO *] -test2)" );
this.solrServer.commit();
/*
* Now your index does only contain the document with id=1 !!
*/
QueryResponse queryResponse = this.solrServer.query( new SolrQuery().setQuery( "*:*" ) );
SolrDocumentList solrDocumentList = queryResponse.getResults();
assertEquals( 1, solrDocumentList.size() );
assertEquals( "1", solrDocumentList.get( 0 ).getFieldValue( "id" ) );
}
/**
* #return
*/
private static CommonsHttpSolrServer newSolrServerInstance()
{
try
{
return new CommonsHttpSolrServer( "http://localhost:8983/solr" );
}
catch ( MalformedURLException e )
{
e.printStackTrace();
fail();
}
return null;
}
}
Related
I am using this code to query Solr and I can see facets are being returned from Solr but for some reason they are not passed through.
public class HomeController : Controller
{
private readonly ISolrReadOnlyOperations<Product> _solr;
public HomeController(ISolrReadOnlyOperations<Product> solr)
{
_solr = solr;
}
public ActionResult Index()
{
var queryOptions = new QueryOptions()
{
Rows = 5,
Facet = new FacetParameters
{
Queries = new[] { new SolrFacetFieldQuery("brand") }
}
};
SolrQueryByField query = new SolrQueryByField("category", "phones-tablets/mobile-phones");
SolrQueryResults<Product> results = _solr.Query(query, queryOptions);
return View();
}
}
The above code ends up generating this url http://localhost:8983/solr/new_core/select?q=category%3a(phones%5c-tablets%5c%2fmobile%5c-phones)&rows=5&facet=true&facet.field=brand&version=2.2&wt=xml
When I paste the URL I can see the facets section as expected. But results.FacetQueries.Count is zero. Am I missing something?
FacetQueries is used for returning the result of explicit facet queries. You're performing regular faceting. That result can be accessed through results.FacetFields. From the documentation:
var r = solr.Query(...);
foreach (var facet in r.FacetFields["category"]) {
Console.WriteLine("{0}: {1}", facet.Key, facet.Value);
}
I am trying to build a query parser by ranking the passages containing the terms.
I understand that I need to use SpanNearQuery, but I can't find a way to access Spans even after going through the documentation. The method I got returns null.
I have read https://lucidworks.com/blog/2009/07/18/the-spanquery/ which explains in a good way about the query. This explains how to access spans, but it is for solr 4.0 and unfortunately solr 6.3 doesn't have atomic reader any more.
How can I get the actual spans?
public void process(ResponseBuilder rb) throws IOException {
SolrParams params = rb.req.getParams();
log.warn("in Process");
if (!params.getBool(COMPONENT_NAME, false)) {
return;
}
Query origQuery = rb.getQuery();
// TODO: longer term, we don't have to be a span query, we could re-analyze the document
if (origQuery != null) {
if (origQuery instanceof SpanNearQuery == false) {
throw new SolrException(SolrException.ErrorCode.SERVER_ERROR,
"Illegal query type. The incoming query must be a Lucene SpanNearQuery and it was a " + origQuery.getClass().getName());
}
SpanNearQuery sQuery = (SpanNearQuery) origQuery;
SolrIndexSearcher searcher = rb.req.getSearcher();
IndexReader reader = searcher.getIndexReader();
log.warn("before leaf reader context");
List<LeafReaderContext> ctxs = (List<LeafReaderContext>) reader.leaves();
log.warn("after leaf reader context");
LeafReaderContext ctx = ctxs.get(0);
SpanWeight spanWeight = sQuery.createWeight(searcher, true);
Spans spans = spanWeight.getSpans(ctx, SpanWeight.Postings.POSITIONS);
AtomicReader wrapper = SlowCompositeReaderWrapper.wrap(reader);
Map<Term, TermContext> termContexts = new HashMap<Term, TermContext>();
Spans spans = fleeceQ.getSpans(wrapper.getContext(), new Bits.MatchAllBits(reader.numDocs()), termContexts);
// SpanWeight.Postings[] postings= SpanWeight.Postings.values();
// Spans spans = sQuery.getSpans();
// Assumes the query is a SpanQuery
// Build up the query term weight map and the bi-gram
Map<String, Float> termWeights = new HashMap<String, Float>();
Map<String, Float> bigramWeights = new HashMap<String, Float>();
createWeights(params.get(CommonParams.Q), sQuery, termWeights, bigramWeights, reader);
float adjWeight = params.getFloat(ADJACENT_WEIGHT, DEFAULT_ADJACENT_WEIGHT);
float secondAdjWeight = params.getFloat(SECOND_ADJ_WEIGHT, DEFAULT_SECOND_ADJACENT_WEIGHT);
float bigramWeight = params.getFloat(BIGRAM_WEIGHT, DEFAULT_BIGRAM_WEIGHT);
// get the passages
int primaryWindowSize = params.getInt(OWLParams.PRIMARY_WINDOW_SIZE, DEFAULT_PRIMARY_WINDOW_SIZE);
int adjacentWindowSize = params.getInt(OWLParams.ADJACENT_WINDOW_SIZE, DEFAULT_ADJACENT_WINDOW_SIZE);
int secondaryWindowSize = params.getInt(OWLParams.SECONDARY_WINDOW_SIZE, DEFAULT_SECONDARY_WINDOW_SIZE);
WindowBuildingTVM tvm = new WindowBuildingTVM(primaryWindowSize, adjacentWindowSize, secondaryWindowSize);
PassagePriorityQueue rankedPassages = new PassagePriorityQueue();
// intersect w/ doclist
DocList docList = rb.getResults().docList;
log.warn("Before Spans");
while (spans.nextDoc() != Spans.NO_MORE_DOCS) {
// build up the window
log.warn("Iterating through spans");
if (docList.exists(spans.docID())) {
tvm.spanStart = spans.startPosition();
tvm.spanEnd = spans.endPosition();
// tvm.terms
Terms terms = reader.getTermVector(spans.docID(), sQuery.getField());
tvm.map(terms, spans);
// The entries map contains the window, do some ranking of it
if (tvm.passage.terms.isEmpty() == false) {
log.debug("Candidate: Doc: {} Start: {} End: {} ", new Object[] { spans.docID(), spans.startPosition(), spans.endPosition() });
}
tvm.passage.lDocId = spans.docID();
tvm.passage.field = sQuery.getField();
// score this window
try {
addPassage(tvm.passage, rankedPassages, termWeights, bigramWeights, adjWeight, secondAdjWeight, bigramWeight);
} catch (CloneNotSupportedException e) {
throw new SolrException(SolrException.ErrorCode.SERVER_ERROR, "Internal error cloning Passage", e);
}
// clear out the entries for the next round
tvm.passage.clear();
}
}
}
}
Pls How can I access sqlite database on the webserver in codename one? I can only use database API to access database on the device. In order to access this on the webserver I think is quite different thing. Pls I need a snippet code on this. Thanks
Use the code below, not tested and you may have to adjust it to suite your need. Leave a comment if there's an issue:
ConnectionRequest req = new ConnectionRequest() {
#Override
protected void handleException(Exception ex) {
//handle error
}
};
req.setUrl(YourURL);
req.setPost(true);
req.setHttpMethod("POST"); //Change to GET if necessary
req.setDuplicateSupported(true);
req.addArgument("argumentToSendThroughPostOrGet1", "value1");
req.addArgument("argumentToSendThroughPostOrGet2", "value2");
NetworkManager.getInstance().addToQueueAndWait(req);
if (req.getResponseCode() == 200) {
Map<String, Object> out = new HashMap<>();
Display.getInstance().invokeAndBlock(() -> {
JSONParser p = new JSONParser();
try (InputStreamReader r = new InputStreamReader(new ByteArrayInputStream(req.getResponseData()))) {
out.putAll(p.parseJSON(r));
} catch (IOException ex) {
//handle error
}
});
if (!out.isEmpty()) {
List<Map<String, Object>> responses = (List<Map<String, Object>>) out.get("response");
for (Object response : responses) {
Map res = (Map) response;
System.out.println(res.get("key"));
}
} else {
//handle error
}
} else {
//handle error
}
TEST JSON RESPONSE:
{
"response": [
{
"key": "I was returned",
}
]
}
EDIT:
To pass data from TextField:
req.addArgument("argumentToSendThroughPostOrGet1", myTextField.getText());
Based on your comment, you can read those arguments in PHP as simple as below:
$var1 = $_POST["argumentToSendThroughPostOrGet1"];
$var1 = $_GET["argumentToSendThroughPostOrGet1"]; // if GET method is used in Codename One
//Or use $_REQUEST which supports both methods but not advisable to be used for production
...
And you can use those variables in your php code normally.
Example of Usage with MySql Query:
class Connection {
function connect() {
$mysqli = mysqli_init();
$mysqli->real_connect("localhost", "username", "password", "databaseName") or die('Could not connect to database!');
$mysqli->query("SET NAMES 'UTF8'");
return $mysqli;
}
function close() {
mysqli_close($this->connect);
}
}
$connection = new Connection();
$mysqli = $connection->connect();
$mysqli->query("SELECT * FROM MyTable WHERE ColumnName LIKE '%$var1%' ORDER BY PrimaryKeyId ASC LIMIT 100");
Please note: I know for the question SolrNet - The given key was not present in the dictionary and I have initialized solr object just like Mauricio suggests.
I am using solr 4.6.0 and solrnet build #173, .net framework 4.0 and VS2012 for development. For some unknown reason I am receiving error 'The given key was not present in the dictionary'. I have a document with that id in solr, I've checked via browser. It's a document like any other document. Why is error popping up? My code (I've made a comment on the place where the error happens):
//establishes connection with solr
private void ConnectToSolr()
{
try
{
if (_solr != null) return;
Startup.Init<Register>(SolrAddress);
_solr = ServiceLocator.Current.GetInstance<ISolrOperations<Register>>();
}
catch (Exception ex)
{
throw new Exception(ex.Message);
}
}
//Returns snippets from solr as BindingSource
public BindingSource GetSnippets(string searchTerm, DateTime? startDate = null, DateTime? endDate = null)
{
ConnectToSolr();
string dateQuery = startDate == null
? ""
: endDate == null
? "savedate:\"" + convertDateToSolrFormat(startDate) + "\"" //only start date
: "savedate:[" + convertDateToSolrFormat(startDate) + " TO " +
convertDateToSolrFormat(endDate) + "]";//range between start and end date
string textQuery = string.IsNullOrEmpty(searchTerm) ? "text:*" : "text:*" + searchTerm + "*";
List<Register> list = new List<Register>();
SolrQueryResults<Register> results;
string currentId = "";
try
{
results = _solr.Query(textQuery,
new QueryOptions
{
Highlight = new HighlightingParameters
{
Fields = new[] { "*" },
},
ExtraParams = new Dictionary<string, string>
{
{"fq", dateQuery},
{"sort", "savedate desc"}
}
});
for (int i = 0; i < results.Highlights.Count; i++)
{
currentId = results[i].Id;
var h = results.Highlights[currentId];
if (h.Snippets.Count > 0)
{
list.Add(new Register//here the error "the given key was not present in the dictionary pops up
{
Id = currentId,
ContentLiteral = h.Snippets["content"].ToArray()[0].Trim(new[]{' ', '\n'}),
SaveDateLiteral = results[i].SaveDate.ToShortDateString()
});
}
}
BindingList<Register> bindingList = new BindingList<Register>(list);
BindingSource bindingSource = new BindingSource();
bindingSource.DataSource = bindingList;
return bindingSource;
}
catch(Exception e)
{
MessageBox.Show(string.Format("{0}\nId:{1}", e.Message, currentId), "Solr error");
return null;
}
}
I've found out what's causing the problem: saving empty documents into solr. If I make an empty query (with text:*) through solrnet (usually I do this if I want to see all saved documents) and empty document is one of saved docs, then 'The given key is not present in dictionary pops up'. If all of the documents have text in them, this error doesn't pop up.
If you document contains fields with types other than string and you index null value to a double or integer field you will get the same error.
solr query return the null field as:
<null name="fieldname"/>
should be
<double name="fieldname">0.0</double>
or
<double name="fieldname"/>
I'd like to manipulate the result of a solr server search. I don't think it's possible on the Server side filter, because the information is only available on the client side at runtime.
I tried following:
private void filterKernSortiment ( SolrDocumentList docsList )
{
List<SolrDocument> filteredItems = new ArrayList<SolrDocument>();
Iterator<SolrDocument> iter = docsList.iterator();
while ( iter.hasNext() )
{
SolrDocument doc = iter.next();
String artnr = doc.getFieldValue( "artnr" ).toString();
String lfnt = doc.getFieldValue( "lfnt" ).toString();
if ( ! user.isForUserInStock( artnr, lfnt ) )
{
filteredItems.add( doc );
}
}
log.debug( "filteredItems=" + filteredItems.size() );
Iterator<SolrDocument> iterFilter = filteredItems.iterator();
while ( iterFilter.hasNext() )
{
SolrDocument doc = iterFilter.next();
docsList.remove( doc );
}
}
The SolrDocumentList is filtered correctly, but the getFacetField function gives the result of the unfiltered SolrDocumentList.
Do I have to manipulate the FacetField Lists too or do you know a a better solution for the problem?