Lucene use taxonomy and DocValues facets together - solr

There are many examples of the use of facets based on the taxonomy index and on DocValues. But I need use as a hierarchy of categories (taxonomy) and Range queries (NumericDocValuesField) together.
For example DrillSideways :
DrillSideways ds = new DrillSideways (searcher, config, taxoReader);
DrillSideways.DrillSidewaysResult result = ds.search (q, topScoreDocCollector);
The second parameter of ds.search() is TopScoreDocCollector.
FacetsCollector created inside ds.search() and not possible to pass this collector to ds.search(). Pass MultiCollector.wrap (FacetsCollector, TopScoreDocCollector) as second parameter in ds.search() is not correct(?). However FacetsCollector need to build facets that are not available in the taxonomy index:
Facets facetsTime = new LongRangeFacetCounts (..., FacetsCollector, ...);
facetsTime.getTopChildren (...);
Also list result.facets contain null value, which refers to DocValues ​​facet.
Maybe you have a working example how use taxonomy and DocValues ​​facets in DrillSideways together.

DrillSideways assumes that you use either TaxonomyFacets or SortedSetDocValuesFacets exclusively. If this is not the case, you can subclass DrillSideways and override the buildFacetsResult method to build the final Facets however you like. You will get the FacetsCollector for the DrillDownQuery and two arrays with the sideways FacetCollectors and dims, for every dim you have added to the DrissSideways.
Here is an example:
public class MyDrillSideways extends DrillSideways {
public MyDrillSideways(IndexSearcher searcher, FacetsConfig config, TaxonomyReader taxoReader) {
super(searcher, config, taxoReader);
}
#Override
protected Facets buildFacetsResult(FacetsCollector drillDowns, FacetsCollector[] drillSideways, String[] drillSidewaysDims) throws IOException {
String longRangeFacetDim = "mySpecialLongRangeDim";
Facets drillDownFacets = new FastTaxonomyFacetCounts(taxoReader, config, drillDowns);
boolean foundLongRangeInDrillSideways = false;
Map<String, Facets> drillSidewaysFacets = new HashMap<>();
if (drillSideways != null) {
for (int i = 0; i < drillSideways.length; i++) {
String sidewaysDim = drillSidewaysDims[i];
FacetsCollector drillSideway = drillSideways[i];
Facets sidewaysFacets;
if (sidewaysDim.equals(longRangeFacetDim)) {
foundLongRangeInDrillSideways = true;
sidewaysFacets = new LongRangeFacetCounts(...,drillSideway,...);
} else {
sidewaysFacets = new FastTaxonomyFacetCounts(taxoReader, config, drillSideway);
}
drillSidewaysFacets.put(sidewaysDim, sidewaysFacets);
}
}
if (!foundLongRangeInDrillSideways) {
Facets facetsTime = new LongRangeFacetCounts(..., FacetsCollector, ...);
drillSidewaysFacets.put(longRangeFacetDim, facetsTime);
}
return new MultiFacets(drillSidewaysFacets, drillDownFacets);
}
}

Related

MongoDB Compass returns all rows when I run a filter

I am trying to run a filter in MongoDB Compass and it returns all rows instead of the row that I am looking for. I can run the filter on example databases that are similar to my database without any problem.
https://i.stack.imgur.com/IBivJ.png
Here is the code that I am using to add records and select from them.
public class NoSQLDataAccess
{
// Create an instance of data factory
public NoSQLDataFactory noSQLDataFactory;
public List<dynamic> DocumentDetails { get; set; }
private IMongoCollection<dynamic> collection;
private BsonDocument bsonDocument = new BsonDocument();
public NoSQLDataAccess() { }
public void TestNoSQL()
{
MongoClient client;
IMongoDatabase database;
string connectionString = ConfigurationManager.AppSettings["NoSQLConnectionString"];
client = new MongoClient(connectionString);
database = client.GetDatabase("TestDatabase");
collection = database.GetCollection<dynamic>("TestCollection");
// Insert
List<Layer> layers = new List<Layer>();
layers.Add(new Layer { LayerId = 117368, Description = "BOOTHLAYER" });
layers.Add(new Layer { LayerId = 117369, Description = "DRAWINGLAYER" });
layers.Add(new Layer { LayerId = 117370, Description = "LAYER3" });
List<Element> elements = new List<Element>();
elements.Add(new Element { ElementId = 9250122, Type = "polyline" });
elements.Add(new Element { ElementId = 9250123, Type = "polyline" });
List<dynamic> documentDetails = new List<dynamic>();
documentDetails.Add(new DrawingDTO { Layers = layers, Elements = elements });
collection.InsertMany(documentDetails);
List<FilterDetails> filterDetails = new List<FilterDetails>();
filterDetails.Add(new FilterDetails { Type = "layers.id", Value = "117368" });
foreach (FilterDetails detail in filterDetails)
{
bsonDocument.Add(new BsonElement(detail.Type, detail.Value));
}
List<dynamic> results = collection.Find(bsonDocument.ToBsonDocument()).ToList();
}
}
I have been able to get the result I need with MongoDB shell but I have not been able to replicate the results in C#.
Here is the solution in MongoDB shell:
db.TestCollection.find({"layers.id": 117368}, {_id:0, layers: {$elemMatch: {id: 117368}}}).pretty();
I have found a post that is similar to my question that works for them. The C# code that I attached is how I will access it after I get it working properly. I use MongoDB Compass to test finds/inserts/updates/deletes.
Retrieve only the queried element in an object array in MongoDB collection

NHibernate Convert query to async query

I'm looking at async-ifying some of our existing code. Unfortunately my experience with NHibernate is lacking. Most of the NHibernate stuff has been easy, considering NHibernate 5 has a lot of support for async. I am, however, stuck.
Originally, we do something like this using our Dependency Injection:
private readonly IRepository repository;
public MovieRepository(IRepository repository)
{
this.repository = repository;
}
public Movie Get(int id)
{
return (from movie in repository.Query<Movie>()
select new Movie
{
ID = movie.ID,
Title = movie.Title,
Genre = new Genre
{
ID = movie.Genre.ID,
Name = movie.Genre.Name,
},
MaleLead = movie.MaleLead,
FemaleLead = movie.FemaleLead,
}).FirstOrDefault();
}
//Repository Query method in Repository.cs
public IQueryable<TEntity> Query<TEntity>() where TEntity : OurEntity
{
session = session.OpenSession();
return from entity in session.Query<TEntity>() select entity;
}
This works great for our current uses. We write things this way to maintain control over our queries, especially related to more complex objects, ensuring we get back exactly what we need.
I've tried a few things, like making the Query method return a Task< List< TEntity>> and using the ToListAsync() method, however because I am returning it as that kind of list I cannot query on it.
I'm sure I've missed something. If anyone can help me out, I would appreciate it.
You need to use FirstOrDefaultAsync in this case.
public async Task<Movie> Get(int id)
{
return await (from movie in repository.Query<Movie>()
select new Movie
{
ID = movie.ID,
Title = movie.Title,
Genre = new Genre
{
ID = movie.Genre.ID,
Name = movie.Genre.Name,
},
MaleLead = movie.MaleLead,
FemaleLead = movie.FemaleLead,
}).FirstOrDefaultAsync();
}
Add this using statement to your file
using NHibernate.Linq;
Then you can change your method to
public async Task<Movie> Get(int id)
{
return await (from movie in repository.Query<Movie>()
select new Movie
{
ID = movie.ID,
Title = movie.Title,
Genre = new Genre
{
ID = movie.Genre.ID,
Name = movie.Genre.Name,
},
MaleLead = movie.MaleLead,
FemaleLead = movie.FemaleLead,
}).FirstOrDefaultAsync();
}
NB: This is only available from NHibernate 5
Addendum:
The code you have in Repository.cs can be simplified to something like this:
//Repository Query method in Repository.cs
public IQueryable<TEntity> Query<TEntity>() where TEntity : OurEntity
{
//session = session.OpenSession(); //this is obviously wrong, but it's beside the point
var session = sessionFactory.OpenSession();
return session.Query<TEntity>(); //the fix
}

SolrNet facets are being returned from Solr but not through the SolrNet client

I am using this code to query Solr and I can see facets are being returned from Solr but for some reason they are not passed through.
public class HomeController : Controller
{
private readonly ISolrReadOnlyOperations<Product> _solr;
public HomeController(ISolrReadOnlyOperations<Product> solr)
{
_solr = solr;
}
public ActionResult Index()
{
var queryOptions = new QueryOptions()
{
Rows = 5,
Facet = new FacetParameters
{
Queries = new[] { new SolrFacetFieldQuery("brand") }
}
};
SolrQueryByField query = new SolrQueryByField("category", "phones-tablets/mobile-phones");
SolrQueryResults<Product> results = _solr.Query(query, queryOptions);
return View();
}
}
The above code ends up generating this url http://localhost:8983/solr/new_core/select?q=category%3a(phones%5c-tablets%5c%2fmobile%5c-phones)&rows=5&facet=true&facet.field=brand&version=2.2&wt=xml
When I paste the URL I can see the facets section as expected. But results.FacetQueries.Count is zero. Am I missing something?
FacetQueries is used for returning the result of explicit facet queries. You're performing regular faceting. That result can be accessed through results.FacetFields. From the documentation:
var r = solr.Query(...);
foreach (var facet in r.FacetFields["category"]) {
Console.WriteLine("{0}: {1}", facet.Key, facet.Value);
}

How to access Spans with a SpanNearQuery in solr 6.3

I am trying to build a query parser by ranking the passages containing the terms.
I understand that I need to use SpanNearQuery, but I can't find a way to access Spans even after going through the documentation. The method I got returns null.
I have read https://lucidworks.com/blog/2009/07/18/the-spanquery/ which explains in a good way about the query. This explains how to access spans, but it is for solr 4.0 and unfortunately solr 6.3 doesn't have atomic reader any more.
How can I get the actual spans?
public void process(ResponseBuilder rb) throws IOException {
SolrParams params = rb.req.getParams();
log.warn("in Process");
if (!params.getBool(COMPONENT_NAME, false)) {
return;
}
Query origQuery = rb.getQuery();
// TODO: longer term, we don't have to be a span query, we could re-analyze the document
if (origQuery != null) {
if (origQuery instanceof SpanNearQuery == false) {
throw new SolrException(SolrException.ErrorCode.SERVER_ERROR,
"Illegal query type. The incoming query must be a Lucene SpanNearQuery and it was a " + origQuery.getClass().getName());
}
SpanNearQuery sQuery = (SpanNearQuery) origQuery;
SolrIndexSearcher searcher = rb.req.getSearcher();
IndexReader reader = searcher.getIndexReader();
log.warn("before leaf reader context");
List<LeafReaderContext> ctxs = (List<LeafReaderContext>) reader.leaves();
log.warn("after leaf reader context");
LeafReaderContext ctx = ctxs.get(0);
SpanWeight spanWeight = sQuery.createWeight(searcher, true);
Spans spans = spanWeight.getSpans(ctx, SpanWeight.Postings.POSITIONS);
AtomicReader wrapper = SlowCompositeReaderWrapper.wrap(reader);
Map<Term, TermContext> termContexts = new HashMap<Term, TermContext>();
Spans spans = fleeceQ.getSpans(wrapper.getContext(), new Bits.MatchAllBits(reader.numDocs()), termContexts);
// SpanWeight.Postings[] postings= SpanWeight.Postings.values();
// Spans spans = sQuery.getSpans();
// Assumes the query is a SpanQuery
// Build up the query term weight map and the bi-gram
Map<String, Float> termWeights = new HashMap<String, Float>();
Map<String, Float> bigramWeights = new HashMap<String, Float>();
createWeights(params.get(CommonParams.Q), sQuery, termWeights, bigramWeights, reader);
float adjWeight = params.getFloat(ADJACENT_WEIGHT, DEFAULT_ADJACENT_WEIGHT);
float secondAdjWeight = params.getFloat(SECOND_ADJ_WEIGHT, DEFAULT_SECOND_ADJACENT_WEIGHT);
float bigramWeight = params.getFloat(BIGRAM_WEIGHT, DEFAULT_BIGRAM_WEIGHT);
// get the passages
int primaryWindowSize = params.getInt(OWLParams.PRIMARY_WINDOW_SIZE, DEFAULT_PRIMARY_WINDOW_SIZE);
int adjacentWindowSize = params.getInt(OWLParams.ADJACENT_WINDOW_SIZE, DEFAULT_ADJACENT_WINDOW_SIZE);
int secondaryWindowSize = params.getInt(OWLParams.SECONDARY_WINDOW_SIZE, DEFAULT_SECONDARY_WINDOW_SIZE);
WindowBuildingTVM tvm = new WindowBuildingTVM(primaryWindowSize, adjacentWindowSize, secondaryWindowSize);
PassagePriorityQueue rankedPassages = new PassagePriorityQueue();
// intersect w/ doclist
DocList docList = rb.getResults().docList;
log.warn("Before Spans");
while (spans.nextDoc() != Spans.NO_MORE_DOCS) {
// build up the window
log.warn("Iterating through spans");
if (docList.exists(spans.docID())) {
tvm.spanStart = spans.startPosition();
tvm.spanEnd = spans.endPosition();
// tvm.terms
Terms terms = reader.getTermVector(spans.docID(), sQuery.getField());
tvm.map(terms, spans);
// The entries map contains the window, do some ranking of it
if (tvm.passage.terms.isEmpty() == false) {
log.debug("Candidate: Doc: {} Start: {} End: {} ", new Object[] { spans.docID(), spans.startPosition(), spans.endPosition() });
}
tvm.passage.lDocId = spans.docID();
tvm.passage.field = sQuery.getField();
// score this window
try {
addPassage(tvm.passage, rankedPassages, termWeights, bigramWeights, adjWeight, secondAdjWeight, bigramWeight);
} catch (CloneNotSupportedException e) {
throw new SolrException(SolrException.ErrorCode.SERVER_ERROR, "Internal error cloning Passage", e);
}
// clear out the entries for the next round
tvm.passage.clear();
}
}
}
}

The given key was not present in the dictionary solrnet

Please note: I know for the question SolrNet - The given key was not present in the dictionary and I have initialized solr object just like Mauricio suggests.
I am using solr 4.6.0 and solrnet build #173, .net framework 4.0 and VS2012 for development. For some unknown reason I am receiving error 'The given key was not present in the dictionary'. I have a document with that id in solr, I've checked via browser. It's a document like any other document. Why is error popping up? My code (I've made a comment on the place where the error happens):
//establishes connection with solr
private void ConnectToSolr()
{
try
{
if (_solr != null) return;
Startup.Init<Register>(SolrAddress);
_solr = ServiceLocator.Current.GetInstance<ISolrOperations<Register>>();
}
catch (Exception ex)
{
throw new Exception(ex.Message);
}
}
//Returns snippets from solr as BindingSource
public BindingSource GetSnippets(string searchTerm, DateTime? startDate = null, DateTime? endDate = null)
{
ConnectToSolr();
string dateQuery = startDate == null
? ""
: endDate == null
? "savedate:\"" + convertDateToSolrFormat(startDate) + "\"" //only start date
: "savedate:[" + convertDateToSolrFormat(startDate) + " TO " +
convertDateToSolrFormat(endDate) + "]";//range between start and end date
string textQuery = string.IsNullOrEmpty(searchTerm) ? "text:*" : "text:*" + searchTerm + "*";
List<Register> list = new List<Register>();
SolrQueryResults<Register> results;
string currentId = "";
try
{
results = _solr.Query(textQuery,
new QueryOptions
{
Highlight = new HighlightingParameters
{
Fields = new[] { "*" },
},
ExtraParams = new Dictionary<string, string>
{
{"fq", dateQuery},
{"sort", "savedate desc"}
}
});
for (int i = 0; i < results.Highlights.Count; i++)
{
currentId = results[i].Id;
var h = results.Highlights[currentId];
if (h.Snippets.Count > 0)
{
list.Add(new Register//here the error "the given key was not present in the dictionary pops up
{
Id = currentId,
ContentLiteral = h.Snippets["content"].ToArray()[0].Trim(new[]{' ', '\n'}),
SaveDateLiteral = results[i].SaveDate.ToShortDateString()
});
}
}
BindingList<Register> bindingList = new BindingList<Register>(list);
BindingSource bindingSource = new BindingSource();
bindingSource.DataSource = bindingList;
return bindingSource;
}
catch(Exception e)
{
MessageBox.Show(string.Format("{0}\nId:{1}", e.Message, currentId), "Solr error");
return null;
}
}
I've found out what's causing the problem: saving empty documents into solr. If I make an empty query (with text:*) through solrnet (usually I do this if I want to see all saved documents) and empty document is one of saved docs, then 'The given key is not present in dictionary pops up'. If all of the documents have text in them, this error doesn't pop up.
If you document contains fields with types other than string and you index null value to a double or integer field you will get the same error.
solr query return the null field as:
<null name="fieldname"/>
should be
<double name="fieldname">0.0</double>
or
<double name="fieldname"/>

Resources