How to export solr result into a text file? - solr

I needs to export doc_id, all fields, socr, rank of one search result to evaluate the results. How can I do this in solr?

Solr provides you with a CSV Response writer, which will help you to export the results of solr in an csv file.
http://localhost:8983/solr/select?q=ipod&fl=id,cat,name,popularity,price,score&wt=csv
All the fields queried would be returned by Solr in proper format.

This has nothing to do with SOLR. When you make a SOLR query over http, then SOLR does the search and returns the results to you in your desired format. The default is XML but lots of people specify wt=json to get results in json format. If you want this result in a text file, then make your search client put it there.
In the browser, File -> Save As.
But most people who want this use curl as the client and use the -o option like this:
curl -o result1.xml 'http://solr.local:8080/solr/stuff/select?indent=on&version=2.2&q=fish&fq=&start=0&rows=10&fl=*%2Cscore&qt=&wt=&explainOther=&hl.fl='
Note the single quotes around the URL due to the use of & characters.

There is not a built in export function in Solr. The easiest way would be to query your Solr instance and evaluate the XML result. Check out Querying Data in the Solr Tutorial for details on how to query a result from Solr. In order to convert the result into a text file, I would recommend using one of the Solr Clients found on the Integrating Solr page in the Solr Wiki and then choose your programming language of choice to create the text file.

Related

SOLR Solarium can we use filter-queries with dismax-queries?

i just built a search form backed by solr, we are using the solarium library to construct our requests.
we built a "huge" collection of filterqueries like that one:
$query = $client->createQuery($client::QUERY_SELECT);
$query->setStart(0)->setRows(1000);
$query->addFilterQuery($query->createFilterQuery("foo")->setQuery("bar:true"));
$query->addFilterQuery($query->createFilterQuery("fo")->setQuery("ba:false"));
....
but we realized that the search just hits all the single fields we specify in the filterqueries, but we have to actually query multiple fields. while reading the docs i realized we could have been wrong, right? the correct approach would be to use disMax queries (in combination with facets?)? im wondering, can we use DisMax in combination with filterqueries to "expand" our search to multiple fields (with boosts) ? or do we have to actually rework everything?
im kinda missing the big picture to decide what the best/working solution would be
help is much appreciated
edit:
solr:
solr-spec 7.6.0
solarium:
solarium/solarium 6.0.1 PHP Solr client
You can give a query parser when giving the fq argument:
fq={!dismax qf="firstfield secondfield^5"}this is my query
The syntax is known as Local Parameters. Since dismax (or edismax which you should normally use now) doesn't have a identifier in front of it, it is implicitly parsed as the type.
If a local parameter value appears without a name, it is given the implicit name of "type". This allows short-form representation for the type of query parser to use when parsing a query string.
You'll have to make sure that Solarium doesn't escape the value you give to setQuery, but seeing as you're already giving a field:value combination, it doesn't seem to get escaped. Double check the Solr log to see exactly what query is being sent to Solr (or ask Solarium to give you the exact query string being sent if possible).

Synonyms in Solr Query Results

Whenever I query a string in solr, i want to get the synonyms of field value if there exists any as a part of query result, is it possible to do that in solr
There is no direct way to fetch the synonyms used in the search results. You can get close by looking at how Solr parsed your query via the debugQuery=true parameter and looking at the parsedQuery value in the response but it would not be straightforward. For example, if you search for "tv" on a text field that uses synonyms you will get something like this:
$ curl "localhost:8983/solr/your-core/select?q=tv&debugQuery=true"
{
...
"parsedquery":"SynonymQuery(Synonym(_text_:television _text_:televisions _text_:tv _text_:tvs))",
...
Another approach would be to load in your application the synonyms.txt file that Solr uses and do the mapping yourself. Again, not straightforward,

Spring Data Solr: Queries with "AND", "NOT" and "OR" not escaped or handled

We are using spring-data-solr, mainly using exact match/equals filter queries.
We have found that the values NOT, OR, and AND can be supplied, which are passed directly onto solr (without any pre-processing). This causes solr to error. For example, building a Criteria object like
Criteria.where("fuelType").is("AND")
Results in the following solr query
fq=fuelType:AND
We have found that if we call Solr directly with
fq=fuelType:"AND"
This would be fine, however, I can see that quotes are only added when there is whitespace in the value.
Is there something I am missing?
I still want to use the Standard Solr query parser if possible
The pull request for this has been merged,
https://jira.spring.io/browse/DATASOLR-437
https://github.com/spring-projects/spring-data-solr/pull/74

Solr Exact Query

I am using SolrNet to try and perform an exact query search
I have a document with the URL stored in Solr as : file://C:/Users/me/docs/X%20Item3
I want to match all documents that contain "X Item", so will be looking for a "X Item"
I have
new SolrQueryByField("url", "*\"X Item\"*");
But this does not return the document. I also do not want to have to convert space characters to %20 but I may have to if Solr will not do it for when it parses the query.
Help appreciated
Solr does not support wildcard searches by default at the beginning of a term. You can work around this by adding a ReversedWildcardFilter to the field indexing definition.
Depending on the kind of searches performed, you could also split on / to index each part of the path separately, or just the file name.
You shouldn't have to convert spaces to %20, as that should be performed by the client library (I'm not familiar with how SolrNet does it, but it really should abstract that away from you) when making an HTTP request.

formatting of files before indexing into solr server

I'm using the Solr server to provide search capability for a tool. I wanted to know if there is a facility provided by solr that will allow me to format some files before they are indexed ? more specifically i have a plain text file with a lot of data ! i want to convert them to an xml format before i index the xml file . eg
some data! some more data : more values
i want to convert this sample line to something like
<field 1>sample data </field 1>
<field 2> some more data </field 2>
<field 3> more values </field 3>
does solr provide a facility for this type of transformation before iindexing a file using solr cell. does it provie any classes or interfaces that i can implement in my java application ??
thanks in advance!
Are you pushing data into Solr or can you pull it from the source by Solr?
If you are pushing into Solr, then you have to use Update Request Processor. However, I am not aware of any that will split data into multiple fields. You may need to write one yourself.
If you are pulling from the source using DataImportHandler, it has a built-in support for splitting content into multiple fields using RegexTransformer.
Both Request Processor and DIH support JavaScript (and possibly other Java script languages) transformers, so you can also write your own script to split the data in whatever way you want.
Some of this is starting with version 4 of Solr though. That's a requirement to keep in mind.
You'll need a custom Index Handler or a SolrRequestHandler

Resources