Solr full text search for dynamically added data? - solr

I'm trying index the data without defining schema.xml, is the any way to apply full text search without adding schema.xml or updating the manged shema?

The default operation mode of Solr is to use the Schemaless mode. In this mode Solr will guess what the field type is based on what pattern the data matches the first time a field is included. If it is numeric the first time, Solr will guess that it's going to be a numeric field every time.
If the field contains text it'll be indexed as a text field with processing applied as defined in the default schema.
As long as you're using the default configuration you can submit documents with just the field name and the associated text, then search against the field name as necessary.practice

Related

Copy Solr Field Values via Script

I would like to copy the data from one field to another field for all documents in Solr.
A title field that is already populated needs to be copied into another field I just created. I'd like to do them all at once if possible via Putty or the Solr Admin console.
Thank you for any help.
If you have pre-ingested data then the only option is to re-ingest the data after adding the second field. You can set only the new field in the docs instead of inserting all the fields using Solr atomic updates. https://solr.apache.org/guide/8_6/updating-parts-of-documents.html#atomic-updates
solr.add({'id':1, 'newField': {'set': 'sample value'}})
For future insertions, if you want the second field to be auto filled, you can use Solr copy field with the source set to the first field. https://solr.apache.org/guide/8_6/copying-fields.html

How can I view actually stored transformed Solr text field values?

When Solr returns a document, the field values match those that where passed to the Solr indexer.
However especially for TextFields Solr typically uses a modified value where (depending on the definition in the schema.xml) various filters are applied, typicall:
conversion to lower case
replacing of synonyms
removal of stopwords
application of stemming
One can see the result of the conversion for specific texts by using Solr Admin > Some core > Analysis. There is a tool called Luke and the LukeRequestHandler but it seems I can only view the values passed to Solr but not the tranformed variant. One can also take a look at the index data on the disk but they seem to be stored in a binary format.
However, non of these seem to enable me to see the actual value as stored.
The reason for asking is that I've created a text field based on a certain filter chain which according to Solr Admin > Analysis transforms the text correctly. However when searching for a specific word in the transformed text it won't find it.

Add new field to SOLR with default value. Populate existing documents

I have added a new field to a SOLR 3.6.1 schema.xml with a default value. Is it possible to populate / index existing documents in the SOLR repository with this default value without having to re-load all the data? I have been looking at re-indexing and re-optimizing but haven't been able to get this to work?
Any changes in schema.xml related to addition or change in fields would need re-indexing of the data.
So you have to reload your data.
If you know the document, you can do a Partial update of all those document with just that field.
Check Solr: Add new fields with Default Value for Existing Documents
If we only need search and display the new fields, we can do the following steps.
add the new field definition in schema.xml:
We need update search query: when search default value for this newFiled, also search null value:
-(-newFiled:defaultValue AND newFiled:[* TO *])
Use DocTransformer to add default value when there is no value in that field for old data.
Some functions may not work such as sort, stats.

Solr doesn't index document's content

I've a little problem with Sorl.
I've indexed about 1400 documents by an xml file with the post.jar command. Within the xml file I placed some information like ID, TITLE and URL of the documents.
When I search a document, It finds nothing, but if I specified an attribute, ex. TITLE: IEEE, It finds the documents.
So I change, on schema.xml, the default field search from text to title. In this way it finds documents without specifying the attribute.
Why doesn't it find the content? Did I mess up the indexing by changing the xml file?
Do a q=*:*. This fetches 10 (implicit default value for rows) documents with all fields and their values. Is all your data indexed properly?
Then do a q=fieldx:val with some known field and value. Do they show up in the results? Can you do more than string matches? If not, you need to choose data types (and storage/indexing options) in schema. Example: string allows only equality and prefix matches and text allows full text search.

Solr Ngram Synonyms Dismax

I have ngram-indexed 2 fields (columns in the database) and the third one is my full text field. Now my default text field is the full text field and while querying I use dismax handler and specify in it both the ngrammed field with certain boost values and also full text field with a certain boost value.
Problem for me if I dont use dismax and just search full text field(i.e. default field specified in schema) synonyms work correctly i.e. ca returns all results where california is there whereas if i use dismax ca is also searched in the ngrammed fields and return partial matches of the word ca and does not go at all in the synonym part.
I want to use synonyms in every case so how should I go about it?
Ensure you already correctly configured the "SynonymFilterFactory" filter in your ngram field's query analyzer.
If still doesn't work, the Solr admin's analysis interface can give more details of the tokenize/filter procedures, through which can check if the Synonym part already works as expected.

Resources