Solr Schemaless Mode creating fields as indexed - solr

while uploading XML file in solr schemaless mode. all field type are text_general only
i'm trying to change field type text_general to Indexed = "true".
for all my field in solr schemaless_mode.
There're 500+ fields in the schema.xml file its difficult to change manually
<field name="parap018f49616aad47a2a754d1fd87fdb4de" type="text_general"/>
<field name="parap06bad96ebf194d8d965db37a5124357a" type="text_general"/>
I'm trying to change it.(desire output)
<field name="parap018f49616aad47a2a754d1fd87fdb4de" type="text_general" indexed="true" stored="false" multiValued="true />
<field name="parap06bad96ebf194d8d965db37a5124357a" type="text_general" indexd="true" stored="false" multiValued="true""/>

copy your 500+ fields to the text editor and find and replace all at once after type="text_general" and replace in your schema and upload it.
it is more of a copy paste question rather than solr question.

Related

SOLR 4.10.2 - Add field with default value

I have a SOLR Core 4.10.2
I want to add a field VehicalType with an index IDX_VehicalType with a default value of "car" for all rows in the core. - I want this field populated with "car"
I have been able to add the field and index successfully using the following in the schema.xml and Reloading the core from from Core Admin:
<field name="IDX_VehicleType" type="text_Exact" indexed="true" stored="false" multiValued="true"/>
<field name="VehicleType" type="string" indexed="true" stored="true"/>
<copyField source="VehicleType" dest="IDX_VehicleType"/>
When I try to give the field a default using the default attribute - and reload, and optimize the core from Core Admin, the field is not populated.
<field name="IDX_VehicleType" type="text_Exact" indexed="true" stored="false" multiValued="true" default="car"/>
<field name="VehicleType" type="string" indexed="true" stored="true" default="car"/>
<copyField source="VehicleType" dest="IDX_VehicleType"/>
I see the field in the schema browser, but its not showing up in the results.
This change falls under the category of changes which need reindexing in SOLR to reflect the value in the field.
To repeat, if we want to fill in the field with the default value for existing documents in the index, we would need to reindex all the docs in the index.
And for any future add/updates to the index, the default field value would be auto set by SOLR according to the new schema.
More on this:
https://solr.apache.org/guide/8_0/reindexing.html#changing-field-and-field-type-properties
https://solr.apache.org/guide/8_0/defining-fields.html#field-properties

How can I Ignore some fields in a SOLR query

I have Solr 5.3.1 and need to do query for all field except some field (what I need search in some field not retrieve fields this way to retrieve [/?q=query&fl=field1,field2,field3] )
i try with some solution but not work
1.How to exclude fields in a SOLR query [this soluation not work]
2.[the below solution work but take more time]
query = field1:"+txtSearch+"OR field1:"+ txtSearch+" OR field1:"+txtSearch
3.I set indexed="false" in data-config.xml it only Ignore search in this field but when I search for all fields http://localhost:8983/solr/test?q=query the query search in all field regardless indexed="false" OR true
I look for all this links
Retrieving specific fields in a Solr query?
How to exclude fields in a SOLR query
https://www.drupal.org/node/1933996
Use copyField
Here is how you can use this:
Make all the field stored="true" and indexed="false"
Also create a new field say cffield with multiValued="true", stored="false" and indexed="true"
Example Schema :
<field name="field1" type="string" indexed="false" stored="true"/>
<field name="field2" type="string" indexed="false" stored="true"/>
<field name="field3" type="string" indexed="false" stored="true"/>
....
<field name="cffield" type="string" indexed="true" stored="false" multiValued="true"/>
Now all the field from you want to search, set source value of use copyField tag to copy from source field to dest
Example Schema :
<copyField source="field1" dest="cffield"/>
<copyField source="field2" dest="cffield"/>
....
Now you can search using
query = cffield:txtSearch
This will give you result from all the field you use copyField's source and cffield as dest
indexed="false"
needs to be mentioned in the schema.xml.
Once you modify the schema.xml, you need to re-index the data.(you need to re-start the server as well)
Then solr will not search in the fields which are not indexed.
And if you want to search in specific field you can use the field name and the search value for the field.
like
`q=field1:"value1"`
q=field1:value1 OR field2:value2
q=field1:value1 AND field2:value2
q=value1&fq=field2:value2&fq=field3:value3

Unable to add new dynamic fields in Solr 6.0

I added a few new dynamic fields in solr-6.0.0/server/solr/configsets/data_driven_schema_configs/conf/managed-schema as follows:
<dynamicField name="*_sst" type="string" indexed="false" stored="true" />
<dynamicField name="*_sin" type="string" indexed="true" stored="false" />
Then I start solr and add a collection as:
bin/solr start -cloud
bin/solr create -c my_coll -shards 2 -replicationFactor 1
I see the dynamic-fields being picked up when I navigate to http://localhost:8983/solr/#/my_coll/files?file=managed-schema
<dynamicField name="*_sst" type="string" indexed="false" stored="true"/>
<dynamicField name="*_sin" type="string" indexed="true" stored="false"/>
However, when I send documents to this collection and query it, I am able to query by *_sst fields (which were meant to be stored-only) and I see *_sin fields in the result (which were meant to be indexed-only).
On seeing the http://localhost:8983/solr/#/my_coll/schema?field=FooPrefix.name2_sst, it does show that my _sst field is mapped correctly, but I am still able to search on it?
Does anyone know what is not correct here?
I think this is because of the file managed-schema.xml has lot many dynamicFields defined already.
And you field must be using one of it.
I think your field is using the
<dynamicField name="*_t" type="text_general" indexed="true" stored="true"/>
or could be used
<dynamicField name="*_ss" type="strings" indexed="true" stored="true"/>
or there might be other dynamic field.
I would suggest you to either delete the rest all dynamicField which are not required and keep the ones which are added by you.
Once you are done with this, restart the server and re-index the data and check.
The answer is in your screenshot. It is all about docValues. Your attributes are accumulated between the field and type definition. And the definition for the type string now includes docValues="true".
Which means the exact value search would still work against docValues even with indexed=false. And, as of schema version 1.6 (in Solr 5.5 and 6.0), docValues can be returned even when stored=false.
If you don't like that, remove docValues=true from the string type or create another similar type without that flag. Or explicitly override it in your field definitions.

Default search on all fields

I can only get Solr to work when I include a field in the query, for example:
http://localhost:8983/solr/collection1/select?q=lastname:johnson
The above query returns approximately 18 results.
Shouldn't it be possible to use Solr (/Lucene) without specifying the field? Such as:
http://localhost:8983/solr/collection1/select?q=johnson
I also tried adding the fields list:
http://localhost:8983/solr/collection1/select?q=johnson&fl=cus_id%2Cinitials%2Clastname%2Cpostcode%2Ccity
But all these queries return zero results.
These are the fields from my schema.xml:
<field name="cus_id" type="string" indexed="true" stored="true"/>
<field name="initials" type="text_general" indexed="true" stored="true" />
<field name="lastname" type="text_general" indexed="true" stored="true"/>
<field name="postcode" type="string" indexed="true" stored="true" />
<field name="city" type="text_general" indexed="true" stored="true"/>
I don't know what else to try. Any suggestions?
For Solr if not field is specified the search happens on the default field (df).
So when you search q=johnson and debug the query you will find it searching on the default field which is usually the field text.
You can copyfield all the fields to the single field text and have it as the default field (if not default already), so that all your search queries would be searched across the default field.
Also, fl lists the fields that would be returned as part of the result and is not related to the fields on which the search is performed.
With dismax, You can check the qf param to specify multiple fields with variable boosts.

Configure fields for considering duplicates

Consider a Solr index with the following fields:
<fields>
<field name="id" type="uuid" indexed="true" stored="true" default="0"/>
<field name="user" stored="true" type="string" multiValued="false" indexed="true"/>
<field name="text" stored="true" type="textmulti" multiValued="false" indexed="true"/>
<field name="media" stored="true" type="string" multiValued="false" indexed="true"/>
</fields>
I would consider a newly indexed Document to be a dupe (and therefore to be rejected) if there exists a current document that has identical user and text fields, no matter what the id or media fields' content are. Documents that have matching user or text is not enough to be considered a dupe, it must be both user and text.
I have read through Document Duplication Detection and XML Messages for Updating a Solr Index on the Solr wiki but I still do not see how to configure this. Any ideas? I am using the wonderful solr-php-client to connect to Solr via PHP.
Thanks.
probably you have some reason not to do so, but you could use the concatenation of user and text as id and then you would not need to use Duplicate Detection as Solr does it for you if you dont overwrite

Resources