In my schema.xml
<dynamicField name="attributes_*" type="integer" indexed="true" stored="true" omitNorms="true"/>
<dynamicField name="itemAttributes_*" type="integer" indexed="true" stored="true" omitNorms="true"/>
after I insert the record with dynamic fields then , where these fields are created on disk?
The schema is "only" used for validation / querying / etc. by Solr, meaning that the content is compared (and field types applied) to the schema when a field is being queried (to get the field type and analysis chain) or when it's being inserted. The schema is a Solr concept, while Lucene is the thing that makes Solr work behind the scenes.
Since the actual storage of data is not connected to the schema, and a Lucene document is a collection of field names and associated values, the field name doesn't have to exist in the schema to be stored in a Lucene document - just for Solr to accept it for storage into its Lucene index.
The fields are created in the index in the same way as any field explicitly named in the index.
Related
I am indexing the RDBMS data into solr from my java application. For each row of a table I am creating a java bean and adding to solr server.(While creating a bean which is nothing but one solr document, I am using table's column name as field name of solr doc and corresponding value as solr field's value). But we support to index data from any number of tables , where each table will have different column names and data types. To, handle this we are using dynamic fields in schema.xml as below
<dynamicField name="*" type="string" indexed="true" stored="true" multiValued="true"/>
But the problem with this configuration is all the fields type is String , but I want to use numeric types for numeric data types in RDBMS and String for Varchar data type. Please suggest me how can I achieve this. I can't use suffix or prefix to field name while creating solr doc because I want to index and retrieve the docs using field name same as column name of table.
Any suggestions are appreciated.
I am working on eCommerce web application which is developed using DOT NET MVC. I use Solr to index product details. So that I have mentioned Product related fields to my Solr Schema file.
Now I also want to index SearchTerm to Solr. For this how can I manage my Schema file to store/index searchterm as my Schema file is product specific?
Can anyone please suggest?
You can have a separate core for this and define the new schema.xml for it or if you want to use the existing schema.xml then you can make use of the dynamic fields by which you need not have bother in future if any other field you need to add..
You can use Dynamic fields.
Dynamic fields allow Solr to index fields that you did not explicitly define in your schema.
This is useful if you discover you have forgotten to define one or more fields. Dynamic fields can make your application less brittle by providing some flexibility in the documents you can add to Solr.
A dynamic field is just like a regular field except it has a name with a wildcard in it. When you are indexing documents, a field that does not match any explicitly defined fields can be matched with a dynamic field.
For example, suppose your schema includes a dynamic field with a name of *_i.
If you attempt to index a document with a cost_i field, but no explicit cost_i field is defined in the schema, then the cost_i field will have the field type and analysis defined for *_i.
Like regular fields, dynamic fields have a name, a field type, and options.
<dynamicField name="*_i" type="int" indexed="true" stored="true"/>
My uniqueKey is defined as:
<field name="id" type="string" indexed="true" stored="true" required="true" multiValued="false" />
<uniqueKey>id</uniqueKey>
I load several docs into Solr with its corresponding "id" field, what i need now is UPDATE "id" value, It is possible?
When I try to do that I get this error:
Document contains multiple values for uniqueKey field
I am using Apache Solr 4.3.0
It's not directly possible. Before I get into how you can do it indirectly, I need to explain a couple of things.
The value in the uniqueKey field is how Solr handles document updating/replacing. When you send a document in for indexing, if an existing document with the same uniqueKey value already exists, Solr will delete its own copy before indexing the new one.
The atomic update functionality is slightly different. It lets an update add, change, or remove any field in the document except the uniqueKey field - because that's the way that Solr can identify the document.
What you need to do is basically index a new document with all the data from the old document, and delete the old document. If all the fields in the document are available to the indexing process, then you can just index the new document, either before or after deleting the old one. Otherwise, you can query the existing doc out of Solr, make a new one and index it, and then delete the old one.
In order to use the existing Solr document to index a new one, all fields must be stored, unless they are copyField destinations, in which case they must NOT be stored. Atomic updates (discussed above) have the same requirement. If one or more of these fields is not stored, then the search result will not contain that field and the data will be lost.
I have many different column names and types that I want to import. Do I need to change my schema.xml to have entries for each of these specific field types, or is there a way for the importhandler to generate the schema.xml from the underlying SQL data?
You need to define the fields you need to import in the schema.xml.
The DIH does not autogenerate the fields and it is better to create the fields if the amount the fields are less.
Solr also allows you to define Dynamic fields, where the fields need not be explicitly defined but just needs to match the regex pattern.
<dynamicField name="*_i" type="integer" indexed="true" stored="true"/>
You can also define a catch field with Solr, however the behaviour cannot be control as same analysis would be applied to all the fields.
I have a document structure in Solr that looks something like this (irrelevant fields excluded):
<field name="review_id" type="int" indexed="true" stored="true"/>
<field name="product_id" type="int" indexed="true" stored="true"/>
<field name="product_category" type="string" indexed="true" stored="true" multiValued="true"/>
product_id here is one-to-many wrt review_id
I can get a faceted count of reviews in each category by doing:
/select?q=*:*&rows=0&facet=true&facet.field=product_category
I want to be able to do faceting on the product_category, but get the number of distinct product_id:s instead of the number of review_id:s. Is this possible to do in Solr?
There is no one-to-many in a Solr index. It's not a relational database. The index is either about reviews or about products, and that depends on what you'll be searching for. To quote the Solr wiki about schema design:
Solr provides one table. Storing a set database tables in an index generally requires denormalizing some of the tables. Attempts to avoid denormalizing usually fail.
So the first step is fixing the schema design. Only after that (and always keeping the fact above in mind) can you design facets and other stuff.