I understand that with Apache Solr 8.11 we can manage schemas, create types and field types through the Schema API. The question I have is:
How do you create enum field types through the API? We usually need to create a enumConfigs.xml file, but I can't seem to find any documentation on how to upload such a file.
Related
This document describes how to include custom attributes into PubSub messages.
https://cloud.google.com/pubsub/docs/samples/pubsub-publish-custom-attributes
Is this possible using the newer Spring Cloud Stream functional APIs?
streamBridge.send("myEvent-out-0", event)
I am currently publishing as per above. The second param is just of type "Object", so no way to differentiate custom v regular attributes.
Thanks
You can publish Spring Message and specify your attributes as headers
streamBridge.send("myEvent-out-0", MessageBuilder
.withPayload(event).setHeader("fooKey", "fooValue").setHeader("barKey", "barValue").build());
We have a SOLR Cloud cluster with 4 nodes. Collections are created with 4 shards and 2 replicas.
I was using REST endpoints (pointing to a single instance for all operations), to create feature(s) and model(s).
http://{{SOLRCLOUD-HOST}}:8983/solr/{{ACTIVE_INDEX_NAME}}/schema/feature-store
http://{{SOLRCLOUD-HOST}}:8983/solr/{{ACTIVE_INDEX_NAME}}/schema/model-store
When I execute REST endpoint to fetch the existing feature(s) and models(s)
http://{{SOLRCLOUD-HOST}}:8983/solr/{{ACTIVE_INDEX_NAME}}/schema/feature-store
http://{{SOLRCLOUD-HOST}}:8983/solr/{{ACTIVE_INDEX_NAME}}/schema/model-store
I see my feature/model created sometimes and the other times it says they don't exist.
At this point, when restart my cluster, thre GET calls always return the created features and models.
Couple of questions -
Like config sets, is there a way to upload features and models without using REST endpoint?
Is restart required after uploading features and models.
Should the feature/mode be executed against all collections in the cluster (assume I have more than one collection with the same data created for different purpose, plz don't ask why, I have them)
Are the features/models created available for collections created in the future with the same config set, I ask this question because the feature/model uploaded is seen inside the config set as - _schema_model-store.json and _schema_feature-store.json
Please advice. Thanks!
Did you find any answers?
I was stuck with feature-store not being available on all shards. Your suggestion of restarting solr helped. Is that the permanent solution?
To answer your Question #3:
You need to upload the features/models for each collection, since collection is part of the upload url, notice the "techproducts" in feature upload example from solr doc:
curl -XPUT 'http://localhost:8983/solr/techproducts/schema/feature-store' --data-binary "#/path/myFeatures.json" -H 'Content-type:application/json'
Just reload the collection to make the feature and model json file to be available on all shards of the collection. The restart of solr is not required.
I was wondering if it's possible to get the full schema or just the fields the schema defines in json format? Obviously I could scrape the page the schema is on
/solr/#/collection1/schema
Do a transformation and create my own json but if solr has a method built in :)
Thanks in advance
You cannot get the schema.xml directly in JSON format but you can get the raw file from Solr instead of haveing to scrape the solr admin page that shows it. You can use this url, where collection1 is the name of your core:
http://localhost:8080/solr/collection1/admin/file?file=schema.xml&contentType=text/xml;charset=utf-8
I'm trying to figure out a solution on how to be able to index/search PDF, doc, and maybe txt files that were uploaded via a webform. I've found a module (Search API attachments) that will index files but it appears that it only indexes files that are attached to nodes. :(
Our client wants to be able to search the contents of resumés that are submitted from a webform.
If your clients are expecting hundreds of nodes, it might be worthwhile to set up an Apache Solr. Then you can use Tika to index all kinds of files: http://tika.apache.org/
If that's not an option, you can write a custom module that uses the Webform API that saves the attached file as a node... and then use your Search API attachments module.
I try to add new data to the solandra according to the solr's schema but I can't find any example about this. My ultimate goal is to integrate solandra with django-solr.
What I understand about the insert and updating in the solr based on the original solr and django-solr is to send the new data on the http protocol to the decent path, for example:
http://localhost:8983/solandra/wikipedia/update/json
However, when I access the url, the browser keep telling me HTTP ERROR: 404.
Can you help me understand the step to add new data and delete the data in the solandra environment?
I also have a look at the reuters-demo, but the procedure to insert data is process in the file of reutersimporter.jar, but I can't see the source as well. So Please help me to understand how the system work in terms of data inserting and deleting.
Thank you.
Since you are using the JSON update handler, this UpdateJSON page on the Solr Wiki has some good examples of inserting data using the JSON handler via curl. Also, the Indexing Data section of the Solr Tutorial shows how you can insert data using the post.jar file that is included with the Solr source.
Are you creating the solr schema.xml and solrconfig.xml and posting it to solandra? If you add the JSON handler then this should work. The reutersdemo uses solrj. django-solr should work as well.