I am new to Apache Solr. I am going by the following tutorial -https://examples.javacodegeeks.com/enterprise-java/apache-solr/apache-solr-tutorial-beginners/
While I am able to index my books.csv on my local machine, but I am getting following issue on my virtual machine- Unable to access jarfile post.jar
I am using Solr 6.3.0 and java 1.8
Please help !!
Your java command is not able to find post.jar in bin directory.
by default post.jar is present in {solr_home}\example\exampledocs directory of solr
you can try giving following path for post.jar
-jar ../example/exampledocs/post.jar
Complete command as per your directory structure.
solr-6.0.0\bin>java -Dtype=text/csv -Durl=http://localhost:8983/solr/jcg/update -jar ../example/exampledocs/post.jar books.csv
Sometimes there is an issue reading the path of the post.jar file.
In the tutorial it says the command is: java -jar /exampledocs/post.jar /films/films.xml
Instead,
Give the full path:
java -jar C:\Program Files\solr-8.6.1\example\films\post.jar films.xml
This worked for me
Try to add the path of post.jar to your solrconfig.xml file so that the core or collection can know where it is present.
Related
I've been indexing a directory of folders/files containing html pages, docs, ppts, pdfs..etc. I noticed a type of file called LOG that is being indexed and I don't want it to be indexed because the contents aren't needed.
To index to Solr i've been using this command (i am a windows user so i use the simple post tool): java -Dc=collection -Dport=4983 -Drecursive -Dauto jar example/exampledocs/post.jar c:/folder Instead, I tried to do the following command to exclude LOG files:
java -Dc=collection -Dport=4983 -Drecursive -Dfiletypes=xml,json,csv,pdf,doc,docx,ppt,pptx,xls,xlsx,odt,odp,ods,ott,otp,ots,rtf,htm,html,txt jar example/exampledocs/post.jar c:/folder
Solr refuses to index, and throws errors (#400 http). -Dfiletypes should be an actual command i can use, but Solr doesn't seem to like it. I even tried [] around the list of file types and it won't work. Is my syntax wrong?
If I add -Dauto, it works!
java -Dc=collection -Dport=4983 -Drecursive -Dauto -Dfiletypes=xml,json,csv,pdf,doc,docx,ppt,pptx,xls,xlsx,odt,odp,ods,ott,otp,ots,rtf,htm,html,txt jar example/exampledocs/post.jar c:/folder
For posting a XML document I used the below command
java -Durl=http://localhost:8983/solr/Hanu_Core/update -jar .\post.jar .\money.xml
By looking at your title, I assume, you maybe asking for "Commands" to post pdf using SimplePostTool.
In the CLI you can type as below to check all the available properties and options.
$ java -jar example/exampledocs/post.jar -h
Moreover, you may follow like the example as below
java -Durl=http://localhost:8983/solr/pdfs/update/extract -Dcommit=yes -Dtype=application/pdf -jar exampledocs/post.jar ~/solr-4.10.3/solr-app/solr_home/pdfs/pdfs_res/Apache_Solr.pdf
I hope this solves your problem.
As #Anis Suggested You can use -Dtype=application/pdf. can also use -Dauto
Example :
java -Dauto -Dc=collection_name -jar post.jar pdf_file.pdf
Using -Dauto we can index all document format that tika supports.
i.e txt,doc,docx,pdf,xml,html etc.
For more details hit help command
java -jar post.jar -h
Is there a way to add username/password parameters to the following solr update command?
...........jdk7x64\bin\java -jar -Durl=http://localhost:8983/solr/collection1/update post.jar test_data.xml
Or is there any other way to post files to a Solr which is password protected?
Can you try below command, Hope this will help.
jdk7x64\bin\java -jar -Durl=http://username:password#localhost:8983/solr/collection1/update post.jar test_data.xml
Adding the username:password in the url just before the hostname.
According to Solr Issue, it is available from Solr 4.8.
I try to run solr in schemaless mode on a windows machine, like it is described here. But if I run the command
java -Dsolr.solr.home=example-schemaless\solr -jar start.jar
I get the error:
Could not find or load main class .solr.home=example-schemaless.solr
check the path of your start.jar file. Make the correct path to start.jar file and set the solr home correctly, that may be fix your issue.
I'm having a little trouble understanding how Solr fits in with Jetty, and why I can't seem to get the start.jar in the distribution package to work.
I can run all of the example configurations via java -jar start.jar. However, when I try to run something like the follwing --
java -Dsolr.solr.home=/Users/jwwest/solr -jar $(brew --prefix solr)/libexec/example/start.jar
-- the following error occurs:
java.io.FileNotFoundException: No XML configuration files specified in start.config or command line.
at org.eclipse.jetty.start.Main.start(Main.java:506)
at org.eclipse.jetty.start.Main.main(Main.java:95)
I opened up the start.jar file, and there is a start.config file located inside of the jar which I'm assuming should handle this configuration for me. I'm not understanding why it will work when run from inside of the distribution examples directory, but not outside of it.
You also need to define the jetty.home property. Try:
java -Dsolr.solr.home=/Users/jwwest/solr -jar $(brew --prefix solr)/libexec/example/start.jar -Djetty.home=$(brew --prefix solr)/libexec/example
You can see the effective command line start.jar generates by using the --dry-run command line flag.
java -jar start.jar --dry-run
That will output everything with full path names so you can run it from outside the directory.
Source: http://www.eclipse.org/jetty/documentation/9.0.0.M3/advanced-jetty-start.html
The start.jar is a jetty specific mechanism that works to build out all the classpath requirements for starting up Jetty. It is generally only used in the scope of the jetty distribution. Pulling the start.jar out of the configuration and placing it somewhere else renders the default configuration of the start.config rather moot.
My understanding of Solr is that it bundles itself with a distribution of jetty, placing what it needs to run into the distribution and repackages it as its own. They may have a custom start.config file that further adds its own locations for classpath resources and the like, or not.
The exception you are seeings stems from the start.config file expecting an etc/ directory containing jetty.xml formatted xml files which are used to configure the jetty process.
Jetty being often used in an embedded format has little to do with this issue, it is simply a common use case because jetty is incredibly easy to embed into an application. Embedded instances of jetty rarely (if ever) leverage a start.jar...instead it is up to the embedding application to manage its own classpath.
First, you need to change your folder where start.jar is located, then execute the same command.
Jetty is often used as embedded container. If you want to use the jetty, then a good start would be to copy the example directory and rename it to what you want it to be. The solr directory is the one for basic configuration.
Else it is recommended to use tomcat and the solr.war file.