Error deploying configuration descriptor Solr - solr

I have done the below steps for Solr Integration to tomcat on windows machine.Can you please clarify what am I doing wrong here.
1) Download Solr and unzipped Solr 5.2.1 to the below directory C:\downloads\solr-5.2.1\solr-5.2.1.
2)Download Tomcat 7 zipped version and unzipped it to below location C:\downloads\apache-tomcat-7.0.62\apache-tomcat-7.0.62
3)Copy Jar files from C:\downloads\solr-5.2.1\solr-5.2.1\dist\solrj-lib directory to C:\downloads\apache-tomcat-7.0.62\apache-tomcat-7.0.62\lib directory.
4) Create a solr.xml in the C:\downloads\apache-tomcat-7.0.62\apache-tomcat-7.0.62\conf\Catalina\localhost folder.
<?xml version='1.0' encoding='UTF-8'?>
<context docBase="C:/downloads/apache-tomcat-7.0.62/apache-tomcat-7.0.62/webapps/solr.war" debug="0" crossContext="true" >
<environment name="solr" type="java.lang.String" value="/apache-tomcat-7.0.62/webapps/" override="true"></environment>
</context>
5)Copy solr.war file from C:\downloads\solr-5.2.1\solr-5.2.1\server\webapps to
C:\downloads\apache-tomcat-7.0.62\apache-tomcat-7.0.62\webapps folder.
6)Start the tomcat using startup.bat command in bin folder
7)Edit web.xml to
<env-entry>
<env-entry-name>solr/home</env-entry-name>
<env-entry-value>C:/downloads/solr-5.2.1/solr-5.2.1</env-entry-value>
<env-entry-type>java.lang.String</env-entry-type>
</env-entry>
8)Restart the tomcat and hit the url http://localhost:8080/solr I get 404 Not found Error.The error in the console is
SEVERE: Error deploying configuration descriptor C:\downloads\apache-tomcat-7.0.
62\apache-tomcat-7.0.62\conf\Catalina\localhost\solr.xml
java.lang.NullPointerException
at org.apache.catalina.startup.HostConfig.deployDescriptor(HostConfig.ja
va:645)

The Solr wiki states that running 5.x versions on Tomcat is no longer supported:
Internally, Solr is still implemented via Servlet APIs and is powered by Jetty -- but this is simply an implementation detail. Deployment as a "webapp" to other Servlet Containers (or other instances of Jetty) is not supported, and may not work in future 5.x versions of Solr when additional changes are likely to be made to Solr internally to leverage custom networking stack features.

Related

Ubuntu upgrade (14.04.3) and solr upgrade (5.3) breaks solr3/jetty setup

While upgrading our production server to ubuntu 14 the solr/jetty broke.
To fix i tried:
- upgraded java version to 8.
- upgraded solr to 5.3
My earlier config used .war files. realized that now its stopped supporting .war files, so now cannot use my earlier config.
This was the steps i used earlier:
copy the solr.war file from the downloaded and untarred solr to the webapps folder, and make jetty owner:
sudo cp ~/x/apache-solr-3.6.1/dist/apache-solr-3.6.1.war /usr/share/jetty/webapps/solr.war
sudo chown jetty:jetty /usr/share/jetty/webapps/solr.war
create a context file in JETTY_HOME/contexts (sudo vim /usr/share/jetty/contexts/solr.xml)
<?xml version="1.0" encoding="ISO-8859-1"?>
<!DOCTYPE Configure PUBLIC "-//Mort Bay Consulting//DTD Configure//EN" "http://jetty.mortbay.org/configure.dtd">
<!-- Set the solr.solr.home system property -->
<Configure class="org.mortbay.jetty.webapp.WebAppContext">
<Call name="setProperty" class="java.lang.System">
<Arg type="String">solr.solr.home</Arg>
<Arg type="String">/usr/share/solr</Arg>
</Call>
</Configure>
tell solr how many cores we have and what those are in SOLR_HOME (sudo vim /usr/share/solr/solr.xml):
<solr persistent="true" sharedLib="lib">
<cores adminPath="/admin/cores">
<core name="demo" instanceDir="demo" />
<core name="dev" instanceDir="dev" />
<core name="qa" instanceDir="qa" />
<core name="www" instanceDir="www" />
</cores>
</solr>
etc.
Don't want to use the solr instance that comes with rsolr (4), if i can help it. But can't find corresponding info for solr 5.3
Running rails 3.2.21. and solr_sunspot gem 2.2.0
Thanks in advance!

Module and Version in GAE/J Cron

How can I specify both module and version in GAE/J cron?
I read this page.
The target string is prepended to your app's hostname.
It is usually the name of a module.
The cron job will be routed to the default version of the named module.
Note that if the default version of the module changes,
the job will run in the new default version.
If there is no module with the name assigend to target,
the name is assumed to be an app version, and App Engine will attempt to
route the job to that version. See About appengine-web.xml
My understanding is that either module name or version can be specified in <target>, but I want to specify both module name and version.
How can I do that?
For achieving your goal, you will need to look for two files (appengine-web.xml, cron.xml), As you already said the target tag of your cron.xml will allow you to set the module or version name, So to be able to do what you need, you can set the app name and module version in appengine-web.xml, Then you can define the module name in the target tag of your cron.xml.
An App Engine Java app must have a file named appengine-web.xml in its WAR, in the directory WEB-INF. This is an XML file whose root element is <appengine-web-app>. A minimal file that specifies the application ID, a version identifier, and no static files or resource files looks like this:
<?xml version="1.0" encoding="utf-8"?>
<appengine-web-app xmlns="http://appengine.google.com/ns/1.0">
<application>_your_app_id_</application>
<version>1</version>
<threadsafe>true</threadsafe>
</appengine-web-app>
A cron.xml file in the WEB-INF directory of your application (alongside appengine-web.xml) controls cron for your Java application. The following is an example cron.xml file:
<?xml version="1.0" encoding="UTF-8"?>
<cronentries>
<cron>
<url>/weeklyreport</url>
<description>Mail out a weekly report</description>
<schedule>every monday 08:30</schedule>
<timezone>America/New_York</timezone>
<target>version-2</target>
</cron>
</cronentries>
Hope it helps
You will achive your goal if you specify target in such format: "version-dot-module"
this is worked for me:
<cron>
<url>/cron/test-cron</url>
<description>test-cron</description>
<schedule>every 1 minutes synchronized</schedule>
<target>v-5-1-dot-my-module-name</target>
</cron>
v-5-1 - version of my module
my-module-name - name of my module

Tomcat 6.0.35 - how to bypass new CSRF protection on the manager application?

I have a command line script (actually a git post-checkout hook) that reloads my Solr application by doing a cURL to:
http://localhost:8080/manager/html/reload?path=/solr
Since I upgraded to Ubuntu 13.04, it now fails, where it used to work before the upgrade.
The cause of the problem is that my newer version of Tomcat (6.0.35), has some new CSRF protection and it now returns 403 Access Denied.
How can I solve the issue and bypass the CSRF protection?
More info:
My /etc/tomcat6/tomcat-users.xml file:
<?xml version='1.0' encoding='utf-8'?>
<role rolename="manager"/>
<user username="tomcat" password="secret" roles="manager"/>
</tomcat-users>
The documentation for Configuring Manager Application access in tomcat mentions some new manager roles, however my error specifically mentions that the single "manager" role still exists for the moment (and I tried the other roles anyway without success).
(As I was writing the question, I found the answer.) Instead of cURLing to the HTML application, I needed to cURL to the "plain text interface".
i.e. instead of
http://localhost:8080/manager/html/reload?path=/solr
Use:
http://localhost:8080/manager/reload?path=/solr
It turns out:
The HTML interface is protected against CSRF but the text and JMX interfaces are not.
This fits with the new role called "manager-script". To ensure my app will work in the future I changed my /etc/tomcat6/tomcat-users.xml file:
<?xml version='1.0' encoding='utf-8'?>
<role rolename="manager-gui"/>
<role rolename="manager-script"/>
<user username="tomcat" password="secret" roles="manager-gui,manager-script"/>
</tomcat-users>

I'm following the Nutch tutorial, and getting a "No URLs to fetch" error

Following the Apache Nutch tutorial here:
As indicated in the tutorial, I've set the last line of my regex-urlfilter.txt to:
+^http://([a-z0-9]*\.)*nutch.apache.org/
My nutch-site.xml file contains only the lines
<property>
<name>http.agent.name</name>
<value>My Nutch Spider</value>
</property>
And my seed.txt file is:
http://nutch.apache.org/
However, when I crawl with
bin/nutch crawl urls -dir crawl -depth 3 -topN 5
I get a "No URLs to fetch" error. Anyone know why?
Configuration looks fine to me. You have made these changes in runtime/local folder right?
seed.txt will be in NUTCH_HOME/runtime/local/urls folder and
regex-urlfilter.txt and nutch-site.xml will be in NUTCH_HOME/runtime/local/conf folder
NUTCH_HOME is installation directory

Getting "missing core name in path" when trying to access Solr admin installed on Glassfish

I've installed Solr 3.1 on Glassfish, and that part passed smoothly, as when I visit<host>:<port>/solr, I get that "Welcome to Solr!" page, along with "Solr Admin" link.
Problems start when I try to opet admin panel, I get "HTTP Status 404 - missing core name in path". I have no clue why is that happening. Previously, I've been testing that default Solr example (single core) at localhost, but using Jetty, shipped with Solr release in form of that start.jar.
I've set system property solr.solr.home to point to the folder where solr.xml and conf folder is located, and here's the content of mentioned solr.xml:
<solr persistent="false"
<cores adminPath="/admin/cores" defaultCoreName="collection1">
<core name="collection1" instanceDir="." />
</cores>
</solr>
As you can see, just simple single core setup.
Any idea?
Thanks in advance
<solr persistent="false"
<cores adminPath="/admin/cores" defaultCoreName="collection1">
<core name="collection1" instanceDir="collection1" />
</cores>
</solr>
and a directory structure of:
collection1 (containing dirs, conf and data)
solr.xml
is the proper way to do it.

Resources