Data Migration within Apache Solr - solr

Good afternoon!
I would like to know, please, how can I do the Data migration in Apache Solr? Is that possible?
For example:
I am currently working with a collection of Data (a large amount of Data) and I want to do the following:
Pass my Data from "COLLECTION1" to "COLLECTION2", for example. Is that possible?
Thank you so much, guys!

Related

how to make a search engine with nutch and cassandra?

I am tring to implement a website search engine with java as an applet,I have used nutch as web crawler and cassandra as my database,I have to use a nosql database(because my teacher wants me to do),now my question is what should I do next to complete my search engine?
I have googled a lot,but all of the sites are mostly about nutch and solr,and they build search engines with integration of these two,cause solr itself is somehow a database,I don't know what should I do,do I have to use solr too to complete my search engine?is it wise to use two databases(solr and cassandra)?or I should do some thing else?
please remember I have to use cassandra.
and please first explain me if I have understood things in a wrong way and then give me a minus mark,:D
I will be really really thankfull for your help,I have got somehow confused.
by the way does solr counted as a nosql database?excuse me,I am new to them all.
Check out Solr's Data Import Handler and see if you feel it would work. It allows you to query your database and store the results with Solr to which then Solr can manipulate the reuslts. Nutch also has very good integration with Solr should you choose to use it.

How WCS's SolR can index Magnolia's content

I'm trying to find a way to index Magnolia's content with a Solr from an other CMS.
(It's not realy a CMS but it's like. I'm talking about websphere commece 7 FEP5).
I've a Solr engine, plugged in a DB2 database, and already configured to index the content of that database.
Now, I have to plug a Magnolia CMS, (with is own database). And one of the requirement is to be able to display search results (provided from my Solr), and theses search result must include Magnolia's content.
Does anybody have any ideas to do that?
Using a second Solr system (the one into Magnolia) is not really acceptable.
Any ideas will be appreciated.
Regards,
Dekx.
This sounds like you should be able to do what you want using the WCS unstructured content crawler.
Start here:
http://pic.dhe.ibm.com/infocenter/wchelp/v7r0m0/topic/com.ibm.commerce.developer.doc/concepts/csdmanagesearchsitecontent.htm

How to do indexing data from database using apache solr with glassfish server on linux?

I want to create a search box in my web app using Apache Lucene and Apache Solr.I am using postgres database and have to do it with java.
As I new to these concepts (solr,lucene), I am struggling with this. I already installed and configured apache Solr with glassfish.Now I dont know how to start with this, Whether I have to cretae a java project in eclipse or I have to use Solr admin GUI.
can any one help me on this?
Thanks in Advance.....
In order to make data searchable, you have to first index your data. You can use one of the following ways to index data.
By using Solr clients such as Solrj
If you store your data in relational DB then you can use DataImportHandler
By posting XML or Json messages. Check here for documentation.
When new data added you can index it using Solr clients (Solrj). You can also search your data using Solrj or any other client libraries.
You can find other client libraries here.
You can start with Solr DIH to index the data from postgres to Solr.
For more detailed understanding you can refer to :-
how-to-import-data-from-sql-databases-part-1
how-to-import-data-from-sql-databases-part-2
how-to-import-data-from-sql-databases-part-3

Solr's schema and how it works

Hey so I started researching about Solr and have a couple of questions on how Solr works. I know the schema defines what is stored and indexed in the Solr application. But I'm confuse as to how Solr knows that the "content" is the content of the site or that the url is the url?
My main goal is I'm trying to extract phone numbers from websites and I want Solr to nicely spit out 1234567890.
You need to define it in Solr schema.xml by declaring all the fields and its field type. You can then query Solr for any field to search.
Refer this: http://wiki.apache.org/solr/SchemaXml
Solr will not automatically index content from a website. You need to tell it how to index your content. Solr only knows the content you tell it to know. Extracting phone numbers sounds pretty simple so writing an update script or finding one online should not be an issue. Good luck!

Index my own data in Solr

I am new to Solr and have a couple of questions to ask help from more experienced people:
I am able to get example running, however what is exactly the start.jar?
I know by running "java -jar start.jar", i can start solr. But do i run this command after i index my own data, not the given sample data? if not, what should i do to run my own solr instance with my own indexed data?
I do need to index my own sample data, not related to the given example solr thing at all. How exactly should i do it? Should i copy the example directory then modify the fields in sechema.xml? should i then run the post.sh accordingly to index the data like what i did to set up the example solr?
Thanks a lot for your help!
Steps:
Decide what will be the document structure u store in SOLR. (Somewhat like creating the schema of a relational DB for one table).
remove the example core and create your own core with that schema
once the schema works with no errors (you check the server logs that hosts the SOLR app) You can start feed the data you have into SOLR. You POST it via HTTP in a specific structure which is documented in the SOLR Wiki. Various frameworks have some classes to handle that.
Marked as Wiki as this is too broad an answer for someone who did not bother to RTFM...
Dear custom indexing is not a difficult task as I have worked on it just a few days ago. First you need to write your documnet is xml,csv or json( format supported in solr) containing fields according to your schema.xml, then run following command in example/exampledocs
For a document mydoc.xml
./post.sh mydoc.xml
if in output, status value is 0 then indexing is successful and you can search your document in solr
Reference:http://www.solrtutorial.com/solr-in-5-minutes.html
Though the question is old, but I am writing for new visitors with same issue. The question can't be answered in few words. You must understand what Solr is, whats Solr Admin UI, why we need Solr instead a relational database. Then you can understand how to import sample data. I have recently published two articles i.e. Solr Introduction and Importing Sample Data, these might be helpful for you.
http://www.devtrainings.com/2017/03/apache-solr-introduction-and-server.html
http://www.devtrainings.com/2017/03/apache-solr-index-data-and-run-search.html

Resources