how to index data in solr from database automatically

how to index data in solr from database automatically - database

I have MySql database for my application. i implemented solr search and used dataimporthandler(DIH)to index data from database into solr. my question is: is there any way that if database gets updated then my solr indexes automatically gets update for new data added in the database. . It means i need not to run index process manually every time data base tables changes.If yes then please tell me how can i achieve this.

I don't think there is a possibility in Solr which lets you index the data when any updates happens to DB.
But there could be possibilities like, with the help of Triggers - there is a possibility to run an external application from triggers.
Write a CRON to trigger PHP script which does reading from the DB and indexing it in Solr. Write a trigger (which calls this script) for CRUD operation and dump it into DB, so, whenever something happens to DB, this trigger will call the above script and indexing could happen.
Please see:
Invoking a PHP script from a MySQL trigger
Automatic Scheduling:
Please see this post How can I Schedule data imports in Solr for more information on scheduling. The second answer, explains how to import using Cron.

Since you used a DataImportHandler to initially load your data into Solr... You could create a Delta Import Handler that is executed using curl from a cron job to periodically add changes in the database to the index. Also, if you need more real time updates, as #Rakesh suggested, you could use a trigger in your database and have that kick off the curl call to the Delta DIH.

you can import the data using your browser and task manager.
do the following steps on windows server...
GO to Administrative tools => task Schedular
Click "Create Task"
Now a screen of Create Task will be open with the TAB
General,Triggers,Actions,Conditions,Settings.
In the genral tab enter the task name "Solrdataimport" and in discriptions enter "Import mysql data"
Now go to Triggers tab CLick new in Setting check Daily.In Advanced setting Repeat task every ... Put time there whatever you want.click OK
Now go to Actions button click new Button IN setting put Program/Script "C:\Program Files (x86)\Google\Chrome\Application\chrome.exe" this is the installation path of chrome browser.In the Add Arguments enter http://localhost:8983/solr/#/collection1/dataimport//dataimport?command=full-import&clean=true And click OK
Using the all above process Data import will Run automatically.In case of Stop the Imort process follow the all above process just change the Program/Script "taskkill" in place of "C:\Program Files (x86)\Google\Chrome\Application\chrome.exe" under Actions Tab In arguments enter "f /im chrome.exe"
Set the triggers timing according the requirements

What you're looking for is a "delta-import", and a lot of the other posts have information about that covered. I created a Windows WPF application and service to issue commands to Solr on a recurring schedule, as using CRON jobs and Task Scheduler is a bit difficult to maintain if you have a lot of cores / environments.
https://github.com/systemidx/SolrScheduler
You basically just drop in a JSON file in a specified folder and it will use a REST client to issue the commands to Solr.

Related

Automate the execution of a C# code that uses Entity Framework to treat data?

I have code that uses Entity Framework to treat data (retrieves data from multiple tables then performs operations on it before saving in a SQL database). The code was supposed to run when a button is clicked in an MVC web application that I created. But now the client wants the data treatment to run automatically every day at a set time (like an SSIS package). How do I go about this?

But now the client wants the data treatment to run automatically every day at a set time (like an SSIS package). How do I go about this?
In addition to adding a job scheduler to your MVC application as #Pac0 suggests, here are a couple of other options:
Leave the code in the MVC project and create an API endpoint that you can invoke on some sort of schedule. Give the client a PowerShell script that calls the API and let them take it from there.
Or
Refactor the code into a .DLL or copy/paste it into a console application that can be run on a schedule using the Windows Scheduler, SQL Agent or some other external scheduler.

You could use some tool/lib that does this for you. I could recommend Hangfire, it works fine (there are some others, I have not tried them).
The example on their homepage is pretty explicit :
RecurringJob.AddOrUpdate(
() => Console.WriteLine("Recurring!"),
Cron.Daily);
The above code needs to be executed once when your application has started up, and you're good to go. Just replace the lambda by a call to your method.
Adapt the time parameter on what you wish, or even better: make it configurable, because we know customers like to change their mind.
Hangfire needs to create its own database, usually it will stay pretty small for this kind of things. You can also monitor if the jobs ran well or not, and check no the hangfire server some useful stats.

How do I completely delete a MarkLogic database along with it's servers and forests?

I have multiple databases in my local which I do not need. Can I run a curl script or a REST API command where I can delete the database, it's servers and all of the forests so that I can use gradle to just deploy them again?
I have tried to manually delete the server first, then the database and then the forests. This is a lengthy process.
I want a single command to do the whole job for me instead of manually having to delete the components one by one which is possible through the admin interface.

Wagner Michael has a fair point in his comment. If you already used (ml-)Gradle to create servers and databases, why not use its mlUndeploy -Pconfirm=true task to get rid of them? You could potentially even use a fake project, with stub configs to get rid of a fairly random set of databases and servers, though that still takes some manual work.
By far the quickest way to reset your entire MarkLogic, is to stop it, and wipe its data directory. This SO question gives instructions on how to do it, as part of a solution to recover when you lost your admin password:
https://stackoverflow.com/a/27803923/918496
HTH!

Monitor that a website is active from SQL Agent

I want to test a portion of my website to see if it is running by executing a SQL server agent job. my site logs every time someone loads the login page. What I would like to do is launch:
https://www.example.com/Main/main_dir.wp1
after a few seconds run
SELECT * FROM dbo.TR_Weblog where DATEDIFF(MINUTE, date_time, getdate()) < 1
If there are no entries the site is down.
How do I launch a URL from inside agent?

IMO, this isn't an appropriate use of SQL Agent; it's not a general purpose task scheduler.
If you're going use Agent though...
I would advise against doing it the way #TheGameiswar suggests, as it will leave orphaned iexplore.exe processes on your SQL Server box, and there are situations where it won't even start properly - causing the process to stall out.
Instead, make your first step one of type PowerShell, and run the following command from it:
invoke-restmethod -URL YOURURLHERE
However, this will not parse/execute any JavaScript on the page, nor load any images. It'll just pull the raw HTML returned by the page when loaded.
But even this is a bit of a Rube Goldberg method of monitoring your website's availability when there are purpose-built applications/tools and services to do exactly that.

You can just select command type as cmd type and then use below url..
#START http://bing.com/
further ,you don't have any control after launch.So I think the best way is to do a periodic check of iis logs using log parser and see status

In ClearCase how long does it take for a changeset and dynamic view to be updated after adding a new file to source control?

I am trying to understand how long ClearCase operations take after performing the add to source control operations.
If I am working through a CCRC snapshot view and I add a file to source control, how long will it take for the changeset to be updated with the new line, and how long until the operation completes will the new file be available under a dynamic view pointing to the stream that the file was checked into?
Is there any way to speed up that process by invoking a manual update of the dynamic view or something?
Regards,
Andrew

how long will it take for the changeset to be updated with the new line
As soon as you checkout a file, selecting an activity, it will update the chenge set of said activity immediately.
A dynamic view would reflect that file only after you check in (through your web snapshot view in CCRC), and that update would also be near instantaneous.
To speed up, you can refresh the dynamic view, or do a cleartool ls in the directory you want to see updated.
In each case, when you are doing a checkout or a checkin through CCRC, you are posting an http request to the CCRC server which in turn complete the operation with the ClearCase Vob/View server.
So once the checkout/checkin is completed, any other ClearCase view (CCRC or not) would be ready to reflect the changes.
The only part which takes time is the communication between the CCRC client and the CCRC server. That server being usually on the same LAN as the ClearCase server, the ClearCase command itself executes fairly quickly.
"fairly quickly" turned out too slow for the OP's need: a postop trigger on checkin.
That trigger use a ClearCase dynamic view on the server side, and has to introduce a sleep on the element checkin (on mkelem) in order for the second call of that trigger (on the parent directory being checked in) to properly detect the new created file.

Theoretically, it should be instant. As soon as the add finishes, the dynamic view should see the new file. In reality, it might take longer due to the nature of ClearCase and its view processes.
Every view has a process running on the view server (local or remote), and this process needs to query the VOB server to get the changes.
In our ClearCase environment, we see many lags that are probably the combination of a loaded server and network traffic.
Bottom line - should be quick (seconds), but not instant. If it takes longer, you should try and see what might be slowing the processes down.

Trigger Jenkins Build on Database Change

As the subject suggests I'm interested in triggering Jenkins on changes involving a pre-configured database table. For example, whenever the number of records changes I want Jenkins to perform some particular action. Is there a plugin out of the box available for this scenario?
Thank you!
Regards,
Alex

Either you have a command line client for your database or you can write a script (perl, ruby, Groovy, Java whatever) to get this functionality. This script can be executed by Jenkins. Based on the absence of information about which database we are talking about i can't give you a more detailed hint.

What database are you using?
Most of them have some kind of triggers that can be fired after table insert, update or delete.

A logical alternative to database triggers is polling: write a script that will poll the database and store the results you are watching. If they change the script can modify a file which will trigger a Jenkins build via FS Trigger plugin.

Probably the easiest way is to use a ScriptTrigger, which can easily use embedded Groovy or Shell/Windows Batch script to pool database with a query to verify state of the given data.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

how to index data in solr from database automatically - database

Related

Automate the execution of a C# code that uses Entity Framework to treat data?

How do I completely delete a MarkLogic database along with it's servers and forests?

Monitor that a website is active from SQL Agent

In ClearCase how long does it take for a changeset and dynamic view to be updated after adding a new file to source control?

Trigger Jenkins Build on Database Change

Categories

Resources