Apatar for feeding data into Solr

Apatar for feeding data into Solr - solr

I need to fetch data from normalized MSSQL db and feed them in Solr index.
I was just wondering whether Apatar can be used to perform the job. I've gone through its documents, but doesn't get the information I'm looking for. It states, it can fetch data from SQL server, and post it over HTTP, but still not sure, whether it can post fetched data in XML over http or not?
Any advise will be highly valuable. thank you

I am not familiar with Apatar, but seeing as it is a Java application, it may be a bit challenging to implement it in a windows environment. However, for various scenarios where I need to fetch data from a MSSQL Database and feed it to Solr, I have written custom C# code leveraging the SolrNet client. This tends to be pretty straight forward and simple code and in the cases where we need to load data at specified intervals we are using scheduled tasks calling a console application. I would recommend checking out the Create/Update section of the SolrNet site for some examples of loading/updating data with the .Net client.

Related

Restricting data in PouchDB

I have an offline ready application that I am currently building in electron.
The core requirements are that all data is restricted (have to be a user to read or write) and that within that data some data is further restricted to a user, (account information, messages, etc...)
Now I do not want to replicate any data offline that a user should not have access to (this is because all the data can be seen using the devtools regardless of restriction) so essentially I only want to sync data to PouchDB's offline store if that user has access to it as well as all the data all users have access to.
Now I have read the following posts/guides but I am still a little confused.
https://pouchdb.com/2015/04/05/filtered-replication.html
https://www.joshmorony.com/creating-a-multiple-user-app-with-pouchdb-couchdb/
Restricting Access to local PouchDB
From my understanding filtering is a bad choice performance wise even though it could do what I want.
Setting up a proxy would work but it then essentially becomes a REST api and the data synchronization falls apart.
And the final option which I think is what I want is to have a database for every user that would contain their private information and then additional databases to hold the information that is available to every user.
The only real question I have with this approach is how is data handled that is private but shared between two users (messages, etc...)
I am more after an overarching view of how the data should be stored as opposed to code examples, just really struggling with the conceptual architecture of the application.

There are many solutions to your problem. One solution looks very promising: IBM Cloudant has started work on Cloudant Envoy, a proxy simulating the CouchDB interface instead of a simple REST API. You can read more about it on the site for Envoy over at ibm.com. A custom replicator for PouchDB is also available on Github.
There's also a blog post on Medium.com on this.
The idea is the same as the much older Couchbase Sync Gateway. Although Couchbase has common roots with CouchDB, I have not tracked if they still support replication with CouchDB.
The easiest way to start would be to create a single database per user on the server, and a common database that you just pull the shared data from. Let me know if you need more info on this solution.

Database with HTTP API out of the box

I am looking for a database with HTTP REST API out of the box. I want to skip the middle tier between client and database.
One option I found is a HTTP Plugin for MySQL which operates with JSON format
http://blog.ulf-wendel.de/2014/mysql-5-7-http-plugin-mysql/
Can someone suggest other similar solutions? I want to save development time and effort for some queries.

You really should have a middle layer to sanitize input and prevent unwanted calls deleting or changing your data, IMO.
Since you claim to just be testing, though, the technologies I know off the top of my head that provide REST out of the box are mostly NoSQL. You mention MySQL with that JSON thing, but I imagine that just goes through a JDBC/ODBC layer.
So what I know is:
Solr/Elasticsearch - while not strictly a database, is useful for quickly searchable semi structured data
Couchbase - a distributed document and key value store for JSON documents
Neo4j - Graph database

Use REST API with Neo4j?

Over the last couple of months I've been building up a Neo4j database. I'm finding Neo4j & Cypher really easy to use and definitely appropriate for the kind of data that I'm working with.
I'm hoping there's someone out there who can offer a few pointers on how to get started with the REST API. I don't have any experience coding in Java and I'm finding the Neo4j documentation a little tricky to follow. From what I understand, it should be possible to send a REST request via a straightforward http URL (like this http://localhost:7474/db/data/relationship/types), which would retrieve some data in a JSON.
My end goal is some form of very high level dashboard to summarise the current status of my database, to show the data from a few high level Cypher queries like this one:
match (n) return distinct(n.team), count(n)
Any advice you can offer would be greatly appreciated.

You would better use the http transactional endpoint where you can send Cypher query statements like the one in your questions.
The default endpoint is http://yourserverurl:7474/db/data/transaction/commit
The Neo4j documentation to use it from Java :
http://neo4j.com/docs/stable/server-java-rest-client-example.html#_sending_cypher
Using the transactional endpoint has the benefit of being able to send multiple statements in one transaction which will result in the operation being committed or rolled back.
The ReST API is like any other http api, the only guidelines to follow are the body contents and cypher query parameters which are well explained in the Neo4j documentation : http://neo4j.com/docs/stable/rest-api.html

Which means of accessing the SFDC API will return data the quickest?

We are using the DevArt connector which pretends to be an ADO.NET connector to SFDC. It is super slow (13 minutes for some queries). What approach will return data the quickest?
And by any chance is their an OData API to SFDC that is fast?

There are a few APIs you can use:
The SOAP API -
CRUD operations and query (SOQL) support. Some metadata support. There are Enterprise and Partner variations. Can be added as a Web Service reference in Visual Studio.
The REST API
"Typically, the REST API operates on smaller numbers of records. You
can GET a single record using its URL and you can also run a query and
bring back a set of records that match that query." Salesforce APIs – What They Are & When to Use Them
The Bulk API
REST initiated batch processes that output XML or CSV data)
The Metadata API
Probably not applicable unless you are doing configuration or deployment style tasks
The Apex API
Again, not applicable unless you are working with Apex classes and running test cases.
The Streaming API
Allows you to register a query and get updates pushed to you when the query result changes.
They all have their advantages and disadvantages. There is a good summary in the Bulk API introduction.
At a guess I'd assume the DevArt connector is based on the SOAP API. The SOAP API can be fast, but it isn't an ideal way to bring back a very large number of records as the results are paged and the SOAP responses can be large. Other factors can also slow it down unnecessarily, such as querying fields that are never used.
The ADO.NET connector must be doing some interpretation of queries into SOQL. There may be joins that are inefficient when translated into SOQL.
I suspect the best solution will depend on what records and fields you are trying to query and how may results you are expecting to work with.

Large Data Service (Astoria) payloads: How to improve performance?

I have a silverlight client accessing data through ado.net data services. One of my queries has a number of expand clauses, and gets back quite a number of entries. The xml response is enormous, and I'm looking for ways to make this more efficient.
I have tried:
Paging (not an option for this behaviour)
Http compression (some client pcs are running IE6)
Doing the expands as separate queries and joining the entities later (this improved things a little)
Is it possible to use JSON as a transport format with the silverlight client? I haven't found anything about this on the web...

You can see the demonstration of using JSON in silverlight in the below link
http://timheuer.com/blog/archive/2008/05/06/use-json-data-in-silverlight.aspx
I am not sure how much performance gain is achieved by using JSON. I definitely remember that ado.net services does JSON.

Well. I got a chance to talk to Tim Heuer about this, who awesomely went and asked Pablo Castro for me. Thanks Tim!
JSON can't be used by the Silverlight client, but Silverlight 3 will be using binary xml by default to talk to web services. Rawr.
One other thing i worked out for myself was that using expand can sometimes result in a lot more data than performing multiple requests. If you batch a few queries together and then hand-stitch the objects together, you can save quite a bit of xml.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight