Guava version conflict happens in spark-connector - graph-databases

I met a problem when I use NebulaGraph Database like below:
I want to write offline spark data into NebulaGraph Database by spark-nebula-connector.
But I encountered two problems:
First, the NebulaGraph Database version I use only support spark v2.4 and scala v2.11. For this one, I solve it by downgrading the spark and scala version.
Second, spark connector writes data via client, but clients has strong dependence on guava-14:nebula-java/pom.xml at v3.3.0 · vesoft-inc/nebula-java · GitHub
And my spark also has strong dependence on guava,guava-27.0-jre
If I use guava-27.0, it will give java.lang.NoSuchMethodError (com.google.common.net.HostAndPort.getHostText()
If I use guava-14.0, EROOR will be give when the spark reads hive, like Exception in thread “main” java.lang.NoSuchMethodError: com.google.common.base.Preconditions.checkArgument
How should I solve this?

Maybe you can refer this solution. Guava-14.0 and Guava-27.0 of the HostAndPort have different methods to acqure hosts. You can change the Guava version in connector or exchange, modify the HostAndPort.getHostTex, and then package it locally.

Related

Flink with Log4j2

We are running a flink job using v.1.13.2 and setup/configured logging using log4j(classic/pre-log4j2). We want to upgrade to and use Log4j2 instead and could not find any way to do that. Wondering if there are any teams who went down this path to try to upgrade Log4j. Thanks.
Log4j2 has been the default logger since Flink 1.11. In order to be using log4j v1, there must be some configuration in place that needs to be removed / updated. See the documentation for details.
Although the log configuration file of flink is named log4j.properties, it actually use log4j2,as david said.

What version data store does Gremlin create when creating a Neo4j database from the console?

Im running Gremlin v 3.2.5, and I keep getting errors when i try to connect to a Neo4j database from the Gremlin console, or using the neo4j-gremlin API:
Failed to start Neo4j with an older data store version. To enable automatic upgrade, please set configuration parameter "allow_store_upgrade=true"
I create the Neo4j database using the neo4j-java-driver 1.4.3, and neo4j 3.2.3, like so in scala:
val graphDb = new GraphDatabaseFactory().newEmbeddedDatabaseBuilder(new File(dbPath))
.setConfig(GraphDatabaseSettings.allow_store_upgrade, "true").newGraphDatabase()
or in the Gremlin console
gremlin> conf = new BaseConfiguration()
gremlin> conf.setProperty(Neo4jGraph.CONFIG_CONF + "dbms.allow_format_migration", "true")
gremlin> g = Neo4jGraph.open(conf)
So I would like to know what version datastore Gremlin uses because it doesn't seem to matter how I make the DB, I get errors like the one above. I believe my version of Neo4j creates a datastore v0.A.8, and the only thing I haven't tried, which may work, is downgrading my version of Neo4j. Thanks in advance for any Ideas/feedback!
*edit: gave wrong version number of neo4j-java-driver, added neo4j version
tldr; Apache TinkerPop 3.2.5 is tested to work with Neo4j 2.3.3.
It's worth noting that there is no direct or default dependency on Neo4j for Apache TinkerPop, given the GPL licensing of Neo4j which conflicts with the Apache license. So there is a bit of indirection involved in determining the version to deal with. Technically, TinkerPop leaves it to the user to choose the version of Neo4j to use by selecting a version of neo4j-tinkerpop-api-impl:
https://github.com/neo4j-contrib/neo4j-tinkerpop-api-impl
that is compatible with the version of neo4j-tinkerpop-api
https://github.com/neo4j-contrib/neo4j-tinkerpop-api
that is used with the version of TinkerPop that you are using. In the case of 3.2.5, that would be:
https://github.com/apache/tinkerpop/blob/3.2.5/neo4j-gremlin/pom.xml#L41
While you are technically free to choose a version of neo4j-tinkerpop-api-impl it's worth noting that TinkerPop 3.2.5 is only tested against 0.3-2.3.3 which is hooked to Neo4j 2.3.3:
https://github.com/neo4j-contrib/neo4j-tinkerpop-api-impl/blob/0.3-2.3.3/pom.xml#L23

Is there any way to index kafka outputs in Apache solr?

I'm new to Apache solr and I want to index data from kafka into solr. Can anyone give simple example of doing this ?
The easiest way to get started on this would probably be to use Kafka Connect.
Connect is part of the apache Kafka package, so should already be installed on your Kakfa node(s). Please refer to the quickstart for a brief introduction on how to run connect.
For writing data to Solr there are two connectors that you could try:
https://github.com/jcustenborder/kafka-connect-solr
https://github.com/MSurendra/kafka-connect-solr
While I don't have any experience with either of them, I'd probably try Jeremy's first based on latest commit and the fact that he works for Confluent.

Bluemix Monitoring and Analytics: Resource Monitoring - JsonSender request error

I am having problems with the Bluemix Monitoring and Analytics service.
I have 2 applications with bindings to a single Monitoring and Analytics service. Every ~1 minute I get the following log line in both apps:
ERR [Resource Monitoring][ERROR]: JsonSender request error: Error: unsupported certificate purpose
When I remove the bindings, the log message does not appear. I also greped my code for anything related to "JsonSender" or "Resource Monitoring" and did not find anything.
I am doing some major refactoring work on our server, which might have broken things. However, our code does not use the Monitoring service directly (we don't have a package that connects to the monitoring server or something like that) - so I will be very surprised if the problem is due to the refactoring changes. I did not check the logs before doing the changes.
Any ideas will help.
Bluemix have 3 production environments: ng, eu-gb, au-syd, and I tested with ng, and eu-gb, both using 2 applications with same M&A service, and tested with multiple instances. They are all work fine.
Meanwhile, I received a similar problem that claim they are using Node.js 4.2.6.
So there are some more information we need to know to identify the problem:
1. Which version of Node.js are you using (Bluemix Default or any other one)
2. Which production environment are you using? (ng, eu-gb, au-syd)
3. Is there any environment variables are you using in your application?
(either the creating in code one, or the one using USER-DEFINED Variables)
4. One more thing, could you please try to delete the M&A service, and create it again, in case we are trapped in a previous fault of M&A.
cf ds <your M&A service name>
cf cs MonitoringAndAnalytics <plan> <your M&A service name>
NodeJS versions 4.4.* all appear to work
NodeJS uses openssl and apparently did/does not like how one of the M&A server certificates were constructed.
Unfortunately NodeJS does not expose the openssl verify purpose API.
Please consider upgrading to 4.4 while we consider how to change the server's certificates in the least disruptive manner as there are other application types that do not have an issue with them (e.g. Liberty and Ruby)
setting node js version 4.2.4 in package.json worked for me, however this is an alternative by-passing solution. Actual fix is being handled by core team. Thanks.

RPM technique for handling cumulative updates?

RPM seems to be pretty good at checking dependencies and handling individual file updates, but what is the best practice for handling cumulative updates to, say, a relational database across multiple versions?
For instance, say you have product Foo with versions 1.2.1, 1.2.2, 1.2.3, and 1.3.0. In each of these, there were database schema changes that required SQL upgrade scripts. Running each upgrade script in sequence is required to get up to the current version of the schema.
Say a customer has 1.2.2 installed and wants to upgrade to 1.3.0. How can one structure the RPM package so that you have the appropriate scripts available and execute the correct upgrade scripts against the database? In this instance, you'd want to execute the upgrade scripts for 1.2.3 and 1.3.0, but not the ones for 1.2.1 or 1.2.2. since those have presumably already been executed.
One alternative is to require upgrading to each intermediate version in sequence, forcing the user in this example to upgrade to 1.2.3 before 1.3.0. This seems less than optimal. Also, this would presumably need to be "forced" through external process, since I don't see anything in the RPM SPEC file that would indicate this.
Are there any known techniques for handling this? A bit of Googling didn't expose any.
EDIT: By "known", I mean "tried and proven" not theoretical.
Use the right tool for the job. RPM probably isn't the right tool. Something like Liquibase would be better suited to this task.

Resources