Running pubsub kafka connector standalone mode issues - google-cloud-pubsub

So, I have been trying to get a PubSub Kafka connector running for about a month now with various problems. I have reviewed many questions here about Kafka Connect and the Pubsub connector which have helped me get his far but I am stuck again. When I run this command:
.\etc\kafka\ .\etc\kafka\ .\etc\kafka\
I get a long list of errors linked here:
Right after it tries to start the rest server is when the errors "could not scan file [file name]..." start. I am unsure if I need to set the and rest.port because currently, for the standaloneConfig values, it reads = null
Edit: After reviewing the log file for awhile, I found the following messages:
Kafka consumer created
Created connector CPSConnector
Initializing task CPSConnector-0 with config {,, tasks.max=1, topics=, cps.project=kohls-sis-sandbox, name=CPSConnector, cps.topic=test-pubsub}
Task CPSConnector-0 threw an uncaught and unrecoverable exception
org.apache.kafka.connect.errors.ConnectException: Sink tasks require a list of topics.
at org.apache.kafka.connect.runtime.WorkerSinkTask.initializeAndStart(
at org.apache.kafka.connect.runtime.WorkerSinkTask.execute(
at org.apache.kafka.connect.runtime.WorkerTask.doRun(
at java.util.concurrent.Executors$
at java.util.concurrent.ThreadPoolExecutor.runWorker(
at java.util.concurrent.ThreadPoolExecutor$
Edit: So, I fixed the above issue by adding topics=test in my configSink. The current error message is below. Does this indicate that you can only run either a sink connector or source connector?
Failed to create job for .\etc\kafka\
Stopping after connector error
java.util.concurrent.ExecutionException: org.apache.kafka.connect.errors.AlreadyExistsException: Connector CPSConnector already exists
at org.apache.kafka.connect.util.ConvertingFutureCallback.result(
at org.apache.kafka.connect.util.ConvertingFutureCallback.get(
at org.apache.kafka.connect.cli.ConnectStandalone.main(
Caused by: org.apache.kafka.connect.errors.AlreadyExistsException: Connector CPSConnector already exists
at org.apache.kafka.connect.runtime.standalone.StandaloneHerder.putConnectorConfig(
at org.apache.kafka.connect.cli.ConnectStandalone.main(
In my WorkerConfig.properites, I have bootstrap.servers=localhost:2181. My property files are here.
I am not sure how to fix since I have my properties files set, made sure the cps-kakfa-connector.jar is in the class path. I also set plugin.path=\share\java\kafka\kafka-connect-pubsub.
If anyone can point me in the right direction to fix this issue, that would be great. I followed the directions here:

Each Connector instance, whether it's a source or a sink, needs to have a unique name when you submit its configuration properties to a Kafka Connect cluster, or standalone worker.
In the above example, just name your Source differently than your Sink.
For instance:
$ head -n 1
$ head -n 1
or, might as well:
$ head -n 1
$ head -n 1


Failed to create collection 'techproducts' due to: Underlying core creation failed while creating collection: techproducts

I just started to learn solr with official documentation and during the first exercise "Index Techproducts Example Data" I failed due to following error: " Failed to create collection 'techproducts' due to: Underlying core creation failed while creating collection: techproducts".
I tried to change java version from 13 to 8 but it didn't helped.
Here is link to the documentation:
Stacktrace from solr Admin console
Collection: techproducts operation: create failed:org.apache.solr.common.SolrException: Underlying core creation failed while creating collection: techproducts
at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(
at java.util.concurrent.ThreadPoolExecutor.runWorker(
at java.util.concurrent.ThreadPoolExecutor$
I had run into similar situation while following Solr
's official tutorial as following
➜ solr-8.7.0 ERROR: Failed to create collection 'techproducts' due to: Underlying core creation failed while creating collection: techproducts
And problem solved my turning off my vpn. I guess the vpn routing probably messed up with solr's localhost setting somehow.
I had the same Underlying core creation failed... error too. Using Java 11, Windows 10.
The log file was ${solr-home}\example\cloud\node1\logs\solr.log. Inside it had:
org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error from server at Error CREATEing SolrCore 'techproducts_shard1_replica_n1': Unable to create core [techproducts_shard1_replica_n1] Caused by: no segments* file found in LockValidatingDirectoryWrapper(NRTCachingDirectory(MMapDirectory#{solr_home}\example\cloud\node2\solr\techproducts_shard1_replica_n1\data\index; maxCacheMB=48.0 maxMergeSizeMB=4.0)): files: [write.lock] at org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod( ~[?:?]
at (etc. etc.)e
But this was the second time I launched solr. The first time it timed out trying to contact one of the nodes and the tutorial script aborted. But the nodes were still running. I killed them off using the windows task manager and not by using solr stop. So I suspect I left an instable mess behind and the second time the tutorial ran it crashed into this mess.
I erased everything and started over from unzipping and this third time there were no timeouts and the tutorial completed without error.
File: /opt/solr/server/etc/jetty.xml
(1) Name="requestHeaderSize" set Property name "solr.jetty.request.header.size" default="81920"
(2) Name="responseHeaderSize"> set Property name="solr.jetty.response.header.size" default="81920"
(3) Restart Solr
Hm, tried this, still getting the exact same error.
After Change:
[Set name="requestHeaderSize"][Property name="solr.jetty.request.header.size" default="81920" /][/Set]
[Set name="responseHeaderSize"][Property name="solr.jetty.response.header.size" default="81920" /][/Set]
I stopped everything and retried, then I had Windows Firewall prompt me for 'SAP Machine' authorization for java 11 message, I accepted it, and retried. Then it worked. Seems Windows Firewall related.

flink job submission org.apache.flink.runtime.messages.FlinkJobNotFoundException: Could not find Flink job

Getting the following flink job submission error,
#centos1 flink-1.10.0]$ ./bin/flink run -m ./examples/batch/WordCount.jar --input file:///storage/flink-1.10.0/test.txt --output file:///storage/flink-1.10.0/wordcount_out
Job has been submitted with JobID 33d489aee848401e08c425b053c854f9
The program finished with the following exception:
org.apache.flink.client.program.ProgramInvocationException: The main method caused an error: [ org.apache.flink.runtime.messages.FlinkJobNotFoundException: Could not find Flink job (33d489aee848401e08c425b053c854f9)
Caused by: java.util.concurrent.CompletionException: org.apache.flink.runtime.messages.FlinkJobNotFoundException: Could not find Flink job (33d489aee848401e08c425b053c854f9)
Caused by: org.apache.flink.runtime.messages.FlinkJobNotFoundException: Could not find Flink job (33d489aee848401e08c425b053c854f9)
at org.apache.flink.runtime.dispatcher.Dispatcher.getJobMasterGatewayFuture(
at org.apache.flink.runtime.dispatcher.Dispatcher.requestJobStatus(
... 27 more
logs from the taskmanger nodes: saying the file not found.. Is the correct way of pointing files in a flink cluster setup.
2020-03-19 13:15:29,843 ERROR org.apache.flink.runtime.operators.BatchTask - Error in task code: CHAIN DataSource (at main( ( -> FlatMap (FlatMap at main( -> Combine (SUM(1), at main( (1/2) Error opening the Input Split file:/storage/flink-1.10.0/test.txt [0,19]: /storage/flink-1.10.0/test.txt (No such file or directory)
how to troubleshoot the above error, what to check , very less clues in the flink logs
The reason why is happening is because you are submitting a job across a distributed cluster and the location you have specified is perhaps only accessible by Job manager or machine from where you have submitted your job. However, actual program and Job execution takes place in Task Manager. Better approach for this would be by specifying a location which is accessible by all the nodes, may be HDFS or NFS.

[AWS Glue]: org.apache.thrift.TApplicationException: Internal error processing createInterpreter

I'm trying to use zeppelin-0.8.0 to connect to AWS Glue Development endpoint and when executing a cell below error occurs.
And there is no helpful message to understand what could be the problem. Any leads appreciated
172318_1906434757 is finished, status: ERROR, exception: java.lang.RuntimeException: org.apache.thrift.TApplicationException: Internal error processing createInterpreter, result: %text org.apache.thrift.TApplicationException: Internal error processing createInterpreter
at org.apache.thrift.TServiceClient.receiveBase(
at org.apache.zeppelin.interpreter.thrift.RemoteInterpreterService$Client.recv_createInterpreter(
at org.apache.zeppelin.interpreter.thrift.RemoteInterpreterService$Client.createInterpreter(
at org.apache.zeppelin.interpreter.remote.RemoteInterpreter$
at org.apache.zeppelin.interpreter.remote.RemoteInterpreter$
at org.apache.zeppelin.interpreter.remote.RemoteInterpreterProcess.callRemoteFunction(
at org.apache.zeppelin.interpreter.remote.RemoteInterpreter.internal_create(
at org.apache.zeppelin.interpreter.remote.RemoteInterpreter.getFormType(
at org.apache.zeppelin.notebook.Paragraph.jobRun(
at org.apache.zeppelin.scheduler.RemoteScheduler$
at java.util.concurrent.Executors$
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(
at java.util.concurrent.ScheduledThreadPoolExecutor$
at java.util.concurrent.ThreadPoolExecutor.runWorker(
at java.util.concurrent.ThreadPoolExecutor$
UPDATE: So as in the answer below looks like 0.8.0 doesn't work with Glue yet.. I had problems running 0.7.x aw well with the package having a bunch of MethodNotFoundException when running with Java 8(did not help update-alternative to Java 7 as well). But when running inside a JDK 7 docker container it worked with no problems and was able to connect to my Dev end point. Highly appreciate if anyone can clarify the root cause of it
Could you please provide more information, such as zeppin instance location. Is it running on your desktop/laptop or is it running as AWS Notebook server? Also did you try connecting to zeppelin 0.7.3 version, as mentioned here in this AWS forum link :
As per the above link dated Jul 2018, think AWS Glue doesn't yet support Zeppelin 0.8 version.
I am assuming all other configurations, environment settings are done as needed. Can help more, if you can provide additional info.
Anyway, please refer here and setting up zeppelin on windows, for any help on setting up local development environment & zeppelin notebook.
Once you set up the zeppelin notebook, have an SSH connection established (using AWS Glue DevEndpoint URL), so you can have access to the data catalog/crawlers,etc., and also the S3 bucket where your data resides. Then, you can create your python scripts in the zeppelin notebook, and run from the zeppelin.
You can use dev instance provided by Glue, but you may incur additional costs for the same(EC2 instance charges).
Environment settings (updated in response to comments):
Change the drive name/ folders accordingly. Let me know if any help neeed.

Collecting Metrics with Graphite Plugin leads to "A metric named [..] already exists" error

when i configure the flink-conf.yaml to collect metrics with the graphite plugin,
the most time only incomplete metrics are being sent. On the Taskmanager output multiple errors occur like:
2018-08-15 00:58:59,016 WARN org.apache.flink.runtime.metrics.MetricRegistryImpl - Error while registering metric.
java.lang.IllegalArgumentException: A metric named mycomputer.taskmanager.8ceab4c3dfbf9fc5fa2af0447f1373a1.State machine job.Source: Custom Source.0.numRecordsOut already exists
at com.codahale.metrics.MetricRegistry.register(
at org.apache.flink.dropwizard.ScheduledDropwizardReporter.notifyOfAddedMetric(
at org.apache.flink.runtime.metrics.MetricRegistryImpl.register(
at org.apache.flink.runtime.metrics.groups.AbstractMetricGroup.addMetric(
at org.apache.flink.runtime.metrics.groups.AbstractMetricGroup.counter(
at org.apache.flink.runtime.metrics.groups.AbstractMetricGroup.counter(
at org.apache.flink.runtime.metrics.groups.OperatorIOMetricGroup.<init>(
at org.apache.flink.runtime.metrics.groups.OperatorMetricGroup.<init>(
at org.apache.flink.runtime.metrics.groups.TaskMetricGroup.addOperator(
at org.apache.flink.streaming.api.operators.AbstractStreamOperator.setup(
at org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.setup(
at org.apache.flink.streaming.runtime.tasks.OperatorChain.<init>(
at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(
I've tried this on a completely freshly prepared flink-1.6.0 release with following config and the precompiled "State machine job" in the examples folder:
metrics.reporters: grph
metrics.reporter.grph.class: org.apache.flink.metrics.graphite.GraphiteReporter localhost
metrics.reporter.grph.port: 2003
metrics.reporter.grph.interval: 1 SECONDS
metrics.reporter.grph.protocol: TCP
I use the official graphite docker image ( that is running on the default configuration.
Has anybody an idea, how i can fix this issue?
Thank's and best regards
to exclude that a specific local setting is responsible for this behaviour, I repeated the process on a clean EC2 instance. There's exactly the same error here.
How to reproduce:
start EC2 t2.xlarge
installed java
download flink at
added the flink-metrics-graphite-1.6.0.jar to lib
configured the flink-yaml.conf as mentioned in my previous post
./bin/flink run examples/streaming/StateMachineExample.jar
I have not set up graphite in this case, because the error obviously already
occurs before.
After the job has been started you can view the error in the flink dashboard under Task Manager -> Logs

Sonarqube 5.6 database copy fails (Exception sending context initialized event to listener...)

I need to move a Sonarqube 5.6 installation from one server to another.
The new server will also run with a new database so my plan is to copy
the old data to the new database and then start the new Sonarqube instance
against the new database containing the copied data.
Both new and old are Sonarqube 5.6 with Oracle. The old database is
Oracle 11g and the new will be Oracle 12c, but I am using the Oracle
Express 11g and another local Sonarqube 5.6 installation in order to test the procedure.
I proceed as follows:
(1) Export old database with SQL Developer as DDL (insert format)
(2) Make some small changes to resulting sql:
Tablespace name is hard-coded and different in target database so adapt
clause "SEGMENT CREATION DEFERRED" not supported in target database, so I simply deleted it
(3) Import sql to new target database
(4) Start new Sonarqube instance connecting to new database
After this unfortunately the Sonarqube server ends and in the logs I see the error/exception:
Exception sending context initialized event to listener instance of class org.sonar.server.platform.PlatformServletContextListener
(full text below).
Further tests:
If I start the new Sonarqube instance against the new database with no imported data
fresh tables are created and all is well. After doing that I can also export the new database,
drop and recreate the new sonarqube database user, and re-import the data from the new environment,
also works fine.
That is to say the new installation in stand-alone mode works fine, the export/import also works
fine (at least with minimal data and exported from the same environment / database).
The problem therefore seems to be caused by something in the data I am importing from the old
Sonarqube installation.
I have also tried after the import rebuilding all indexes (no change), and deleting all rows
from all tables (sonarqube then tries to create new tables and runs into an error because
table projects alreads exists).
One thing that does occur to me is that the old installation has many plugins. I have tried
to get the new installation to the same state but it is not totally identical, there are a few
version differences and the old installation had some licenced plugins (Swift and Objective C)
that I do not have for my local test installation. There are also a few error messages in the log
to that effect, but these don't seem to be the critical problem.
**2017.01.21 00:07:53 ERROR web[cpp] No license for cpp
2017.01.21 00:07:53 ERROR web[objc] No license for objc**
I have also tried deleting the logs, data, temp directories in Sonarqube before starting the
new server against the new database.
I have of course searched for this error message but it seems to mostly occur when migrating
from one Sonar version to another which is not the case here.
Does anyone have any thoughts?
Should this procedure theoretically work or have I missed something?
Many thanks for any ideas!
2017.01.21 00:08:29 INFO web[o.s.s.n.NotificationService] Notification service stopped
2017.01.21 00:08:29 ERROR web[o.a.c.c.C.[.[.[/]] Exception sending context initialized event to listener instance of class org.sonar.server.platform.PlatformServletContextListener
java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.NullPointerException
at ~[guava-17.0.jar:na]
at ~[sonar-server-5.6.jar:na]
at ~[sonar-server-5.6.jar:na]
at ~[sonar-server-5.6.jar:na]
at org.sonar.server.platform.platformlevel.PlatformLevelStartup$1.doPrivileged( ~[sonar-server-5.6.jar:na]
at org.sonar.server.user.DoPrivileged.execute( ~[sonar-server-5.6.jar:na]
at org.sonar.server.platform.platformlevel.PlatformLevelStartup.start( ~[sonar-server-5.6.jar:na]
at org.sonar.server.platform.Platform.executeStartupTasks( ~[sonar-server-5.6.jar:na]
at org.sonar.server.platform.Platform.doStart( ~[sonar-server-5.6.jar:na]
at org.sonar.server.platform.Platform.doStart( ~[sonar-server-5.6.jar:na]
at org.sonar.server.platform.PlatformServletContextListener.contextInitialized( ~[sonar-server-5.6.jar:na]
at org.apache.catalina.core.StandardContext.listenerStart( [tomcat-embed-core-8.0.30.jar:8.0.30]
at org.apache.catalina.core.StandardContext.startInternal( [tomcat-embed-core-8.0.30.jar:8.0.30]
at org.apache.catalina.util.LifecycleBase.start( [tomcat-embed-core-8.0.30.jar:8.0.30]
at org.apache.catalina.core.ContainerBase$ [tomcat-embed-core-8.0.30.jar:8.0.30]
at org.apache.catalina.core.ContainerBase$ [tomcat-embed-core-8.0.30.jar:8.0.30]
at [na:1.8.0_65]
at java.util.concurrent.ThreadPoolExecutor.runWorker( [na:1.8.0_65]
at java.util.concurrent.ThreadPoolExecutor$ [na:1.8.0_65]
at [na:1.8.0_65]
Caused by: java.util.concurrent.ExecutionException: java.lang.NullPointerException
at [na:1.8.0_65]
at java.util.concurrent.FutureTask.get( [na:1.8.0_65]
at ~[guava-17.0.jar:na]
at ~[sonar-server-5.6.jar:na]
... 18 common frames omitted
Caused by: java.lang.NullPointerException: null
at ~[na:1.8.0_65]
at ~[commons-io-2.4.jar:2.4]
at ~[commons-io-2.4.jar:2.4]
at org.sonar.db.source.FileSourceDto.decodeTestData( ~[sonar-db-5.6.jar:na]
at ~[sonar-server-5.6.jar:na]
at ~[sonar-server-5.6.jar:na]
at ~[sonar-db-5.6.jar:na]
at org.sonar.server.test.index.TestIndexer.doIndex( ~[sonar-server-5.6.jar:na]
at org.sonar.server.test.index.TestIndexer.doIndex( ~[sonar-server-5.6.jar:na]
at org.sonar.server.test.index.TestIndexer.doIndex( ~[sonar-server-5.6.jar:na]
at$2.index( ~[sonar-server-5.6.jar:na]
at$ ~[sonar-server-5.6.jar:na]
at java.util.concurrent.Executors$ ~[na:1.8.0_65]
... 4 common frames omitted
2017.01.21 00:08:29 ERROR web[o.a.c.c.StandardContext] One or more listeners failed to start. Full details will be found in the appropriate container log file
2017.01.21 00:08:29 ERROR web[o.a.c.c.StandardContext] Context [] startup failed due to previous errors
2017.01.21 00:08:29 WARN web[o.a.c.l.WebappClassLoaderBase] The web application [ROOT] appears to have started a thread named [Thread-4] but has failed to stop it. This is very likely to create a memory leak. Stack trace of thread: Method)
2017.01.21 00:08:29 WARN web[o.a.c.l.WebappClassLoaderBase] The web application [ROOT] appears to have started a thread named [Progress[BulkIndexer[tests]]] but has failed to stop it. This is very likely to create a memory leak. Stack trace of thread:
java.lang.Object.wait(Native Method)
2017.01.21 00:08:29 INFO web[o.a.c.h.Http11NioProtocol] Starting ProtocolHandler ["http-nio-"]
2017.01.21 00:08:29 INFO web[o.s.s.a.TomcatAccessLog] Web server is started
2017.01.21 00:08:29 INFO web[o.s.s.a.EmbeddedTomcat] HTTP connector enabled on port 9000
2017.01.21 00:08:29 WARN web[o.s.p.ProcessEntryPoint] Fail to start web
java.lang.IllegalStateException: Webapp did not start
at ~[sonar-server-5.6.jar:na]
at [sonar-server-5.6.jar:na]
at org.sonar.process.ProcessEntryPoint.launch( ~[sonar-process-5.6.jar:na]
at [sonar-server-5.6.jar:na]
2017.01.21 00:08:29 INFO web[o.a.c.h.Http11NioProtocol] Pausing ProtocolHandler ["http-nio-"]
2017.01.21 00:08:30 INFO web[o.a.c.h.Http11NioProtocol] Stopping ProtocolHandler ["http-nio-"]
2017.01.21 00:08:30 INFO web[o.a.c.h.Http11NioProtocol] Destroying ProtocolHandler ["http-nio-"]
2017.01.21 00:08:30 INFO web[o.s.s.a.TomcatAccessLog] Web server is stopped
2017.01.21 00:08:30 INFO app[o.s.p.m.Monitor] Process[es] is stopping
2017.01.21 00:08:31 INFO es[o.s.p.StopWatcher] Stopping process
2017.01.21 00:08:31 INFO es[o.elasticsearch.node] [sonar-1484953654097] stopping ...
Server fails to start if target database is not an exact copy of the source database. You should double-check that all tables and sequences have exactly the same content, values of primary keys included. A strategy is to start a fresh install on the target db so that SonarQube creates the schema. Then a data backup can be restored.
OK working now so just a quick update maybe it will help others... It seems it is necessary to let the new sonar instance initialise the new database and then to do a "hard" copy by which I mean in the SQl Developer the options copy objects, replace existing target objects, truncate target data before copy.
I couldn't quite figure this out because the initial startup must do something that causes the error I was getting to go away, so something must be left of it in the database even after the hard copy. Soft copy not replacing objects allowed Sonar to start but with problems - e.g. key violations when creating users or groups. The latter could be fixed by rebuilding indexes and/or or dropping and reactivating constraints, the former was the result of differing initial values of sequences used to set user-id. But the hard copy circumvented all these problems, so that is the route I would recommend. I also deleted the directories data, temp, logs from SONAR_HOME, I'm not 100% sure if this is necessary.
