Error restarting Zeppelin interpreters and saving their parameters - apache-zeppelin

I have Zeppelin 0.7.2 installed and connected to Spark 2.1.1 standalone cluster.
It has been running fine for quite a while until I changed the Spark workers' settings, to double the workers' cores and executor memory. I also tried to change the parameters SPARK_SUBMIT_OPTIONS and ZEPPELIN_JAVA_OPTS on zeppelin-env.sh, to make it request for more "Memory per node" on the Spark workers but it always requests only 1GB per node so I removed them.
I had an issue while developing a paragraph so I tried set zeppelin.spark.printREPLOutput to true on the web interface. But when I tried to save that setting, I only got a small transparent red box at right side of my browser window. So it fails to save that setting. I also got that small red box when I tried to restart the Spark interpreter. The same actually happens when I tried to change the parameters of all other interpreters or restart them.
There is nothing on the log files. So I am quite puzzled on this issue. Do any of you has ever experienced this issue? If so, what kind of solutions that you applied to fix it? If not, do you have any suggestions on how to debug this issue?

I am sorry for the noise. The issue was actually due to my silly mistake.
I actually have Zeppelin behind nginx. I recently played around with a new CMS. I didn't separate the configuration of the CMS and the proxy to Zeppelin. So any access to location containing /api/, like restarting Zeppelin interpreters or saving the interpreters' settings, got blocked. Separating the site configuration of the CMS and the proxy to Zeppelin on nginx solves the problem.

Related

Appengine runs a stale version of the code -- and stack traces don't match source code

I have a python27 appengine application. My application generates a 500 error early in the code initialization, and I can inspect the stack trace in the StackDriver debugger in the GCP console.
I've since patched the code, and I've re-deployed under the same service name and version name (i.e. gcloud app deploy --version=SAME). Unfortunately, the old error still comes up, and line numbers in the stack traces reflect the files in the buggy deployment. If I use the code viewer to debug the error, I am however brought to the updated patched code in the online viewer -- and there is a mismatch. It behave as if the app instance is holding on to a previous snapshot of the code.
I'm fuzzy on the freshness and eventual consistency guarantees of GAE. Do I have to wait to get everything to serve the latest deployed version? Can I force it to use the newer code right away?
Things I've tried:
I initially assumed the problem had to do with versioning, i.e. maybe requests being load-balanced between instances with the same version, but each with slightly different code. I'm a bit fuzzy on the actual rules that govern which GAE instance gets chosen for a new request (esp whether GAE tries to reuse previous instances based on a source IP). I'm also fuzzy on whether or not active instances get destroyed right away when different code is redeployed under the same version name.
To take that possibility out of the equation, I tried pushing to a new version name, and then deleting all previous versions (using gcloud app versions list to get the list). But it doesn't help -- I still get stack traces from the old code, despite the source being up to date in the GCP console debugger. Waiting a couple hours doesn't do anything either.
I've tried two things:
disabling and re-enabling the application in GAE->Settings
I'd also noticed that there were some .pyc files uploaded in the snapshot, so I removed those and re-deployed.
I discovered that (1) is a very effective way to stop all running appengine instances. When you deploy a new version of a project, a traffic split is created (i.e. 0% for the old version and 100% for the new), but in my experience old instances might still be running if they've been used recently (despite them being configured to receive 0% of traffic). Toggling kills them all immediately. I unfortunately found that my stale code was still being used after re-enabling.
(2) did the trick. It wasn't obvious that .pyc were being uploaded. I discovered it by looking at GCP->StackDriver->Debug and I saw .pyc files in the tree snapshot.
I had recently updated my .gitignore to ignore locally installed pip runtime dependencies for the project (output of pip install -t lib requirements.txt). I don't want those in git, but they do need to ship as part of my appengine project. I had removed the #!.gitignore special include line from .gcloudignore. However, I forgot to re-add *.pyc into my .gcloudignore.
Another way to see the complete set of files included in an app deployment is to increase the verbosity to info on the gcloud app deploy command -- you see a giant json manifest with checksums. I don't typically leave that on because it's hard to visually inspect, but I would have spotted the .pyc in there.

Displaying errors from Zeppelin on Amazon EMR

I'm running Apache Zeppelin on AWS EMR for the first time and I'm unable to see the errors after running code. The only thing I see is the status changes from Running to ERROR.
I have also checked the Hadoop Resource Manager and the logs do not contain any errors.
I would assume that Zeppelin should display the errors in the results window. Is this not the correct assumption?
The error should be displayed.
BTW What EMR version are you using?
I just tested Zeppelin Tutorial in EMR-5.2.0 and it's working well.
Zeppelin behaves like this when Spark and Hadoop are not running. Maybe it's simple as that.
We had the same issue with emr-5.0.3. This is a known bug. Switched to emr-5.2.0 and it is working fine
You can edit interpreter settings from the menu on the right top corner. Edit spark properties section to add zeppelin.spark.printREPLOutput as name and set its value to true.

Bluemix Monitoring and Analytics: Resource Monitoring - JsonSender request error

I am having problems with the Bluemix Monitoring and Analytics service.
I have 2 applications with bindings to a single Monitoring and Analytics service. Every ~1 minute I get the following log line in both apps:
ERR [Resource Monitoring][ERROR]: JsonSender request error: Error: unsupported certificate purpose
When I remove the bindings, the log message does not appear. I also greped my code for anything related to "JsonSender" or "Resource Monitoring" and did not find anything.
I am doing some major refactoring work on our server, which might have broken things. However, our code does not use the Monitoring service directly (we don't have a package that connects to the monitoring server or something like that) - so I will be very surprised if the problem is due to the refactoring changes. I did not check the logs before doing the changes.
Any ideas will help.
Bluemix have 3 production environments: ng, eu-gb, au-syd, and I tested with ng, and eu-gb, both using 2 applications with same M&A service, and tested with multiple instances. They are all work fine.
Meanwhile, I received a similar problem that claim they are using Node.js 4.2.6.
So there are some more information we need to know to identify the problem:
1. Which version of Node.js are you using (Bluemix Default or any other one)
2. Which production environment are you using? (ng, eu-gb, au-syd)
3. Is there any environment variables are you using in your application?
(either the creating in code one, or the one using USER-DEFINED Variables)
4. One more thing, could you please try to delete the M&A service, and create it again, in case we are trapped in a previous fault of M&A.
cf ds <your M&A service name>
cf cs MonitoringAndAnalytics <plan> <your M&A service name>
NodeJS versions 4.4.* all appear to work
NodeJS uses openssl and apparently did/does not like how one of the M&A server certificates were constructed.
Unfortunately NodeJS does not expose the openssl verify purpose API.
Please consider upgrading to 4.4 while we consider how to change the server's certificates in the least disruptive manner as there are other application types that do not have an issue with them (e.g. Liberty and Ruby)
setting node js version 4.2.4 in package.json worked for me, however this is an alternative by-passing solution. Actual fix is being handled by core team. Thanks.

unable to run "update.php"

I had a Drupal website in drupal 6.x When i wanted to upgrade it to 7.x, the custom theme i was using was absolutely outdated, so i imported my Zen subtheme from another website. For doing that, i just copied the folders "zen" and "mytheme" from the other installation to this one. Then i selected "mytheme" as the current theme and changed some little things to adapt it.
The fact is now i'm experiencing the following problems:
I can't auto-update nothing. It downloads the files but then it stacks (and leaves the site in Maintenance Mode).
I can't run update.php. I can go on until i reach the "Run updates" step, which shows a blank screen. This is the main issue, i think, as it evidences -in my oppinion- a problem of communication between the site and the database.
In the Webform module i can't select "File" as a Field Type. This one is not that important, but it may be relevant for someone who knows more than me.
Edit:
Recently i migrated from a Shared to a VPS server. When i migrated, the subdomain in which i was experiencing the issue threw an incompatibility related to the lenght of the password of the database. I changed the password and it started to work again. This may help somebody for helping me. Thanks!

Embedded Solr on Amazon AWS

Currently, I have developed a web application. In my web application, I used embedded solr server to make indexing. After that I deployed onto the Tomcat 6 on window xp. Everything is ok. Next, I have tried my web application to deploy on Amazon AWS. My platform is linux + mysql. When I deployed, I got the exception related with embedded solr.
[ WARN] 19:50:55 SolrCore - [] Solr index directory 'solrhome/./data/index' doesn't exist. Creating new index...
[ERROR] 19:50:55 CoreContainer - java.lang.RuntimeException: java.io.IOException: Cannot create directory: /usr/share/tomcat6/solrhome/./data/index
at org.apache.solr.core.SolrCore.initIndex(SolrCore.java:403)
at org.apache.solr.core.SolrCore.<init>(SolrCore.java:552)
at org.apache.solr.core.CoreContainer.create(CoreContainer.java:480)
So how to fix my problem. I am novie to linux.
My guess is that the user you are running Solr under does not have permission to access that directory.
Also, which version of Solr are you using? Looks like 3+. The latest version is 4, so it may make sense to try using that from the start. Probably a bit more troubleshooting to start, but a much better pay off that starting with legacy configuration.
I got solution. That is because of permission affair on Amazon Linux with ec2-user. So , I changed permission by following.
sudo chmod -R ugo+rw /usr/share/tomcat6
http://wiki.apache.org/solr/SolrOnAmazonEC2strong text
t should allow access to ports 22 and 8983 for the IP you're working from, with routing prefix /32 (e.g., 4.2.2.1/32). This will limit access to your current machine. If you want wider access to the instance available to collaborate with others, you can specify that, but make sure you only allow as much access as needed. A Solr instance should not be exposed to general Internet traffic. If you need help figuring out what your IP is, you can always use whatismyip.com. Please note that production security on AWS is a wide ranging topic and is beyond the scope of this tutorial.

Resources