Apache Zeppelin: very slow html output - apache-zeppelin

My goal is to take some data from python and/or scala interpreter in Zeppelin, and to finally display the data inline by some JavaScript library such as Plotly, D3, Vis, etc.
The perfect seamless integration would be to simply output the JavaScript incl. the stringified data via print("%html <script>" + content + "</script>").
Indeed, this works very well with all kind of libraries as long as the content is not too large, e.g., print("%html <script>alert(JSON.stringify({name: 'Peter', age: 24}))</script>")
However, if the content size grows, then the html output takes very, very long, e.g.:
%python
print("%html start")
s = "X" * 100000 # data of length 100k
print("<script>js='" + s + "'; alert(js.length)</script>") # takes > 1 minute!
Note that if I write the same output to a file and load it, there is no such delay. Thus, it is not caused by slow browser rendering, but probably by Zeppelin's way how %html output is processed?
Does anybody know how to fix or circumvent this problem?

Okay, I finally found the answer: this is a known bug.
https://issues.apache.org/jira/browse/ZEPPELIN-1360
A workaround is to use the %pyspark interpreter for python development, rather than the pure %python interpreter.

Related

Is there a Python alternative to IDL's array_indices() function?

I am working on a port of some IDL code to Python (3.7). I have a translation working which uses whatever direct Python alternatives are available, and supplementing what I can with idlwrap. In an effort to eliminate legacy IDL functions from the code, I am looking for an alternative to ARRAY_INDICES(). Right now, I have simply translated the entire function directly and import it on its own. I've spent a good deal of time trying to understand exactly what it does, and even after translating it verbatim, it is still unclear to me, which makes coming up with a simple Python solution challenging.
The good news is I only need it to work with one specific set of arrays whose shapes wont change. An example of the code that will be run follows:
temp = np.sum(arr, axis=0)
goodval = idlwrap.where(temp > -10)
ngood = goodval.size
arr2 = np.zeros_like(arr)
for i in range(0, ngood - 1):
indices = array_indices(arr2, goodval[i])
#use indices for computation

Pretty printing with Sagemath command line

How do I "pretty-print" something in the Sagemath command line (as in sympy)? The only solution I found was to use get_test_shell() function, according to this entry in the docs. But basically I have to use
from sage.repl.interpreter import get_test_shell
shell = get_test_shell()
shell.run_cell('%display ascii_art')
shell.run_cell('integral(x^2/pi^x, x)')
I find this to be too cumbersome (4 complicated lines just to display things nicely!) and what you want to display even has to be given as a string. So I'm looking for an easier alternative. I'm sure there's another way; I just can't find it.
I think I might be using the wrong terms because according to the manual itself:
Pretty printing means
rendering things so that MathJax or some other latex-aware front end
can render real math.
So if I use pretty_print_default(True) it's going to always print latex, which isn't what I want for now.
One thing I tried that I thought would work was to do
from sympy import pretty_print
Which should use Sympy's pretty-print (which is what I actually want), but it doesn't make any difference.
You ran across the problem that we want to be able to doctest things that are interactive by their nature. Try just this at the command line:
sage: %display ascii_art
sage: integral(x^2/pi^x, x)
/ 2 2 \ -x*log(pi)
-\x *log (pi) + 2*x*log(pi) + 2/*e
---------------------------------------------
3
log (pi)
This is one of the "magic" percent directives in IPython. I agree it could be documented better.

Generate MEL to rebuild a model in Maya

I made a 3D tree in Maya and need MEL code to run in a for loop and generate as many trees as I want. Is there any way to convert a model made in Maya into MEL code that would rebuild the tree?
I can't just duplicate it because the script needs to generate the tree from scratch. Unfortunately, I cleared my history because it was messy, so I am looking for a way to generate the MEL code given just the geometry.
It is slightly unfortunate that you deleted the history. If you had not then you could have just built a script out if the history tree. But all is not lost, still i will explain how you can do this in case you do remake the history at some point.
The complexity of your history is never really a issue, at least when modeling. I would advice to not delete any history until your sure its of no use. Having a 1000 node history is not in any way slowing you down since most of it is inactive. Its only a problem once you enter animation even then its not necessarily a problem at all. Since the history deletion is a bit destructive its a good idea to save the scene when you are deleting big areas of history. So rule of thumb do not delete history and use the freeze transformation operation unless its really necessary (and it may be you never actually do this).
Another side remark: Copying stuff is still valid, unless like in this case there is a artificial limitation. The only time when this limitation is valid is if this is homework (in which case you may want to demonstrate that your willing to follow instruction). Doing a copy does not detract in any way the end result. Copying is mel (and all maya ascii files are mel of sorts).
Method 1: (when history is cleared)
You can reduce the problem to a normal rigging operation. What you do is you rig the tree that you built with bones, clusters and possibly blend shapes. Then just randomize the channels of your rig on each copy operation. The neat thing here is that the script can be easily extended with different base trees, or just about anything, just by making new separate rigs.
The bonus here is that you could now also easily animate the tree being blown by the wind etc.
This might feel like something outside the scope of the assignment. However a good MEL programmer does not really separate tasks. Using a node to do something is just as valid, if not more valid, as writing everything in code. Something like this demonstrates really thorough understanding of maya use and maya programming. On the other hand not all users are as enlightened (most are not).
Everything in Maya is or at least should be a rig. (the auxiliary to this is that your mel should strive to build a rig or you should use API to make nodes)
Method 2: (when history is cleared)
Chunk your result into pieces. Then randomize the chunk positions with L-system like rule based generator. It may seem counter intuitive to use a L-system with hand modeled pieces when the entire thing could be generated with a L-system.
But clunking allows for a very simple l-system to be built. The end result also retains the artistic integrity of the original in some ways that might be much more pleasing.
A slightly different version of this and method 3 is to make 2-3 overlapping set of branches in the model and then randomly delete branches for variation.
Method 3: (when history is cleared)
Cheat, well i would actually argue that in CG there's no such thing as cheating. Just copy the same tree about, alter its rotation ans scale. And randomize new shaders to your model (swap different leaf textures different color etc). When combined with even slight amount of rigginng approach can totally hide the fact that there is just one tree. Copies dont have to be all that exact they might be shaped with a lattice or something. This is actually pretty efficient, most people wouldn't notice.
Method 4: (when history is not cleared)
When you model something manually maya records this quite efficiently. The history can be turned into a script with little or no effort. The best kept secret. If you save this kind of scene as maya ascii then the maya ascii file is (almost) just mel and you can repeat this by adding variables to the stream to instrument it to mel. Beware tough that not all tools build meaningful history so point tweaks should be done on clusters instead on manual point tweaking that ends up in the shapes tweak array.
It is also possible to build the mel automatically with a mel script that checks for connections and values in nodes to reproduce the thing in code. The good thing is that you can readily introduce the instrumentation. Here's a really old version of code as giveaway (you need to handle the shaders and inputComponents for example):
/* jooConvertNodeNetworkToMelLadder.mel 0.0
Authors: Janne 'Joojaa' Ojala
Testing: Has known bugs, code deprecated, no bug reporting
Licence: Creative Commons Attribution-ShareAlike 3.0 Unported
About:
A deprecated early version of code generation it has some bugs and
is nowhere near perfect but you can have this code for the heck of
it. Main thing missing is that it does not check for selection
sets for nodes so you need to make those manually.
Install Instructions:
copy jooConvertNodeNetworkToMelLadder.mel to your maya script
directory
Usage:
Select node and run jooConvertNodeNetworkToMelLadder from mel
commandline.
Deprecated at 31.12.2006
*/
proc string generateAttrib(string $node, string $createNode,
string $attrib,string $varName){
string $return="";
if (getAttr($node + "." + $attrib) !=
getAttr($createNode + "." + $attrib))
$return = (" setAttr (" +
$varName + "+\"." + $attrib+"\") "+
getAttr($node + "." + $attrib) + ";\n");
return $return;
}
proc string doOneNode(string $nodeOrig,string $connections[]){
string $hist[]=`listConnections -c 1 -p 1 $nodeOrig`;
string $return="";
$type=`nodeType $nodeOrig`;
$varName=("$"+$nodeOrig);
$createNode=`createNode $type`;
$return=(" "+$varName+"=`createNode -n "+$nodeOrig+" "+$type+"`;\n");
if ($type != "mesh") {
for ($attrib in `listAttr -settable -multi -w -scalar $createNode`){
$return += generateAttrib($nodeOrig, $createNode,
$attrib, $varName);
}
}
delete $createNode;
for ($i=0;$i< size($hist);$i=$i+2){
if (size(`connectionInfo -dfs $hist[$i]`)){
string $node,$port,$type,$nodeOrig,$portOrig;
$node=`match "[^.]*" $hist[$i+1]`;
$port=`match "[.].*$" $hist[$i+1]`;
$nodeOrig=`match "[^.]*" $hist[$i]`;
$portOrig=`match "[.].*$" $hist[$i]`;
$connections[size($connections)]=
(" connectAttr -f ($" +
$nodeOrig + "+\"" + $portOrig + "\") ($" +
$node+"+\""+$port+"\");");
}
}
return $return;
}
global proc jooConvertNodeNetworkToMelLadder()
{
print "\n// jooConvertNodeNetworkToMelLadder result:\n{\n";
string $a[]={};
for ($node in `listHistory`)
print(`doOneNode $node $a`);
print $a;
print "}\n";
}
But this is all I have time for, hope this helps.

Automatically Generate C Code From Header

I want to generate empty implementations of procedures defined in a header file. Ideally they should return NULL for pointers, 0 for integers, etc, and, in an ideal world, also print to stderr which function was called.
The motivation for this is the need to implement a wrapper that adapts a subset of a complex, existing API (the header file) to another library. Only a small number of the procedures in the API need to be delegated, but it's not clear which ones. So I hope to use an iterative approach, where I run against this auto-generated wrapper, see what is called, implement that with delegation, and repeat.
I've see Automatically generate C++ file from header? but the answers appear to be C++ specific.
So, for people that need the question spelled out in simple terms, how can I automate the generation of such an implementation given the header file? I would prefer an existing tool - my current best guess at a simple solution is using pycparser.
update Thanks guys. Both good answers. Also posted my current hack.
so, i'm going to mark the ea suggestion as the "answer" because i think it's probably the best idea in general. although i think that the cmock suggestion would work very well in tdd approach where the library development was driven by test failures, and i may end up trying that. but for now, i need a quicker + dirtier approach that works in an interactive way (the library in question is a dynamically loaded plugin for another, interactive, program, and i am trying to reverse engineer the sequence of api calls...)
so what i ended up doing was writing a python script that calls pycparse. i'll include it here in case it helps others, but it is not at all general (assumes all functions return int, for example, and has a hack to avoid func defs inside typedefs).
from pycparser import parse_file
from pycparser.c_ast import NodeVisitor
class AncestorVisitor(NodeVisitor):
def __init__(self):
self.current = None
self.ancestors = []
def visit(self, node):
if self.current:
self.ancestors.append(self.current)
self.current = node
try:
return super(AncestorVisitor, self).visit(node)
finally:
if self.ancestors:
self.ancestors.pop(-1)
class FunctionVisitor(AncestorVisitor):
def visit_FuncDecl(self, node):
if len(self.ancestors) < 3: # avoid typedefs
print node.type.type.names[0], node.type.declname, '(',
first = True
for param in node.args.params:
if first: first = False
else: print ',',
print param.type.type.names[0], param.type.declname,
print ')'
print '{fprintf(stderr, "%s\\n"); return 0;}' % node.type.declname
print '#include "myheader.h"'
print '#include <stdio.h>'
ast = parse_file('myheader.h', use_cpp=True)
FunctionVisitor().visit(ast)
UML modeling tools are capable of generating default implementation in the language of choice. Generally there is also a support for importing source code (including C headers). You can try to import your headers and generate source code from them. I personally have experience with Enterprise Architect and it supports both of these operations.
Caveat: this is an unresearched answer as I haven't had any experience with it myself.
I think you might have some luck with a mocking framework designed for unit testing. An example of such a framework is: cmock
The project page suggests it will generate code from a header. You could then take the code and tweak it.

Easy way to plot and display arrays?

First post here. Using C in Visual Studio 2008. Can work with VS 2005 if necessary.
How do I display numerical data in arrays as in a spreadsheet?
How do I plot numerical data in arrays?
These seem to be simple questions. But I cannot find solutions. So far, I would print the data to a file, import into Excel and view/plot. However, with this code there are too many arrays--so the print/import/plot is tiring.
Some constraints.
I do not want to write 20+ lines of code to do the above. MATFOR or Array Visualizer let you do the plotting with a one line function call.
They cannot display the data in a convenient format. I would like to display the data and the plot in one or two windows so that they are visible simultaneously.
This is a win32 console application---all the code is portable.
Will be using these during debugging.
Free or paid.
While I am looking for something specific, the requirements are substantially the same for any one doing numerical work with arrays and matrices--displaying data and plot simultaneously.
I am hoping that a such a tool has been written and is available.
I am also open to a solution that outputs the array data to an Excel sheet (can keep Excel open) and if it can also plot that can be great but I can live without plotting.
PS: I need this only when debugging the code.
I use ArrayDebugView which is a plug-in you install in Visual studio and draws graphs out of arrays while you are debugging your application. It works as a visual way of variable watch in debug mode. You don't need to write a line of code.
I can't think of any library that would enable what you want in a console app in less than 20 lines of code. My suggestion would be instead to script the plotting-step using MATLAB og GNU Octave to do the actual plotting.
In order to display numerical data in array, you should add the pointer to the first data element you want to observe, into the watch --- if you want to observe the array from the beginning, it would just be the array name, which is the pointer to the first element. In order to view more then one element, you add a "," after the pointer, followed by the number of element you want to observe.
For example, in order to observe the elements of float farray[100];, you should add to the watch farray,100.
In order to plot, you can copy-paste from the watch to your plotting software (i.e. excel), but it is not very convenient as you cannot copy the data column alone, but the columns to the left and right as well, so it involves extra manual editing.
I use GNUPlot (http://www.gnuplot.info/) to display my performance/speedup measurements.
I print my numbers to stdout and wrote a bash script that combines these numbers and calls gnuplot for rendering.
I made a simple plotting program for that purpose. There is only a textbox where I paste the data and a chart where it's drawn.
The data needs to be in either form:
with an automatic X (increment by 1 for each value): seriesName value
for both X and Y specified: seriesName xvalue yvalue
Most of the time I used to plot data from tracepoints.
I copy/paste the whole output window of VS, the plotting program ignores anything that doesn't follow these 2 forms (so I don't have to cleanup the string and put it in excel and all).
It does line, point, colum, area charts and save image, copy to clipboard.
MiniPlot
There are several ways to do this but this will require writing some code. Visualizing data is generally easy and straight forward but visualizing data exactly the way you want them to look will require some additional code and work.
There are several options to visualize data:
A combination of BASH and GNUPLOT
Use MATLAB or OCTAVE for all your calculations and visualization
Use PYTHON and SciPy and matlibplot libraries.
Gnuplot is a great tool to plot data but it is cumbersome to use. It looks fabulous if you invest time to get the plots right and combines excellent with LaTeX and has a good fit implementation for arbitrary functions. Visit http://gnuplot-tricks.blogspot.ch/ great site to learn all about gnuplot.
Numerical programs such as MATLAB and it's open source equivalent OCTAVE are great because they are fast implementation languages for numerical programs and have extensive additional libraries especially MATLAB. For high load numerical computing it is really slow and the plot library is only good for basic plotting needs.
Using PYTHON and its scientific programing libraries (SciPy and matlibplot) are a great combination. This allows excellent plot which are not as cryptic as gnuplot to plrogram and it is more flexible than MATLAB in plotting. Additionally it gives you a environment for numerical programing like MATLAB.

Resources