What are the options for getting data into JS/Angular from an Impala query within a Zeppelin note? - apache-zeppelin

I'm currently getting data from an Impala query into Javascript/Angular, within a Zeppelin note, by using a %angular paragraph that does an AJAX call to the Zeppelin Notebook REST API (to run a paragraph synchronously, doc: https://zeppelin.apache.org/docs/0.8.2/usage/rest_api/notebook.html#run-a-paragraph-synchronously ) - this runs another paragraph in the same note, which is set to %impala and has the query. Doing this via the API means I get the data back into Javascript as the REST response to the API call. (That API link seems to be the only call in the whole API that will run a paragraph and return data from it).
However, it's been proving a bit unreliable in our big corporate setting for various reasons (network and policy related etc). My question is are there ANY other ways, within Zeppelin, that I could use to get data from a query (on a huge database), into Javascript. The data is often a few 10's of KB but could potentially by multi-MB.
For example, I know I can run pyspark .sql("My query here") type queries in a %pyspark paragraph but how do I then get the data over to JS running in the %angular para? Is the only option the same API call? I know of variable binding using the z. context but not sure that can cope with larger data sizes?
Any suggestions very welcome! Thanks

Related

REACT return WAV audio blob to backend, Streamlit custom component

In the past, I have built a custom component (REACT based) for the Streamlit-Framework which lets the user record audio inside a web browser of choice. Please have a look at the current version of streamlit-audio-recorder here.
As a mediocre web developer, I did not succeed in converting audio data stored in the browser's cache (audio-blob object) so that I can return it to the Streamlit backend.
What I have tried so far & my thought process:
There exist various scripts that enable saving audio to a local disk. However, none of these solutions work in an online-deployed scenario. (the program would save to the server's disk instead of the user's). This is why I came to the conclusion that this issue requires a solution that uses the audio data which is stored in the user's browser cache after being recorded.
The data stored in this cache via the audio-blob format, can not directly be passed back to python as a return variable and needs to be converted to an "environment agnostic datatype" (I tried binary base64). This conversion's complexity scales exponentially with the length of the audio data. Therefore I considered splitting the audio-blob into slices which can then be converted, aggregated and returned to Python. However, this process of splitting and concatenating WAV-audio blobs was not possible for me to implement due to the data structure/metadata inside the wav-file and the lack of libraries that would enable audio-blob slicing etc.
Does somebody know of a more elegant and performant solution? This would enable to finalize the audio recorder component and provide immense value to the Streamlit community which currently lacks comparable functionality.

will gatling actually perform the operation or will it check only the urls' response time?

I have a gatling test for an application that will answer a survey and upon answering this survey, the application will identify possible answers that may pose a risk and create what we call riskareas. These riskareas are normally created in the background as soon as the survey answering is finished. My question is I have a gatling test with ten users who will go and answer the survey and logout, I used recorder to record the test; now after these ten users are finished I do not see any riskareas being created in the application. Am I missing something--should the survey be really answered by gatling (like it does in selenium) user or is it just the urls that the gatling test will touch ?
I am new to gatling please help.
Gatling should be indistinguishable from a user in a web browser (or Selenium) as far as the server is concerned, so the end result should be exactly the same as if you'd gone through the process yourself. However, writing a Gatling script is a little more work than writing a Selenium script.
For performance reasons, Gatling operates at a lower level than Selenium. Gatling works with the actual data that is sent and received from the server (i.e, the actual GETs and POSTs sent to the server), rather than with user-level interactions (such as clicking links and filling forms).
The recorder will generally produce a relaitvely "dumb" script. It records the exact data that was sent to the server, and makes no attempt to account for things that may change from run to run. For example, the web application you are testing might have hidden form fields that contain session information, or the link addresses might contain a unique identifier or a session id.
This means that your script may not be doing what you think it's doing.
To debug the script, the first thing to do is to add checks on each of the requests, to validate that you are getting the response you expect (for example, check that when you submit page 1 of the survey, you are taken to page 2 - check for something that you'd only expect to find on page 2, like a specific question).
Once you know which requests are failing, look at what data was sent with the request, and try to figure out where it came from. You will probably find that there are session ids, view state, or similar, that must be extracted from the previous page.
It will help to enable request and response logging, as per the documentation.
To simplify testing of web apps, we wrote some helper functions to allow tests to be written in a more Selenium-like way. Once you understand what your application is doing, you may find that it simplifies scripting for you too. However, understanding why your current script doesn't work the way you expect should be your first step.

In Solr What is the use of BinaryResponseParser?

I have to use BinaryResponseParser in my application.I dont know what it is the use of BinaryResponseParser. As I searched in web I got one information that "A BinaryResponseParser that sends callback events rather then build a large response"
Here what is the call back events in the response.Can any one clearly explain what is the call back event and how it was used in Solr.
If am not using BinaryResponseParser in my application means , what will be the effect ?
BinaryResponseParser is a parser for Solr responses serialized in a binary format as opposed to json or xml. The class has one method that can be of use to you (v. 3.6.x):
public NamedList<Object> processResponse(InputStream body, String encoding).
The advantage of using binary serialization is that your response size will be much smaller. This can be critical for the performance of live IR systems like Lucene/Solr - just imagine an auto-suggest service, which has to provide a list of suggestions for a user for every key-stroke. The response must be dellivered to the user bellow 100ms. If you don't use binary encoding, your response will be larger and consequently will take a bit longer to transfer over HTTP.
I suggest you take a look at SolrJ - a Java client for Solr that will possibly solve most of your problems.

Large server-side data: JSON vs XML vs JSTL EL [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
I'm working with a web where many queries have to be performed to a MYSQL database. I use Java+Tomcat as development environment.
I have tried different ways to solve this:
At the beginning I started using jstl with the sql tag. Every jsp document created this way seems 'dirty' with many sql queries using the jstl sql tag and you have to change every document when the DB changes.
The second solution I tried was using lists and maps in servlets to access the data from jstl el. It looks fine but seems a bit strange -at least for me- to have to access each document with a custom url (firstly you manipulate the data with several classes until you set the request attributes and call the jsp document RequestDispatcher) and sometimes it could fail (missing queries needed to fill some menus or tables) if user jumps from here to there in the navigation sequence.
The final solution I'm trying now is sending and receiving JSON data to/from server-side. I like it (everything happens in the same url context, the html document part is clean, more dynamic web...) but JSON data have to be processed with javascript in client-side, which could affect the performance for large data chunks, and maybe have some size limitations due to the string format. For example JSON works fine for data splitting using DataTables js library where I only get a max -I set this parameter- of 200 rows at the same time from a DB. But it slows the web when I don't perform this data splitting and show, for example, a multiple combobox or a table with all the rows inside a table in the MYSQL DB.
Some years ago I developed a desktop app (C#) with embedded flash for navigation and xml for data exchange between DB and APP. But I think XML is better for data exporting among different apps, I don't need more files with partial data if I already have a DB.
So, guys, what do you think is the best solution? I would like to check different points of view.
Dump JSP tag libraries and switch to JSON.
I often see development teams using JSP tag libraries when they shouldn't be. I wrote this post to explain why it's best to view JSP tag libraries as relics of the past.
Most non-trivial web applications store data in a database on the server side. These applications need mechanisms that allow the clients (web browsers) and servers (e.g. Java application servers) to exchange data. Typically, either a) data needs to be displayed for the user (so, the client sends the look up criteria to the server and the server responds with the relevant data) or b) the user changes data in the browser and the client needs to submit the data modification to the server for processing and/or permanent storage.
Until recently, most Java web applications have used JSP tag libraries as a client-side mechanism to extract data out of Java objects (JavaBeans) passed back and forth between clients (web browsers) and servers as part of the JSP/servlet paradigm offered by Java. (Note: JSPs are HTML files that get converted into Java servlets so that they can contain Java code for manipulating server-side Java objects.) In each case, the server responds with a new page (with embedded data), also known as a full page refresh.
In 1995, AJAX came along and changed the full page refresh paradigm described above. AJAX allows for partial page refreshes and data exchanges between the browser and the app server without having to do full page refreshes.Since then, AJAX has continuously gained momentum with support built into popular frameworks like Spring (for Java) in v3.0/2010 and jQuery (for JavaScript) in v1.5/2011.
The data exchange format that works best with AJAX is JSON, since JSP tag libraries cannot be invoked unless a full page refresh is involved. There are several options for mapping between the server-side Java model objects (JavaBeans) and JSON, which can easily be consumed by JavaScript running in the browser. (Note: Since JSON is the literal representation of a JavaScript object, the conversion from JSON to JavaScript object is trivial.) The option I recommend and have been using is Spring MVC's #RequestBody and #ResponseBody annotations as part of the controller method definitions (which leverage the Jackson library) to automatically map JavaBeans to JSON and back (see figures 3 and 4). (The alternative is to use a proprietary framework like Direct Web Remoting or DWR, which I do not recommend for obvious reasons.)
As a result, I recommend to most teams I consult with that it's best to abandon JSP tag libraries entirely in favor of a pure AJAX/JSON based approach.
Here's a summary of the reasoning behind my recommendation to use AJAX/JSON exclusively (even for full page refreshes).
Unless, you have a very simple application, you will likely need to support partial page refreshes using AJAX (rather than do a full page refresh each time that some data needs to change on the page). To do so, you need to map between Java objects (JavaBeans) and JavaScript objects (JSON) in order to exchange data between the browser/client and the application server. Therefore, it probably doesn't make much sense to support two channels for data exchange (JSP tag libraries for full page refreshes and AJAX/JSON for partial page refreshes). And if you have to pick one it has to be AJAX/JSON, since JSP tag libraries don't work for partial page refreshes. Hence, my recommendation to go head first with AJAX/JSON and abandon JSP tag libraries. But if you need more incentive, please read on.
I have worked with teams that have analyzed the size of the data being shuttled back and forth across the network and found that JSON consumes a lot less network bandwidth than the JavaBeans/JSP tag library approach or even XML payloads. Their analysis seems to make sense to me since JSON is a bare bones pure text format without the syntactical overhead involved with XML or the rich object overhead involved with JavaBeans.
Relative to the acrobatics required for manipulating JavaBeans using JSP tag libraries (see figure 1), the JavaBeans to JSON mapping is completely seamless with Spring MVC and requires no coding whatsoever (see figure 2). Whether or not you're using JSP tag libraries, chances are that you need to populate JavaScript objects with the data in order for the data to be consumed by jQuery widgets. In other words, the JavaScript object(s) are required in regardless of whether you use JSP tag libraries or not. Abandoning JSP tag libraries allows you to skip step 2 (see figures) and go straight to JSON and the corresponding JavaScript object(s) without having to muddle through the manipulation of JavaBean objects using JSP tag libraries.

Writing data into a database using a fully REST web service

How would one create a REST web service to write a row into a databse table. Use the follwoing scenario:
The table is called Customer - the
data to be inserted into the row would
be the name, addresss, telephone
number, email.
I think its impossible to describe the whole thing end to end in Java or C#, and I would never expect that, but here are the questions I have popping into my head as I prepare for coding:
How the URI would look (eg. if you use this URL - http://www.example.com/)?
What info would go into the HTTP envelope?
Would I use POST when writing to the database in this way?
Do I use a resource to store the posted data from the client? Is this even necessary if the data is being written to a database anyway?
When the data to be writeen into the db is recieved by the server - how do I physically insert it into the database - do I call some method on the server to actually write the data (in Java)? - this doesn't seem to fit with truely REST architecture - shunning RPC calls.
Should I even be bothering writing to a DB - should I be storing my data as a resource?
As you can see I need a few issues clearing in my head. Any help much appreciated.
First of all, I'm not either java nor c# expert and I don't exactly know what means do these languages have to support REST design, but in general:
http://www.example.com/customers - customers is a collection of resources and you want to add a new resource to this collection
It depends on various things - you should probably set the content-type header (according to the data format in which you are sending the representation) and set some authentication headers if you need it.
Yes, you always use POST to create a new entry in a collection of resources.
I don't fully understand this question, to be honest. What do you mean by "inmediately writing data into the database"?
REST is primarily just a style of communication between server and a client. It doesn't say anything about how you should handle the data received by using it. The usual way how modern web approaches (MVC style frameworks) solve it, is by routing every REST action to a method of some class (usually a controller instance) where you handle the received parameters (eg. save them to the database) and generate a response to be sent back.
For a very brief and very clear introduction to REST have a look at this short video.
RESTful Web Services, published by O'Reilly and Associates, seems to fit the bill you're looking for.
As far as doing it in Java, Sun has a page on it.

Resources