Large server-side data: JSON vs XML vs JSTL EL [closed] - database

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
I'm working with a web where many queries have to be performed to a MYSQL database. I use Java+Tomcat as development environment.
I have tried different ways to solve this:
At the beginning I started using jstl with the sql tag. Every jsp document created this way seems 'dirty' with many sql queries using the jstl sql tag and you have to change every document when the DB changes.
The second solution I tried was using lists and maps in servlets to access the data from jstl el. It looks fine but seems a bit strange -at least for me- to have to access each document with a custom url (firstly you manipulate the data with several classes until you set the request attributes and call the jsp document RequestDispatcher) and sometimes it could fail (missing queries needed to fill some menus or tables) if user jumps from here to there in the navigation sequence.
The final solution I'm trying now is sending and receiving JSON data to/from server-side. I like it (everything happens in the same url context, the html document part is clean, more dynamic web...) but JSON data have to be processed with javascript in client-side, which could affect the performance for large data chunks, and maybe have some size limitations due to the string format. For example JSON works fine for data splitting using DataTables js library where I only get a max -I set this parameter- of 200 rows at the same time from a DB. But it slows the web when I don't perform this data splitting and show, for example, a multiple combobox or a table with all the rows inside a table in the MYSQL DB.
Some years ago I developed a desktop app (C#) with embedded flash for navigation and xml for data exchange between DB and APP. But I think XML is better for data exporting among different apps, I don't need more files with partial data if I already have a DB.
So, guys, what do you think is the best solution? I would like to check different points of view.

Dump JSP tag libraries and switch to JSON.
I often see development teams using JSP tag libraries when they shouldn't be. I wrote this post to explain why it's best to view JSP tag libraries as relics of the past.
Most non-trivial web applications store data in a database on the server side. These applications need mechanisms that allow the clients (web browsers) and servers (e.g. Java application servers) to exchange data. Typically, either a) data needs to be displayed for the user (so, the client sends the look up criteria to the server and the server responds with the relevant data) or b) the user changes data in the browser and the client needs to submit the data modification to the server for processing and/or permanent storage.
Until recently, most Java web applications have used JSP tag libraries as a client-side mechanism to extract data out of Java objects (JavaBeans) passed back and forth between clients (web browsers) and servers as part of the JSP/servlet paradigm offered by Java. (Note: JSPs are HTML files that get converted into Java servlets so that they can contain Java code for manipulating server-side Java objects.) In each case, the server responds with a new page (with embedded data), also known as a full page refresh.
In 1995, AJAX came along and changed the full page refresh paradigm described above. AJAX allows for partial page refreshes and data exchanges between the browser and the app server without having to do full page refreshes.Since then, AJAX has continuously gained momentum with support built into popular frameworks like Spring (for Java) in v3.0/2010 and jQuery (for JavaScript) in v1.5/2011.
The data exchange format that works best with AJAX is JSON, since JSP tag libraries cannot be invoked unless a full page refresh is involved. There are several options for mapping between the server-side Java model objects (JavaBeans) and JSON, which can easily be consumed by JavaScript running in the browser. (Note: Since JSON is the literal representation of a JavaScript object, the conversion from JSON to JavaScript object is trivial.) The option I recommend and have been using is Spring MVC's #RequestBody and #ResponseBody annotations as part of the controller method definitions (which leverage the Jackson library) to automatically map JavaBeans to JSON and back (see figures 3 and 4). (The alternative is to use a proprietary framework like Direct Web Remoting or DWR, which I do not recommend for obvious reasons.)
As a result, I recommend to most teams I consult with that it's best to abandon JSP tag libraries entirely in favor of a pure AJAX/JSON based approach.
Here's a summary of the reasoning behind my recommendation to use AJAX/JSON exclusively (even for full page refreshes).
Unless, you have a very simple application, you will likely need to support partial page refreshes using AJAX (rather than do a full page refresh each time that some data needs to change on the page). To do so, you need to map between Java objects (JavaBeans) and JavaScript objects (JSON) in order to exchange data between the browser/client and the application server. Therefore, it probably doesn't make much sense to support two channels for data exchange (JSP tag libraries for full page refreshes and AJAX/JSON for partial page refreshes). And if you have to pick one it has to be AJAX/JSON, since JSP tag libraries don't work for partial page refreshes. Hence, my recommendation to go head first with AJAX/JSON and abandon JSP tag libraries. But if you need more incentive, please read on.
I have worked with teams that have analyzed the size of the data being shuttled back and forth across the network and found that JSON consumes a lot less network bandwidth than the JavaBeans/JSP tag library approach or even XML payloads. Their analysis seems to make sense to me since JSON is a bare bones pure text format without the syntactical overhead involved with XML or the rich object overhead involved with JavaBeans.
Relative to the acrobatics required for manipulating JavaBeans using JSP tag libraries (see figure 1), the JavaBeans to JSON mapping is completely seamless with Spring MVC and requires no coding whatsoever (see figure 2). Whether or not you're using JSP tag libraries, chances are that you need to populate JavaScript objects with the data in order for the data to be consumed by jQuery widgets. In other words, the JavaScript object(s) are required in regardless of whether you use JSP tag libraries or not. Abandoning JSP tag libraries allows you to skip step 2 (see figures) and go straight to JSON and the corresponding JavaScript object(s) without having to muddle through the manipulation of JavaBean objects using JSP tag libraries.

Related

Consume data from local JSON file into website?

I work for a small publishing company with an internal website that displays a static HTML table of our published products.
We have a need to be able to list and sort published products (about 1-2 items are published per day) dynamically that is being fed from an Excel spreadsheet. The Excel spreadsheet is what we are currently using to maintain the data. The Excel spreadsheet is on a shared network drive available to the company.
I am familiar with AngularJS, ReactJS, and VueJS2 for front-end development and was wondering if I would be able to use one of those tools to consume a Excel file, parse it to JSON, and then display it dynamically on the client side.
Is something like this is possible?
When a user finishes editing the Excel sheet and saves it to the shared network drive, is there a script that would automatically save the data as JSON? I assume we would then simply have our Javascript framework reference and consume the saved JSON to populate its published products list.
Note: We are unable to use a relational database at this time (ie MySQL).
Part 1 - generating json from excel...
front-end technologies are not the way to go. You need to run a service that watches folder for change (like nodejs or python). Saving as csv instead of xls might make things easier as you may not need extra libraries to make sense of your xls file
Part 2, displaying json data...
Your browser, by default, cannot load a local json file. So you may need to run a server (again nodejs and python make this relatively easy) to host your json file.
there are many ways of presenting data these days, but without knowing some of your particular and based on the information you did share, looks like you've got a steep learning curve to get something like this going.

react-engine vs other template engines

I was wondering to use paypal's React Engine (https://github.com/paypal/react-engine), but I have some doubts:
What are the benefits over other template engines like Handlebars?
Why upload .jsx files, and not (jsx precompiled/transformed) .js files? (This one should be faster because don't have to do deal with the transformation at the server).
I have been researching but I get confused.
Thanks
The main difference between react-engine and template engines is only when the browser enables the user to interact with the browser page. Nevertheless, it is important how machines have access to individual data.
Assuming we want to run a simple webpage. Just a scrolling and open text information. Using template engines, like Handlebars.js, typically, when the browser request hits to the server, it tries to figure it out how to respond and what to do. That said, the template engine may reference existing fetched data from files stored into a local and accessible source. Those are loading all the defined data regarding the site template file (i.e. head, meta, title, etc.), with a render of incomplete HTML string. This HTML is then sent back to the Browser and rendered.
The react-engine, on the same side it happens the use of the same rendering mechanism. However, instead of a template engine semantic, it uses JSX, or if we want, we can also use JavaScript. The JSX is, therefore, broader then template engines. A great article by Hajime Yamasaki Vukelic complies the separation of concerns from a different angle between JSX and HTML templates.
With template engines, you feed the library a string (usually but not
necessarily HTML), which is then converted into a piece of JavaScript
code which generates virtual DOM structures when executed. At design
time, templates are just strings, so we don’t have direct access to
the surrounding code. For instance, we can’t import helper functions,
or call methods. These things are only possible through abstractions
called directives (and possibly other names depending on where you are
coming from). Directives are the glue between the HTML and the
JavaScript.
So far so good, there is no relevant difference between both solutions. Links to next or previous simple webpage are just simple <a href="/webpage>Next</a> elements.
At the moment, when we decide to implement some interactions, react-engine will be the winner. While react-engine rendering does not require JavaScript to run on the client side, it will enable SEO over the search.
Template engines also have this SEO support, but with less impact. We can not run here all JavaScript to render HTML. Even libraries like jQuery require live access to the browser window object which cannot be mocked easily on the server side. So template engines become less productive.
In conclusion, template engines can do the same as react-engine rendering. Maybe not equally easy or equally fast but both tools are qualified. You can also read another great thread on this topic.

Database vs static html files for storage solution (Site Builder)

Project Description:
As a learning exercise for asp mvc 4, I'm creating a site builder / multitenancy site. It's nothing too fancy just wysiwyg editing on templates with custom routing to direct users to the correct template based on subdomain. So usr1.mysite.com is directed to the template edited by usr1. My main concern at the moment is my method of storing the edited templates.
Storage Dilemma:
At first I was simply going to make the templates into views and store the changes made by the user in the database. When usr1's template was displayed the system would pull up the view and populate it with usr1's data.
Instead I've implemented a system that takes the user's modified template and saves the whole thing as static html files in the file system. Only the path to the usr1's site (and some other details) are saved in the database. When usr1.mysite.com is called I just have a "content" controller to retrieve the correct html file.
Question:
Is there any reason to choose the database/view method over the static html file method?
Also I'm not concerned with having dynamic content in the end user pages. This is one reason I even tried the file method.
Decision (EDIT):
I'm implementing the file method. After more research (verifying my previous research), I have few doubts the file system will have trouble with even a few hundred sites. I will structure it in a way to group user data directories into group directories based on a naming convention I've yet to dream up, probably something like 000usr1, 000usr2 in 000 group directory. With a goal of less than 100 files/folders in any given directory and less than 4 levels deep. Which should give me the capability of holding 10000 sites. I have no plans of having any activity near that level with this software, but I do want to get up and running and torture it for awhile and see what it's capable of handling. If anyone expresses any interest I'll post back some results.

Writing data into a database using a fully REST web service

How would one create a REST web service to write a row into a databse table. Use the follwoing scenario:
The table is called Customer - the
data to be inserted into the row would
be the name, addresss, telephone
number, email.
I think its impossible to describe the whole thing end to end in Java or C#, and I would never expect that, but here are the questions I have popping into my head as I prepare for coding:
How the URI would look (eg. if you use this URL - http://www.example.com/)?
What info would go into the HTTP envelope?
Would I use POST when writing to the database in this way?
Do I use a resource to store the posted data from the client? Is this even necessary if the data is being written to a database anyway?
When the data to be writeen into the db is recieved by the server - how do I physically insert it into the database - do I call some method on the server to actually write the data (in Java)? - this doesn't seem to fit with truely REST architecture - shunning RPC calls.
Should I even be bothering writing to a DB - should I be storing my data as a resource?
As you can see I need a few issues clearing in my head. Any help much appreciated.
First of all, I'm not either java nor c# expert and I don't exactly know what means do these languages have to support REST design, but in general:
http://www.example.com/customers - customers is a collection of resources and you want to add a new resource to this collection
It depends on various things - you should probably set the content-type header (according to the data format in which you are sending the representation) and set some authentication headers if you need it.
Yes, you always use POST to create a new entry in a collection of resources.
I don't fully understand this question, to be honest. What do you mean by "inmediately writing data into the database"?
REST is primarily just a style of communication between server and a client. It doesn't say anything about how you should handle the data received by using it. The usual way how modern web approaches (MVC style frameworks) solve it, is by routing every REST action to a method of some class (usually a controller instance) where you handle the received parameters (eg. save them to the database) and generate a response to be sent back.
For a very brief and very clear introduction to REST have a look at this short video.
RESTful Web Services, published by O'Reilly and Associates, seems to fit the bill you're looking for.
As far as doing it in Java, Sun has a page on it.

How do screen scrapers work? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
I hear people writing these programs all the time and I know what they do, but how do they actually do it? I'm looking for general concepts.
Technically, screenscraping is any program that grabs the display data of another program and ingests it for it's own use.
Quite often, screenscaping refers to a web client that parses the HTML pages of targeted website to extract formatted data. This is done when a website does not offer an RSS feed or a REST API for accessing the data in a programmatic way.
One example of a library used for this purpose is Hpricot for Ruby, which is one of the better-architected HTML parsers used for screen scraping.
Lots of accurate answers here.
What nobody's said is don't do it!
Screen scraping is what you do when nobody's provided you with a reasonable machine-readable interface. It's hard to write, and brittle.
As an example, consider an RSS aggregator, then consider code that gets the same information by working through a normal human-oriented blog interface. Which one breaks when the blogger decides to change their layout?
Of course, sometimes you have no choice :(
In general a screen scraper is a program that captures output from a server program by mimicing the actions of a person sitting in front of the workstation using a browser or terminal access program. at certain key points the program would interpret the output and then take an action or extract certain amounts of information from the output.
Originally this was done with character/terminal outputs from mainframes for extracting data or updating systems that were archaic or not directly accessible to the end user. in modern terms it usually means parsing the output from an HTTP request to extract data or to take some other action. with the advent of web services this sort of thing should have died away, but not all apps provide a nice api to interact with.
A screen scraper downloads the html page, and pulls out the data interested either by searching for known tokens or parsing it as XML or some such.
In the early days of PC's, screen scrapers would emulate a terminal (e.g. IBM 3270) and pretend to be a user in order to interactively extract, update information on the mainframe. In more recent times, the concept is applied to any application that provides an interface via web pages.
With emergence of SOA, screenscraping is a convenient way in which to services enable applications that aren't. In those cases, the web page scraping is the more common approach taken.
Here's a tiny bit of screen scraping implemented in Javascript, using jQuery (not a common choice, mind you, since scraping is usually a client-server activity):
//Show My SO Reputation Score
var repval = $('span.reputation-score:first'); alert('StackOverflow User "' + repval.prev().attr('href').split('/').pop() + '" has (' + repval.html() + ') Reputation Points.');
If you run Firebug, copy the above code and paste it into the Console and see it in action right here on this Question page.
If SO changes the DOM structure / element class names / URI path conventions, all bets are off and it may not work any longer - that's the usual risk in screen scraping endeavors where there is no contract/understanding between parties (the scraper and the scrapee [yes I just invented a word]).
Technically, screenscraping is any program that grabs the display data of another program and ingests it for it's own use.In the early days of PC's, screen scrapers would emulate a terminal (e.g. IBM 3270) and pretend to be a user in order to interactively extract, update information on the mainframe. In more recent times, the concept is applied to any application that provides an interface via web pages.
With emergence of SOA, screenscraping is a convenient way in which to services enable applications that aren't. In those cases, the web page scraping is the more common approach taken.
Quite often, screenscaping refers to a web client that parses the HTML pages of targeted website to extract formatted data. This is done when a website does not offer an RSS feed or a REST API for accessing the data in a programmatic way.
Typically You have an HTML page that contains some data you want. What you do is you write a program that will fetch that web page and attempt to extract that data. This can be done with XML parsers, but for simple applications I prefer to use regular expressions to match a specific spot in the HTML and extract the necessary data. Sometimes it can be tricky to create a good regular expression, though, because the surrounding HTML appears multiple times in the document. You always want to match a unique item as close as you can to the data you need.
Screen scraping is what you do when nobody's provided you with a reasonable machine-readable interface. It's hard to write, and brittle.
As an example, consider an RSS aggregator, then consider code that gets the same information by working through a normal human-oriented blog interface. Which one breaks when the blogger decides to change their layout.
One example of a library used for this purpose is Hpricot for Ruby, which is one of the better-architected HTML parsers used for screen scraping.
You have an HTML page that contains some data you want. What you do is you write a program that will fetch that web page and attempt to extract that data. This can be done with XML parsers, but for simple applications I prefer to use regular expressions to match a specific spot in the HTML and extract the necessary data. Sometimes it can be tricky to create a good regular expression, though, because the surrounding HTML appears multiple times in the document. You always want to match a unique item as close as you can to the data you need.
Screen scraping is what you do when nobody's provided you with a reasonable machine-readable interface. It's hard to write, and brittle.
Not quite true. I don't think I'm exaggerating when I say that most developers do not have enough experience to write decents APIs. I've worked with screen scraping companies and often the APIs are so problematic (ranging from cryptic errors to bad results) and often don't give the full functionality that the website provides that it can be better to screen scrape (web scrape if you will). The extranet/website portals are used my more customers/brokers than API clients and thus are better supported. In big companies changes to extranet portals etc.. are infrequent, usually because it was originally outsourced and now its just maintained. I refer more to screen scraping where the output is tailored, e.g. a flight on particular route and time, an insurance quote, a shipping quote etc..
In terms of doing it, it can be as simple as web client to pull the page contents into a string and using a series of regular expressions to extract the information you want.
string pageContents = new WebClient("www.stackoverflow.com").DownloadString();
int numberOfPosts = // regex match
Obviously in a large scale environment you'd be writing more robust code than the above.
A screen scraper downloads the html
page, and pulls out the data
interested either by searching for
known tokens or parsing it as XML or
some such.
That is cleaner approach than regex... in theory.., however in practice its not quite as easy, given that most documents will need normalized to XHTML before you can XPath through it, in the end we found the fine tuned regular expressions were more practical.

Resources