I work for a small publishing company with an internal website that displays a static HTML table of our published products.
We have a need to be able to list and sort published products (about 1-2 items are published per day) dynamically that is being fed from an Excel spreadsheet. The Excel spreadsheet is what we are currently using to maintain the data. The Excel spreadsheet is on a shared network drive available to the company.
I am familiar with AngularJS, ReactJS, and VueJS2 for front-end development and was wondering if I would be able to use one of those tools to consume a Excel file, parse it to JSON, and then display it dynamically on the client side.
Is something like this is possible?
When a user finishes editing the Excel sheet and saves it to the shared network drive, is there a script that would automatically save the data as JSON? I assume we would then simply have our Javascript framework reference and consume the saved JSON to populate its published products list.
Note: We are unable to use a relational database at this time (ie MySQL).
Part 1 - generating json from excel...
front-end technologies are not the way to go. You need to run a service that watches folder for change (like nodejs or python). Saving as csv instead of xls might make things easier as you may not need extra libraries to make sense of your xls file
Part 2, displaying json data...
Your browser, by default, cannot load a local json file. So you may need to run a server (again nodejs and python make this relatively easy) to host your json file.
there are many ways of presenting data these days, but without knowing some of your particular and based on the information you did share, looks like you've got a steep learning curve to get something like this going.
I was wondering to use paypal's React Engine (https://github.com/paypal/react-engine), but I have some doubts:
What are the benefits over other template engines like Handlebars?
Why upload .jsx files, and not (jsx precompiled/transformed) .js files? (This one should be faster because don't have to do deal with the transformation at the server).
I have been researching but I get confused.
Thanks
The main difference between react-engine and template engines is only when the browser enables the user to interact with the browser page. Nevertheless, it is important how machines have access to individual data.
Assuming we want to run a simple webpage. Just a scrolling and open text information. Using template engines, like Handlebars.js, typically, when the browser request hits to the server, it tries to figure it out how to respond and what to do. That said, the template engine may reference existing fetched data from files stored into a local and accessible source. Those are loading all the defined data regarding the site template file (i.e. head, meta, title, etc.), with a render of incomplete HTML string. This HTML is then sent back to the Browser and rendered.
The react-engine, on the same side it happens the use of the same rendering mechanism. However, instead of a template engine semantic, it uses JSX, or if we want, we can also use JavaScript. The JSX is, therefore, broader then template engines. A great article by Hajime Yamasaki Vukelic complies the separation of concerns from a different angle between JSX and HTML templates.
With template engines, you feed the library a string (usually but not
necessarily HTML), which is then converted into a piece of JavaScript
code which generates virtual DOM structures when executed. At design
time, templates are just strings, so we don’t have direct access to
the surrounding code. For instance, we can’t import helper functions,
or call methods. These things are only possible through abstractions
called directives (and possibly other names depending on where you are
coming from). Directives are the glue between the HTML and the
JavaScript.
So far so good, there is no relevant difference between both solutions. Links to next or previous simple webpage are just simple <a href="/webpage>Next</a> elements.
At the moment, when we decide to implement some interactions, react-engine will be the winner. While react-engine rendering does not require JavaScript to run on the client side, it will enable SEO over the search.
Template engines also have this SEO support, but with less impact. We can not run here all JavaScript to render HTML. Even libraries like jQuery require live access to the browser window object which cannot be mocked easily on the server side. So template engines become less productive.
In conclusion, template engines can do the same as react-engine rendering. Maybe not equally easy or equally fast but both tools are qualified. You can also read another great thread on this topic.
Project Description:
As a learning exercise for asp mvc 4, I'm creating a site builder / multitenancy site. It's nothing too fancy just wysiwyg editing on templates with custom routing to direct users to the correct template based on subdomain. So usr1.mysite.com is directed to the template edited by usr1. My main concern at the moment is my method of storing the edited templates.
Storage Dilemma:
At first I was simply going to make the templates into views and store the changes made by the user in the database. When usr1's template was displayed the system would pull up the view and populate it with usr1's data.
Instead I've implemented a system that takes the user's modified template and saves the whole thing as static html files in the file system. Only the path to the usr1's site (and some other details) are saved in the database. When usr1.mysite.com is called I just have a "content" controller to retrieve the correct html file.
Question:
Is there any reason to choose the database/view method over the static html file method?
Also I'm not concerned with having dynamic content in the end user pages. This is one reason I even tried the file method.
Decision (EDIT):
I'm implementing the file method. After more research (verifying my previous research), I have few doubts the file system will have trouble with even a few hundred sites. I will structure it in a way to group user data directories into group directories based on a naming convention I've yet to dream up, probably something like 000usr1, 000usr2 in 000 group directory. With a goal of less than 100 files/folders in any given directory and less than 4 levels deep. Which should give me the capability of holding 10000 sites. I have no plans of having any activity near that level with this software, but I do want to get up and running and torture it for awhile and see what it's capable of handling. If anyone expresses any interest I'll post back some results.
How would one create a REST web service to write a row into a databse table. Use the follwoing scenario:
The table is called Customer - the
data to be inserted into the row would
be the name, addresss, telephone
number, email.
I think its impossible to describe the whole thing end to end in Java or C#, and I would never expect that, but here are the questions I have popping into my head as I prepare for coding:
How the URI would look (eg. if you use this URL - http://www.example.com/)?
What info would go into the HTTP envelope?
Would I use POST when writing to the database in this way?
Do I use a resource to store the posted data from the client? Is this even necessary if the data is being written to a database anyway?
When the data to be writeen into the db is recieved by the server - how do I physically insert it into the database - do I call some method on the server to actually write the data (in Java)? - this doesn't seem to fit with truely REST architecture - shunning RPC calls.
Should I even be bothering writing to a DB - should I be storing my data as a resource?
As you can see I need a few issues clearing in my head. Any help much appreciated.
First of all, I'm not either java nor c# expert and I don't exactly know what means do these languages have to support REST design, but in general:
http://www.example.com/customers - customers is a collection of resources and you want to add a new resource to this collection
It depends on various things - you should probably set the content-type header (according to the data format in which you are sending the representation) and set some authentication headers if you need it.
Yes, you always use POST to create a new entry in a collection of resources.
I don't fully understand this question, to be honest. What do you mean by "inmediately writing data into the database"?
REST is primarily just a style of communication between server and a client. It doesn't say anything about how you should handle the data received by using it. The usual way how modern web approaches (MVC style frameworks) solve it, is by routing every REST action to a method of some class (usually a controller instance) where you handle the received parameters (eg. save them to the database) and generate a response to be sent back.
For a very brief and very clear introduction to REST have a look at this short video.
RESTful Web Services, published by O'Reilly and Associates, seems to fit the bill you're looking for.
As far as doing it in Java, Sun has a page on it.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
I hear people writing these programs all the time and I know what they do, but how do they actually do it? I'm looking for general concepts.
Technically, screenscraping is any program that grabs the display data of another program and ingests it for it's own use.
Quite often, screenscaping refers to a web client that parses the HTML pages of targeted website to extract formatted data. This is done when a website does not offer an RSS feed or a REST API for accessing the data in a programmatic way.
One example of a library used for this purpose is Hpricot for Ruby, which is one of the better-architected HTML parsers used for screen scraping.
Lots of accurate answers here.
What nobody's said is don't do it!
Screen scraping is what you do when nobody's provided you with a reasonable machine-readable interface. It's hard to write, and brittle.
As an example, consider an RSS aggregator, then consider code that gets the same information by working through a normal human-oriented blog interface. Which one breaks when the blogger decides to change their layout?
Of course, sometimes you have no choice :(
In general a screen scraper is a program that captures output from a server program by mimicing the actions of a person sitting in front of the workstation using a browser or terminal access program. at certain key points the program would interpret the output and then take an action or extract certain amounts of information from the output.
Originally this was done with character/terminal outputs from mainframes for extracting data or updating systems that were archaic or not directly accessible to the end user. in modern terms it usually means parsing the output from an HTTP request to extract data or to take some other action. with the advent of web services this sort of thing should have died away, but not all apps provide a nice api to interact with.
A screen scraper downloads the html page, and pulls out the data interested either by searching for known tokens or parsing it as XML or some such.
In the early days of PC's, screen scrapers would emulate a terminal (e.g. IBM 3270) and pretend to be a user in order to interactively extract, update information on the mainframe. In more recent times, the concept is applied to any application that provides an interface via web pages.
With emergence of SOA, screenscraping is a convenient way in which to services enable applications that aren't. In those cases, the web page scraping is the more common approach taken.
Here's a tiny bit of screen scraping implemented in Javascript, using jQuery (not a common choice, mind you, since scraping is usually a client-server activity):
//Show My SO Reputation Score
var repval = $('span.reputation-score:first'); alert('StackOverflow User "' + repval.prev().attr('href').split('/').pop() + '" has (' + repval.html() + ') Reputation Points.');
If you run Firebug, copy the above code and paste it into the Console and see it in action right here on this Question page.
If SO changes the DOM structure / element class names / URI path conventions, all bets are off and it may not work any longer - that's the usual risk in screen scraping endeavors where there is no contract/understanding between parties (the scraper and the scrapee [yes I just invented a word]).
Technically, screenscraping is any program that grabs the display data of another program and ingests it for it's own use.In the early days of PC's, screen scrapers would emulate a terminal (e.g. IBM 3270) and pretend to be a user in order to interactively extract, update information on the mainframe. In more recent times, the concept is applied to any application that provides an interface via web pages.
With emergence of SOA, screenscraping is a convenient way in which to services enable applications that aren't. In those cases, the web page scraping is the more common approach taken.
Quite often, screenscaping refers to a web client that parses the HTML pages of targeted website to extract formatted data. This is done when a website does not offer an RSS feed or a REST API for accessing the data in a programmatic way.
Typically You have an HTML page that contains some data you want. What you do is you write a program that will fetch that web page and attempt to extract that data. This can be done with XML parsers, but for simple applications I prefer to use regular expressions to match a specific spot in the HTML and extract the necessary data. Sometimes it can be tricky to create a good regular expression, though, because the surrounding HTML appears multiple times in the document. You always want to match a unique item as close as you can to the data you need.
Screen scraping is what you do when nobody's provided you with a reasonable machine-readable interface. It's hard to write, and brittle.
As an example, consider an RSS aggregator, then consider code that gets the same information by working through a normal human-oriented blog interface. Which one breaks when the blogger decides to change their layout.
One example of a library used for this purpose is Hpricot for Ruby, which is one of the better-architected HTML parsers used for screen scraping.
You have an HTML page that contains some data you want. What you do is you write a program that will fetch that web page and attempt to extract that data. This can be done with XML parsers, but for simple applications I prefer to use regular expressions to match a specific spot in the HTML and extract the necessary data. Sometimes it can be tricky to create a good regular expression, though, because the surrounding HTML appears multiple times in the document. You always want to match a unique item as close as you can to the data you need.
Screen scraping is what you do when nobody's provided you with a reasonable machine-readable interface. It's hard to write, and brittle.
Not quite true. I don't think I'm exaggerating when I say that most developers do not have enough experience to write decents APIs. I've worked with screen scraping companies and often the APIs are so problematic (ranging from cryptic errors to bad results) and often don't give the full functionality that the website provides that it can be better to screen scrape (web scrape if you will). The extranet/website portals are used my more customers/brokers than API clients and thus are better supported. In big companies changes to extranet portals etc.. are infrequent, usually because it was originally outsourced and now its just maintained. I refer more to screen scraping where the output is tailored, e.g. a flight on particular route and time, an insurance quote, a shipping quote etc..
In terms of doing it, it can be as simple as web client to pull the page contents into a string and using a series of regular expressions to extract the information you want.
string pageContents = new WebClient("www.stackoverflow.com").DownloadString();
int numberOfPosts = // regex match
Obviously in a large scale environment you'd be writing more robust code than the above.
A screen scraper downloads the html
page, and pulls out the data
interested either by searching for
known tokens or parsing it as XML or
some such.
That is cleaner approach than regex... in theory.., however in practice its not quite as easy, given that most documents will need normalized to XHTML before you can XPath through it, in the end we found the fine tuned regular expressions were more practical.