Disadvantages of the Force.com platform [closed] - salesforce

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
We're currently looking at using the Force.com platform as our development platform and the sales guys and the force.com website are full of reasons why it's the best platform in the world. What I'm looking for, though, is some real disadvantages to using such a platform.

Here are 10 to get you started.
Apex is a proprietary language. Other than the force.com Eclipse plugin, there's little to no tooling available such as refactoring, code analysis, etc.
Apex was modeled on Java 5, which is considered to be lagging behind other languages, and without tooling (see #1), can be quite cumbersome.
Deployment is still fairly manual with lots of gotchas and manual steps. This situation is slowly improving over time, but you'll be disappointed if you're used to having automated deployments.
Apex lacks packages/namespaces. All of your classes, interfaces, etc. live in one folder on the server. This makes code much less organized and class/interface names necessarily long to avoid name clashes and to provide context. This is one of my biggest complaints, and I would not freely choose to build on force.com for this reason alone.
The "force.com IDE", aka force.com eclipse plugin, is incredibly slow. Saving any file, whether it be a class file, text file, etc., usually takes at least 5 seconds and sometimes up to 30 seconds depending on how many objects, data types, class files, etc. are in your org. Saving is also a blocking action, requiring not only compilation, but a full sync of your local project with the server. Orders of magnitude slower than Java or .NET.
The online developer community does not seem very healthy. I've noticed lots of forum posts go unanswered or unsolved. I think this may have something to do with the forum software salesforce.com uses, which seems to suck pretty hard.
The data access DSL in Apex leaves a lot to be desired. It's not even remotely competitive with the likes of (N)Hibernate, JPA, etc.
Developing an app on Apex/VisualForce is an exercise in governor limits engineering. Easily half of programmer time is spent trying to optimize to avoid the numerous governor limits and other gotchas like visualforce view state limits. It could be argued that if you write efficient code to begin with you won't have this problem, which is true to an extent. However there are many times that you have valid reasons to make more than x queries in a session, or loop through more than x records, etc.
The save->compile->run cycle is extremely slow, esp. when it involves zipping and uploading the entire static resource bundle just to do something like test a minor CSS or javascript change.
In general, the pain of a young, fledgling platform without the benefits of it being open source. You have no way to validate and/or fix bugs in the platform. They say to post it to their IdeaExchange. Yeah, good luck with that.
Disclaimers/Disclosures: There are lots of benefits to a hosted platform such as force.com. Force.com does regularly enhance the platform. There are plenty of things about it I like. I make money building on force.com

I see you've gotten some answers, but I would like to reiterate how much time is wasted getting around the various governor limits on the platform. As much as I like the platform on certain levels, I would very strongly, highly, emphatically recommend against it as a general application development platform. It's great as a super configurable and extensible CRM application if that's what you want. While their marketing is exceptional at pushing the idea of Force.com as a general development platform, it's not even remotely close yet.
The efficiency of having a stable platform and avoiding big performance and stability problems is easily wasted in trying to code around the limits that people refer to. There are so many limits to the platform, it becomes completely maddening. These limits are not high-end limits you'll hit once you have a lot of users, you'll hit them almost right away.
While there are usually techniques to get around them, it's very hard to figure out strategies for avoiding them while you're also trying to develop the business logic of your actual application.
To give you a simple sense of how developer un-friendly the environment is, take the "lack of debugging environment" referred to above. It's worse than that. You can only see up to 20 of the most recent requests to the server in the debug logs. So, as you're developing inside the application you have to create a "New" debug request, select your name, hit "Save", switch back to your app, refresh the page, click back to your debug tab, try to find the request that will house your debug log, hit "find" to search for the text you're looking for. It's like ten clicks to look at a debug output. While it may seem trivial, it's just an example of how little care and consideration has been given to the developer's experience.
Everything about the development platform is a grafted-on afterthought. It's remarkable for what it is, but a total PITA for the most part. If you don't know exactly what you are doing (as in you're certified and have a very intimate understanding of Apex), it will easily take you upwards of 10-20x the amount of time that it would in another environment to do something that seems like it would be ridiculously simple, if you can even succeed at all.
The governor limits are indeed that bad. You have a combination of various limits (database queries, rows returned, "script statements", future calls, callouts, etc.) and you have to know exactly what you are doing to avoid these. For example, if you have a calculated rollup "formula" field on an object and you have a trigger on a child object, it will execute the parent object triggers and count those against your limits. Things like that aren't obvious until you've gone through the painful process of trying and failing.
You'll try one thing to avoid one limit, and hit another in a never ending game of "whack a limit". In the process you'll have to drastically re-architect your entire app and approach, as well as rewrite all of your test code. You must have 75% test code coverage to deploy into production, which is actually very good thing, but combined with all of the other limits, it's very burdensome. You'll actually hit governor limits writing your test code that wouldn't come up in normal user scenarios, but that will prevent you from achieving the coverage.
That is not to mention a whole host of other issues. Packaging isn't what you expect. You can't package up your app and deliver it to users without significant user intervention and configuration on the part of the administrator of the org. The AppExchange is a total joke, and they've even started charging 5K just to get your app listed. Importing with the data loader sucks, especially if you have any triggers. You can't export all of your data in one step that includes your relationships in such a way that it can easily be re-imported into another org in a single step (for example a dev org). You can only refresh a sandbox once a month from production, no exceptions, and you can't include your data in a refresh by default unless you have called your account executive to get that feature unlocked. You can't mass delete data in custom objects. You can't change your package names. Certain things can take numerous days to complete after you have requested them, such as a data backup before you want to deploy an app, with no progress report along the way and not much sense of when exactly the export occurred. Given that there are synchronicity issues of data if there are relationships between the data, there are serious data integrity issues in that there is no such thing as a "transaction" that can export numerous objects in a single step. There are probably some commercial tools to facilitate some of this, but these are not within reach to normal developers who may not have a huge budget.
Everything else the other people said here is true. It can take anywhere from five seconds to a minute sometimes to save a file.
I don't mean to be so negative because the platform is very cool in some ways and they're trying to do things in a multi-tenant environment that no one else is doing. It's a very innovative environment and powerful on some levels (I actually like VisualForce a lot), but give it another year or two. They're partnering with VMware, maybe that will lead to giving developers a bit more of a playpen rather than a jail cell to work in.

Here are a few things I can give you after spending a fair bit of time developing on the platform in the last fortnight or so:
There's no RESTful API. They have a soap based API that you can call, but there is no way of making true restful calls
There's no simple way to take their SObjects and convert them to JSON objects.
The visual force pages are ok until you want to customize them and then it's a whole world of pain.
Visual force pages need to be bound to SObjects otherwise there's no way to get the standard input fields like the datepicker or select list to work.
The eclipse plugin is ok if you want to work by yourself, but if you want to work in a large team with the eclipse plugin forget it. It doesn't handle synchronizing to and from the server, it crashes and it isn't really helpful at all.
THERE IS NO DEBUGGER! If you want to debug, it's literally debugged by system.debug statements. This is probably the biggest problem I've found
Their "MVC" model isn't really MVC. It's a lot closer to ASP.NET Webforms. Your views are tightly coupled to not only the models but the controllers as well.
Storing a large number of documents is not feasible. We need to store over 100gb's of documents and we were quoted some ridiculous figure. We've decided to implement our document storage on amazons S3 infrastructure
Even tho the language is java based, it's not java. You can't import any external packages or libraries. Also, the base libraries that are available are severely limited so we've found ourselves implementing a bunch of stuff externally and then exposing those bits as services that are called by force.com
You can call external SOAP or REST based services but the message body is limited to 100kb's so it's very restrictive in what you can call.
In all honesty, whilst there are potential benefits to developing on something like the force.com platform, for me, you couldn't use the force.com platform for true enterprise level apps. At best you could write some basic crud style applications but once you move into anything remotely complicated I'd be avoiding it like the plague.

Wow- there's a lot here that I didn't even know were limitations - after working on the platform for a few years.
But just to add some other things...
The reason you don't have a line-by-line debugger is precisely because it's a multi-tenant platform. At least that's what SFDC says - it seems like in this age of thread-rich programming, that isn't much of an excuse, but that's apparently the reason. If you have to write code, you have "System.debug(String)" as your debugger - I remember having more sophisticated server debugging tools in Java 1.2 about 12 years ago.
Another thing I really hate about the system is version control. The Spring framework is not used for what Spring is usually used for - it's really more off a configuration tool in SFDC rather than version control. SFDC provides ZERO version-control.
You can find yourself stuck for days doing something that should seem so ridiculously easy, like, say, scheduling a SFDC report to export to a CSV file and email to a list of recipients... Well, about the easiest way to do that is create a custom object with a custom field, with a workflow rule and a Visualforce email template... and then for code you need to write a Visualforce component that streams the report data to the Visualforce email template as an attachment and you write anonymous APEX code schedule field-update of the custom object... For SFDC developers, this is almost a daily task... trying to put about five different technologies together to do tasks that seem so simple.... And this can cause management headaches and tensions too - Typically, you'd find this out after getting a suggestion to do something that doesn't work in the user-community (like someone already said), and then trying many things that, after you developed them you'd find they just don't work for some odd-ball reason - like "you can't schedule a VisualForce page", or "you can't call getContent from a schedulable context" or some other arcane reason.
There are so many, many maddening little gotcha's on the SFDC platform, that once you know WHY they're there, it makes sense... but they're still very bad limitations that keep you from doing what you need to do. Here's some of mine;
You can't get record owner information "out of the box" on pretty much any kind of record - you have to write a trigger that links the owner on create of the record to the record you're inserting. Why? Short answer because an owner can be either a "person" or a "queue", and the two are drastically different entities... Makes sense, but it can turn a project literally upside down.
Maddening security model. Example: "Manage Public Reports" permission is vastly different from "Create and Customize Reports" and that basically goes for everything on the platform... especially folders of any kind.
As mentioned, support is basically non-existent. If you are an extremely self-sufficient individual, or have a lot of SFDC resources, or have a lot of time and/or a very forgiving manager, or are in charge of a SFDC system that's working fine, you're in pretty good shape. If you are not in any of these positions, you can find yourself in deep trouble.
SFDC is a very seductive business proposition... no equipment footprint, pretty good security, fixed price, no infrastructure, AND you get web-based CRM with batchable, and schedualble processing... But as the other posters said, it is really quite a ramp-up in development learning, and if you go with consulting, I think the lowest price I've seen was $200/hour.
Salesforce tends integrate with other things years after some technologies become common-place - JSON and jquery come to mind... and if you have other common infrastructures that you want to do an integration with, like JIRA, expect to pay a lot extra, and they can be quite buggy.
And as one of the other posters mentioned, you are constantly fighting governor limits that can just drive you nuts... an attachment can NOT be > 5MB. Period. And sometimes < 3MB (if base64 encoded). Ten HTTP callouts in a class. Period. There are dozens of published governor limits, and many that are not which you will undoubtedly find and just want to run out of your office screaming.
I really, REALLY like the platform, but trust me - it can be one really cruel mistress.
But in fairness to SFDC, I'd say this: the biggest problem I find with the platform is not the platform itself, but the gargantuan expectations that almost anyone who sees the platform, but hasn't developed on it has.... and those people tend to be in positions of great authority in business organizations; marketing, sales, management, etc. Huge disconnects occur and heads roll, or are threatened to roll daily - all because there's this great platform out there with weird gotchas and thousands of people struggling daily to get their heads around why things should just work when they just don't and won't.
EDIT:
Just to add to lomaxx's comments about the MVC; In SFDC terminology, this is closely related to what's known as the "viewstate" -- aand it can be really buggy, in that what is on the VF page is not what is in the controller-class for the page. So, you have to go throught weird gyrations to synch whats on the page with what the controller is going to write to SF when you click your "save" button (or make your HTTP callout or whatever).... man, it's annoying.

I think other people have covered the disadvantages in more depth but to me, it doesn't seem to use the MVC paradigm or support much in the way of code reuse at all. To do anything beyond simple applications is an exercise in frustration compared to developing an application using something like ASP.Net MVC.
Furthermore, the tools, the data layer and the frustration of trying to refactor code or rename fields during the development process doesn't help.
I think as a CMS it's pretty cool but as a platform for non CMS applications, it's doesn't make sense to me.

The security model is also very very restrictive... but this isn't the worst part. You can't currently assert whether a user has the ability to perform a particular action.
You can check to see what their role is, but you can't check if that role has permissions to perform the current action.
Even worse is the response from tech support to "try the action and if there's an exception, catch it"

Considering Force.com is a "cloud" platform, its ability to act as a client to an external WSDL-defined service is pretty underwhelming. See http://force201.wordpress.com/2010/05/20/when-generate-from-wsdl-fails-hand-coding-web-service-calls/ for what you might end up having to do.

To all above, I am curious how the release of VMforce, allowing Java programmer to write code for Force.com, changes the disadvantages above?
http://www.zdnet.com/blog/saas/vmforcecom-redefines-the-paas-landscape/1071

I guess they are trying to address these issues. At dreamforce they mentioned they we're trying to drop the Governor limits to only 4. I'm not sure what the details are. They have a REST API for early access, and they bought heroku which is a ruby development in the cloud. They split out the database, with database.com so you can do all your web development on and your db calls using database.com.
I guess they are trying to make it as agnostic as possible. But right about now these are all announcements and early access so like their Safe Harbor statements don't purchase on what they say, only on what they currently have.

Related

JitterBit vs Dell Boomi vs Celigo

We've narrowed our selection for an ipaas down to the above 3.
Initially we're looking to pass data from a cloud based HR system to Netsuite, and from Netsuite to Salesforce, and sometimes JIRA.
i've come from a Mulesoft background which I think would be too complex for this. On the other hand it seems that Celigo is VERY drag and drop, and there's not much room for modification/customisation.
Of the three, do you have any experience/recommendations? We aren't looking for any code heavy custom APIs, most will just be simple scheduled data transfers but there may be some complexity within the field mapping, and we want to set ourselves up for the future.
I spent a few years removing Celigo from NetSuite and Salesforce. The best way I can describe Celigo is that it is like the old school anti-virus programs which were often worse than the viruses... lol... It digs itself into the end system, making removing it a nightmare.
Boomi does the job, but is very counter-intuitive, and overly complex. You can't do everything from one screen, you can't easily bounce back and forth between tasks/operations/etc. And, sometimes it is very difficult to find where endpoints are used, as they are not always shown in their "where is this used" feature. Boomi has a ton of endpoint connectors pre-built (the most, I believe), but I have not seen an easy way to just create your own. Boomi also has much more functionality than just the integrations, if that is something that may be needed.
Jitterbit, my favorite, is ridiculously simple to use. You can access everything from one main screen, you can connect to anything (as long as it can reach out to the network, or you can reach it via the network - internal or external). Jitterbit has a lot of pre-built endpoint connectors. It is also extremely easy to just create a connection to anything you want. The win with Jitterbit is that it is super easy to use, super easy to learn, it always works, they have amazing support (if you need it). I have worked with Jitterbit the most (about 6 years), and I have never been unable to complete an integration task in less that a couple of day, max.
I have extensive experience with Dell Boomi platform but none with JitterBit or Celigo. Dell Boomi offers very versatile and well supported iPaaS solution. The technical challenges of Boomi are some UI\usability issues (#W3BGUY mentioned the main ones) and the lack of out-of-the-box support for CI/CD and DevOps processes (code management, versioning, deployments etc.)
One more important component to consider here is the pricing of the platform. Boomi does charge their clients yearly connection prices. Connection is defined as a unique combination of URL, username and password. The yearly license costs vary and can range anywhere between ($1,000 - $12,000) per license per year. The price depends greatly on your integration landscape and the discounts provided so I would advise on engaging with vendor early to understand your costs. Would be great to hear from others on pricing for JitterBit and Celigo.
Boomi is also more than just an iPaaS platform. They offer other modules of their platform to customers: API Management, Boomi Flow (workflow and automation module), Master Data Hub (master data management). Some of these modules are well developed and some are in their infancy (API Management).
From my limited experience with MuleSoft platform, I share the OP's sentiments about it being too complex for simple integrations. They do provide great CI/CD and DevOps functionality though if that is something that is needed.
There is not a simple answer to a question like this. One needs to look at multiple aspects of the platform and make a decision based on multitude of factors. I would advise looking at Gartner and Forrester reports for a general guidelines and working out the pricing (initial and recurring) with the vendor.
I have only used Jitterbit, so can only comment on that. It works fine. It is pretty intuitive and easy to use, but does have some flexibility with writing your own queries, defining and mapping file formats, and choosing different transfer protocols.
I've only used the free version (which you need to host somewhere and also is not supported) and it was good enough for production tasks. If you have the luxury of time, I'd say download it and try it out. If it works for you, throw it on a server or upgrade to the cloud version.
One note: Jitterbit uses background services. If you run it locally and then decide to migrate your account to a server, you need to stop those services on your local. Otherwise, it will try to run jobs from both locations and that doesn't turn out well.
Consider checking out Choreo as well. It has a novel simultaneous code + low-code approach for integration development. And provides rich AI support for performance monitoring, debugging, and data mapping.
Disclaimer: I'm a member of the project.

Salesforce: Developers view

We are in the process of deciding a route to take for a new CRM system. We've had Salesforce come in and give us their pitch and the developers have had a little play with it, made it do a few things we need etc...
It's hard for us to get a good idea of the pros and cons until we start to develop with it and if you start, you are tied in to a year contract for X number of users and it's pretty expensive as it is..
So, my question. Who has developed for sales force platform? how did you find the experience? would you recommend it as a good solution? Should we just continue with our ruby/rails/mongo systems?
Thanks!
The good news is the amount of customization you can do via configuration is amazing. The out-of-box functionality is very strong and you get a pretty nice security model and reporting system included.
Having said that, when you do need to do custom development beyond what the configuration can support, the pain can start;
-APEX is the most frustrating (modern?) language I have ever worked with.
-Deployment/Migration can be slow and painful (some things cannot be migrated, e.g. Approval processes)
-APEX is a rather immature language missing much of the concepts of .net or java
-Debugging is messy (log actually gets truncated at a certain length, no stepping)
Having said all that, SalesForce.com is a very strong CRM - 90% of the custom work you'll want to do will be really smooth and fast, the remainder will be extremely painful.

Good (CMS-based?) platform for simple database apps

I need to implement yet another database website. Let's say roughly 5 tables, 25 columns, and (eventually) thousands to tens of thousands of rows. Easy data entry and maintenance are more important than presentation of the data to non-privileged users. It's a niche site, so performance is not a concern. We'll have no trouble finding somewhere to host it.
So: what's a good platform for this? Intituitively I feel that there ought to be some platform that allows this to be done with no code written - some web version of MS Access. Obviously I'm happy to code business rules, and special logic that distinguishes this from every other database app.
I've looked at Drupal (with Views) and it looks possible, but with quite a bit of effort. Will look at Al Fresco next. A CMS-y platform helps because then you can nicely integrate static content, you get nice styling, plugins, etc etc.
Really good data entry (tracking changes, logging, ability to roll back, mass imports...) would be great. If authorised users could do arbitrary SQL queries (yes, I know...) that would be a big bonus. Image management support a small bonus.
Django is what you are looking for. In fact, you could probably set up what you ask without much coding at all, just configuration.
Once complete, authorised users can add 'rows' with a nice but simple GUI, or, of course, you can batch import via database commands.
I'm a Python newbie, and I've already created 2 Django-based sites. I have created more than a dozen Drupal-based sites, and Django is easier and produces significantly faster sites.
Your need somewhat sits between two chairs : bespoke application and CMS-based. I'd advocate for the CMS approach, if and only if you feel the need for content structure customization will grow in the future, slowly removing the need for direct SQL queries.
I am biased since working with eZ Publish for many years now, but it satisfies the requirements you expressed natively :
Really good data entry (tracking changes, logging, ability to roll back, mass imports...)
[...] Image management support a small bonus.
An idea of the content edition feel can be watched here:
http://ez.no/Demos-Videos/eZ-Publish-Administration-Interface-Video-Tutorial
and you can download and test-drive eZ Publish Community Edition there : http://share.ez.no/latest
It is a PHP-based solution, strong professional community (http://share.ez.no), over 1100 add-ons available on http://projects.ez.no. The underlying libs are mostly relying on Apache Zeta Components, high-quality, robust set of PHP5 libraries.
Last note : the content model is abstracted, meaning you'd not have to create a new table everytime a new type of content should be stored : a simple content class definition from the administration interface, and the rest is taken care of, including the edition interface for the new content type. Might remove the need for hardcore SQL queries ?
Hope it helped,
Drupal can do most of what you need (I don't know of a module that will let you enter arbitrary SQL queries), but you will end up with some overhead of tables and modules you don't really need. It's up to you to decide if that's a problem or not. I don't think the overhead would hurt performance in your case.
The advantages of using Drupal would be the large community, the stability of the platform and the flexibility to add more functionality when needed. Also, the large user base ensures that most code has been tested rather well.
I highly recommend Drupal. It is very simple (also internally codebase is small and clean) it has dosens of possibilities and tremendous support. Once you start with Drupal you will never go to anything else.
Note that I'm not connected with Drupal staff, I've just created dosens of Drupas sites and many of them in just a minutes. My last one took me 2 hrs, see it here http://iPadDevZone.com
UPDATE #1:
It really depends on your DB schema complexity. The best case is that you just use CCK module (part of core now) and create your node type. Node is Drupal name for content. All you do is just web admin your node type fields (text, image, numbers, dates, custom, etc). Then, if user creates content with this node type he/she can enter all the fields which are stored in separate db table fields. This is however hidden for you - if you wish not to know about it - it is just a web gui. Then you choose how the node is presented, which properties as shown and where.
Watch videos in CCK resources section in the bottom of this page: http://drupal.org/project/cck
If you need to do some programming then it is also very easy to use so called PHP code sniplets which are entered as part of your content (node) and executed when the page is displayed.
Drupal has node revisions built in the core. You can see all the versions and roll back if you wish.
You can set the permissions in quite granular level so you can control what your users may or may not.
I would take a look at Symphony. I havn't been using it myself, but it seems like it's really easy to use and to customize!
http://symphony-cms.com/
Seems to me an online database system would be better than a CMS system.
So in addition to what's been posted above:
www.quickbase.com (by Intuit) - think around $150/mo
www.rollbase.com - check on price, full featured
www.rhythmdata.com - easy to set up, but don't think it's got the advanced features you're looking for.
Good luck!
B
I appreciate these answers, but most of them are really platforms that are much better at something else (eg, Drupal really is a CMS, and has some support for custom fields - but it's not at all easy). Since this is a brand new site from scratch, it doesn't really make sense to start with something that does custom database fields as an afterthought, I think.
The closest I've found is Zoho Creator. It really is like "MS Access for Web 2.0" - and even supports importing from Access. The pricing could get expensive though. It feels like it might eventually be quite constraining. I'm still evaluating.
Are there any other products like Zoho Creator?

Looking for an example of when screen scraping might be worthwhile

Screen scraping seems like a useful tool - you can go onto someone else's site and steal their data - how wonderful!
But I'm having a hard time with how useful this could be.
Most application data is pretty specific to that application even on the web. For example, let's say I scrape all of the questions and answers off of StackOverflow or all of the results off of Google (assuming this were possible) - I'm left with data that is not very useful unless I either have a competing question and answer site (in which case the stolen data will be immediately obvious) or a competing search engine (in which case, unless I have an algorithm of my own, my data is going to be stale pretty quickly).
So my question is, under what circumstances could the data from one app be useful to some external app? I'm looking for a practical example to illustrate the point.
It's useful when a site publicly provides data that is (still) not available as an XML service. I had a client who used scraping to pull flight tracking data into one of his company's intranet applications.
The technique is also used for research. I had a client who wanted to compare the contents of several online dictionaries by part of speech, and all of these sites had to be scraped.
It is not a technique for "stealing" data. All ordinary usage restrictions apply. Many sites implement CAPTCHA mechanisms to prevent scraping, and it is inappropriate to work around these.
A good example is StackOverflow - no need to scrape data as they've released it under a CC license. Already the community is crunching statistics and creating interesting graphs.
There's a whole bunch of popular mashup examples on ProgrammableWeb. You can even meet up with fellow mashupers (O_o) at events like BarCamps and Hack Days (take a sleeping bag). Have a look at the wealth of information available from Yahoo APIs (particularly Pipes) and see what developers are doing with it.
Don't steal and republish, build something even better with the data - new ways of understanding, searching or exploring it. Always cite your data sources and thank those who helped you. Use it to learn a new language or understand data or help promote the semantic web. Remember it's for fun not profit!
Hope that helps :)
If the site has data that would benefit from being accessible through an API (and it would be free and legal to do so), but they just haven't implemented one yet, screen scraping is a way of essentially creating that functionality for yourself.
Practical example -- screen scraping would allow you to create some sort of mashup that combines information from the entire SO family of sites, since there's currently no API.
Well, to collect data from a mainframe. That's one reason why some people use screen scraping. Mainframes are still in use in the financial world and often it's running software that has been written in the previous century. The people who wrote it might already be retired and since this software is very critical for these organizations, they really hate it when some new code needs to be added. So, screenscraping offers an easy interface to communicate with the mainframe to collect information from the mainframe and then send it onwards to any process that needs this information.
Rewrite the mainframe application, you say? Well, software on mainframes can be very old. I've seen software on mainframes that was over 30 years old, written in COBOL. Often, those applications work just fine and companies don't want to risk rewriting parts because it might break some code that had been working for over 30 years! Don't fix things if they're not broken, please. Of course, additional code could be written but it takes a long time for mainframe code to be used in a production environment. And experienced mainframe developers are hard to find.
I myself had to use screen scraping too in a software project. This was a scheduling application which had to capture the output to the console of every child process it started. It's the simplest form of screen scraping, actually, and many people don't even realize that if you redirect the output of one application to the input of another, that it's still a kind of screen scraping. :)
Basically, screen scraping allows you to connect one (web) application with another one. It's often a quick solution, used when other solutions would cost too much time. Everyone hates it, but the amount of time it saves still makes it very efficient.
Let's say you wanted to get scores from a popular sports site that did not offer the information available with an XML feed or API.
For one project we found a (cheap) commercial vendor that offered translation services for a specific file format. The vendor didn't offer an API (it was, after all, a cheap vendor) and instead had a web form to upload and download from.
With hundreds of files a day the only way to do this was to use WWW::Mechanize in Perl, screen scrape the way through the login and upload boxes, submit the file, and save the returned file. It's ugly and definitely fragile (if the vendor changes the site in the least it could break the app) but it works. It's been working now for over a year.
One example from my experience.
I needed a list of major cities throughout the world with their latitude and longitude for an iPhone app I was building. The app would use that data along with the geolocation feature on the iPhone to show which major city each user of the app was closest to (so as not to show exact location), and plot them on a 3D globe of the earth.
I couldn't find an appropriate list in XML/Excel/CSV type format anywhere easily, but I did find this wikipedia page with (roughly) the info I needed. So I wrote up a quick script to scrape that page and load the data into a database.
Any time you need a computer to read the data on a website. Screen scraping is useful in exactly the same instances that any website API is useful. Some websites, however, don't have the resources to create an API themselves; screen scraping is the developer's way around that.
For instance, in the earlier days of Stack Overflow, someone built a tool to track changes to your reputation over time, before Stack Overflow itself provided that feature. The only way to do that, since Stack Overflow has no API, was to screen scrape.
The obvious case is when a webservice doesn't offer reverse search. You can implement that reverse search over the same data set, but it requires scraping the entire dataset.
This may be fair use if the reverse search also requires significant pre-processing, e.g. because you need to support partial matching. The data source may not have the technical skills or computing resources to provide the reverse search option.
I use screen scraping daily, I run some eCommerce sites and have screen-scraping scripts running daily to gather product lists automatically from my suppliers wholesale sites. This allows me to have upto date information on all the products available to me from several suppliers and allows me to flag non-economical margins due to price changes.

Does anyone have database, programming language/framework suggestions for a GUI point of sale system?

Our company has a point of sale system with many extras, such as ordering and receiving functionality, sales and order history etc. Our main issue is that the system was not designed properly from the ground up, so it takes too long to make fixes and handle requests from our customers. Also, the current technology we are using (Progress database, Progress 4GL for the language) incurs quite a bit of licensing expenses on our customers due to mutli-user license fees for database connections etc.
After a lot of discussion it is looking like we will probably start over from scratch (while maintaining the current product at least for the time being). We are looking for a couple of things:
Create the system with a nice GUI front end (it is currently CHUI and the application was not built in a way that allows us to redesign the front end... no layering or separation of business logic and gui...shudder).
Create the system with the ability to modularize different functionality so the product doesn't have to include all features. This would keep the cost down for our current customers that want basic functionality and a lower price tag. The bells and whistles would be available for those that would want them.
Use proper design patterns to make the product easy to add or change any part at any time (i.e. change the database or change the front end without needing to rewrite the application or most of it). This is a problem today because the Progress 4GL code is directly compiled against the database. Small changes in the database requires lots of code recompiling.
Our new system will be Linux based, with a possibility of a client application providing functionality from one or more windows boxes.
So what I'm looking for is any suggestions on which database and/or framework or programming language(s) someone might recommend for this sort of product. Anyone that has experience in this field might be able to point us in the right direction or even have some ideas of what to avoid. We have considered .NET and SQL Express (we don't need an enterprise level DB), but that would limit us to windows (as far as I know anyway). I have heard of Mono for writing .NET code in a Linux environment, but I don't know much about it yet. We've also considered a Java and MySql based implementation.
To summarize we are looking to do the following:
Keep licensing costs down on the technology we will use to develop the product (Oracle, yikes! MySQL, nice.)
Deliver a solution that is easily maintainable and supportable.
A solution that has a component capable of running on "old" hardware through a CHUI front end. (some of our customers have 40+ terminals which would be a ton of cash in order to convert over to a PC).
Suggestions would be appreciated.
Thanks
[UPDATE]
I should note that we are currently performing a total cost analysis. This question is intended to give us a couple of "educated" options to look into to include in or analysis. Anyone who could share experiences/suggestions about client/server setups would be appreciated (not just those who have experience with point of sale systems... that would just be a bonus).
[UPDATE]
For anyone who is interested, we ended up going with Microsoft Dynamics NAV, LS Retail (a plugin for the point of sale and various other things) and then did some (and are currently working on) customization work on top of that. This setup gave us the added benefit of having a fully integrated g/l system, which our current system lacked.
Java for language (or Scala if you want to be "bleeding edge", depending on how you plan to support it and what your developers are like it might be better, but also worse)
H2 for database
Swing for GUI
Reason: Free, portable and pretty standard.
Update: Missed the part where the system should be a client-server setup. My assumption was that the database and client should run on the same machine.
I suggest you first research your constraints a bit more - you made a passing reference to a client using a particular type of terminal - this may limit your options, unless the client agrees to upgrade.
You need to do a lot more legwork on this. It's great to get opinions from web forums, but we can't possibly know your environment as well as you do.
My broad strokes advice would be to aim for technology that is widely used. This way, expertise on the platform is cheaper than "niche" technologies, and it will be easier to get help if you hit a brick wall. Of course, following this advice may not be possible if you have non-negotiable technology already in place at customers.
My second suggestion would be to complete a full project plan, with detailed specs and proper cost estimates, before going with the "rewrite from scratch" option. Right now, you're saying that it would be cheaper to rewrite the system than maintain it, and you don't really know how much it would cost to re-write.
I suggest you use browser for the UI.
Organize your application as a web application.
There are tons of options for the back-end. You can use Java + MySQL. Java backend will save you from windows/linux debate as it will run on both platforms. You won't have any licensing cost for both Java and MySQL. (Edit: Definitely there are a lot of others languages that have run-times for both linux & windows including PHP, Ruby, Python etc)
If you go this route, you may also want to consider Google Web Toolkit (GWT) for creating the browser based front-end in a modular fashion.
One word of caution though. Browsers can be pesky when it comes to memory management. In our experience, this was the most significant challenge in doing browser based POS You may want to checkout Adobe Flex that runs in browser but might be more civil in its memory management.
What is CHUI? Character-UI, as in VT terminals? Or even 3270 style?
It sounds like you need a 3-tier system - the database backend, a middle-layer that runs the bulk of the back-end business processes, and a front-end layer for the CHUI / GUI / data-gateway.
All three layers can reside on one machine; or you can distribute the tiers out to various servers. The front-end layer would control the actual terminals, whether they are VT-terminals, or a web-browser, or a custom-written 'client' application.
Make sure you have considered the hardware needs here -- are you going to have barcode scanners, cash drawers, POS debit/credit terminals, et cetra? If you are using a standard browser, it might be hard to reliably integrate those items. (At the very least, you're likely going to have to write special applets to handle them.)
Finally, consider the possibility of a thin-client technology on Windows. It greatly simplifies system management, since you only have to upgrade the software centrally. Thin-client PC's are cheap -- sub $200.
Golden Code Development (see www.goldencode.com) has a technology that does automated conversion of Progress 4GL (the schema and code... the entire application) to a Java application with a relational database backend (e.g. PostgreSQL). They currently support a very complete CHUI environment and they do refactor the code. For example, the conversion separates the UI, the data model and the business logic into separate Java classes. The entire result is a drop-in replacement that is compatible with the original (users don't need retraining, processes don't need to be modified, the data is migrated too). This is possible because they provide an application server and a set of runtime classes that provide that compatibility. The result of the automated conversion is not something that needs further editing before you can compile and run it. True terminal support is included so hardware terminals still work (it requires a small JNI library to access NCURSES from Java). All the rest of the code in the runtime is pure Java. No Progress Software Corp technology is used in the resulting system and it runs on Linux.
At least one converted system is already in production, running a 24 by 7 mission critical environment. It is a converted ERP system that their mid-sized pilot customer uses to run their entire business.

Resources