IPFS can't really enforce nodes to delete an uploaded file, isn't that a problem? - distributed

As this decentralisation wave is taking place around the digital world, I was wondering how can you remove some content that you just uploaded on a decentralized network.
As I understand, more and more people want to have decentralized services, because as opposed to the client-server architecture this gives you more ownership of your stuff and everything is more transparent. But, what happens if you messed up, or the company you're a client of messed up and they/you upload some personal info that you clearly don't want others to have access to? Since it's a peer-to-peer network everybody has access to it and there's no way to enforce them to delete it.
I think what I am trying to understand is how this decentralized future will play out with private data, will there be a centralized place for private data and then we gonna do other things on ipfs and different similar apps? Because if that's so then what's the purpose, why not continue as it is right now? Maybe I am still not seeing all the use cases...

IPFS does allows you to delete file, you just need to make so on all different nodes hosting the file.
If some nodes aren't in your control the process is fairly simple, monitor ipfs dht findprovs <A file you want to delete>, find all peers hosting the file, then for each find their IP with ipfs dht findpeer <Peer ID>, then use a database like whois or BGP to find the ISP and send them C&D or GDPR claim or whatever.
Appart from the tools you use being ipfs centred it's the exact same process as for regular good old web2 with http.
You might think that for multiple nodes it's unlikely for everyone to comply and whatever juridiction you use to claim your rights of forgotness.
But that already happen with http, you can host your server in a country that doesn't follow whatever law you claim your right of thoses files to be removed or use Tor and mostly not worry about legal threats.
GDPR or any other law like that is already ineffective at removing stuff from the web, the goal is more to scare big players and help politicians keep their job (putting in place an ineffective solution to a problem not many people understand can help them get a good reception of the public and being renewed).

Yes it can be a problem. Companies which store data of their customers should not store them on a Blockchain. As in Europe with the GDPR-Law they are obliged to delete the data if the customer requests it.
I have had a similar issue at my company when we were discovering if we should use a decentraliced network in a project. In this link here is a statement from R3 (which developed Corda, a DLT for Business) about this topic. It is from 3 years ago but it's still relevant in my opinion.
So the solution is to only store the reference to the user (like an ID) on-chain and keep the sensitive stuff off-chain.
Another interesting project is Atala Prism, but unfortunatly I had not yet the time to dive into it.

Related

Distributing an application across Internet

I searched but couldn't find a proper answer for this...may be I didn't look deep enough. Anyways, little insight from you guys will only make things easier. So hear me out.
this is for my final year research project. I just need concepts and if any links I can read more.
So this application is a distributed one for a hotel which has 3 branches (including the main hotel) in location A, B, C.
I & (my colleagues) have developed the database, business logic, and 3 separate GUIs for the billing, bar and the kitchen. All are working perfectly and we used .NET remoting for this. this is the whole system and GUIs connected to the business through LAN.
This system, should be deployed in each location (A,B,C) and from the main hotel (A), I should be able to view the details of other locations (B,C). and all 3 systems should be connected through the internet.
problem is, how do I do that?
I just wanna view the information of other places and may be take printouts. that is not relevant for the question i guess.
The database is not distributed, each location has its own database. If I were to use a web service, how can I do it more cost-effectively? where do I have to deploy the service?
as a side note, I have developed a simple chat system (remoting) and tried to connect it through internet with a friend but it didn't work. If anyone knows why?
please be kind enough to provide any other relevant information on this topic. and please ask questions.
Why not just build a web application with a secure login? That way you build one system, deploy one system, maintain one system. All your data would be in one place, making reporting a lot less onerous, the whole thing would be faster and if you ever need to add a fourth, fifth or twenty seventh additional location, then you'd need to do very little to make it happen.
I see no reason why you have to go about it as you are.

Silverlight Financial Data

Does anyone know of any documentation of how to access bank data via some sort of webservice or other method for use in a Silverlight financial / banking application? Is there any sort of standard protocol or terminology used for this that I can look up online. I'm having trouble finding any sort of information on how this is typically done.
"Access bank data"... Not exactly something banks allow from the outside world. They kinda want to keep things secure :)
If you work for a bank you may well have access to various web services internally. There are standards for data transfers, but every bank will likely have it's own systems.
I'm having trouble finding any sort of information on how this is typically done.
That's probably a good thing. This is typically done by either internal bank developers or consultants. For example, take the Bank of America Windows Phone 7 app (which is a Silverlight app): it connects to BofA's servers, but I would be surprised if the way in which it connects is public information. Because you can use it to check your account, I can only presume that there is a web service hosted somewhere that allows these clients to get this data. I'm pretty confident, however, that the connection is secured, and the details of it are kept hidden for good reason.
In short, banks don't usually expose web services to the outside world for public consumption. Unless you've been hired by a bank to specifically do this, I'm not sure you should be able to.

Steps to publish Software to be purchased via Registration

I'm about to get finished developing a windows application which I want to release as shareware. It was developed in C# and will be running on .Net 3.5+ machines.
To use it the user will have to be online.
My intent is to let the user try it for 30 days and then limit its functionality until a registration is purchased.
The installer will be made available via an msi file.
Could anyone give the general steps on how to implement this?
Here are some more specific questions:
Since I am trying to avoid having to invest a lot upfront in order to establish an e-commerce site, I was thinking of a way to just let the user pay somehow, while supplying his email in which he then receives the unlock key.
I found some solutions out there like listed here:
Registration services
I am still not sure, if they are the way to go.
One of my main concerns is to prevent the reuse if a given serial, e.g. if two users run the program with the same serial at the same time, this serial should disabled or some other measure be taken.
Another point is, that my software could potentially be just copied from one computer to the other without using an installer, so to just protect the installer itself will not be sufficient.
Maybe someone who already went though this process can give me some pointers, like the general steps involved (like 1. Get domain, 2. Get certain kind of webhost ....) and address some of the issues I mentioned above.
I'm thankful for any help people can give me.
I don't have a useful answer for you, but I did have a couple observations I wanted to share that were too large to fit in a comment. Hopefully someone else with more technical expertise can fill in the details.
One of my main concerns is to prevent the reuse if a given serial, e.g. if two users run the program with the same serial at the same time, this serial should disabled or some other measure be taken.
To ensure that two people aren't using the same serial number, your program will have to "phone home." A lot of software does this at installation time, by transmitting the serial number back to you during the installation process. If you want to do it in real time, your application will have to periodically connect to your server and say "this serial number is in use."
This is not terribly user friendly. Any time that the serial number check is performed, the user must be connected to the Internet, and must have their firewall configured to allow it. It also means that you must commit to maintaining the server side of things (domain name, server architecture) unchanged forever. If your server goes down, or you lose the domain, your software will become inoperative.
Of course, if a connection to your service specifically (rather than the Internet in general) is essential to the product's operation, then it becomes a lot easier and more user friendly.
Another point is, that my software could potentially be just copied from one computer to the other without using an installer, so to just protect the installer itself will not be sufficient.
There are two vectors of attack here. One is hiding a piece of information somewhere on the user's system. This is not terribly robust. The other is to check and encode the user's hardware configuration and encode that data somewhere. If the user changes their hardware, force the product to reactivate itself (this is what Windows and SecuROM do).
As you implement this, please remember that it is literally impossible to prevent illegal copying of software. As a (presumably) small software developer, you need to balance the difficulty to crack your software against the negative effects your DRM imposes on your users. I personally would be extremely hesitant to use software with the checks that you've described in place. Some people are more forgiving than I am. Some people are less so.
The energy and effort to prevent hacks from breaking your code is very time consuming. You'd be better served by focusing on distribution and sales.
My first entry into shareware was 1990. Back then the phrase was S=R which stood for Shareware equals Registered. A lot has changed since then. The web is full of static and you have to figure out how to get heard above the static.
Here's somethings I've learned
Don't fall in love with your software. Someone will always think it should work differently. Don't try and convert them to your way of thinking instead listen and build a list of enhancements for the next release.
Learn how to sell or pay someone to help you sell your stuff
Digital River owns most of the registration companies out there
Create free loss leaders that direct traffic back to you
Find a niche that is has gone unmet and fill it
Prevent copying: base the key on the customer's NIC MAC. Most users will not go to the trouble of modifying their NIC MAC. Your app will have a dialog to create and send the key request, including their MAC.
The open issue is that many apps get cracked and posted to warez sites. Make this less likely by hiding the key validation code in multiple places in your app. Take care to treat honest users with respect, and be sure your key validation does not annoy them in any way.
Make it clear that the key they are buying is node locked.
And worry about market penetration. Get a larger installed base by providing a base product that has no strings attached.
cheers -- Rick

Identifying hostile web crawlers

I am wondering if there are any techniques to identify a web crawler that collects information for illegal use. Plainly speaking, data theft to create carbon copies of a site.
Ideally, this system would detect a crawling pattern from an unknown source (if not on the list with the Google crawler, etc), and send bogus information to the scraping crawler.
If, as a defender, I detect an unknown crawler that hits the site at regular intervals, the attacker will randomize the intervals.
If, as a defender, I detect the same agent/IP, the attacker will randomize the agent.
And this is where I get lost - if an attacker randomizes the intervals and the agent, how would I not discriminate against proxies and machines hitting the site from the same network?
I am thinking of checking the suspect agent with javascript and cookie support. If the bogey can't do either consistently, then it's a bad guy.
What else can I do? Are there any algorithms, or even systems designed for quick on-the-fly analysis of historical data?
My solution would be to make a trap. Put some pages on your site where access are banned by robots.txt. Make a link on you page, but hide it with CSS, then ip ban anybody who goes to that page.
This will force the offender to obey robots.txt, which means that you can put important information or services permanently away from him, which will make his carbon-copy clone useless.
Don't try and recognize by IP and timing or intervals--use the data you send to the crawler to trace them.
Create a whitelist of known good crawlers--you'll serve them your content normally. For the rest, serve pages with an extra bit of unique content that only you will know how to look for. Use that signature to later identify who has been copying your content and block them.
And how do you keep someone from hiring a person in a country with low wages to use a browser to access your site and record all of the information? Set up a robots.txt file, invest in a security infrastructure to prevent DoS attacks, obfuscate your code (if accessible, like javascript), patent your inventions, and copyright your site. Let the legal people worry about someone ripping you off.

How do I create a web application where I do not have access to the data?

Premise: The requirements for an upcoming project include the fact that no one except for authorized users have access to certain data. This is usually fine, but this circumstance is not usual. The requirements state that there be no way for even the programmer or any other IT employee be able to access this information. (They want me to store it without being able to see it, ever.)
In all of the scenarios I've come up with, I can always find a way to access the data. Let me describe some of them.
Scenario I: Restrict the table on the live database so that only the SQL Admin can access it directly.
Hack 1: I rollout a change that sends the data to a different table for later viewing. Also, the SQL Admin can see the data, which breaks the requirement.
Scenario II: Encrypt the data so that it requires a password to decrypt. This password would be known by the users only. It would be required each time a new record is created as well as each time the data from an old record was retrieved. The encryption/decryption would happen in JavaScript so that the password would never be sent to the server, where it could be logged or sniffed.
Hack II: Rollout a change that logs keypresses in javascript and posts them back to the server so that I can retrieve the password. Or, rollout a change that simply stores the unecrypted data in a hidden field that can be posted to the server for later viewing.
Scenario III: Do the same as Scenario II, except that the encryption/decryption happens on a website that we do not control. This magic website would allow a user to input a password and the encrypted or plain-text data, then use javascript to decrypt or encrypt that data. Then, the user could just copy the encrypted text and put the in the field for new records. They would also have to use this site to see the plain-text for old records.
Hack III: Besides installing a full-fledged key logger on their system, I don't know how to break this one.
So, Scenario III looks promising, but it's cumbersome for the users. Are there any other possibilities that I may be overlooking?
If you can have javascript on the page, then I don't think there's anything you can do. If you can see it in a browser, then that means it's in the DOM, which means you can write a script to get it and send it to you after it has been decrypted.
Aren't these problems usually solved via controls:
All programmers need a certain level of clearance and background checks
They are trained to understand that rolling out code to access the data is a fireable or worse offense
Every change in certain areas needs some kind of signoff
For example -- no JavaScript on page without signoff.
If you are allowed to add any code you want, then there's always a way, IMO.
Ask the client to provide an Non-disclosure Agreement for you to sign, sign it, then look at as much data as you want.
What I'm wondering is, what exactly will you be able to do with encrypted data anyway? Pretty-much all apps require you to do some filtering of the data, whether it be move it to a required place, modify it, sanitize it, or display it. Otherwise, you're just a glorified pipe, and you don't have to do any work.
The only way I can think of where you wouldn't be looking at the data or doing anything with it would be a simple form to table mapping with CRUD options. If you know what format the data will be coming in as you should be able to roll something out with RoR, a simple skin, put SSL into the mix, and roll it out. Test with dummy data in the same format, and you're set.
In fact, is your client unable to supply dummy data for testing? If they can, then your life is simple as all you do is provide an "installable" and tell them how to edit a config file.
I think you could still create the app in the following way:
Create a dev database and set up a user for it.
Ask them for: the data type, size, and name of each field that needs to be on the screen.
Set up the screens, create columns in the database that accept the data type and size they specify.
Deploy the app to production, hooked up to an empty database. Get someone with permission (not you) to go in and set the password on the database user and set the password for the DB user in the web app.
Authorized users can then do whatever they want and you never saw what any of the data looked like.
Of course, maintaining the app and debugging is gonna be a bitch!
--In answer to comments:
Ok, so after setting up the password for the Username in the database and in the web app's config, write a program that connects to the database, sets a randomized password, then writes that same randomized password to the web config.
Prevent any outgoing packets from the machine except to a set of authorized workstations - so you can't install your spyware.
Then set the Admin password on both servers to the same random password, then delete all other users on the servers, delete the program, and delete the program source code.
Wipe the hard drives of the developer machines with the DOD algorithm, and then toss them into an industrial shredder.
10. If the server ever needs debugging, toss it in the trash, buy a new one, and start back at #1.
But seriously - this is an insolvable problem. The best answer to this really is:
Tell them they can't have an application. Write your stuff on paper. Put it in a folder. Lock it in a vault. Thrust, repeat.
Wouldn't scenario 3 just expose all the data to the magic website? This doesn't sound like a solvable problem (at least I can't think of a solution).
Go with whatever solution is easiest for you to implement, I think the requirements show the the client does not understand software development and so it should be easy to sell any approach you take.
I have to say I really don't like the idea of using JavaScript on the client to decrypt the data. That is a huge hole as any script (hacker, GreaseMonkey, IE7Pro, etc.) can access the DOM and get data out of the page.
Also, it is very hard to get around the problem of key stroke loggers. If you throw those into the mix, then your options are limited. At that point you need a security FOB such as RSA (commonly used with corporate VPNs) to generate truly random PINs. That will probably be expensive, and it is a pain, and I have only seen it used with VPNs but I assume it could work with websites as well.
As far as the website, I'd stick with HTTPS and find a way to encrypt/decrypt through the WebServer rather than relying on JavaScript. The SSL traffic isn't very prone to sniffing (very difficult to decrypt), so that allows the encryption and decryption to happen server-side which (IMHO) is more secure.
Look at banking scenarios and other financial institutions for a starting point, and then go from there. Try not to over-complicate if possible.
You can't guarantee against hacking into the data as long as you have access to the server it lives on. So tell the employer they have to host the data somewhere else and grant access to the client's browser via a secure HTTPS connection.
You can design your web page to dynamically load an XML data stream securely, and format it into a web page using an XSLT script on the client.
See http://www.w3schools.com/xsl/xsl_client.asp for examples
That way you produce the code, but you never have access to the data. Only the user has access to their own data.
As for how the employer is going to host the data without granting any IT people access to it, that's their problem. It's a foolish requirement.
I think that I'll just tell them that they either have to trust a couple of us to have access (and not look at it) or they don't get a project.
Thanks for the answers. Feel free to post more thoughts if you have them.
You can never have 100% security, and extra security comes at a cost of speed/price/convenience etc.
Let's suppose you take scenario 3 - one of your programmers can use social engineering to get the password from one of the users. Goodbye security.
There's no point having a high-security iron door as a gate if people can just walk around it. Just implement a decent level of security.
(They want me to store it without being able to see it, ever.)
Hey, the recording industry wants people to be able to listen to their music, but not copy it. Sounds like they should get together sometime!
Their idea won't work for the same reason DRM doesn't work: the trust chain is inherently compromised. Encryption examples often use Alice, Bob, and Charlie where Alice is trying to communicate with Bob without Charlie listening in. With DRM, the trust chain is compromised because Bob and Charlie are the same person. With your situation, Charlie is the guy writing the software that Alice and Bob use to communicate. There's an implied trust, because if you don't trust Charlie then you can't trust Charlie's software, either.
That's the root of the issue: trust. If they can't trust the programmer, the game is over before it starts.
There are lots of options based on what their goal really is, but I am confused by their paranoia, er, intent:
Is this their (and end-user) data that they wish to keep private or end-user data to be kept private from everyone?
Is it just that your (or any contracted) company is suspect?
Are they afraid of over-the-wire snooping?
Are they afraid of DOM access through JavaScript or browser plugins?
Are they planning staged deployment? In that case you work on test/dev server w/o real data but have no access to the production server with the real data, and DNS logging and/or firewall rules inhibit all of your hacks from working undetected.
Ultimately if the data is stored in a DB then the programmer and DB admin can, by working together, get it. Period. A good audit should uncover that, though.
If this is truly a requirement, the only way to guard against this is to hire an outside firm to audit the code prior to releasing the software, and that's going to be very expensive.

Resources