Consensus algorithm check list - cryptocurrency

I wrote a new consensus algorithm. Is there a self-evaluation checklist I can run to see if it meets the basic requirements? Like is it resistant to double-spent attacks? Or how does it scales?

I reviewed this entire algorithm. Though the idea is great, it feels a bit incomplete. The self-evaluation checklist below is based on the requirements and safety measures taken by well established blockchains i.e. ETH, BTC, etc.
System Criteria:
What is the required storage capacity of system?
-- RAM usage, bandwidth
What happens when the entire network goes offline?
Algorithm evaluation:
Is this algorithm scalable? scalable as in operable when the users multiply exponentially.
How long does it take the miners to reach a 2/3 consensus?
Are there safety measures for the user' funds?
How can a user transfer funds safely? (Cryptographic hash algorithms that can be deciphered only by the authorised entities to ensure safety)
Architecture evaluation:
Is it decentralized, transparent and immutable?
User evaluation:
Is there enough incentive for a miner/validator to validate the transactions?
Is there enough incentive for a "new" miner/validator to join the network?
Is it possible for a single entity to dominate the network?
What safety measure is there in order to prevent blind/unreliable transfer of data?
Resistance to attacks evaluation:
Is the algorithm resistant to double spend attacks, eclipse attacks, sybil attacks(identity thefts), 67% attack?
Is there a way for the honest users to defend against such attacks? If not then how likely is it that an attacker is successful after attacking the blockchain?
As an attacker what is the weakness of this algorithm? Once something is confirmed by 2/3 then it is unchangeable, so how can you get that 2/3 vote ?
These are some conditions that came to my mind while reading the algorithm description which were unanswered. A consensus algorithm takes into account the maximum throughput and latency of the current systems in order to provide a holistic idea of how to evade attacks and secure the users. If the consensus algorithm fails to do either of those then it would not fly in the market because the network would become untrustworthy because of a lacking algorithm. To ensure it is not, such questions should be asked in addition to the blockchain/algorithm specific questions that would rise in a user's mind when trying to join a network. At the end of the day, everyone likes to keep their money safe and secure and hidden away from the general public to avoid any and all kinds of attack.

I'll admit I didn't read it too carefully - but I was looking on how the document hands CAP theorem.
There is a statement in your doc: "since they (validators) are looking at the full blockchain picture". This statement is never true in a distributed system.
Second statement "Once 2/3 of the validators approve an item" - who does this decision that 2/3 reached? When does the customer knows that the transaction is good? It seems the system is not too stable and will come to halt quite often.
Looking forward for other comments from the community :)

Related

Two-phase commit: availability, scalability and performance issues

I have read a number of articles and got confused.
Opinion 1:
2PC is very efficient, a minimal number of messages are exchanged and latency is low.
Source:
http://highscalability.com/paper-consensus-protocols-two-phase-commit
Opinion 2:
It is very hard to scale distributed transactions to high level, moreover they reduce throughput. As 2PC guarantess ACID It puts a great burden due to its complex coordination algorithm.
Source: http://ivoroshilin.com/2014/03/18/distributed-transactions-and-scalability-issues-in-large-scale-distributed-systems/
Opinion 3:
“some authors have claimed that two-phase commit is too expensive to support, because
of the performance or availability problems that it brings. We believe it is better to have
application programmers deal with performance problems due to overuse of transactions
as bottlenecks arise, rather than always coding around the lack of transactions. Running
two-phase commit over Paxos mitigates the availability problems.”
Source: http://courses.cs.washington.edu/courses/csep552/13sp/lectures/6/spanner.pdf
Opinion 4:
The 2PC coordinator also represents a Single Point of Failure, which is unacceptable for critical systems - I believe it is a coordinator.
Source: http://www.addsimplicity.com/adding_simplicity_an_engi/2006/12/2pc_or_not_2pc_.html
First 3 opinions contradict each other. The 4-th one I think is correct. Please clarify what is wrong and what is correct. It would be great also to give facts why that is.
The 4th statement is correct, but maybe not in the way you are reading it. In 2PC, if the coordinator fails, the system cannot make progress. It therefore often desirable to use a fault-tolerant protocol like Paxos (see Gray and Lamport for example), which will allow the system to safely progress when there are failures.
Opinion 3 should be read in context of the rest of the Spanner paper. The authors are saying that they have developed a system which allows efficient transactions in a distributed database, and that they think it's the right default tradeoff for users of the system. The way Spanner does that is well detailed in the paper, and it is worth reading. Take note that Spanner is simply a way (a clever way, granted) of organizing the coordination which is inherently required to implement serializable transactions. See Gilbert and Lynch for one way to look at the limits on coordination).
Opinion 2 is a common belief, and there are indeed tradeoffs between availability and richness of transaction semantics in real-world distributed systems. Current research, however, is making it clear that these tradeoffs are not as dire as they have been portrayed in the past. See this talk by Peter Bailis for one of the research directions. If you want true serializability or linearizability in the strictest sense, you need to obey certain lower bounds of coordination in order to achieve them.
Opinion 1 is technically true, but not very helpful in the way you quoted it. 2PC is optimal in some sense, but seldom implemented naively because of the availability tradeoffs. Many adhoc attempts to adress these tradeoffs lead to incorrect protocols. Others, like Paxos and Raft, successfully address them at the cost of some complexity.

Why are relational databases having scalability issues?

Recenctly I read some articles online that indicates relational databases have scaling issues and not good to use when it comes to big data. Specially in cloud computing where the data is big. But I could not find good solid reasons to why it isn't scalable much, by googling. Can you please explain me the limitations of relational databases when it comes to scalability?
Thanks.
Imagine two different kinds of crossroads.
One has traffic lights or police officers regulating traffic, motion on the crossroad is at limited speed, and there's a watchdog registering precisely what car drove on the crossroad at what time precisely, and what direction it went.
The other has none of that and everyone who arrives at the crossroad at whatever speed he's driving, just dives in and wants to get through as quick as possible.
The former is any traditional database engine. The crossroad is the data itself. The cars are the transactions that want to access the data. The traffic lights or police officer is the DBMS. The watchdog keeps the logs and journals.
The latter is a NOACID type of engine.
Both have a saturation point, at which point arriving cars are forced to start queueing up at the entry points. Both have a maximal throughput. That threshold lies at a lower value for the former type of crossroad, and the reason should be obvious.
The advantage of the former type of crossroad should however also be obvious. Way less opportunity for accidents to happen. On the second type of crossroad, you can expect accidents not to happen only if traffic density is at a much much lower point than the theoretical maximal throughput of the crossroad. And in translation to data management engines, it translates to a guarantee of consistent and coherent results, which only the former type of crossroad (the classical database engine, whether relational or networked or hierarchical) can deliver.
The analogy can be stretched further. Imagine what happens if an accident DOES happen. On the second type of crossroad, the primary concern will probably be to clear the road as quick as possible, so traffic can resume, and when that is done, what info is still available to investigate who caused the accident and how ? Nothing at all. It won't be known. The crossroad is open just waiting for the next accident to happen. On the regulated crossroad, there's the police officer regulating the traffic who saw what happened and can testify. There's the logs saying which car entered at what time precisely, at which entry point precisely, at what speed precisely, a lot of material is available for inspection to determine the root cause of the accident. But of course none of that comes for free.
Colourful enough as an explanation ?
Relational databases provide solid, mature services according to the ACID properties. We get transaction-handling, efficient logging to enable recovery etc. These are core services of the relational databases, and the ones that they are good at. They are hard to customize, and might be considered as a bottleneck, especially if you don't need them in a given application (eg. serving website content with low importance; in this case for example, the widely used MySQL does not provide transaction handling with the default storage engine, and therefore does not satisfy ACID). Lots of "big data" problems don't require these strict constrains, for example web analytics, web search or processing moving object trajectories, as they already include uncertainty by nature.
When reaching the limits of a given computer (memory, CPU, disk: the data is too big, or data processing is too complex and costly), distributing the service is a good idea. Lots of relational and NoSQL databases offer distributed storage. In this case however, ACID turns out to be difficult to satisfy: the CAP theorem states somewhat similar, that availability, consistency and partition tolerance can not be achieved at the same time. If we give up ACID (satisfying BASE for example), scalability might be increased.
See this post eg. for categorization of storage methods according to CAP.
An other bottleneck might be the flexible and clever typed relational model itself with SQL operations: in lots of cases a simpler model with simpler operations would be sufficient and more efficient (like untyped key-value stores). The common row-wise physical storage model might also be limiting: for example it isn't optimal for data compression.
There are however fast and scalable ACID compliant relational databases, including new ones like VoltDB, as the technology of relational databases is mature, well-researched and widespread. We just have to select an appropriate solution for the given problem.
Take the simplest example: insert a row with generated ID. Since IDs must be unique within table, database must somehow lock some sort of persistent counter so that no other INSERT uses the same value. So you have two choices: either allow only one instance to write data or have distributed lock. Both solutions are a major bottle-beck - and is the simplest example!

Why would you not want to consolidate Mission Critical Databases?

Suppose if you wanted to consolidate all of your mission critical databases into one instance so that you can save some licensing money, what would be the potential risks and are there any good articles or case studies on this? I realize this is a terrible idea but I have somebody that wants to do this and is willing to maximize the hardware resources if needed. I am trying to present him with something quantifiable or some articles that can steer him away from doing this.
There are three big mission critical databases which includes Billing, Dynamics CRM, and an in house built application to keep track of transactions. These are high volume databases for a small/mid sized company. I need quantifiable or a good case study support in order to convince somebody that this is the wrong path to go towards. Any other advice on how I can convince this person would be helpful also.
The answer depends. On first glance, it may look like a bad idea. On the other hand, if the goal is to consolidate everything on one server, and then replicate that server in a remote environment, then you are on the way to a more reliable system. IT might prefer having everything in one place, rather than dealing with mission critical servers spread out over the terrain.
One major issue is the need for a larger machine. So, if any of the systems use software whose license depensd no the size of the machine, you nmight end up spending more money because you need a larger server. I've seen this happen with SAS licencing, for instance.
Perhaps the biggest issue, though, is that the different applications are probably on different development cycles -- whether developed in-house or from outside vendors. So, updating the hardware/operating system/software can become a nightmare. A fix or enhanced functionality in A might require an OS patch, which in turn, has not been tested on B. This maintenance issue is the reason why I would strongly advocate separate servers.
That said, mission critical applications are exactly that, mission critical. The driving factor should not be a few dollars on hardware. The driving factors should be reliability, maintenance, performance, sustainability, and recovery.
The comments made by Oded, Catcall and Gilbert are spot on.
The bank where I learnt the IT trade ran its entire core business on a single MVS (later Z/OS) mainframe, which ran a single DBMS and a single transaction processor (unless you counted TSO as a transaction processor).
The transaction processor went down regularly (say, once a day). It never caused the bank to go broke because it was always up and running again in less than a minute. Mileage may vary, but losing one minute of business time in an entire working day (480 minutes, or < 0.25%) really isn't dangerously disruptive.
The single DBMS went down too, at times (say, twice a month). I can still hear the sysprogs yelling "DBMS is down" over the fence to the helpdesk officers, meaning "expect to be getting user calls". It never caused the bank to go broke because it was always up and running again in a matter of minutes. Mileage may vary, but losing a couple of minutes of business time each month really shouldn't be dangerously disruptive.
The one time I do remember when the bank was really close to bankruptcy was when the development team had made a mess out of a new project in the bank's absolute core business, and the bank was as good as completely out of business (its REAL business) for three or four days in a row. That wasn't 0.25% loss of business time, but almost 100 TIMES more.
Moral of my story ? Two of them. (a) It's all about risk assessment (= probability assessment) and risk-weighted (= probability-weighted) cost estimation. (b) If you ask a question on SO (which implies a kind of recognition/expectation that answerers have more expertise than you on the subject matter), and people like Oded and Catcall provide you with a concise answer, which is accurate and to the point, then don't ask for papers or case studies to back up their answers. If you don't want to accept the experts' expertise, then why bother asking anything in the first place ?

algorithmic trading simulator/benchmark data [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 2 years ago.
Improve this question
I am interested about playing with algorithmic trading strategies. Does anyone know if there exists simulator or benchmark data I could possibly play with offline (without actually making any investments)?
This question is pretty broad; as it is, there is no mention of which instruments you're interested in. For instance:
Stocks
Bonds
Commodities
Forex
Derivatives (like Futures or Options)
For the moment, I will assume you're interested in stocks... if so, take a look at Ninja Trader, which offers these features for free. You can get free end-of-day stock data from Yahoo Finance, which is sufficient for longer-term trading timelines; keep in mind that the shorter the trading cycle, the more stringent your data resolution needs to be.
If you are willing to put several thousand into a trading account, any number of brokers will be happy to send you live market intra-day market feeds; but you don't need money in an account for paper trading (at least with my broker). I think the broker that is most flexible for programmers is Interactive Brokers. You can get sub-second data from them via an API, just understand they won't give you Tick-level granularity; they summarize their feeds, the specific details vary so it's better just to talk to them if you have tight granularity requirements. As for off-line simulation, you can do that with Ninja Trader, Interactive Brokers, and many other online brokers (see What online brokers offer APIs?).
BONUS MATERIAL
Since you're offering +200, I will share a bit more that you might be able to use... Keep it or toss in the trash for whatever value it brings.
Trading Timeframe
As a general rule, the shorter the timeframe, the more difficult the trades, and the harder it is to consistently make money. If you're unsure where to start with timelines, look at trading cycles of days or weeks at a time; then if you see too many opportunities passing you by, refine your system to a smaller timeline. The other thing to consider is how often you want to touch this code and tweak algorithms. The general rule is, as the trading cycles get shorter, you do more calibration and maintenance on your algorithms. It is not that unusual to find an algorithmic trader who wrote a good Swing-Trading platform that has worked as-is for the last decade. On the other hand, Day Trading Algorithms tend to require more care and feeding based on changes in market conditions.
Trading Style
Closely related to timelines are your trading tactics. Are you:
Trend-Following
Swing-Trading
Day Trading
Scalping
Sub-pennying
Using Statistical Arbitrage
Playing Options
Trade Management / Mindset
Trade management is a rather big topic, and one you'll find addressed in-depth if you lurk around trader boards like Elite Trader. While it may sound out of place to discuss some of this in the same thread about automated trading platforms, I'm sure you'd agree that your assumptions and attitude have insidious ways of leeching into your code. I will share a few things off the top of my head:
Success is primarily about preventing losing trades. Good trades take care of themselves.
Always trade with a stop-loss. Conventional wisdom is, "Your first loss is your smallest loss". If things start going south, figure out a way to consistently get out while keeping most of your previous profits; holding losers is the fast path to becoming a boiled frog.
There is no such thing as "Too High" or "Too low". The market moves in herd-mentality and doesn't care what you think it should be doing.
Closely related to point "3": Trade with the long-term trend. Fighting the trend (affectionately known as "counter-trending") may sound attractive to natural contrarians, but you need to be very good to do it well. Trading is enough work without trying to counter-trend.
Trading within the hour after a Federal Reserve market announcement is very difficult; I think it's just better to be out of the market. The quick profits can look seductive, but this is where the pros love to eat the amateur traders; brutal reversals can develop within a couple of minutes.
Avoid trading on margin unless you have a proven technique that you have backtested with at least a couple years of data.
The first thrity-minutes and last hour of regular trading can see rapid changes in volatility.
Regarding profit taking, "Bulls get fed, hogs get slaughtered"
If you find that you are not making profits, consider evaluating your trading frequency; reducing your trades to a minimum is key to success, otherwise, slippage, commissions and fees on junk trades will eat your profits.
Due to computational delays / processing time and partial order fills, limit orders are rather challenging to manage and pretty close to minutia; algorithmic traders have much bigger fish to fry.
Coding
Log every data point and decision you make in your code; three logging levels works for me. Trading is an inexact task, and tiny changes can blow up your previously-profitable algorithm. If something blows up, you need a way to compare against what was working.
Prototype in a scripting language; if something is slow, you can always offload to a compiler later. I think python is a fantastic for quantitative finance... mature unit-testing, C / C++ integration, numpy, pyplot and pandas are winners for me.
More pandas plugs... (pandas video), also see: Compute a compounded return series in Python
I started off with plain-ole csv, but I'm in process of migrating to HDF5 for tick data archives
Trade-simulation is deceptive: Simulated trades don't have problems filling due to low-liquidity or high-demand for the instrument; depending on market conditions, my real trades can see two or three seconds delay from the time I send the order to the time I get a fill. Simulated trades also don't get data blackouts; be sure to include sudden data loss (and how to recover) in your plans. Lower-cost brokers tend to suffer more blips and blackouts, but if your timeframe is longer, it may be something you can ignore.
Legal
The information set forth herein has been obtained or derived from sources believed by author to be reliable. However, the author does not make any representation or warranty, express or implied, as to the information’s accuracy or completeness, nor does the author recommend that the attached information serve as the basis of any investment decision. This data has been provided to you solely for information purposes and does not constitute an offer or solicitation of an offer, or any advice or recommendation, to purchase any securities or other financial instruments, and may not be construed as such. By using any of this information, you expressly agree that all risks associated with the performance and quality of the information is assumed solely by you. The author shall not be liable for any direct, indirect, incidental, special or consequential damages arising out of the use of or inability to use the information, even if the author has been advised of the possibility of such damages. The information is made available by the author "as is" and "with all faults".
Is this the kind of data you're looking for?
Ohio Department of Finance, Financial Data Finder
This page might also be of help:
Yahoo Finance Download, HTTP Interface parameters
And this one:
mathfinance.cn: Free financial resources
Not offline, but still a good link to give to other readers:
Google Finance API
I hope I understood the question correctly.
I use AMIBroker.
Its primarily used for backtesting algorithmic trading strategies. It is very fast and can download and get data from a variety of free sources.
Try with Cloud based backtesting Engines like Quantconnect and Quantopian
Quantopian is a python based IDE where you can write strategies and backtest online. You can do simulation trades as well as real trades with Interactive Broker.
Quanconnect is similar to Quantopian and they used .Net Based IDE and you can run simulation trades and live trader with the discount broker tradier.
QuantGo lets you rent high frequency data sets rather than buying them. I really like it because it's only a couple hundred per month instead of thousands.
Quandl has some good free data sets if you are only interested in trading at longer time intervals. Their stock data API is pretty slick (link).
One alternative would be PyAlgoTrade (http://gbeced.github.io/pyalgotrade/). Its an open source Python library to backtest trading strategies.

Is it theoretically possible to emulate a human brain on a computer? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
Our brain consists of billions of neurons which basically work with all the incoming data from our senses, handle our consciousness, emotions and creativity as well as our hormone system, etc.
So I'm completely new to this topic but doesn't each neuron have a fixed function? E.g.: If a signal of strength x enters, if the last signal was x ms ago, redirect it.
From what I've learned in biology about our nerves system which includes our brain because both consist of simple neurons, it seems to me as our brain is one big, complicated computer.
Maybe so complicated that things such as intelligence and cognition become possible?
As the most complicated things about a neuron pretty much are the chemical aspects on generating an electric singal, keeping itself alive, and eventually segmenting itself, it should be pretty easy emulating some on a computer, or?
You won't have to worry about keeping your virtual neuron alive, or?
If you can emulate a single neuron on a computer, which shouldn't be too hard, could you theoretically emulate more than 1000 billions of them, recreating intelligence, cognition and maybe even creativity?
In my question I'm leaving out the following aspects:
Speed of our current (super) computers
Actually writing a program for emulating neurons
I don't know much about this topic, please tell me if I got anything wrong :)
(My secret goal: Make a copy of my brain and store it on some 10 million TB HDD and make someone start it up in the future)
A neuron-like circuit can be built with a handful of transistors. Let's say it takes about a dozen transistors on average. (See http://diwww.epfl.ch/lami/team/vschaik/eap/neurons.html for an example.)
A brain-sized circuit would require 100 billion such neurons (more or less).
That's 1.2 trillion transistors.
A quad-core Itanium has 2 billion transistors.
You'd need a server rack with 600 quad-core processors to be brain-sized. Think $15M US to purchase the servers. You'll need power management and cooling plus real-estate to support this mess.
One significant issue in simulating the brain is scale. The actual brain only dissipates a few watts. Power consumption is 3 square meals per day. A pint of gin. Maintenance is 8 hours of downtime. Real estate is a 42-foot sailboat (22 Net Tons of volume as ships are measured) and a place to drop the hook.
A server cage with 600 quad-core processors uses a lot more energy, cooling and maintenance. It would require two full-time people to keep this "brain-sized" server farm running.
It seems simpler to just teach the two people what you know and skip the hardware investment.
Roger Penrose presents the argument that human consciousness is non-algorithmic, and thus is not capable of being modeled by a conventional Turing machine-type of digital computer. If it's like that you can forget about building a brain with a computer...
Simulating a neuron is possible and therefore theoretically simulating a brain is possible.
The two things that always stump me as an issue is input and output though.
We have a very large number of nerve endings that all provide input to the brain. Without them the brain is useless. How can we simulate something as complicated as the human brain without also simulating the entire human body!?!
Output, once the brain has "dealt" with all of the inputs that it gets, what is then the output from it? How could you say that the "copy" of your brain was actually you without again hooking it up to a real human body that could speak and tell you?
All in all, a fascinating subject!!!!
The key problem with simulating neural networks (and human brain is a neural network) is that they function continuously, while digital computers function in cycles. So in a neural network different neurons function independently in parallel while in a computer you only simulate discrete system states.
That's why adequately simulating real neural networks is very problematic at the moment and we're very far from it.
Yes, the Blue Brain Project is getting close, and I believe Moore's Law has a $1000 computer getting there by 2049.
The main issue is that our brains are based largely on controlling a human body, which means that our language comprehension and production, the basis of our high-level reasoning and semantic object recognition, is strongly tied to its potential and practiced outputs to a larynx, tongue, and face muscles. Further, our reward systems are tied to signals that indicate sustenance and social approval, which are not the goals we generally want a brain-based AI to have.
An exact simulation of the human brain will be useful in studying the effects of drugs and other chemicals, but I think that the next steps will be in isolating pathways that let us do things that are hard for computers (e.g. visual system, fusiform gyrus, face recognition), and developing new or modifying known structures for representing concepts.
Short: yes we will surely be able to reproduce artificial brains, but no it maybe won't be with our current computers models (Turing machines), because we simply don't know yet enough about the brain to know if we need new computers (super-Turing or biologically engineered brains) or if current computers (with more power/storage space) are enough to simulate a whole brain.
Long:
Disclaimer: I am working in computational neuroscience research and I am interested both by the neurobiological side and the computational (artificial intelligence) side.
Most of the answers assume as true OP's postulate that simulating neurons is enough to save the whole brain state and thus simulate a whole brain.
That's not true.
The brain is more than just neurons.
First, there is the connectivity, the synapses, that is of paramount importance, even maybe more than neurons.
Secondly, there are glial cells such as astrocytes and oligodendrocytes that also possess their own connectivity and communication system.
Thirdly, neurons are heterogenous, which means that there is not just one template model of a neuron that we could just scale up to the required amount to simulate a brain, we also have to define multiple types of neurons and place them pertinently at the right places. Plus, the types can be continuous, so in fact you can have neurons that are half way between 3 different types...
Fourthly, we don't know much about the rules of brain's information processing and management. Sure, we discovered that the cerebellum works pretty much like an artificial neural network using stochastic gradient descent, and that the dopaminergic system works like TD-learning, but then we have no clue about the rest of the brain, even memory is out of reach (although we guess it's something close to a Hopfield network, but there's no precise model yet).
Fifthly, there are so many other examples from current research in neurobiology and computational neuroscience showing the complexity of brain's objects and networks dynamics that this list can go on and on.
So in the end, your question cannot be answered, because we simply do not know yet enough about the brain to know if our current computers (Turing machines) are enough to reproduce the complexity of biological brains to give rise to the full spectrum of cognitive functions.
However, biology field is getting closer and closer to computer science field, as you can see with biologically engineered viruses and cells that are programmed pretty much like you develop a computer program, and genetical therapies that basically re-engineer a living system based on its "class" template (the genome). So I dare to say that once we know enough about the brain's architecture and dynamics, the in-silico reproduction won't be an issue: if our current computers cannot reproduce the brain because of theoretical constraints, we will devise new computers. And if only biological systems can reproduce the brain, we will be able to program an artificial biological brain (we can already 3D-print functional bladders and skin and veins and hearts etc.).
So I would dare say (even if it can be controversial, this is here my own claim) that yes, artificial brains will surely be possible someday, but whether it will be as a Turing machine computer, a super-Turing computer or a biologically engineered brain remain to be seen depending on our progress in the knowledge of brain's mechanisms.
I don't think they are remotely close enough to understanding the human brain to even begin thinking about replicating it.
Scientists would have you think we are nearly there, but with regards to the brain we're not much further along than Dr. Frankenstein.
What is your goal? Do you want a program that can make intelligent decisions or a program that provides a realistic model of how the human brain actually works? Artificial intelligence can be approached from the perspective of psychology, where the goal is to simulate the brain and thereby get a better understanding of how humans think, or from the perspective of mathematics, optimization theory, decision theory, information theory, and computer science, in which case the goal is to create a program that is capable of making intelligent decisions in a computationally efficient manner. The latter, I would say is pretty much solved, although advances are definitely still being made. When it comes to a realistic simulation of the brain, I think we were only recently able to simulate a brain of cat semi-realistically; when it comes to humans, it would not be very computationally feasible at present.
Researchers far smarter than most recon so, see Blue Brain from IBM and others.
The Blue Brain Project is the first
comprehensive attempt to
reverse-engineer the mammalian brain,
in order to understand brain function
and dysfunction through detailed
simulations.
Theoretically the brain can be modeled using a computer (as software and hard/wetware are compatible or mutually expressible). The question isn't a theoretical one as far as computer science goes, but a philosophical one:
Can we model the (chaotic) way in which a brain develops. Is a brains power it's hardware or the environment that shapes the development and emergent properties of that hardware as it learns
Even more mental:
If I, with 100% accuracy modeled my own brain, then started the simulation. And that brain had my memories (as it has my brain's physical form) ... is it me? If not, what do I have that it doesn't?
I think that if we are ever in a position to emulate the brain, we should have been working on logical system based on biological principles with better applications than the brain itself.
We all have a brain, and we all have access to it's amazing power already ;)
A word of caution. Current projects on brain simulation work on a model of a human brain. Your idea about storing your mind on a hard-disk is crazy: if you want a replica of your mind you'll need two things. First, another "blank" brain. Second, devise a method to perfectly transfer all the information contained in your brain: down to the quantum states of every atom in it.
Good luck with that :)
EDIT: The dog ate part of my text.

Resources