What logic gates are required for Turing completeness? - turing-complete

My son has been playing Little Big Planet 2 lately, and I noticed that the game editor allows AND gates, OR gates and NOT gates... Is it Turing complete? If so, can anyone recommend a source for learning to turn those primitives into something like a higher level conditional if?

You need NOT and one of AND or OR to be able to do all binary logic.
This is DeMorgan's Law, basically.
However, this is not sufficient for Turing completeness.
For that you also need random (or reducably equivalent) access
(theoretically) infinite memory.
Odds are, you'll be able to build a flip flop (a D flip flop
is built using NANDs, so it's straightforward) using
the available logic gates. From those, you can build a
register, and with enough of those you'll be equipped
to build some simple programs.

A NAND gate is all that is required, everything can be built from that, so the three you have are plenty. Here's a course that takes you from logic gates, up through building a computer, all the way to writing an operating system: The Elements of Computing Systems:
Building a Modern Computer from First Principles

An idea: you should be able to construct a NAND gate, so you can then build a XOR gate. With XOR and AND you can build a half-adder. Combine half-adders to build a full-adder. That would be a start at least.
NAND and NOR are basic building blocks for other gates so chances are Turing completeness is just around the corner.

AND, OR and NOT is functionally complete, that is, all possible truth tables can be expressed. Which I believe also makes it turing complete, since you can construct a general purpose processor with any functionally complete set of gates

I'm know I'm late to the game here but yes. I play LBP2, and it has an AND, OR, NOT, XOR, NAND, NOR. You can also add and subtract signals, there is also ways to do binary in the game.

The only gates you need are NOT and OR. With those two you can build all other logic gates. For example, NOT(OR(NOT|NOT)) is an AND gate, OR(NOT|NOT) is NAND, NOT(OR()) is NOR, etc. The difficult one to make (and also most functionally useful) is XOR, which can be made with a tree of NAND gates, which in turn can be made with NOT and OR as shown above.

You can build logic circuit of any complexity with either NAND or NOR gates.
NAND is an AND with a NOT on the output pin.
NOR is an OR with a NOT on the output pin.
Any NAND-based circuit can be rebuilt using NOR's exclusively and vice versa.
So, you can build any logic circuit given only NAND gates. Or you can use just NOR gates. Or you can use NOT and AND gates. Or you can use NOT and OR gates. Or, you can also use AND, NOT and OR gates: you can certainly reduce the number of transistors by creating an optimal combination by using all three types of gates.
All this can be proven by boolean algebra using truth tables: any combination of truth tables can be built from a combination of above mentioned gates. When there are two inputs, there are 4 possible combinations of inputs, giving 16 possible truth tables. By using combinations of above mentioned gates you can create all of these 16 truth tables, and so, you don't need 16 different gates. This holds when you add more inputs and outputs, and even when you create registers and latches to create memory bits, CPU registers and/or any sequential logic circuits.
https://en.wikipedia.org/wiki/NAND_logic
https://en.wikipedia.org/wiki/NOR_logic
https://en.wikipedia.org/wiki/Truth_table

Theoretically speaking, an infinite number of NAND (inverted AND) logic gates can be used to build a Turing machine. This is because NAND and NOR are the universal logic gates.
In the real world, one can never build a Turing complete machine because infinite memory does not exist.
That's why all computers today are deterministic Finite State Machines.
Modern computers can be considered approximations of Turing machines to aid in program analysis.

Related

Does modeling digital circuits in C have any practical benefits as opposed using the language's standard operations?

So I've start looking into digital circuit designs and enlightened to find that almost every operation (that I'm aware of), all derive from 3 logical operations: AND, OR, and NOT. As an analogy, these are sort of like subatomic particles to atoms that make up everything else. Subatomic particles are to logic gate as atoms are to processor instructions. If programming in Assembly is like putting atoms together, then programming in C is like putting molecules (and atoms) together. Someone PLEASE tell me if I'm off base here.
With that said, I'm aware that GCC and most other compilers do a pretty good job optimizing from C to machine code. Lets assume we are looking at an x386 instruction set. If I built an 32-bit full-adder using only $$, ||, and ~, is the compiler smart enough to use existing instructions provided by the processor, or will my full-adder end up being a more bloated, less efficient versions of what's already on the processor.
Disclaimer:
I started looking into digital circuits in an attempt to start learning assembly, and I'm fair in C. I want to model some of these circuits in C to further my understand of the digital circuits because those are in terms I understand. But I don't want to lure myself into the illusion that it will also be efficient code (or any other practical benefits other than learning) when using a simple + will do. And yes, I'm aware that the horrible maintainability of the code would far outweigh any benefit's of this coding "style" may provide.
A software simulation of a full adder will never be anywhere near as efficient than a full adder built out of logic gates of the same technology as the silicon which runs the software simulation. The number of gates involved in running the simulation will far outstrip the gate count in the hardware adder, and the propagation delays will be significantly longer, especially if you include the I/O processing needed to make the simulation function as a real adder which accepts and produces electronic signals, which means that the code has to read and write actual CPU or peripheral I/O pins.
The above is true even if a crack team of the world's best x86 assembly language coders get together in a conference to design the ideal piece of code to implement a full adder in machine language; in other words, it doesn't reflect the inability of compilers to optimize C sufficiently well.
A software simulation of a logic circuit on a some given computer, however, can be more efficient than a circuit built with some different technology from that computer: specifically, older technology. For instance, a program running on a modern, fast microcontroller chip with integrated I/O can likely express a faster adder on its GPIO pins, than something cobbed together out of discrete, through-hole transistors, and it will almost certainly take up less space, and possibly require less current.
I've always been of mind that building digital circuits is no different than programming in assembly. I've said before that electronics are like physical opcodes.
To you question of C smart-compiling a logic adder to an ADD instruction.. No, it will not. There are a few exceptions in embedded development such as AVR and avr-gcc with the right optimization flags will turn REGISTER|=1<<bit or REGISTER&=~(1<<bit) and turn those into bit set and clear instructions instead of doing a verbatim Load, Logic, Store.
Beyond assemble/opcodes I can't really think of an analogy for higher-level languages to electronics.
Although you may use C to describe a digital circuit (in fact, the Verilog HDL has some resembling to C), there's one fundamental thing that you cannot model in plain C: parallelism.
If there's anything that belong to digital circuit description is its inherent parallelism in the description itself: functions (modules actually), and even code blocks "execute" in parallel.
Any ordinary C program describes a sequence of operations over time, and therefore, any attempt to describe a digital operation, unless it is a very trivial one, will be compiled as a sequence of steps, not exploiting the parallelism that digital circuits have by nature.
That said, there does exist some HDL (hardware description languages) that are fair close to C. One of them is Handel-C. Handel-C uses syntax borrowed from C, plus some additions to better handle the inherent parallelism present in digital design.
For example: imagine you have to exchange the value of two variables. The classical solution (besides solutions based in bitwise operations and the like) is:
temp = a;
a = b;
b = temp;
However, when someone is learning computer programming, it's a common mistake to code the above sequence as this:
a = b;
b = a;
Because we think in a variable interchange as a parallel operation: "the value of b is copied to a, meanwhile the value of a is copied to b".
The funny thing about this approach is that it actually works... if we manage to execute these two assignments in parallel. Something that is not possible in plain C, but it is in Handel-C:
par
{
a = b;
b = a;
}
The par statement indicates that each code line is to be "executed" in parallel with respect to the others.
In Verilog, the same interchange would be written as this:
a <= b;
b <= a;
<= is the nonblocking assignment in Verilog: the second line is not "executed" after the first one is finished, but both start at the same time. This sequence is normally found inside a clocked always (sort of a loop that is "executed" every time the clock signal in the sensitivity list changes from 0 to 1 -posedge- or from 1 to 0 -negedge- ).
always #(posedge clk) begin
a <= b;
b <= a;
end
This means: everytime the clock goes from 0 to 1, interchange the values between a and b.
Note that I always quote "executed" when I speak about languages for digital design. The code doesn't actually translate into a sequence of operations to be executed by a processor, but the code IS a circuit. Think of it as a 1D rendering of a 2D schematic, with sentences and operators instead of electronic symbols, and assignments, arguments and "function calls" instead of wires.
If you are familiarized with digital circuits, you will realize that the "always" loop look-alike, is actually translated into this:
Which is something you couldn't do with just translating the same high level description into assembly (unless the ISA of the target processor has some sort of XCHG instruction, which is actually not uncommon, and the code keeps the two variables to be interchanged into CPU registers).

Is a Turing machine a real device or an imaginary concept?

When I am studying about Turing machines and PDAs, I was thinking that the first computing device was the Turing machine.
Hence, I thought that there existed a practical machine called the Turing machine and its states could be represented by some special devices (say like flip-flops) and it could accept magnetic tapes as inputs.
Hence I asked how input strings are represented in magnetic tapes, but by the answer and by the details given in my book, I came to know that a Turing machine is somewhat hypothetical.
My question is, how would a Turing machine be implemented practically? For example, how it is used to check spelling errors in our current processors.
Are Turing machines outdated? Or are they still being used?
When Turing first devised what are now called Turing machines, he was doing so for purely theoretical reasons (they were used to prove the existence of undecidable problems) and without having actually constructed one in the real world. Fast forward to the present, and with the exception of hobbyists building Turing machines for the fun of doing so, TMs are essentially confined to Theoryland.
Turing machines aren't used in practice for several reasons. For starters, it's impossible to build a true TM, since you'd need infinite resources to construct the infinite tape. (You could imagine building TMs with a limited amount of tape, then adding more tape in as necessary, though.) Moreover, Turing machines are inherently slower than other models of computation because of the sequential nature of their data access. Turing machines cannot, for example, jump into the middle of an array without first walking across all the elements of the array that it wants to skip. On top of that, Turing machines are extremely difficult to design. Try writing a Turing machine to sort a list of 32-bit integers, for example. (Actually, please don't. It's really hard!)
This then begs the question... why study Turing machines at all? Fortunately, there are a huge number of reasons to do this:
To reason about the limits of what could possibly be computed. Because Turing machines are capable of simulating any computer on planet earth (or, according to the Church-Turing thesis, any physically realizable computing device), if we can show the limits of what Turing machines can compute, we can demonstrate the limits of what could ever hope to be accomplished on an actual computer.
To formalize the definition of an algorithm. Why is binary search an algorithm while the statement "guess the answer" is not? In order to answer this question, we have to have a formal model of what a computer is and what an algorithm means. Having the Turing machine as a model of computation allows us to rigorously define what an algorithm is. No one actually ever wants to translate algorithms into the format, but the ability to do so gives the field of algorithms and computability theory a firm mathematical grounding.
To formalize definitions of deterministic and nondeterministic algorithms. Probably the biggest open question in computer science right now is whether P = NP. This question only makes sense if you have a formal definition for P and NP, and these in turn require definitions if deterministic and nndeterministic computation (though technically they could be defined using second-order logic). Having the Turing machine then allows us to talk about important problems in NP, along with giving us a way to find NP-complete problems. For example, the proof that SAT is NP-complete uses the fact that SAT can be used to encode a Turing machine and it's execution on an input.
Hope this helps!
It is a conceptual device that is not realizable (due to the requirement of infinite tape). Some people have built physical realizations of a Turing machine, but it is not a true Turing machine due to physical limitations.
Here's a video of one: http://www.youtube.com/watch?v=E3keLeMwfHY
Turing Machine are not exactly physical machines, instead they are basically conceptual machine. Turing concept is hypothesis and this is very difficult to implement in real world since we require infinite tapes for small and easy solution too.
It's a theoretical machine, the following paragraph from Wikipedia
A Turing machine is a theoretical device that manipulates symbols on a strip of tape according to a table of rules. Despite its simplicity, a Turing machine can be adapted to simulate the logic of any computer algorithm, and is particularly useful in explaining the functions of a CPU inside a computer.
This machine along with other machines like non-deterministic machine (doesn't exist in real) are very useful in calculating complexity and prove that one algorithm is harder than another or one algorithm is not solvable...etc
Turing machine (TM) is a mathematical model for computing devices. It is the smallest model that can really compute. In fact, the computer that you are using is a very big TM. TM is not outdated. We have other models for computation but this one was used to build the current computers and because of that, we owe a lot to Alan Turing who proposed this model in 1936.

Hardware/Software Implementation

What does it mean to say that a function (e.g. modular multiplication,sine) is implemented in hardware as opposed to software?
Implemented in hardware means the electrical circuit (through logical gates and so) can perform the operation.
For example, in the ALU the processor is physically able to add one byte to another.
Implemented in software are operations that usually are very complex combinations of basic implemented in hardware functions.
Typically, "software" is a list of instructions from a small set of precise formal instructions supported by the hardware in question. The hardware (the cpu) runs in an infinite loop executing your instruction stream stored in "memory".
When we talk about a software implementation of an algorithm, we mean that we achieve the final answer by having the CPU carry out some set of these instructions in the order put together by an outside programmer.
When we talk about a hardware implementation, we mean that the final answer is carried out with intermediate steps that don't come from a formal (inefficient) software stream coded up by a programmer but instead is carried out with intermediate steps that are not exposed to the outside world. Hardware implementations thus are likely to be faster because (a) they can be very particular to the algorithm being implemented, with no need to reach well-defined states that the outside would will see, and (b) don't have to sync up with the outside world.
Note, that I am calling things like sine(x), "algorithms".
To be more specific about efficiency, the software instructions, being a part of a formal interface, have predefined start/stop points as they await for the next clock cycle. These sync points are needed to some extent to allow other software instructions and other hardware to cleanly and unambiguously access these well defined calculations. In contrast, a hardware implementation is more likely to have a larger amount of its internal implementations be asynchronous, meaning that they run to completion instead of stopping at many intermediate points to await a clock tick.
For example, most processors have an instruction that carries out an integer addition. The entire process of calculating the final bit positions is likely done asynchronously. The stop/sync point occurs only after the added result is achieved. In turn, a more complex algorithm than "add", and which is done in software that contains many such additions, necessarily is partly carried out asynchronously (eg, in between each addition) but with many sync points (after each addition, jump, test, etc, result is known). If that more complex algorithm were done entirely in hardware, it's possible it would run to completion from beginning to end entirely independent of the timing clock. No outside program instructions would be consulted during the hardware calculation of that algorithm.
It means that the logic behind it is in the hardware (ie, using gates AND/OR/XOR, etc) rather than a software recreation of said hardware logic.
Hardware implementation means typically that a circuit was created to perform the refered operation. There is no need for a CPU nor virtual calculations. You can literally see the algorithm being performed through the lines and architecture of the circuit itself.

Finding prime factors to large numbers using specially-crafted CPUs

My understanding is that many public key cryptographic algorithms these days depend on large prime numbers to make up the keys, and it is the difficulty in factoring the product of two primes that makes the encryption hard to break. It is also my understanding that one of the reasons that factoring such large numbers is so difficult, is that the sheer size of the numbers used means that no CPU can efficiently operate on the numbers, since our minuscule 32 and 64 bit CPUs are no match for 1024, 2048 or even 4096 bit numbers. Specialized Big Integer math libraries must be used in order to process those numbers, and those libraries are inherently slow since a CPU can only hold (and process) small chunks (like 32 or 64 bits) at one time.
So...
Why can't you build a highly specialized custom chip with 2048 bit registers, and giant arithmetic circuits, much in the same way that we scaled from 8 to 16 to 32 to 64-bit CPUs, just build one a LOT larger? This chip wouldn't need most of the circuitry on conventional CPUs, after all it wouldn't need to handle things like virtual memory, multithreading or I/O. It wouldn't even need to be a general-purpose processor supporting stored instructions. Just the bare minimum to perform the necessary arithmetical calculations on ginormous numbers.
I don't know a whole lot about IC design, but I do remember learning about how logic gates work, how to build a half adder, full adder, then link together a bunch of adders to do multi-bit arithmetic. Just scale up. A lot.
Now, I'm fairly certain that there is a very good reason (or 17) that the above won't work (since otherwise one of the many people smarter than I am would have already done it) but I am interested in knowing why it won't work.
(Note: This question may need some re-working, as I'm not even sure yet if the question makes sense)
What #cube said, and the fact that a giant arithmetic logic unit would take more time for the logic signals to stabilize, and include other complications in digital design. Digital logic design includes something that you take for granted in software, namely that signals through combinational logic take a small but nonzero time to propagate and settle. A 32x32 multiplier needs to be designed carefully. A 1024x1024 multiplier would not only take a huge amount of physical resources in a chip, but it also would be slower than a 32x32 multiplier (though perhaps faster than a 32x32 multiplier computing all the partial products needed to perform a 1024x1024 multiply). Plus it's not only the multiplier that's the bottleneck: you've got memory pathways. You'd have to spend a bunch of time gathering the 1024 bits from a memory circuit that's only 32 bits wide, and storing the resulting 2048 bits back into the memory circuit.
Almost certainly it's better to get a bunch of "conventional" 32-bit or 64-bit systems working in parallel: you get the speedup w/o the hardware design complexity.
edit: if anyone has ACM access (I don't), perhaps take a look at this paper to see what it says.
Its because this speedup would be only in O(n), but the complexity of factoring the number is something like O(2^n) (with respect to the number of bits). So if you made this überprocessor and factorized the numbers 1000 times faster, I would only have to make the numbers 10 bits larger and we would be back on the start again.
As indicated above, the primary problem is simply how many possibilities you have to go through to factor a number. That being said, specialized computers do exist to do this sort of thing.
The real progress for this sort of cryptography is improvements in number factoring algorithms. Currently, the fastest known general algorithm is the general number field sieve.
Historically, we seem to be able to factor numbers twice as large each decade. Part of that is faster hardware, and part of it is simply a better understanding of mathematics and how to perform factoring.
I can't comment on the feasibility of an approach exactly like the one you described, but people do similar things very frequently using FPGAs:
Crack DES keys
Crack GSM conversations
Open source graphics card
Shamir & Tromer suggest a similar approach, using a kind of grid computing:
This article discusses a new design for a custom hardware
implementation of the sieving step, which
reduces [the cost of sieving, relative to TWINKLE,] to about $10M. The new device,
called TWIRL, can be seen as an extension of the
TWINKLE device. However, unlike TWINKLE it
does not have optoelectronic components, and can
thus be manufactured using standard VLSI technology
on silicon wafers. The underlying idea is to use
a single copy of the input to solve many subproblems
in parallel. Since input storage dominates cost, if the
parallelization overhead is kept low then the resulting
speedup is obtained essentially for free. Indeed, the
main challenge lies in achieving this parallelism efficiently while allowing compact storage of the input.
Addressing this involves myriad considerations, ranging
from number theory to VLSI technology.
Why don't you try building an uber-quantum computer and run Shor's algorithm on it?
"... If a quantum computer with a sufficient number of qubits were to be constructed, Shor's algorithm could be used to break public-key cryptography schemes such as the widely used RSA scheme. RSA is based on the assumption that factoring large numbers is computationally infeasible. So far as is known, this assumption is valid for classical (non-quantum) computers; no classical algorithm is known that can factor in polynomial time. However, Shor's algorithm shows that factoring is efficient on a quantum computer, so a sufficiently large quantum computer can break RSA. ..." -Wikipedia

Why can Conway’s Game of Life be classified as a universal machine?

I was recently reading about artificial life and came across the statement, "Conway’s Game of Life demonstrates enough complexity to be classified as a universal machine." I only had a rough understanding of what a universal machine is, and Wikipedia only brought me as close to understanding as Wikipedia ever does. I wonder if anyone could shed some light on this very sexy statement?
Conway's Game of Life seems, to me, to be a lovely distraction with some tremendous implications: I can't make the leap between that and calculator? Is that even the leap that I should be making?
Paul Rendell implemented a Turing machine in Life. Gliders represent signals, and interactions between them are gates and logic that together can create larger components which implement the Turing machine.
Basically, any automatic machinery that can implement AND, OR, and NOT can be combined together in complex enough ways to be Turing-complete. It's not a useful way to compute, but it meets the criteria.
You can build a Turing machine out of Conway's life - although it would be pretty horrendous.
The key is in gliders (and related patterns) - these move (slowly) along the playing field, so can represent streams of bits (the presence of a glider for a 1 and the absence for a 0). Other patterns can be built to take in two streams of gliders (at right angles) and emit another stream of bits corresponding to the AND/OR/etc of the original two streams.
EDIT: There's more on this on the LogiCell web site.
Conway's "Life" can be taken even further: It's not only possible to build a Life pattern that implements a Universal Turing Machine, but also a Von Neumann "Universal Constructor:" http://conwaylife.com/wiki/Universal_constructor
Since a "Universal Constructor" can be programmed to construct any pattern of cells, including a copy of itself, Coway's "Life" is therefore capable of "self-replication," not just Universal Computation.
I highly recommend the book The Recursive Universe by Poundstone. Out of print, but you can probably find a copy, perhaps in a good library. It's almost all about the power of Conway's Life, and the things that can exist in a universe with that set of natural laws, including self-reproducing entities and IIRC, Darwinian evolution.
And Paul Chapman actually build a universal turing machine with game of life: http://www.igblan.free-online.co.uk/igblan/ca/ by building a "Universal Minsky Register Machine".
The pattern is constructed on a
lattice of 30x30 squares. Lightweight
Spaceships (LWSSs) are used to
communicate between components, which
have P60 logic (except for Registers -
see below). A LWSS takes 60
generations to cross a lattice square.
Every 60 generations, therefore, any
inter-component LWSS (pulse) is in the
same position relative to the square
it's in, allowing for rotation
.

Resources