Sub-Turing Complete Class of computational models - theory

Many programming languages and systems are Turing-complete; they can simulate any Turing machine, and therefore can simulate any finite state machine as well.
Consider the following informal model:
Language A defines a finite set of NAND gates, their connections to each other, and which gates receive input and which are outputted.
In this model, any finite state machine can be built. The NANDs can form latches, registers, busses and control structures, and in the end any finite state machine, including full computers and other systems.
The model, however, does not have the ability to simulate an infinite tape, only a tape of finite size. It cannot simulate any Turing machine because it may not have the memory to do so.
Are Language A and all other systems that can simulate any finite state machine considered Turing complete? Is there a separate class for them, or is there an opportunity to define such?

As you have realized, there is a hierarchy - with potentially infinitely many levels - of classes of languages, including regular languages (recognizable by finite automata) and decidable (accepted by a Turing machine).
All real computers - including theoretical models which can be used to construct them, like yours involving NAND gates - are not Turing equivalent because they cannot in theory access an infinite tape. In practice, time, space and matter are insufficient in physical reality to allow for real Turing-equivalent computation. All physical computation can be carried out by a finite automata. There are regular languages, in practice, too complex to ever accept by constructing a real finite state machine or general-purpose computer.
Modeling languages as being of a type higher than regular is for convenience - it is a lie in the same way that modeling matter as continuous (e.g., when computing moment of inertia) is a lie. Matter is really made of discrete molecules, which in turn are comprised of smaller discrete particles.


Do software systems realy have more states than hardware ones?

In No Silver Bullet, Fred Brooks claims that "Software systems have orders of magnitudes more states than computers do", which makes them harder to design and test (and chips are already pretty hard to test!).
This is counter-intuitive to me: any running software system can be mapped to a computer in a certain state, and it seems like a computer could be in a state that doesn't represent a running software system. Thus, a computer should have many more potential states than a software system.
Does Brooks intend some particular meaning that I'm missing? Or does a computer really have fewer potential states than the software systems it can run?
Well, let's first think about Turing machines.
A Turing machine consists of an unbounded tape which contains symbols, a head and a small control unit which is a finite state automata that controls how the machine reads, moves and modifies the symbols on the tape.
Fact: there exist universal Turing machines, i.e. machines that read from the tape the description of an other Turing machine and execute it on some given input. In other words: even with just a finite number of states in the control unit such machines can simulate every possible other Turing machine.
Reading the description of a Turing machine is the same as reading a software program stored in memory.
In this sense if you count as the number of states of the hardware the number of states in the control unit, and if software is the description of a Turing machine written on the tape, then yes a finite hardware can simulate infinite softwares, yet the softwares surely contains Turing machines with more states than the one simulating it.
If you however consider as state the whole state of the computation, i.e. including the state of the tape, then you are right: every simulation corresponds to specific possible states in this sense and there are many states that are not valid, or are unreachable.
In the same way modern computers consists of a set of hardware that implements this control unit, and then memory which is our tape. If you do not consider the state of the memory as part of the state of the hardware, the same applies: a finite computer, given enough memory, could execute every possible program on every possible input, yet its controlling parts are only finite.
This said I wouldn't take such assertions too literally or too seriously...
The point is simply: software systems's number of states grows extremely rapidly.

Is a Turing machine a real device or an imaginary concept?

When I am studying about Turing machines and PDAs, I was thinking that the first computing device was the Turing machine.
Hence, I thought that there existed a practical machine called the Turing machine and its states could be represented by some special devices (say like flip-flops) and it could accept magnetic tapes as inputs.
Hence I asked how input strings are represented in magnetic tapes, but by the answer and by the details given in my book, I came to know that a Turing machine is somewhat hypothetical.
My question is, how would a Turing machine be implemented practically? For example, how it is used to check spelling errors in our current processors.
Are Turing machines outdated? Or are they still being used?
When Turing first devised what are now called Turing machines, he was doing so for purely theoretical reasons (they were used to prove the existence of undecidable problems) and without having actually constructed one in the real world. Fast forward to the present, and with the exception of hobbyists building Turing machines for the fun of doing so, TMs are essentially confined to Theoryland.
Turing machines aren't used in practice for several reasons. For starters, it's impossible to build a true TM, since you'd need infinite resources to construct the infinite tape. (You could imagine building TMs with a limited amount of tape, then adding more tape in as necessary, though.) Moreover, Turing machines are inherently slower than other models of computation because of the sequential nature of their data access. Turing machines cannot, for example, jump into the middle of an array without first walking across all the elements of the array that it wants to skip. On top of that, Turing machines are extremely difficult to design. Try writing a Turing machine to sort a list of 32-bit integers, for example. (Actually, please don't. It's really hard!)
This then begs the question... why study Turing machines at all? Fortunately, there are a huge number of reasons to do this:
To reason about the limits of what could possibly be computed. Because Turing machines are capable of simulating any computer on planet earth (or, according to the Church-Turing thesis, any physically realizable computing device), if we can show the limits of what Turing machines can compute, we can demonstrate the limits of what could ever hope to be accomplished on an actual computer.
To formalize the definition of an algorithm. Why is binary search an algorithm while the statement "guess the answer" is not? In order to answer this question, we have to have a formal model of what a computer is and what an algorithm means. Having the Turing machine as a model of computation allows us to rigorously define what an algorithm is. No one actually ever wants to translate algorithms into the format, but the ability to do so gives the field of algorithms and computability theory a firm mathematical grounding.
To formalize definitions of deterministic and nondeterministic algorithms. Probably the biggest open question in computer science right now is whether P = NP. This question only makes sense if you have a formal definition for P and NP, and these in turn require definitions if deterministic and nndeterministic computation (though technically they could be defined using second-order logic). Having the Turing machine then allows us to talk about important problems in NP, along with giving us a way to find NP-complete problems. For example, the proof that SAT is NP-complete uses the fact that SAT can be used to encode a Turing machine and it's execution on an input.
Hope this helps!
It is a conceptual device that is not realizable (due to the requirement of infinite tape). Some people have built physical realizations of a Turing machine, but it is not a true Turing machine due to physical limitations.
Here's a video of one:
Turing Machine are not exactly physical machines, instead they are basically conceptual machine. Turing concept is hypothesis and this is very difficult to implement in real world since we require infinite tapes for small and easy solution too.
It's a theoretical machine, the following paragraph from Wikipedia
A Turing machine is a theoretical device that manipulates symbols on a strip of tape according to a table of rules. Despite its simplicity, a Turing machine can be adapted to simulate the logic of any computer algorithm, and is particularly useful in explaining the functions of a CPU inside a computer.
This machine along with other machines like non-deterministic machine (doesn't exist in real) are very useful in calculating complexity and prove that one algorithm is harder than another or one algorithm is not solvable...etc
Turing machine (TM) is a mathematical model for computing devices. It is the smallest model that can really compute. In fact, the computer that you are using is a very big TM. TM is not outdated. We have other models for computation but this one was used to build the current computers and because of that, we owe a lot to Alan Turing who proposed this model in 1936.

What logic gates are required for Turing completeness?

My son has been playing Little Big Planet 2 lately, and I noticed that the game editor allows AND gates, OR gates and NOT gates... Is it Turing complete? If so, can anyone recommend a source for learning to turn those primitives into something like a higher level conditional if?
You need NOT and one of AND or OR to be able to do all binary logic.
This is DeMorgan's Law, basically.
However, this is not sufficient for Turing completeness.
For that you also need random (or reducably equivalent) access
(theoretically) infinite memory.
Odds are, you'll be able to build a flip flop (a D flip flop
is built using NANDs, so it's straightforward) using
the available logic gates. From those, you can build a
register, and with enough of those you'll be equipped
to build some simple programs.
A NAND gate is all that is required, everything can be built from that, so the three you have are plenty. Here's a course that takes you from logic gates, up through building a computer, all the way to writing an operating system: The Elements of Computing Systems:
Building a Modern Computer from First Principles
An idea: you should be able to construct a NAND gate, so you can then build a XOR gate. With XOR and AND you can build a half-adder. Combine half-adders to build a full-adder. That would be a start at least.
NAND and NOR are basic building blocks for other gates so chances are Turing completeness is just around the corner.
AND, OR and NOT is functionally complete, that is, all possible truth tables can be expressed. Which I believe also makes it turing complete, since you can construct a general purpose processor with any functionally complete set of gates
I'm know I'm late to the game here but yes. I play LBP2, and it has an AND, OR, NOT, XOR, NAND, NOR. You can also add and subtract signals, there is also ways to do binary in the game.
The only gates you need are NOT and OR. With those two you can build all other logic gates. For example, NOT(OR(NOT|NOT)) is an AND gate, OR(NOT|NOT) is NAND, NOT(OR()) is NOR, etc. The difficult one to make (and also most functionally useful) is XOR, which can be made with a tree of NAND gates, which in turn can be made with NOT and OR as shown above.
You can build logic circuit of any complexity with either NAND or NOR gates.
NAND is an AND with a NOT on the output pin.
NOR is an OR with a NOT on the output pin.
Any NAND-based circuit can be rebuilt using NOR's exclusively and vice versa.
So, you can build any logic circuit given only NAND gates. Or you can use just NOR gates. Or you can use NOT and AND gates. Or you can use NOT and OR gates. Or, you can also use AND, NOT and OR gates: you can certainly reduce the number of transistors by creating an optimal combination by using all three types of gates.
All this can be proven by boolean algebra using truth tables: any combination of truth tables can be built from a combination of above mentioned gates. When there are two inputs, there are 4 possible combinations of inputs, giving 16 possible truth tables. By using combinations of above mentioned gates you can create all of these 16 truth tables, and so, you don't need 16 different gates. This holds when you add more inputs and outputs, and even when you create registers and latches to create memory bits, CPU registers and/or any sequential logic circuits.
Theoretically speaking, an infinite number of NAND (inverted AND) logic gates can be used to build a Turing machine. This is because NAND and NOR are the universal logic gates.
In the real world, one can never build a Turing complete machine because infinite memory does not exist.
That's why all computers today are deterministic Finite State Machines.
Modern computers can be considered approximations of Turing machines to aid in program analysis.

What are the useful limits of Linear Bounded Automata compared to Turing Machines?

There are languages that a Turing machine can handle that an LBA can't, but are there any useful, practical problems that LBAs can't solve but TMs can?
An LBA is just a Turing machine with a finite tape, and actual computers have finite storage, so it would seem to me that there's nothing of practical importance that an LBA can't do. Except for the fact that a Linear Bounded Automaton has not just a finite tape, but a tape with a size that's a linear function of the size of the input. Does the linearity of the finiteness restrict the LBA in some way?
Are there problems that a LBA can't cope with, but an Exponentially Bounded Automaton could (if such things exist)?
I'm going to go out on a limb and say "no". Pretty much every programming language that we use today is context sensitive. (Actually not even context sensitive, only slightly stronger than context free, IIRC). And obviously, if we can't program it, we don't really care about it...
OTOH, this all depends on your definition of "interesting"... Ackerman's function clearly fits into this category.... is that interesting?
The Wikipedia article for context-sensitive languages states that any recursive
language (that is, recognizable by a Turing machine) whose decision is EXPSPACE-hard
is not context-sensitive, and therefore cannot be recognized by a LBA. They
give an example of such a language: the set of pairs of equivalent regular expressions including exponentiation.

Finding prime factors to large numbers using specially-crafted CPUs

My understanding is that many public key cryptographic algorithms these days depend on large prime numbers to make up the keys, and it is the difficulty in factoring the product of two primes that makes the encryption hard to break. It is also my understanding that one of the reasons that factoring such large numbers is so difficult, is that the sheer size of the numbers used means that no CPU can efficiently operate on the numbers, since our minuscule 32 and 64 bit CPUs are no match for 1024, 2048 or even 4096 bit numbers. Specialized Big Integer math libraries must be used in order to process those numbers, and those libraries are inherently slow since a CPU can only hold (and process) small chunks (like 32 or 64 bits) at one time.
Why can't you build a highly specialized custom chip with 2048 bit registers, and giant arithmetic circuits, much in the same way that we scaled from 8 to 16 to 32 to 64-bit CPUs, just build one a LOT larger? This chip wouldn't need most of the circuitry on conventional CPUs, after all it wouldn't need to handle things like virtual memory, multithreading or I/O. It wouldn't even need to be a general-purpose processor supporting stored instructions. Just the bare minimum to perform the necessary arithmetical calculations on ginormous numbers.
I don't know a whole lot about IC design, but I do remember learning about how logic gates work, how to build a half adder, full adder, then link together a bunch of adders to do multi-bit arithmetic. Just scale up. A lot.
Now, I'm fairly certain that there is a very good reason (or 17) that the above won't work (since otherwise one of the many people smarter than I am would have already done it) but I am interested in knowing why it won't work.
(Note: This question may need some re-working, as I'm not even sure yet if the question makes sense)
What #cube said, and the fact that a giant arithmetic logic unit would take more time for the logic signals to stabilize, and include other complications in digital design. Digital logic design includes something that you take for granted in software, namely that signals through combinational logic take a small but nonzero time to propagate and settle. A 32x32 multiplier needs to be designed carefully. A 1024x1024 multiplier would not only take a huge amount of physical resources in a chip, but it also would be slower than a 32x32 multiplier (though perhaps faster than a 32x32 multiplier computing all the partial products needed to perform a 1024x1024 multiply). Plus it's not only the multiplier that's the bottleneck: you've got memory pathways. You'd have to spend a bunch of time gathering the 1024 bits from a memory circuit that's only 32 bits wide, and storing the resulting 2048 bits back into the memory circuit.
Almost certainly it's better to get a bunch of "conventional" 32-bit or 64-bit systems working in parallel: you get the speedup w/o the hardware design complexity.
edit: if anyone has ACM access (I don't), perhaps take a look at this paper to see what it says.
Its because this speedup would be only in O(n), but the complexity of factoring the number is something like O(2^n) (with respect to the number of bits). So if you made this ├╝berprocessor and factorized the numbers 1000 times faster, I would only have to make the numbers 10 bits larger and we would be back on the start again.
As indicated above, the primary problem is simply how many possibilities you have to go through to factor a number. That being said, specialized computers do exist to do this sort of thing.
The real progress for this sort of cryptography is improvements in number factoring algorithms. Currently, the fastest known general algorithm is the general number field sieve.
Historically, we seem to be able to factor numbers twice as large each decade. Part of that is faster hardware, and part of it is simply a better understanding of mathematics and how to perform factoring.
I can't comment on the feasibility of an approach exactly like the one you described, but people do similar things very frequently using FPGAs:
Crack DES keys
Crack GSM conversations
Open source graphics card
Shamir & Tromer suggest a similar approach, using a kind of grid computing:
This article discusses a new design for a custom hardware
implementation of the sieving step, which
reduces [the cost of sieving, relative to TWINKLE,] to about $10M. The new device,
called TWIRL, can be seen as an extension of the
TWINKLE device. However, unlike TWINKLE it
does not have optoelectronic components, and can
thus be manufactured using standard VLSI technology
on silicon wafers. The underlying idea is to use
a single copy of the input to solve many subproblems
in parallel. Since input storage dominates cost, if the
parallelization overhead is kept low then the resulting
speedup is obtained essentially for free. Indeed, the
main challenge lies in achieving this parallelism efficiently while allowing compact storage of the input.
Addressing this involves myriad considerations, ranging
from number theory to VLSI technology.
Why don't you try building an uber-quantum computer and run Shor's algorithm on it?
"... If a quantum computer with a sufficient number of qubits were to be constructed, Shor's algorithm could be used to break public-key cryptography schemes such as the widely used RSA scheme. RSA is based on the assumption that factoring large numbers is computationally infeasible. So far as is known, this assumption is valid for classical (non-quantum) computers; no classical algorithm is known that can factor in polynomial time. However, Shor's algorithm shows that factoring is efficient on a quantum computer, so a sufficiently large quantum computer can break RSA. ..." -Wikipedia