Is my Sudoku algorithm considered an "expert system"? - artificial-intelligence

I wrote a code which has all the rules of Sudoku written into it (one occurence of a digit per column, line, and square). The code takes an input (unfilled sudoku grid), and returns a solution by translating logical clauses into DIMACS format and using a SAT solver.
Given that the algorithm respects rules, takes in data, and uses that data to form conclusion based on implications (eg if there is a 1 in the first cell, there cannot be a 1 in the second cell), is this code considered an "expert system"? Thank you.

Whether a program is an expert system is subjective, but I'd say unless your program is encoding non-trivial knowledge acquired from a domain expert, it's not an expert system. If you can't teach another person to practically do what your program is doing, it's not an expert system.
By that definition, what you've done is probably not an expert system since it would be too time consuming for a person to use the same technique. I've written a sudoku solver using a production system (https://sourceforge.net/p/clipsrules/code/HEAD/tree/branches/63x/examples/sudoku/) that I would consider to be an expert system. The encoded knowledge was acquired from websites with advanced techniques for humans to use for solving sudoku puzzles. All of the encoded techniques can be practically used by humans for solving puzzles (although some of the more complex techniques push that boundary).
Although my sudoku solver can solve much more complicated puzzles than I could, calling it an expert system is not an indication of its sophistication. There are better approaches for solving extremely complex sudoku puzzles than emulating approaches humans might take.

In the 80's, I had written a clone of the Emycin expert system engine. One important characteristic was the ability for the user to ask WHY the expert system got some conclusion. The system could reply (in an almost natural language) that it applied such and such rules to get to the conclusion.
With this kind of system, the knowledge is modeled and implemented (by a cognitician engineer) as an explicit set of rules. These rules are objects known by the engine. The engine can trigger the rules (forward or backward or maybe using metarules...) and can log the triggered rules and thus explain its conclusions.
(this is my sense for expert systems).

Related

How do algorithms differ from design patterns?

I am new to C programming; coming from an OOP PHP background.
I find C to be (no wonder) a much more difficult language. I had particularly lots of problems figuring out a couple of things on arrays at first: like there is no native associative array.
Now, this part I guess I'm figuring out little by little, but now I have a question regarding a conversation I had just yesterday with a C developer. She was explaining the binary search algorithm to me because I asked her whether there were libraries to do array related stuff in C or not because it seemed like a smarter solution than always re-inventing the wheel.
I would really love to learn more about algorithms in C, in particular what differences are there between algorithms and the design patterns I'm used to using in PHP?
Taking things in order: the extent of C's support for anything like an associative array would be qsort to sort an array of structures based on a key, and bsearch to find one based on a key. There are, of course, quite a few alternatives -- various other libraries have hash tables, balanced trees, etc. Exactly which will suit your purposes is hard to guess though.
Offhand, I don't know of many good books covering algorithms that use C as their primary vehicle for demonstration. A few obvious recommendations for books on algorithms in general (mostly language independent) would be:
The Art of Computer Programming by Donald Knuth. This is pretty much the class algorithms book. It's now (finally) up to four volumes. Knuth originally started on it in 1967, planning to write 7 volumes. Only three volumes were available for a long time. A fourth was added quite recently. At the rate he's going, it's only going to make it to 7 if Knuth lives to be well past 100 years old. Nonetheless, the parts that are there are extremely good -- but (warning!) he analyzes the algorithms in considerable detail; if you don't know at least a little calculus, a fair amount will probably be hard to follow.
Introduction to Algorithms by Cormen, Leiserson, Rivest and Stein. IIRC, there's now a newer edition than I have, which adds yet another author. This is a large book (dropping it on your toes would be quite painful). It uses a fair amount of mathematical notation and such throughout, but if you're willing to work a little at looking up the notation, it's really pretty understandable. It covers quite a bit of important ground (e.g., graph algorithms) that are scheduled for later volumes of Knuth, but not (at least yet) available there.
Algorithms and Data Structures by Aho, Hopcraft and Ullman. This is (by a pretty fair margin) the smallest, lightest, and at least for most people probably the easiest of these to follow.
Though it's only available used anymore, if you can find a copy of Algorithms + Data Structures = Programs by Niklaus Wirth, that's what I'd really suggest. It uses Pascal (no surprise -- Niklaus Wirth invented Pascal), but that's enough like C that it doesn't cause a real problem. It doesn't go into as much depth as Knuth about each algorithm, but still enough to give a good feel for when one is likely to be a good choice versus another. For somebody in your position (some background in programming, but little in this area) it's my top recommendation.
Though I've said it before, I think it bears repeating: IMO, all of Robert Sedgewick's books on algorithms should be avoided. Algorithms in C++ is probably the worst of them, but the others are only marginally better. The code they include (again, especially the C++ version) is truly execrable, and the descriptions of algorithms are often incomplete and/or misleading. The most recent editions have fixed some of the problems, but (IMO) not nearly enough to qualify as something that should ever be recommended. If there was no alternative, you could probably get by with these, but given the number of alternatives that are dramatically superior, the only reason to read these at all is if somebody gives them to you, and you absolutely can't afford anything else.
As far as algorithms versus design patterns goes, the line can get blurry in places, but generally an algorithm is much more tightly defined. An algorithm will normally have a specific, tightly defined input which it processes in a specific way to produce an equally specific result/output. A design pattern tends to be more loosely defined, more generic. An algorithm can be generic as well (e.g., a sorting algorithms might require a type that defines a strict, weak ordering) but still has specific requirements on the type.
A design pattern tends to be somewhat more loosely defined. For example, the visitor pattern involves processing groups of objects -- but we don't want to modify the types of those objects when we decide we need to process them in a new and different way. We do that by defining the processes separately from the objects to be processed, along with how we'll traverse the groups of objects, and allow a process to work with each.
To look at it from a rather different direction, you can usually implement an algorithm with a function or a small group of functions. A design pattern tends to be oriented more toward the style in which you write your code, rather than just "here's a function, use it."
"Algorithms in C, Parts 1-5 (Bundle): Fundamentals, Data Structures, Sorting, Searching, and Graph Algorithms (3rd Edition)"
Cannot stress how good that series is.

What specific examples are there of knowing C making you a better high level programmer?

I know about the existance of question such as this one and this one. Let me explain.
Afet reading Joel's article Back to Basics and seeing many similar questions on SO, I've begun to wonder what are specific examples of situations where knowing stuff like C can make you a better high level programmer.
What I want to know is if there are many examples of this. Many times, the answer to this question is something like "Knowing C gives you a better feel of what's happening under the covers" or "You need a solid foundation for your program", and these answers don't have much meaning. I want to understand the different specific ways in which you will benefit from knowing low level concepts,
Joel gave a couple of examples: Binary databases vs XML, and strings. But two examples don't really justify learning C and/or Assembly. So my question is this: What specific examples are there of knowing C making you a better high level programmer?
My experience with teaching students and working with people who only studied high-level languages is that they tend to think at a certain high level of abstraction, and they assume that "everything comes for free". They can become very competent programmers, but eventually they have to deal with some code that has performance issues and then it comes to bite them.
When you work a lot with C, you do think about memory allocation. You often think about memory layout (and cache locality if that's an issue). You understand how and why certain graphics operations just cost a lot. How efficient or inefficient certain socket behaviors are. How buffers work, etc. I feel that using the abstractions in a higher level language when you do know how it is implemented below the covers sometimes gives you "that extra secret sauce" when thinking about performance.
For example, Java has a garbage collector and you can't directly assign things to memory directly. And yet, you can make certain design choices (e.g., with custom data structures) that affect performance because of the same reasons this would be an issue in C.
Also, and more generally, I feel that it is important for a power programmer to not only know big-O notation (which most schools teach), but that in real-life applications the constant is also important (which schools try to ignore). My anecdotal experience is that people with skills in both language levels tend to have a better understanding of the constant, perhaps because of what I described above.
In addition, many higher level systems that I have seen interface with lower level libraries and infrastructures. For instance, some communications, databases or graphics libraries. Some drivers for certain devices, etc. If you are a power programmer, you may eventially have to venture out there and it helps to at least have an idea of what is going on.
Knowing low level stuff can help a lot.
To become a racing driver, you have to learn and understand the basic physics of how tyres grip the road. Anyone can learn to drive pretty fast, but you need a good understanding of the "low level" stuff (forces and friction, racing lines, fine throttle and brake control, etc) to get those last few percent of performance that will allow you to win the race.
For example, if you understand how the CPU architecture works in your computer, you can write code that works better with it (e.g. if you know you have a certain CPU cache size or a certain number of bytes in each CPU cache line, you can arrange your data structures and the way that you access them to make the best use of the cache - for example, processing many elements of an array in order is often faster than processing random elements, due to the CPU cache). If you have a multi-core computer, then understanding how low level techniques like threading work can gave huge benefits (just as not understanding the low level can lead to disaster in threading).
If you understand how Disk I/O and caching works, you can modify file operations to work well with it (e.g. if you read from one file and write to another, working on large batches of data in RAM can help reduce I/O contention between the reading and writing phases of your code, and vastly improve throughput)
If you understand how virtual functions work, you can design high-level code that uses virtual functions well. If used incorrectly they can severely hamper performance.
If you understand how drawing is handled, you can use clever tricks to improve drawing speed. e.g. You can draw a chessboard by alternately drawing 64 white and black squares. But it is often faster to draw 32 white sqares and then 32 black ones (because you only have to change the drawing colour twice instead of 64 times). But you can actually draw the whole board black, then XOR 4 stripes across the board and 4 stripes down the board in white, and this can be much faster still (2 colour changes, and only 9 rectangles to draw instead of 64). This chessboard trick teaches you a very important programming skill: Lateral thinking. By designing your algorithm well, you can often make a big difference to how well your program operates.
Understanding C, or for that matter, any low level programming language, gives you an opportunity to understand things like memory usage (i.e. why is it a bad thing to create several million heavy objects), how pointers/object references work, etc.
The problem is that as we've created ever increasing levels of abstraction, we find ourselves doing a lot of 'lego block' programming, without understanding how the legos actually function. And by having almost infinite resources, we start treating memory and resources like water, and tend to solve problems by throwing more iron at the situation.
While not limited to C, there's a tremendous benefit to working at a low level with much smaller, memory constrained systems like the Arduino or old-school 8-bit processors. It lets you experience close to the metal coding in a much more approachable package, and after spending time squeezing apps into 512K, you will find yourself applying these skills at a larger level within your day to day programming.
So the language itself is not important, but having a deeper appreciation for how all of the bits come together, and how to work effectively at a level closer to the hardware is a set of skills beneficial to any software developer.
For one, knowing C helps you understand how memory works in the OS and in other high level languages. When your C# or Java program balloons on memory usage, understanding that references (which are basically just pointers) take memory too, and understand how many of the data structures are implemented (which you get from making your own in C) helps you understand that your dictionary is reserving huge amounts of memory that aren't actually used.
For another, knowing C can help you understand how to make use of lower level operating system features. You don't need this often, but sometimes you may need memory mapped files, or to use marshalling in C#, and C will greatly help understand what you're doing when that happens.
I think C has also helped my understanding of network protocols, but I can't put my finger on specific examples. I was reading another SO question the other day where someone was complaining about how C's bit-fields are 'basically useless' and I was thinking how elegantly C bit fields represent low-level network protocols. High level languages dealing with structures of bits always end up a mess!
In general, the more you know, the better programmer you will be.
However, sometimes knowing another language, such as C, can make you do the wrong thing, because there might be an assumption that is not true in a higher-level language (such as Python, or PHP). For example, one might assume that finding the length of a list might be O(N) where N is the length of the list. However, this is probably not the case in many high-level language instances. In Python, for most list-like things the cost is O(1).
Knowing more about the specifics of a language will help, but knowing more in general might lead one to make incorrect assumptions.
Just "knowing" C would not make you better.
But, if you understand the whole thing, how native binaries work, how does CPU work with it, what are architecture limitations, you may write a code which is easier for CPU.
For example, how L1/L2 caches affect your work, and how should you write your code to have more hits in L1/L2 caches. When working with C/C++ and doing heavy optimizations, you will have to go down to that kind of things.
It isn't so much knowing C as it is that C is closer to the bare metal than many other languages. You need to be more aware of how to allocate/deallocate memory because you have to do it yourself. Doing it yourself helps you understand the implications of many decisions that you make.
To me any language is acceptable as long as you understand how the compiler/interpreter (basically) maps your code onto the machine. It's a bit easier to do in a language that exposes this directly, but you should be able to, with a bit of reading, figure out how memory is allocated and organized, what sort of indexing patterns are more optimal than others, what constructs are more efficient for particular applications, etc.
More important, I think, is a good understanding of operating systems, memory architectures, and algorithms. If you understand how your algorithm works, why it would be better to choose one algorithm or data structure over another (e.g., HashSet vs. List), and how your code maps onto the machine, it shouldn't matter what language you are using.
This is my experience of how I learnt and taught myself programming, specifically, understanding C, this is going back to early 1990's so may be a bit antique, but the passion and the drive is important:
Learn to understand the low level principles of the computer, such as EGA/VGA programming, here's a link to the Simtel archive on the C programmer's guide to the PC.
Understanding how TSR's work
Download the whole archive of Bob Stout's snippets which is a big collection of C code that does one thing only - study them and understand it, not alone that, the collection of snippets strives to be portable.
Browse at the International Obfuscated C Code Contest (IOCCC) online, and see how the C code can be abused and understand the intracies of the language. The worst code abuse is the winner! Download the archives and study them.
Like myself, I loved the infamous Ponzo's C Tutorial which helped me immensely, unfortunately, the archive is very hard to find. If anyone knows of where to obtain them, please leave a comment and I will amend this answer to include the link. There is another one that I can remember - Coronado's [Generic?] C Tutorial, again, my memory on this one is hazy...
Look at Dr. Dobb's journal and C User Journal here - I do not know if you can still get them in print but they were a classic, can remember the feeling of holding a printed copy in my hand and tearing off home to type in the code to see what happens!
Grab an ancient copy of Turbo C v2 which I believe you can get from borland.com and just play with 16bit C programming to get a feel and mess with the pointers...sure it is ancient and old but playing with pointers on it is fine.
Understand and learn Pointers, link here to the legacy Simtel.net - a crucial link to achieving C Guru'ship for want of a better word, also you will find a host of downloads pertaining to the C programming language - I remember actually ordering the Simtel CD Archive and looking for the C stuff...
A couple of things that you have to deal directly with in C that other languages abstract away from you include explicit memory management (malloc) and dealing directly with pointers.
My girlfriend is one semester from graduating MIT (where they mainly use Java, Scheme, and Python) with a Computer Science degree, and she is currently working at a company whose codebase is in C++. For the first few days she had a difficult time understanding all the pointers/references/etc.
On the other hand, I found moving from C++ to Java very easy, because I was never confused about pass-references-by-value vs pass-by-reference.
Similarly, in C/C++ it is much more apparent that primitives are just the compiler treating the same sets of bits in different ways, as opposed to a language like Python or Ruby where everything is an object with its own distinct properties.
A simple (not entirely realistic) example to illustrate some of the advice above. Consider the seemingly harmless
while(true)
for(Iterator iter = foo.iterator(); iter.hasNext();)
bar.doSomething( iter.next() )
or the even higher level
while(true)
for(Baz b: foo)
bar.doSomething(b)
A possible problem here is that each time round the while loop a new object (the iterator) is created. If all you care about is programmer convenience, then the latter is definitely better. But if the loop has to be efficient or the machine is resource constrained then you are pretty much at the mercy of the designers of your high level language.
For example, a typical complaint for doing high-performance Java is having execution stop while garbage (such as all those allocated Iterator objects) is reclaimed. Not very good if your software is charged with tracking incoming missiles, auto-piloting a passenger jet, or just not leaving the user wondering why the GUI has stopped responding.
One possible solution (still in the higher-level language) would be to weaken the convenience of the iterator to something like
Iterator iter = new Iterator();
while(true)
for(foo.initAlreadyAllocatedIterator(iter); iter.hasNext();)
bar.doSomething(iter.next())
But this would only make sense if you had some idea about memory allocation...otherwise it just looks like a nasty API. Convenience always costs somewhere, and knowing lower-level stuff can help you identify and mitigate those costs.

Why do safety requirements like to discourage use of AI?

Seems that requirements on safety do not seem to like systems that use AI for safety-related requirements (particularly where large potential risks of destruction/death are involved). Can anyone suggest why? I always thought that, provided you program your logic properly, the more intelligence you put in an algorithm, the more likely this algorithm is capable of preventing a dangerous situation. Are things different in practice?
Most AI algorithms are fuzzy -- typically learning as they go along. For items that are of critical safety importance what you want is deterministic. These algorithms are easier to prove correct, which is essential for many safety critical applications.
I would think that the reason is twofold.
First it is possible that the AI will make unpredictable decisions. Granted, they can be beneficial, but when talking about safety-concerns, you can't take risks like that, especially if people's lives are on the line.
The second is that the "reasoning" behind the decisions can't always be traced (sometimes there is a random element used for generating results with an AI) and when something goes wrong, not having the ability to determine "why" (in a very precise manner) becomes a liability.
In the end, it comes down to accountability and reliability.
The more complex a system is, the harder it is to test.
And the more crucial a system is, the more important it becomes to have 100% comprehensive tests.
Therefore for crucial systems people prefer to have sub-optimal features, that can be tested, and rely on human interaction for complex decision making.
From a safety standpoint, one often is concerned with guaranteed predictability/determinism of behavior and rapid response time. While it's possible to do either or both with AI-style programming techniques, as a system's control logic becomes more complex it's harder to provide convincing arguments about how the system will behave (convincing enough to satisfy an auditor).
I would guess that AI systems are generally considered more complex. Complexity is usually a bad thing, especially when it relates to "magic" which is how some people perceive AI systems.
That's not to say that the alternative is necessarily simpler (or better).
When we've done control systems coding, we've had to show trace tables for every single code path, and permutation of inputs. This was required to insure that we didn't put equipment into a dangerous state (for employees or infrastructure), and to "prove" that the programs did what they were supposed to do.
That'd be awfully tricky to do if the program were fuzzy and non-deterministic, as #tvanfosson indicated. I think you should accept that answer.
The key statement is "provided you program your logic properly". Well, how do you "provide" that? Experience shows that most programs are chock full of bugs.
The only way to guarantee that there are no bugs would be formal verification, but that is practically infeasible for all but the most primitively simple systems, and (worse) is usually done on specifications rather than code, so you still don't know of the code correctly implements your spec after you've proven the spec to be flawless.
I think that is because AI is very hard to understand and that becomes impossible to maintain.
Even if a AI program is considered fuzzy, or that it "learns" by the moment it is released, it is very well tested to all know cases(and it already learned from it) before its even finished. Most of the cases this "learning" will change some "thresholds" or weights in the program and after that, it is very hard to really understand and maintain that code, even for the creators.
This have been changing in the last 30 years by creating languages easier to understand for mathematicians, making it easier for them to test, and deliver new pseudo-code around the problem(like mat lab AI toolbox)
As there is no accepted definition of AI, the question shall be more specific.
My answer is on adaptive algorithms merely employing parameter estimation - a kind of learning - to improve the safety of the output information. Even this is not welcome in functional safety although it may seem that the behaviour of a proposed algorithm is not only deterministic (all computer programs are) but also easy to determine.
Be prepared for the assessor asking you to demonstrate test reports covering all combinations of input data and failure modes. Your algorithm being adaptive means it depends not only on current input values but on many or all of the earlier values. You know that a full test coverage is impossible within the age of the universe.
One way to score is showing that previously accepted simpler algorithms (state of the art) are not safe. This shall be easy if you know your problem space (if not, keep away from AI).
Another possibility may exist for your problem: a compelling monitoring function indicating whether the parameter is estimated accurately.
There are enough ways that ordinary algorithms, when shoddily designed and tested, can wind up killing people. If you haven't read about it, you should look up the case of Therac 25. This was a system where the behaviour was supposed to be completely deterministic, and things still went horribly, horribly wrong. Imagine if it were trying to reason "intelligently", too.
"Ordinary algorithms" for a complex problem space tend to be arkward. On the other hand, some "intelligent" algorithms have a simple structure. This is especially true for applications of Bayesian inference. You just have to know the likelihood function(s) for your data (plural applies if the data separates into statistically independent subsets).
Likelihood functions can be tested. If the test cannot cover the tails far enough to reach the required confidence level, just add more data, for example from another sensor. The structure of your algorithm will not change.
A drawback is/was the CPU performance required for Bayesian inference.
Besides, mentioning Therac 25 is not helpful, since no algorithm at all was involved, just multitasking spaghetti code. Citing the authors, "[the] accidents were fairly unique in having software coding errors involved -- most computer-related accidents have not involved coding errors but rather errors in the software requirements such as omissions and mishandled environmental conditions and system states."

Why can Conway’s Game of Life be classified as a universal machine?

I was recently reading about artificial life and came across the statement, "Conway’s Game of Life demonstrates enough complexity to be classified as a universal machine." I only had a rough understanding of what a universal machine is, and Wikipedia only brought me as close to understanding as Wikipedia ever does. I wonder if anyone could shed some light on this very sexy statement?
Conway's Game of Life seems, to me, to be a lovely distraction with some tremendous implications: I can't make the leap between that and calculator? Is that even the leap that I should be making?
Paul Rendell implemented a Turing machine in Life. Gliders represent signals, and interactions between them are gates and logic that together can create larger components which implement the Turing machine.
Basically, any automatic machinery that can implement AND, OR, and NOT can be combined together in complex enough ways to be Turing-complete. It's not a useful way to compute, but it meets the criteria.
You can build a Turing machine out of Conway's life - although it would be pretty horrendous.
The key is in gliders (and related patterns) - these move (slowly) along the playing field, so can represent streams of bits (the presence of a glider for a 1 and the absence for a 0). Other patterns can be built to take in two streams of gliders (at right angles) and emit another stream of bits corresponding to the AND/OR/etc of the original two streams.
EDIT: There's more on this on the LogiCell web site.
Conway's "Life" can be taken even further: It's not only possible to build a Life pattern that implements a Universal Turing Machine, but also a Von Neumann "Universal Constructor:" http://conwaylife.com/wiki/Universal_constructor
Since a "Universal Constructor" can be programmed to construct any pattern of cells, including a copy of itself, Coway's "Life" is therefore capable of "self-replication," not just Universal Computation.
I highly recommend the book The Recursive Universe by Poundstone. Out of print, but you can probably find a copy, perhaps in a good library. It's almost all about the power of Conway's Life, and the things that can exist in a universe with that set of natural laws, including self-reproducing entities and IIRC, Darwinian evolution.
And Paul Chapman actually build a universal turing machine with game of life: http://www.igblan.free-online.co.uk/igblan/ca/ by building a "Universal Minsky Register Machine".
The pattern is constructed on a
lattice of 30x30 squares. Lightweight
Spaceships (LWSSs) are used to
communicate between components, which
have P60 logic (except for Registers -
see below). A LWSS takes 60
generations to cross a lattice square.
Every 60 generations, therefore, any
inter-component LWSS (pulse) is in the
same position relative to the square
it's in, allowing for rotation
.

When is theoretical computer science useful?

In class, we learned about the halting problem, Turing machines, reductions, etc. A lot of classmates are saying these are all abstract and useless concepts, and there's no real point in knowing them (i.e., you can forget them once the course is over and not lose anything).
Why is theory useful? Do you ever use it in your day-to-day coding?
True story:
When I got my first programming job out of graduate school, the guys that owned the company that I worked for were pilots. A few weeks after I was hired, one of them asked me this question:
There are 106 airports in Arkansas.
Could you write a program that would
find the shortest rout necessary to
land at each one of them?
I seriously thought he was quizzing me on my knowledge of the Traveling Salesman Problem and NP-Completeness. But it turns out he wasn't. He didn't know anything about it. He really wanted a program that would find the shortest path. He was surprised when I explained that there were 106-factorial solutions and finding the best one was a well-known computationally intractable problem.
So that's one example.
When I graduated from college, I assumed that I was on par with everyone else: "I have a BS in CS, and so do a lot of other people, and we can all do essentially the same things." I eventually discovered that my assumption was false. I stood out, and my background had a lot to do with it--particularly my degree.
I knew that there was one "slight" difference, in that I had a "B.S." in CS because my college was one of the first (supposedly #2 in 1987) in the nation to receive accreditation for its CS degree program, and I graduated in the second class to have that accreditation. At the time, I did not think that it mattered much.
I had also noticed during high school and in college that I did particularly well at CS--much better than my peers and even better than many of my teachers. I was asked for help a lot, did some tutoring, was asked to help with a research project, and was allowed to do independent study when no one else was. I was happy to be able to help, but I did not think much about the difference.
After college (USAFA), I spent four years in the Air Force, two of which were applying my CS degree. There I noticed that very few of my coworkers had degrees or even training related to computers. The Air Force sent me to five months of certification training, where I again found a lack of degrees or training. But here I started to notice the difference--it became totally obvious that many of the people I encountered did not REALLY know what they were doing, and that included the people with training or degrees. Allow me please to illustrate.
In my Air Force certification training were a total of thirteen people (including me). As Air Force officers or the equivalent, we all had BS degrees. I was in the middle based on age and rank (I was an O-2 amongst six O-1s and six O-3s and above). At the end of this training, the Air Force rubber-stamped us all as equally competent to acquire, build, design, maintain, and operate ANY computer or communication system for ANY part of the Department of Defense.
However, of the thirteen of us, only six had any form of computer-related degree; the other seven had degrees ranging from aeronautics to chemistry/biology to psychology. Of the six of us with CS degrees, I learned that two had never written a program of any kind and had never used a computer more than casually (writing papers, playing games, etc.). I learned that another two of us had written exactly one program on a single computer during their CS degree program. Only one other person and myself had written more than one program or used more than one kind of computer--indeed, we found that we two had written many programs and used many kinds of computers.
Towards the end of our five-month training, our class was assigned a programming project and we were divided into four groups to separately undertake it. Our instructors divided up the class in order to spread the "programming talent" fairly, and they assigned roles of team lead, tech lead, and developer. Each group was given a week to implement (in Ada) a full-screen, text-based user interface (this was 1990) for a flight simulator on top of an instructor-provided flight-mechanics library. I was assigned as tech lead for my team of four.
My team lead (who did not have a computer degree) asked the other three of us to divide up the project into tasks and then assigned a third of them to each of us. I finished my third of the tasks by the middle of that first day, then spent the rest of the day helping my other two teammates, talking to my team lead (BSing ;^), and playing on my computer.
The next morning (day two), my team lead privately informed me that our other two teammates had made no progress (one could not actually write an "if" statement that would compile), and he asked me to take on their work. I finished the entire project by mid-afternoon, and my team spent the rest of the day flying the simulator.
The other guy with the comparable CS degree was also assigned as a tech lead for his team, and they finished by the end of day three. They also began flying their simulator. The other two teams had not finished, or even made significant progress, by the end of the week. We were not allowed to help other teams, so it was left at that.
Meanwhile, by the middle of day three, I had noticed that the flight simulator just seemed to behave "wrong". Since one of my classmates had a degree in aeronautics, I asked him about it. He was mystified, then confessed that he did not actually know what made a plane fly!?! I was dumbfounded! It turns out that his entire degree program was about safety and crash investigations--no real math or science behind flight. On the other hand, I had maybe a minor in aeronautics (remember USAFA?), but we had designed wings and performed real wind tunnel tests. Therefore, I quietly spent the rest of the week rewriting the instructor-provided flight-mechanics library until the simulator flew "right".
Since then, I have spent nearly two decades as a contractor and occasionally as an employee, always doing software development plus related activities (DBA, architect, etc.). I have continued to find more of the same, and eventually I gave up on my youthful assumption.
So, what exactly have I discovered? Not every one is equal, and that is okay--I am not a better person because I can program effectively, but I am more useful IF that is what you need from me. I learned that my background really mattered:
growing up in a family of electricians and electrical engineers,
building electronics kits,
reading LITERALLY every computer book in the school/public libraries because I did not have access to a real computer,
then moving to a new city where my high school did have computers,
then getting my own computer as a gift,
going to schools that had computers of many different sizes and kinds (PCs to mainframes),
getting an accredited degree from a VERY good engineering school,
having to write lots of programs in different languages on different kinds of computers,
having to write hard programs (like my own virtual machine with a custom assembly language, or a Huffman compression implementation, etc.),
having to troubleshoot for myself,
building my own computers from parts and installing ALL the software,
etc.
Ultimately, I learned that my abilities are built on a foundation of knowing how computers work from the electrical level on up--discrete electronic components, circuitry, subsystems, interfaces, protocols, bits, bytes, processors, devices, drivers, libraries, programs, systems, networks, on up to the massive enterprise-class conglomerates that I routinely work on now. So, when the damn thing misbehaves, I know exactly HOW and WHY. And I know what cannot be done as well as what can. And I know a lot about what has been done, what has been tried, and what is left relatively unexplored.
Most importantly, after I have learned all that, I have learned that I don't know a damned thing. In the face of all that there is potentially to know, my knowledge is miniscule.
And I am quite content with that. But I recommend that you try.
Sure, it's useful.
Imagine a developer working on a template engine. You know the sort of thing...
Blah blah blah ${MyTemplateString} blah blah blah.
It starts out simple, with a cheeky little regular expression to peform the replacements.
But gradually the templates get a little more fancy, and the developer includes features for templatizing lists and maps of strings. To accomplish that, he writes a simple little grammar and generates a parser.
Getting very crafty, the template engine might eventually include a syntax for conditional logic, to display different blocks of text depending on the values of the arguments.
Someone with a theoretical background in CS would recognize that the template language is slowly becoming Turing complete, and maybe the Interpreter pattern would be a good way to implement it.
Having built an interpreter for the templates, a computer scientist might notice that the majority of templating requests are duplicates, regenerating the same results over and over again. So a cache is developed, and all requests are routed through the cache before performing the expensive transformation.
Also, some templates are much more complex than others and take a lot longer to render. Maybe someone gets the idea to estimate the execution of each template before rendering it.
But wait!!! Someone on the team points out that, if the template language really is Turing complete, then the task of estimating the execution time of each rendering operating is an instance of the Halting Problem!! Yikes, don't do that!!!
The thing about theory, in practice, is that all practice is based on theory. Theoretically.
The things I use most:
computational complexity to write algorithms that scale gracefully
understanding of how memory allocation, paging, and CPU caching work so I can write efficient code
understanding of data structures
understanding of threading, locking, and associated problems
As to that stuff on Turing machines etc. I think it is important because it defines the constraints under which we all operate. Thats important to appreciate.
it's the difference between learning algebra and being taught how to use a calculator
if you know algebra, you realize that the same problem may manifest in different forms, and you understand the rules for transforming the problem into a more concise form
if you only know how to use a calculator, you may waste a lot of time punching buttons on a problem that is either (a) already solved, (b) cannot be solved, or (c) is like some other problem (solved or unsolved) that you don't recognize because it's in a different form
pretend, for a moment, that computer science is physics... would the question seem silly?
A friend of mine is doing work on a language with some templates. I was asked in to do a little consulting. Part of our discussion was on the template feature, because if the templates were Turing complete, they would have to really consider VM-ish properties and how/if their compiler would support it.
My story is to this point: automata theory is still taught, because it still has relevance. The halting problem still exists and provides a limit to what you can do.
Now, do these things have relevance to a database jockey hammering out C# code? Probably not. But when you start moving to a more advanced level, you'll want to understand your roots & foundations.
Although I don't directly apply them in day-to-day work, I know that my education on formal computer science has affected my thinking process. I certainly avoid certain mistakes from the onset because I have the lessons learned from the formal approaches instilled in me.
It might seem useless while they're learning it; but I bet your classmate will eventually comes across a problem where they'll use what they were taught, or at least the thinking patterns behind it...
Wax on... Wax off... Wax on... Wax off... What does that have to do with Karate, anyways?
At one job I was assigned the task of improving our electrical distribution model's network tracing algorithm as the one they were using was too slow. The 3-phase network was essentially three n-trees (since loops aren't allowed in electrical networks). The network nodes were in the database and some of the original team couldn't figure out how to build an in-memory model so the tracing was done by successive depth SELECTs on the database, filtering on each phase. So to trace a node ten nodes from the substation would require at least 10 database queries (the original team members were database whizzes, but lacked a decent background in algorithms).
I wrote a solution that transformed the 3 n-tree networks of nodes from the database into a data structure where each node was stored once in a node structure array and the n-tree relationship was converted to three binary trees using doubly-linked pointers within the array so that the network could be easily traced in either direction.
It was at least two orders of magnitude faster, three on really long downstream traces.
The sad thing was that I had to practically teach a class in n-trees, binary trees, pointers, and doubly-linked lists to several of the other programmers who had been trained on databases and VB in order for them to understand the algorithms.
It's a classic dichotomy, between "how" and "what". Your classmates are looking at "how" to program software, and they're very focused on the near focus; from that perspective, the perspective of implementation, it seems like knowing things like halting states and Turing machines are unimportant.
"How" is very little of the actual work that you get expected to do with Computer Science, though. In fact, most successful engineers I know would probably put it at less than 20 percent of the actual job. "What" to do is by far more important; and for that, the fundamentals of Computer Science are critical. "What" you want to do relates much more to design than implementation; and good design is... well, let's just call it "non-trivial".
"There are two ways of constructing a software design: One way is to make it so simple that there are obviously no deficiencies, and the other way is to make it so complicated that there are no obvious deficiencies. The first method is far more difficult." - C.A.R. Hoare
Good luck with your studies!
I think understanding the fundamental models of computation is useful: sure you never need to be able to translate a Turing machine into a register machine in practice, but learning how to see that two very different problems are really instances of the same concept is a critical skill.
Most knowledge is not "practical", but helps you connect dots in ways that you cannot anticipate, or gives you a richer vocabulary for describing more complex ideas.
It's not the specific problems that you study that matters, it's the principles that you learn through studying them. I use concepts about algorithms, data structures, programming languages, and operating systems every day at my job. If you work as a programmer you'll make decisions all the time that affect system performance. You need to have a solid foundation in the fundamental abstract concepts in order to make the right choices.
After i graduated from CS I thought similarly: the whole bunch of theories that we studied are completely useless in practice. This proved to be right for a short period of time, however the moment you deal with complex tasks, theory is definitely MORE VALUABLE than practice. every one after few years of coding can write some programs that "work" but not every one is able to understand how. no matter what most of us will deal at a certain point with performance issues, network delays, precission, scalability etc. At this stage the theory is critical. in order to design a good solution when dealing with complex systems is very important to know how the memory management works, the concepts of process and threads, how memory is assigned to them, or efficient data structures for performance and so on.
One time for example i was working on a project including plenty of mathematical calculations and at a certain point our software failed. while debugging i figured out that after some mathematical operation i received a number as DOUBLE of a value 1.000000000002 but from the mathematical perspective couldnt be > 1 which at some later stage in the program was giving the legendary NaN exception. i spent some time to figure this out but if i had paid more attention to the "approximation of Double and Float" lesson i would have not wasted that time.
If you work in a company that does groundbreaking work, it is important to be able to communicate to architects and developers what the benefits are. There is a lot of hype about all kinds of technologies and positioning yourself can be difficult. When you frame your innovation in scientific and theoretical terms you are definitely at an advantage and customers sense you are the real thing. I can tell folks: there is a new way to deal with state, encoding and nondeterminism (i.e. complexities) and you can definitely be more productive than you are today.
If you take the long view in your career learning about theory will give you depth, the depth you need to grow. The return on investment in learning your 5th or 6th programming language will be a lot less then learning your 2nd and 3rd. Exposure to theory will give you a sense for real engineering, about where the degrees of freedom are and how you can make the right trade-offs.
The important concepts 1) State, 2) Encoding, 3) Nondeterminism. If you don't know them they will not help you. What theory should provide you with is the big picture and a sense of how basic concepts fit together. It should help you hone your intuition.
Example: some of the answers above mention the halting problem and Turing machines. When I came across Turing's paper in college I did not feel enlightened at all. One day I came across Goedel's incompleteness theorem and Goedel numbering in particular. Things started to make a lot of sense. Years later I read about Georg Cantor at a bookstore. Now I really started to understand Turing machines and the halting problem. Try for yourself and look up "Cantor's Diagonal Argument" on Wikipedia. It is one of the most awesome things intellectually you will ever encounter.
Food for thought: A typical Turing machine is not the only way to design a state transition machine. A Turing machine with two rather than one tape would give you a lot more speed for a number of algorithms. http://www.math.ucla.edu/~ynm/papers/eng.ps
You can expose yourself to these insights more efficiently then I did by reading this book. Link at the bottom of this post. (At the very least, check out the table of contents on Amazon to get a taste of what this is all about):
I found the book by Rosenberg sensational. http://www.amazon.com/The-Pillars-Computation-Theory-Nondeterminism/dp/0387096388 If you have only one book on theory IMHO this should be the one.
I do not use it on a daily basis. But it gave me a lot of understanding that helps me each day.
I found that all I need for daily bliss from the CS theoretical world is the utterance of the mantra "Low coupling and High Cohesion". Roger S. Pressman made it scholarly before Steve McConnell made it fashionable.
Ya, I generally use state diagrams to design the shape and flow of the program.
Once it works in theory, I start coding and testing, checking off the states as i go.
I find that they are also a useful tool to explain the behavior of a process to other people.
Simple. For example: if I'm using RSACryptoServiceProvider I'd like to know what is that and why I can trust it.
Because C++ templates are actually some kind of lambda calculus. See www.cs.nott.ac.uk/types06/slides/michelbrink_types_06.pdf
I'm studying for my Distributed algorithms course now. There is a chapter about fault tolerance and it contains some proofs on the upper bound for how many processes can fail (or misbehave) so that the distributed algorithm can handle it correctly.
For many problems, the bound for misbehaving processes is up to one third of total number of processes. This is quite useful in my opinion because you know that it's pointless to try to develop a better algorithm (under given assumptions).
Even if theoretical courses aren't going to be used directly, it might help you think better of something.
You don't know what your boss is going to ask you to do, you may have to use something that you thought it won't be benefical, as Jeffrey L Whitledge said.
To be honest, I sort of disagree with a lot of the answers here. I wrote my first compiler (for fun; I really have too much coffee/free time) without having taken a course in compilers; basically I just scanned the code for another compiler and followed the pattern. I could write a parser in C off the top of my head, but I don't think I could remember how to draw a pushdown automaton if my life depended on it.
When I decided I wanted to put type inference in my toy (imperative) programming language, I first looked over probably five papers, staring at something called "typed lambda calculus" going what.... the.... ****....? At first I tried implementing something with "generic variables" and "nongeneric variables" and had no idea what was going on. Then I scrapped it all, and sat there with a notebook figuring out how I could implement it practically with support for all the things I needed (sub-typing, first-class functions, parameterized types, etc.) With a couple days of thinking & writing test programs, I blew away more than a weeks worth of trying to figure out the theoretical crap.
Knowing the basics of computing (i.e. how virtual memory works, how filesystems work, threading/scheduling, SMP, data structures) have all proved HIGHLY useful. Complexity theory and Big-O stuff has sometimes proved useful (especially useful for things like RDBMS design). The halting problem and automata/Turing Machine theory? Never.
I know this is old, but my short reply to those who claim that theory is 'useless' and that they can practice their profession without it is this:
Without the underlying theory, there is no practice.
Why is theory useful?
Theory is the underlying foundation on top of which other things are built. When theory is applied, practice is the result.
Consider computers today. The common computer today is modeled and built on top of the Turing Machine, which, to keep it simple, is an abstract/theoretical model for computation. This theoretical model lies at the foundation of computing, and all the computing devices we use today, from high-end servers to pocket phones, work because the underlying foundation is sound.
Consider algorithm analysis. In simple terms, algorithm analysis and time-complexity theory have been used to classify problems (e.g. P, NP, EXP, etc) as well as how the algorithms we have behave when trying to solve different problems in different classes.
Suppose one of your friends gets a job at some place X and, while there, a manager makes a few simple requests, such as these examples:
Ex 1: We have a large fleet of delivery vehicles that visit different cities across several states. We need you to implement a system to figure out what the shortest route for each vehicle is and choose the optimal one out of all the possibilities. Can you do it?
Thinking the theory is 'useless' your friends don't realize that they've just been given the Traveling Salesman Problem (TSP) and start designing this system without a second thought, only to discover their naive attempt to check all the possibilities, as originally requested, is so slow their system is unusable for any practical purposes.
In fact, they have no idea why the system works at an "acceptable" level when checking 5 cities, yet becomes very slow at 10 cities, and just freezes when going up to only 40 cities. They reason that it's only "2x and 8x more cities than the 5 city test" and wonder why the program does not simply require "2x and 8x more time" respectively...
Understanding the theory would've allowed them to realize the following, at least at a glance:
It's the TSP
The TSP is NP-hard
Their algorithm's order of growth is O(n!)
The numbers speak for themselves:
+--------------+-------+-----------------------------------------------------------------+
| No. Cities | O(N!) | Possibilities |
+--------------+-------+-----------------------------------------------------------------+
| 5 | 5! | 120 |
| 10 | 10! | 3,628,800 |
| 40 | 40! | 815,915,283,247,897,734,345,611,269,596,115,894,272,000,000,000 | <-- GG
+--------------+-------+-----------------------------------------------------------------+
They could've realized at the outset that their system was not going to work as they imagined it would. The system was later considered impractical and cancelled after a significant amount of time, effort, and other resources had been allocated to, and ultimately wasted on, the project --and all because thought "theory is useless".
So after this failure, the managers think "Well, maybe that system was underestimated; after all, there're a LOT of cities in our country and our computers are simply not as fast as we need them to be for our recently cancelled system to have been a success".
The management team blames slow computers as the cause of the project's failure. After all, they're not experts in CS theory, don't need to be, and those who're supposed to be the experts on the topic and could've informed them, didn't.
But they have another project in mind. A simpler one actually. They come the week later and ask say the following:
Ex 2: We have only a few servers and we have programmers who keep submitting programs that, due to unknown reasons, end up in infinite cycles and hogging down the servers. We need you to write a program that will process the code being submitted and detect whether the submitted program will cause an infinite cycle during its run or not, and decide whether the submitted program should be allowed to run on this basis. Can you do it?
Your dear friend accepts the challenge again and goes to work immediately. After several weeks of work, there're no results, your friend is stressed, and doesn't know what to do. Yet another failure... your friend now feels "dumb" for not having been able to solve this "simple problem"... after all, the request itself made it sound simple.
Unfortunately, your friend, while insisting that "theory is useless" didn't realize that the, allegedly simple, request was actually an intractable problem about decidability (i.e. the halting problem itself), and that there was no known solution for it. It was an impossible task.
Therefore, even starting work to solve that particular problem was an avoidable and preventable mistake. Had the theoretical framework to understand what was being requested been in place, they could've just proposed a different, and achievable, solution... such as implementing a monitoring process that can simply kill -SIGTERM <id> of any user process (as per a list of users) that monopolizes the CPU for some arbitrary/reasonable interval under certain assumptions (e.g. we know every program run should've terminated within 10 minutes, so any instance running for 20+ minutes should be killed).
In conclusion, practice without the theory is like a building without a foundation. Sooner or later, the right amount of pressure from the right angle will make it collapse in on itself. No exceptions.
Do you ever use it in your day-to-day coding?
Yes, but not directly. Rather, we rely on it indirectly. The caveat here is that different theoretical concepts will be more or less applicable depending on the problem domain you happen to be working on.
Surely, we:
use computers daily, which relies on computational models (e.g. turing machines)
write code, which relies on computability theory (e.g. what's even computable) and lambda calculus (e.g. for programming languages)
rely on color theory and models (e.g. RGB and CMYK color models) for color displays and printing, etc.
Euler's theorems in computer graphics so that matrices can be built to rotate objects about arbitrary axes, and so on...
It's a fact that someone who simply use a plane to travel doesn't need to understand the theory that even allowed planes to be built and fly in the first place... but when someone is expected to build said machines and make them work... can you really expect a good outcome from someone who doesn't understand even the principles of flight?
Was it really a coincidence that, for most of history, no one was able to build a flying machine (and a few even died testing theirs) until the Wright brothers understood certain theoretical concepts about flight and managed to put them into practice?
It's no coincidence. We have a lot of working technology today because the people who built them understood, and applied, the theoretical principles that allowed them to work in the first place.
I guess it depends on which field you go into.

Resources