In C-type languages, there is a strong emphasis on structs/records and objects from the very beginning and in every introductory book. Then, their complete systems are designed around managing such structs, their mutual relations and inheritance.
In Lisp documentation, you can usually find 1-2 pages about how Lisp "also" has a defstruct, a simple example, and thats usually it. Also, nesting of structures is never mentioned at all.
For someone coming from a C background, it first seems that organizing different data types hierarchically isnt the preferred method in Lisp, but apart from CLOS, which is a full blown object system and too complicated if you just want structs, and apart from craming everything into lists, there isnt an apparent way to transfer your C struct knowledge.
What is the idiomatic Lisp way of hierarchically organizing data which most resembles C structs?
--
I think the summary answer to my question would be: For beginner learning purposes, defstruct and/or plists, although "legacy features", can be used, since they most closely resemble C structs, but that they have been largerly superseded by the more flexible defclass/CLOS, which what most Lisp programs use today.
This was my first question on SO, so thanks everyone for your time answering it.
Use CLOS. It isn't complicated.
Otherwise use structures.
If you have a specific question how to use them, then just ask.
(defclass point ()
((x :type number)
(y :type number)))
(defclass rectangle ()
((p1 :type point)
(p2 :type point)
(color :type color)))
Stuff like that eventually leads to interfaces like Rectangles in CLIM (the Common Lisp Interface Manager).
History
To expand on it a bit: Historically 'structures' have been used in some low-level situations. Structures have single inheritance and slot access is 'fast'. Some Lisp dialects have more to structures than what Common Lisp offers. Then from the mid-70s on various forms of object-oriented representations have been developed for Lisp. Most of the representation of structured objects moved from structures to some kind of object-oriented Lisp extension. Popular during the 80s were class-based systems like Flavors, LOOPS and others. Frame-based or prototype-based systems like KEE Units or Object Lisp were also popular. The first Macintosh Common Lisp used Object Lisp for all its UI and IO facilities. The MIT Lisp machine used Flavors basically everywhere. Starting in the mid 80s ANSI CL was developed. A common OO-system was developed especially for Common Lisp: CLOS. It was based on Flavors and Loops. During that time mostly nothing was done to really improve structures - besides implementors finding ways to improve the implementation and providing a shallow CLOS integration. For example structures don't provide any packing of data. If there are two slots of 4 bits content, there is no way to instruct Common Lisp to encode both slots into a single 8 bit memory region.
As an example you can see in the Lisp Machine Manual, chapter on structures (PDF), that it had much more complex structures than what Common Lisp provides. Some of that was already present in Maclisp in the 70s: DEFSTRUCT in the Maclisp manual.
CLOS, the Common Lisp Object System
Most people would agree that CLOS is a nice design. It sometimes leads to 'larger' code, mostly because identifiers can get long. But there is some CLOS code, like the one in the AMOP book, that is really nicely written and shows how it is supposed to be used.
Over time implementors had to deal with the challenge that developers wanted to use CLOS, but also wanted to have the 'speed' of structures. Which is even more a task with the 'full' CLOS, which includes the almost standard Meta Object Protocol (MOP) for CLOS. So there are some tricks that implementors provide. During the 80s some software used a switch, so it could compiled using structures or using CLOS - CLX (the low-level Common Lisp X11 interface was an example). The reason: on some computers and implementations CLOS was much slower than structures. Today it would be unusual to provide such a compilation switch.
If I look today at a good Common Lisp implementation, I would expect that it uses CLOS almost everywhere. STREAMs are CLOS classes. CONDITIONs are CLOS classes. The GUI toolkit uses CLOS classes. The editor uses CLOS. It might even integrate foreign classes (say, Objective C classes) into CLOS.
In any non-toy Common Lisp implementation CLOS will be the tool to provide structured data, generic behavior and a bunch of other things.
As mentioned in some of the other answers, in some places CLOS might not be needed.
Common Lisp can return more than one value from a function:
(defun calculate-coordinates (ship)
(move-forward ship)
(values (ship-x ship)
(ship-y ship)))
One can store data in closures:
(defun create-distance-function (ship x y)
(lambda ()
(point-distance (ship-x ship) (ship-y ship) x y)))
For configuration one can use some kind of lists:
(defship ms-germany :initial-x 0 :initial-y 0)
You can bet that I would implement the ship model in CLOS.
A lesson from writing and maintaining CLOS software is that it needs to be carefully designed and CLOS is so powerful that one can create really complex software with it - a complexity which is often not a good idea. Refactor and simplify! Fortunately, for many tasks basic CLOS facilities are sufficient: DEFCLASS, DEFMETHOD and MAKE-INSTANCE.
Pointers to CLOS introductions
For a start, Richard P. Gabriel has his CLOS papers for download.
Also see:
http://cl-cookbook.sourceforge.net/clos-tutorial/index.html
A Brief Guide to CLOS
Book chapter from Practical Common Lisp, Object Reorientation, Classes
Book chapter from Practical Common Lisp, Object Reorientation, Generic Functions
C++ Coder’s Newbie Guide to Lisp-style OO
Book: The Art of the Metaobject Protocol. According to some guy named Alan Kay the most important computer science book in a decade, unfortunately written for Lispers ;-). The book explains how to modify or extend CLOS itself. It also includes a simple CLOS implementation as source. For normal users this book is not really needed, but the programming style is that of real Lisp experts.
Examples with defstruct are short and simple because there isn't much to say about them. C's structs are complicated:
memory management
complicated memory layout due to unions, inline nested structures
In C, structs are also used for other purposes:
to access memory
due to the lack of polymorphism or ability to pass a value of 'any' type: it is idiomatic to pass around a void*
due to inability to have other means of passing data; e.g., in Lisp you can pass a closure which has the needed data
due to the lack of advanced calling conventions; some functions accept their arguments inside structures
In Common Lisp, defstruct is roughly equivalent to Java's/C#'s class: single inheritance, fixed slots, can be used as specifiers in defmethods (analogous to virtual methods). Structures are perfectly usable for nested data structures.
Lisp programs tend not to use deeply nested structures (Lisp's source code being the primary exception) due to that often more simple representations are possible.
Property Lists (plists)
I think the idiomatic equivalent of a C struct is to not need to store the data in structs in the first place. I'd say at least 50% of the C-style code I've ported to Lisp, rather than storing the data in some elaborate structure, I just compute what it is I want to compute. C needs structs to store everything temporarily because its expressions are so weak.
If you a specific example of some C-style code, I'm sure we could demonstrate an idiomatic way to implement it in Lisp.
Beyond that, remember that Lisp's s-exps are hierarchical data. An if expression in Lisp, for example, is itself hierarchical data.
(defclass point ()
( x y z))
(defun distance-from-origin (point)
(with-slots (x y z)
point
(sqrt (+ (* x x) (* y y) (* z z)))))
I think this is kind of what I was looking for, can be found here:
http://cl-cookbook.sourceforge.net/clos-tutorial/index.html
y
Why not use hash tables? Every member in the struct can be a key in a hash table.
Related
Arrays are one of the fundamental data structures and I was reading them up on LeetCode and saw the following -
They exist in all programming languages, and are used as the basis for most other data structures.
The claim that they exist in all programming languages seems a bit extravagant to me but maybe that's just because I am a beginner.
So are Arrays really present in all programming languages and I've seen some quite cryptic ones while exploring Code Golf pages which seem to be intentionally made so, so do they still use Arrays or are there any alternatives to arrays as well?
Almost every claim about "all programming languages" is false, simply because of the fact that there are so many programming languages that for every claim there is a counter example.
So let's restrict to practical programming languages.
Then, yes, every practical programming language offers the possibility to collect things of similar type in an ordered collection.
However, depending in your definition of 'Array', and on the exact conditions when arrays 'exist' in a programming language, this need not be an array. For example:
Some scripting languages (such as Perl, if I remember correctly) only have dictionaries, and an array is simply a dictionary with numbers as keys.
Some languages only have linked lists (for example some functional and logical programming languages).
Some languages don't have arrays or lists, but are expressive enough that linked lists can be easily defined.
you can find some home made language that do not respect any standard like the one used in the application ProRealTime: the name of the language is ProRealCode and is used to program indicators or scanners on financial markets.
it has a feature to get older data with x[index], but you cannot create proper array and loop in to it.
The main problem of this home made language are:
they can have some bugs (so you do not know if your code is buggy or the execution machine)
you have no serious debugger
For me, an array is an abstraction that has a concept of the element count, and not necessarily maps it to contiguous memory. I think of Fortran arrays since it was the first programming language and it sets the expectations of what arrays are.
Here are some examples of (mainstream) languages without a concept of arrays:
Assembly Language
Logo, or turtle graphics.
XSLT
C
In lieu of arrays, some alternatives are pointers with offsets (C, assembly), linked lists, and other tree-like structures. Also, some languages may have the concept of a stack (assembly) which has some of the features of arrays. C++ does have arrays in the form of std::vector<T>, and other STL structures.
This is a community wiki, and so I hope the community can edit/add to this freely.
As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
Member functions can be emulated in C by passing the this pointer explicitly. Virtual functions can be emulated by explicitly storing in every object a pointer to a global array of function pointers. Fine.
Now my question is, do people actually do this? I am wondering if it's worth teaching this technique, because I do not want to teach something to C freshmen that is practically never used in the real world.
(I need to fill the last day of a two-week introductory C course for people already familiar with OOP.)
Are there any relevant projects, libraries or frameworks that emulate OO in C in the manner described?
I've about twenty years experience in C. It was the first compiled language I learned and I've never needed to move on, so it's been C and only C, all the way. I write code constantly at work and at home. I have published a library of lock-free data structures. I think I'm a competent C programmer.
With regard to your question, OO consists of a number of concepts. One, for example, is instantiation, e.g a library with a new() and delete() and instances of a given entity (stack, list, etc). C supports this and it is, of course, a very functional and useful approach. I've used this approach for about fifteen years.
Many years ago I began experimenting with another OO concept, well supported in C++, inheritance. I wanted an entity which contained other entities. The problem then is exposing the API of the contained entites. You can do it, but the fact is, the C language does not naturally express such an concept and approach. It is not something I now use.
My advice is; a knife is a knife, a fork is a fork. You can use either as the other, but it doesn't work well. C does not naturally support some (important) OO concepts, such as inheritance. Don't try to make C do these things. If you want to do this, use C++.
Yes, they do.
Are there any relevant projects, libraries or frameworks that emulate OO in C in the manner described?
I wouldn't call it "emulating" just because there's no first-class language support. See GObject.
A lot of project uses the Object oriented paradigms in C codebase. For various reasons they don't use CPP directly. For system level or performance intensive projects, Other languages don't cut the deal. So its a battle between cpp and c.
Why people emulate OO in C instead of full blown CPP is topic of heated arguments. Linus torvalds once famously stated, CPP compilers are not trustworthy. He has little faith on CPP generated code.
Linux kernel is a good example of implementing OO design patterns in C. You can read about how Linux kernel did it in this lwn.net article series :
part1
part2
There is a extensive free document lying around in internet which covers a full range implementation OO design patterns in C.
ooc.pdf
You can find many other projects along the same road.
Examples:
pjsip
sofia
It may not be used in practice, but it is incredibly valuable to learn the concept of the equivalence between member functions and functions that take the object as the first parameter. Having this concept in the back of their head will help them in many problems they will encounter down the road.
Day in and day out I see people asking questions on Stack Overflow about why it doesn't work to point to pass a member function to something requiring function pointer, and things like that. They think that member functions are just some magical functions that are part of an object, and over-complicate the whole situation. If they had realized that member functions were equivalent to functions that took the object as the first parameter, then the problem they're having (that to call the method they would somehow need both the member function pointer as well as the object), as well as possible solutions (somehow pass the object in separately, or make some kind of closure that captures the object) becomes apparent. Apparently, too many people just pretend that OO is "magic" and don't understand this.
In functional programming, we often teach people how data structures and local variables and all that stuff could be written purely in terms of manipulation of functions. Not that this is practical -- it would probably be inefficient -- but this impresses upon them something about the power of functions. And it helps them to understand things in a different way. And maybe down the road if they write a compiler or something, these equivalences will come in handy.
Computer science is all about equivalences and reductions, and how to think about one problem in terms of another. We reduce SAT-3 to subset sum, not because that's actually how we would actually solve the SAT-3 problem, but because this teaches us that subset sum is NP-complete.
Every once in a while, I come across a piece of code written by someone else, where non-instance methods take a pointer to a structure as an argument, and I see a pattern and a light bulb goes off in my head, and I say, ah-ha, this can be re-factored into an instance method, because I know about this equivalence. So you see, knowing these equivalences also helps us to write better, simpler code.
Check out TI's "DSP Algorithm Standard" / xDAIS framework.
There's a generic C API that every conforming DSP algorithm implementation implements (sorry for the tautology). The need for all this "art" stems from several issues common in the DSP world:
relatively small RAMs
multiple data channels (often parallel/concurrent)
complex algorithm usage patterns
something else I forget
The standard and framework aim at making it easier for DSP engineers to use 3rd party DSP algorithms.
There's an interface to configure an algorithm instance and query its memory requirements (based on the configuration) and there are support functions that actually manage the memory.
Some memory areas, scratchpads, can be allocated temporarily and given to an algorithm instance when it's active and taken away from it when it's inactive and given to another instance, effectively shared.
There's also functionality (and APIs) to move instance memory buffers to defragment memory.
There's more, but I'd need to reread the docs to recall the details.
See IALG_*() and ALG_*() interface methods for example.
Also, there are tools to validate implementations of the generic APIs. 3rd parties can request official validation of them from TI.
Some relevant links: spru352g.pdf, spru360e.pdf.
This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
C Analog To STL
Oftentimes, when programming in C, I find myself wishing that I had access to something like the vector or list classes from C++, and I end up implementing a sort of stripped-down version of some high-level data structure, but it takes a fair amount of time. I am wondering if anyone knows of a good library of high quality simple data structures in C?
Well, there's the GLib library, library, which is what is used to implement the GTK+ widget set. GLib uses GObject as a way to do OOP in C. But GObjectified C is famous for being very verbose.
Other than GLib as others have suggested, you could create your own generic types without any type safety, such as a linked list that uses void * as the generic pointer to the next stack object. Some have even used macros to create template-like behaviour in ANSI C with some reasonable success. Search around cplusplus.com's forums for an example or two (there aren't many more than that if I recall correctly).
Alternatively you could implement your own OOP scheme as described in the paper Object-Oriented Programming With ANSI C, though it's something I wouldn't recommend. You might as well take the time you'd spend creating that beast (and possibly not understanding how it works in the end) and learn some of GLib to get familiar with it.
Personally I prefer the first approach. You lose type safety and gain a bunch of casts, but it's the fastest to implement. The real problem with that method is the data that is stored. std::stack<int> converted to C is simply a struct stack that has a data field of type int. What about std::vector<std::stack<int> >, an addressable list of stacks of integers? The C standard cannot guarantee that casting from a pointer type to an integral type will work without some loss of information. In other words, while you can necessarily use pointers, you can't use plain integers to store information. Perhaps a few unrolled template specializations would work -- one for int, (one for long long?), one for double, and one for pointer types.
In the end, the best solution is to use a library like GLib, but if you prefer to roll your own, take care when creating it. You never know when you'll need a combination of types of items in a container.
I like GLib for this kind of thing; the code you end up writing is clean, the libraries are easy to use, stable, mature and cross-platform, and the documentation is excellent.
You may also want to take a look at Rusty Russell's CCAN project. I haven't used any code from it but I will definitely go dip my toe in the next time I have similar needs:
http://ccan.ozlabs.org/
The Idea
That nice snippets of C code should be moved out of junkcode directories and exposed to a wider world, where they can become something useful.
CCAN is loosely modelled after the successful CPAN project for Perl code development and
sharing.
I am new to C programming; coming from an OOP PHP background.
I find C to be (no wonder) a much more difficult language. I had particularly lots of problems figuring out a couple of things on arrays at first: like there is no native associative array.
Now, this part I guess I'm figuring out little by little, but now I have a question regarding a conversation I had just yesterday with a C developer. She was explaining the binary search algorithm to me because I asked her whether there were libraries to do array related stuff in C or not because it seemed like a smarter solution than always re-inventing the wheel.
I would really love to learn more about algorithms in C, in particular what differences are there between algorithms and the design patterns I'm used to using in PHP?
Taking things in order: the extent of C's support for anything like an associative array would be qsort to sort an array of structures based on a key, and bsearch to find one based on a key. There are, of course, quite a few alternatives -- various other libraries have hash tables, balanced trees, etc. Exactly which will suit your purposes is hard to guess though.
Offhand, I don't know of many good books covering algorithms that use C as their primary vehicle for demonstration. A few obvious recommendations for books on algorithms in general (mostly language independent) would be:
The Art of Computer Programming by Donald Knuth. This is pretty much the class algorithms book. It's now (finally) up to four volumes. Knuth originally started on it in 1967, planning to write 7 volumes. Only three volumes were available for a long time. A fourth was added quite recently. At the rate he's going, it's only going to make it to 7 if Knuth lives to be well past 100 years old. Nonetheless, the parts that are there are extremely good -- but (warning!) he analyzes the algorithms in considerable detail; if you don't know at least a little calculus, a fair amount will probably be hard to follow.
Introduction to Algorithms by Cormen, Leiserson, Rivest and Stein. IIRC, there's now a newer edition than I have, which adds yet another author. This is a large book (dropping it on your toes would be quite painful). It uses a fair amount of mathematical notation and such throughout, but if you're willing to work a little at looking up the notation, it's really pretty understandable. It covers quite a bit of important ground (e.g., graph algorithms) that are scheduled for later volumes of Knuth, but not (at least yet) available there.
Algorithms and Data Structures by Aho, Hopcraft and Ullman. This is (by a pretty fair margin) the smallest, lightest, and at least for most people probably the easiest of these to follow.
Though it's only available used anymore, if you can find a copy of Algorithms + Data Structures = Programs by Niklaus Wirth, that's what I'd really suggest. It uses Pascal (no surprise -- Niklaus Wirth invented Pascal), but that's enough like C that it doesn't cause a real problem. It doesn't go into as much depth as Knuth about each algorithm, but still enough to give a good feel for when one is likely to be a good choice versus another. For somebody in your position (some background in programming, but little in this area) it's my top recommendation.
Though I've said it before, I think it bears repeating: IMO, all of Robert Sedgewick's books on algorithms should be avoided. Algorithms in C++ is probably the worst of them, but the others are only marginally better. The code they include (again, especially the C++ version) is truly execrable, and the descriptions of algorithms are often incomplete and/or misleading. The most recent editions have fixed some of the problems, but (IMO) not nearly enough to qualify as something that should ever be recommended. If there was no alternative, you could probably get by with these, but given the number of alternatives that are dramatically superior, the only reason to read these at all is if somebody gives them to you, and you absolutely can't afford anything else.
As far as algorithms versus design patterns goes, the line can get blurry in places, but generally an algorithm is much more tightly defined. An algorithm will normally have a specific, tightly defined input which it processes in a specific way to produce an equally specific result/output. A design pattern tends to be more loosely defined, more generic. An algorithm can be generic as well (e.g., a sorting algorithms might require a type that defines a strict, weak ordering) but still has specific requirements on the type.
A design pattern tends to be somewhat more loosely defined. For example, the visitor pattern involves processing groups of objects -- but we don't want to modify the types of those objects when we decide we need to process them in a new and different way. We do that by defining the processes separately from the objects to be processed, along with how we'll traverse the groups of objects, and allow a process to work with each.
To look at it from a rather different direction, you can usually implement an algorithm with a function or a small group of functions. A design pattern tends to be oriented more toward the style in which you write your code, rather than just "here's a function, use it."
"Algorithms in C, Parts 1-5 (Bundle): Fundamentals, Data Structures, Sorting, Searching, and Graph Algorithms (3rd Edition)"
Cannot stress how good that series is.
If a language has control structures and variables, but no support for arrays, lists, memory access and allocation, etc, can it be Turing-complete?
Maybe if there was no limit to the amount of variables you can create, you can simulate arrays by creating variables like array_1, array_2, ... array_6000 and manually loop through them, and somehow create complex data structures and recursion?
Edit: Even if you cannot access variables by name manipulation (array_10+i is not allowed)?
Certainly. Have a look at Lambda Calculus, which is one of the most minimal Turing Complete languages I've ever seen. Basically, all you have are lambdas (function literals); no assignment, no declaration, no data structures. It's all very very slimmed-down.
You can, however, simulate a linear data structure like a List by chaining functions together. It gets pretty verbose, but it's certainly possible and it's much nicer than having a large series of sequentially named variables.
Generally speaking, whether or not a language is Turing Complete has nothing to do with whether it has Arrays. Functional languages like SML and Haskell lack arrays, just like Lambda Calculus, and these are actually useful languages! Saying a language is "Turing Complete" is merely another way of saying that there is no Turing Computable function which cannot be expressed in said language. This is a surprisingly loose qualification, allowing many languages which would be completely impractical (like Lambda Calculus).
There's plenty of Turing-complete languages that don't even have the notion of a "variable"! Memory access and allocations are implementation details, so they're completely irrelevant. You have to realize that Turing machines and Turing completeness are very theoretical concepts, useful for proving things, but completely divorced from the reality of actual hardware.
Paul Graham has written a long, but very, very interesting essay on the history of computer languages where he describes the two very different main traditions of computer languages:
Lisp, Scheme, etc. - derived from theoretical considerations, very simple, yet conceptually powerful languages, but for the longest time impractical because of their complete disregard for what's easy and efficient to implement
Assembler, FORTRAN, C and pretty much all "mainstream" languages - derived more or less directly from what the hardware could do, easy to implement, efficient, but for the longest time inferior to the (older!) Lisp family in terms of expressiveness.
It sounds like you know only the second tradition, but Turing completeness is a concept that originates from the same principles as the first tradition and makes little sense if you don't know those principles.