Portable org-mode export? - export

Is it possible to create org-mode files, that behave independently of local emacs customization on export and, if possible, on display?
Org-mode has many features whose behavior depends on user customization. For instance, depending on org-use-sub-superscripts and org-export-with-sub-superscripts, a code a^{b}_c may be displayed (and exported) as either
abc
ab_c
a^{b}_c
The export-behavior can be normalized by requiring that the export is performed in a clean Emacs process (emacs -q), but it would be a bit of a hack, and wouldn't work for the interactive behavior and display of the buffer.
Is there some way to make org-mode ignore all user-settings, except for those explicitly specified in the file?

Related

Split C file by its functions

How can I automatically split a single C file with various functions in it into various files with only a single function each? Anyone have a script or let's say a plugin on notepad++ that could do it? Thank you
It may not even be possible. If a single global static variable exists in one of the files, it shall be shared by all the functions of that file but not be accessible (even with the extern modifier) from functions of other files. And even without that, processing of includes and global variables will be a nightmare.
Anyway, on Unix-Linux, the good old ctags command should be close to your requirements: it does not split the files, but creates an index file (called a tags file) which contains the file and position of all functions from the specified C, Pacal, Fortran, yacc, lex, and Lisp sources. The man page says:
Using the tags file, ex [or vi, vim, etc.] can quickly locate these object definitions.
Depending upon the options provided to ctags, objects will consist of
subroutines, typedefs, defines, structs, enums and unions.
You can either use it (if on Unix world) or mimic it, on Windows for example.
For reasons explained in Serge Ballesta's answer, splitting a single C file into smaller pieces is not automatable in general.
And having several small files instead of a larger one is generally a bad idea. The code becomes less readable, its execution could be slower (because there are less inlining and optimizing opportunities for the compiler).
In some cases, you might want to split a big C file (e.g. more than ten thousands lines of source code) into a few smaller ones (e.g. at least a thousands lines of code each). This may require some work, like renaming static functions or variables into a longer (and globally unique) name declared as extern, moving some short functions (or adding some macros) into header files and declaring them as static inline, etc. This cannot be really automatized in the general case.
My recommendation is often to merge a few small (but related) files into one single bigger one. As a rule of thumb, I would suggest having files of more than a thousand lines each, but YMMV.
In particular, there is no reason to have only one function definition in each of your source file. This practically forbids inlining (unless you compile with link-time-optimization, a very expensive approach).
Look into existing free software projects (e.g. on github) for inspiration. Or look into the Linux kernel source code.
Splitting a C file into smaller ones (or conversely, merging several source files in a single bigger one) generally requires some code refactoring. In many cases, it is quite simple (perhaps even as trivial as copy & pasting some functions one by one); in some cases, it could be difficult. You should do it manually and incrementally (and enable all warnings in your compiler, to help you find mistakes in your refactoring; don't forget to recompile often!). You may want to improve your naming conventions in your code while you split it.
Of course you need a version control system (I recommend git), and you'll compile and commit your code several times while splitting it. You need also a good source code editor (I recommend GNU emacs, but it is a matter of taste; some people prefer vim, etc ....).
You certainly don't want to automatize C file splitting (you might write some scripts to help you, generally it is not worth the trouble). You need to control that split.

Why is it mandatory to specify the module name at start of source file?

GHC insist that the module name has to equal the file name. But if they are the same, then why does a Haskell compiler need both? Seems redundant for me. Is this only a language design mistake?
Beside the inconvinience it also raises the problem that if I want to use 2 libraries that accidentially have the same top module name, then I can not disambiguate simply by renaming the folder of one of them. What is the idiomatic solution to this problem?
The Haskell language specification doesn't talk about files. It only talks about modules and their syntax. So there's clearly no language design mistake.
The GHC compiler (and many others) chose to follow a pattern of one module per file, and searching for modules in files with matching names. Seems like a decent strategy to me. Otherwise you'd need to provide the compiler with some mapping from module name to file name or an explicit list of every file in use.
I would say that one of the big reasons is that you don't always want the module name to be path to the file appended with the file name. This is the same as with Java, C#, and many other languages that prefer an explicit namespace declaration in the source code, explicit is better than implicit in many cases. It gives the programmer maximum control over their filenames without tying it to the filename only.
Imagine that I was a Japanese Haskell programmer, and my OS used Japanese characters for file names. I can write my source code using Japanese characters where possible, but I also want to export an API that uses ASCII characters. If module name and filename had to be identical, this would be impossible and would make it very difficult for people in other countries to use my library.
And as #chi has pointed out, if you have two packages with conflicting module names (a very rare occurrence in my experience), you can always use package-qualified imports.
The Haskell language specification requires that modules are started by a module header, and it does not mention files - it leaves total freedom for the implementing compilers regarding files. So the Haskell language lacks the ability to express where files containing modules are. Because of this some compilers [including the most important one: GHC] use a simple solution : the name of the module must match the path from an include directory to the file. This introduced the redundancy.
To avoid the redundancy, the compilers could drop the requirement in the language specification to start each module by a header. However they chose not to do this simply for the sake of confirming to the specification. Perhaps a GHC language extension could do this, but currently there is no such extension.
So the problem is a language design mistake, and lives on as legacy.
To combat possible name collisions among independent libraries, GHC extension Package-qualified imports seems the best way.

Code Style When Interfacing With Libraries (C)

Recently I ran into a bit of an interesting problem in terms of coding style. Realizing that consistency is a key attribute of good code style, I inherited some code that had some interesting style patterns.
The code in question basically ties two libraries together. The program itself isn't too large. Code for utility functions that wrap the first library are found in a .h and .c file that total a whopping 100 lines (for both files). Code that interfaces with the second library is found in a .c file that also contains the main (total of 300 lines for this sole .c file). All functions that are written with the second library are made static since they are super custom to the particular implementation.
The problem I have is that each file has it's own style. The programmer in question has a style of his own, but the first set of files follows the style of the first library. The second file uses the style of second library. As a result, code in each file is locally consistent in terms of style, but the program itself spans several codes styles.
Style differences include using GLib types and functions like g_printf() in the second file and using C types and functions in the first set of files. In the second file (the one that interfaces with a library that uses GLib), there are portions that absolutely require use of GLib and others that do not. However, in order to maintain local consistency, the programmer used GLib throughout the file.
As a result, I'm wondering what the best practice is in this case in terms of code style. Here are the possible options as I see them.
Use the style from the first library in the first set of files and the style from the second library in the second set of files. This allows code from the library to match in terms of style and allows each set of files to be 100% consistent in terms of style locally, but not for the project as a whole.
Write the code with your own style, ignoring the style of the two libraries. GLib calls will be limited to where they are absolutely necessary. Standard C libraries will be utilized elsewhere. This will cause the code to locally not match styles between each file and their associated library calls. However, the code from file to file should appear somewhat consistent.
Pick one library's style to go with. While this should cause the project's code to be consistent, the code from this project to the programmer's other projec's will be inconsistent. Also, the source code file that has to follow the other library's style may look a bit off.
Looking forward to hearing any thoughts on this. I think this is the first time I've encountered a project's code shifting from one style to another with the same programmer. Thanks for your input and feedback.
In my opinion use the coding style of your company (If you are coding for company).It will help many other in future.
In case you have two different style maintain the coding style present there .
If it is necessary then change the style of coding in the library file.
I see two different problems:
Coding style
Dependency on glib
For what concerns the coding style, it should be uniform across the project. Choose the coding style and rules that you prefer (from the first file, from the second, or one of your choice). But make it consistent across the whole project. That will require some effort, of course, but it will pay you off in future.
Dependency on glib should be isolated as much as possible, in order to let you switch this library in future. In C++ you usually create a dependency on an abstract class, which is then inherited by a concrete class ("Program to an interface (i.e., an abstract class), not to an implementation (i.e., concrete class)". Since you are in C, and you don't have classes, try to mimic this behavior by decoupling the code.

Extract just the required functions from a C code project?

How can I extract just the required functions from a pile of C source files? Is there a tool which can be used on GNU/Linux?
Preferably FOSS, but the GNU/Linux is a hard requirement.
Basically I got about 10 .h files; I'd like to grab part of the code and get the required variables from the header files. Then I can make a single small .h file corresponding to the code I'm using in another project.
My terms might not be 100% correct.
One tool that you may or may not be aware of is cscope. It can be used to help you.
For a given set of files (more on what that means shortly), it gives you these options:
Find this C symbol:
Find this global definition:
Find functions called by this function:
Find functions calling this function:
Find this text string:
Change this text string:
Find this egrep pattern:
Find this file:
Find files #including this file:
Thus, if you know you want to use a function humungous_frogmondifier(), you can find where it is declared or defined by typing its name (or pasting its name) after 'Find this global definition'. If you then want to know what functions it calls, you use the next line. Once you've hit return after specifying the name, you will be given a list of the relevant lines in the source files above this menu on the screen. You can page through the list (if there are more entries than will fit on the screen), and at any time select one of the shown entries by number or letter, in which case cscope launches your editor on the file.
How about that list of files? If you run cscope in a directory without any setup, it will scan the source files in the directory and build its cross-reference. However, if you prefer, you can set up a list of files names in cscope.files and it will analyze those files instead. You can also include -I /path/to/directory on the cscope command line and it will find referenced headers in those directories too.
I'm using cscope 15.7a on some sizeable projects - depending on which version of the project, between about 21,000 and 25,000 files (and some smaller ones with only 10-15 thousand files). It takes about half an hour to set up this project (so I carefully rebuild the indexes once per night, and use the files for the day, accepting that they are a little less accurate at the end of the day). It allows me to track down unused stuff, and find out where stuff is used, and so on.
If you're used to an IDE, it will be primitive. If you're used to curses-mode programs (vim, etc), then it is tolerably friendly.
You suggest (in comments to the main question) that you will be doing this more than once, possibly on different (non-library) code bases. I'm not sure I see the big value in this; I've been coding C on an off for 30+ years and don't feel the need to do this very often.
But given the assumption you will, what you really want is a tool that can, for a given identifier in a system of C files and headers, find the definition of that identifier in those files, and compute the transitive closure of all the dependencies which it has. This defines a partial order over the definitions based on the depends-on relationship. Finally you want to emit the code for those definitions to an output file, in a linear order that honors the partial order determined. (You can simplify this a bit by insisting that the identifier you want is in a particular C compilation unit, but the rest of it stays the same).
Our DMS Software Reengineering Toolkit with its C Front End can be used to do this. DMS is a general purpose program transformation system, capable of parsing source files into ASTs, perform full name resolution (e.g., building symbol tables), [do flow analysis but this isn't needed for your task]. Given those ASTs and the symbol tables, it can be configured to compute this transitive dependency using the symbol table information which record where symbols are defined in the ASTs. Finally, it can be configured to assemble the ASTs of interest into a linear order honoring the partial order.
We have done all this with DMS in the past, where the problem was to generate SOA-like interfaces based on other criteria; after generating the SOA code, the tool picked out all the dependencies for the SOA code and did exactly what was required. The dependency extraction machinery is part of the C front end.
A complication for the C world is that the preprocessor may get in the way; for the particular task we accomplished, the extraction was done over a specific configuration of the application and so the preprocessor directives were all expanded away. If you want this done and retain the C preprocessor directives, you'll need something beyond what DMS can do today. (We do have experimental work that captures macros and preprocessor conditionals in the AST but that's not ready for release to production).
You'd think this problem would be harder with C++ but it is not, because the prepreprocessor is used far more lightly in C++ programs. While we have not done extraction for C++, it would follow exactly the same approach as for C.
So that's the good part with respect to your question.
The not so good part from your point of view, perhaps, is that DMS isn't FOSS; it is a commercial tool designed to be used by my company and our customers to build custom analysis and transformation tools for all those tasks you can't get off the shelf, that make economic sense. Nor does DMS run natively on Linux, rather it is a Windows based tool. It can reach across the network using NFS to access files on other systems including Linux. DMS does run under Wine on Linux.

Are there any naming conventions when creating your own file suffix?

I'm working on a little game and figured I'd pack images and other resources into my own files to keep the directories neat. Then I thought, is there any convention to what I should call my file, or can I simply call it what ever without anyone ever caring? There's probably not a lot of strong opinions about this rather innocent subject, but I thought I'd better have some kind of response on it.
The most obvious would be to not use reserved characters.
< > : " / \ | ? *
Those are the ones for windows. Anyone care to add what characters are reserved on other systems?
There's some standard suffixes that I'm guessing shouldn't be used, unless the file actually apply to the suffix's standard.
.bat .exe .dll .lib .cmd
And then there's all the image file types and what not, but those are about as obvious. What more?
What is your opinion? Should I name my suffix as uniquely as possible, say .maf (My Awesome File) or what ever... or should I be more informative and stick to a known suffix that might reveal what my file is actually doing there? Or perhaps a bland old .dat or .bin?
If you want to create something that you want to associate with your program, you do, of course, want it to be as unique as possible. When you come up with an extension, check with FILExt to see if it's conflicting with anything major.
If you just want to convey "this is a binary file, don't try to open it in notepad or tamper with it", i'd go with something like .bin, yes.
Unix platforms don't have filename restrictions (other than NULL and forward slash), so don't worry about any characters other than what Windows doesn't like.
You can worry about using an extension that hasn't been used before, but unless you want to use a really long one, I'd say don't bother, you can always go with something generic like .dat or .bin. You don't actually need to even use an extension, which (imo) is just as good, unless you will be distributing some of these files other than with the program (for example, user made maps, you will want to have an extension since users will be distributing the files).
Another minor point you might want to consider is that MS DOS extensions need to be exactly three characters after the dot. Being DOS compatible isn't a huge issue (not an issue at all really), but that's why you'll see a lot of extensions are three characters.
Use what makes sense to you - I would avoid well known extensions, as you have proposed to avoid them being accidentaly opened by another application.
Most applications/games will give an extension that is related to the application name or use (.doc, .psd etc...).
Unless users are going to double-click on the files from explorer, having a nice informative, unique extension is not important, so you might want to go with .bin or .dat. However there exist good mechanisms for packing files together (.zip or .7z) so you might want to go for a standard packer, with a standard extension.

Resources