Crypt type identification /etc/shadow - c

I know that the password field in /etc/shadow is prefixed with ${number}$ if it is not simply DES encrypted. What I am not able to find is a table that correlates the type of encryption to a given number.
For instance, $1$ would indicate MD5. Its the rest that escape me (i.e. SHA1, SHA256, Twofish, Blowfish, etc)
I've gone through the source to passwd and chpasswd as well as glib, but I'm not finding what I'm looking for.
Would someone mind sharing a link to a web page, or even a clue as to where In glib I might find such a table? I need to progmatically update passwords from within a program without using system() or exec*() calls. I'd like to write original code because I want to keep a uniform 3 clause BSD license and full copyright over my code.
Forgive me if this is a duplicate. I found lots of questions regarding how to parse /etc/shadow, but none that specifically ask how to identify the encryption type of the second field.
Edit:
For reference, here is the announcement from the discussion group that moved forward implementing SHA (over DES) with support for BSD Blowfish.

The crypt(3) man page describes it in the NOTES section.

Related

Case Sensitive Directory Path in Windows

I have reviewed the questions/answers asking whether or not directory/file names are case sensitive in a Windows environment as well as those discussing a need for case-sensitive searching [usually in Python, not C], so I think I understand the essential facts, but none of the postings include my particular application architecture, and the problem I am having resolving the problem.
So, let me briefly explain the application architecture of which I am speaking. The heart of the application is built using Adobe AIR. Yes, that means that much of the U/I involves the Flex framework, but the file handling problem I am needing help with has no dependency upon the Flex U/I part of the application.
As I am trying to process a very large list of recursive directory structures, I am using the low level C RunTime API via a well-behaved mechanism which AIR provides for such cases where access to the host's Native Environment is needed.
The suite of functions which I am using is FindFileFirst, FindFileNext and FindClose. If I write a stand-alone test program, it nicely lists the directories, sub-directories and files. The case of the directories and files is correctly shown -- just as they are correctly shown in Windows Explorer, or using the dir command.
If, however, I launch precisely the same function via the Adobe ANE interface, I receive exactly the same output with the exception that all directory names will be reduced to lower case.
Now, I should clarify that when this code is being executed as a Native Extension, it is not passing data back to AIR, it is directly outputting the results in a file that is opened and closed entirely in the CRT world, so we are not talking about any sort of communication confusion via the passing of either text or byte arrays between two different worlds.
Without kludging up this forum with lots and lots of extraneous code, I think what will help anyone who is able to help me is these snippets:
// This is where the output gets written.
FILE* textFile = _wfopen (L"Peek.txt", L"wt,ccs=UTF8");
WIN32_FIND_DATAW fdf;
HANDLE find = NULL;
wchar_t fullPath[2048];
// I am just showing the third argument as a literal to exemplify
// what, in reality is passed into the recursively-called function as
// a variable.
wsprintf (fullPath, L"\\\\?\\%ls\\*.*", L"F:\\");
hFind = FindFirstFile (fullPath, &fdf);
// After checking for success there appears a do..while loop
// inside which there is the expected check for the "." and ".."
// pseudo directories and a test of fdf.dwFileAttributes for
// file versus sub-directory.
// When the NextFile is a file a function is called to format
// the output in the textFile, like this:
fwprintf (textF, L"%ls\t%ls\t%2.2x\t%4d/%02d/%02d/%02d/%02d/%02d \t%9ld.\n",
parentPath, fdf.cFileName,
(fdf.dwFileAttributes & 0x0f),
st.wYear, st.wMonth, st.wDay,
st.wHour, st.wMinute, st.wSecond,
fSize);
At that point parentPath will be a concatenated wide character string and
the other file attributes will be of the types shown.
So, to summarize: All of this code works perfectly if I just write a stand-alone test. When, however, the code is running as a task called from an Adobe ANE, the names of all the sub-directory parts are reduced to lower case. I have tested every combination of file type attribute -- binary and text and encoding -- UTF-8 and UTF-16LE, but no matter what configuration I choose, the result remains the same: Standalone the API delivers case-correct strings, running as a task in a dll invoked from AIR, the same API delivers only lower-case strings.
First, my thanks to Messrs Ogilvie and Passant for helpful suggestions.
Second, I apologize for not really knowing the protocol here as a very infrequent visitor. If I am supposed to flag either response as helpful and therefore correct, let these words at least reflect that fact.
I am providing an answer which was discovered by taking the advice above.
A. I discovered several tools that helped me get a handle on the contents of the .exe and .dll files. I should add some detail that was not part of the original posting: I have purposely been using the mingw-w64 toolchain rather than Visual Studio for this development work. So, as it turns out, both ldd and dumpbin helped me get a handle on whether or not the two slightly-different build environments were perhaps leaving me with different dependencies.
B. When I saw that one output included a reference to FindFirstFileExW, which function I had once tried in order to solve what I thought was the problem, I thought I had perhaps found a reason for the different results. In the event, that was just a red-herring and I do not mean to waste the forum's time with my low-level of experience and understanding, but it seems useful to note this sort of trouble-shooting methodology as a possible assist to others.
C. So what was the problem? There was, indeed, a small difference in the code between the stand-alone and the ANE-integrated implementations of the recursive directory search. In the production ANE use case, there is logic to apply a level of filtering to the returned results. The actual application allows the user to qualify a search for duplicate files by interrogating parts of the parent string in addition to the filename string itself.
In one corner condition, the filter may be case-sensitive or case-insensitive and I was using _wcslwr in the mistaken belief that that function behaved the nice, Unicode-compliant way that string handling methods are provided in AIR/Actionscript3. I did not notice that the function actually does an in-place replacement of the original string with one reduced to lowercase.
User error, not, any untoward linking of non-standard CRT Kernel functions by Adobe's Native Extension interoperability, was the culprit.

C struct introspection at runtime

Is there a facility for the C language that allows run-time struct introspection?
The context is this:
I've got a daemon that responds to external events, and for each event we carry around an execution context struct (the "context"). The context is big and messy, and contains references to all sorts of state.
Once the event has been handled, I would like to be able to run the context through a filter, and if it matches some set of criteria, drop a log message to help with debugging. However, since I hope to use this for field debugging, I won't know what criteria will be useful to filter on until run time.
My ideal solution would allow the user to, essentially, write a C-style boolean expression and have the program use that. Something like:
activate_filter context.response_time > 4.2 && context.event.event_type == foo_event
Ideas that have been tossed around so far include:
Providing a limited set of fields that we know how to access.
Wrapping all the relevant structs in some sort of macro that generates introspection tools at run time.
Writing a python script that knows where (versioned) headers live, generates C code and compiles it to a dll, which the daemon then loads and uses as a filter. Obviously this approach has some extra security considerations.
Before I start in on some crazy design goose chase, does anyone know of examples of this sort of thing in the wild? I've dome some googling but haven't come up with much.
I would also suggest tackling this issue from another angle. The key words in your question are:
The context is big and messy
And that's where the issue is. Once you clean this up, you'll probably be able to come up with a clean logging facility.
Consider redefining all the fields in your context struct in some easy, pliable format, like XML. A simple `XML schema, that lists all the members of the struct, their types, and maybe some other metadata, even a comment that documents this field.
Then, throw together a quick and dirty stylesheet that reads the XML file and generates a compilable C struct, that your code actually uses. Then, a different stylesheet that cranks out robo-generated code that enumerates each field in the struct, and generates the code to convert each field into a string.
From that, bolting on a logging facility of some kind, with a user-provided filtering string becomes an easier task. You do have to come up with some way of parsing an arbitrary filtering string. Knowledge of lex and yacc would come in handy.
Things of this nature have been done before.
The XCB library is a C client library for the X11 protocol. The protocol defines various kinds of binary messages which are essentially simple structs that the client and the server toss to each other, over a socket. The way that libxcb is implemented, is that all X11 messages and all datatypes inside them are described in an XML definition, and a stylesheet robo-generates C struct definitions, and the code to parse them out, and provide a fairly clean C API to parse and generate X11 messages.
You are probably approaching this problem from a wrong side.
Logging is typically used to facilitate debugging. The program writes all sorts of events to a log file. To extract interesting entries filtering is applied to the log file.
Sometimes a program generates just too much events; logging libraries usually address this issues by offering verbosity control. Basically a logging function takes an additional parameter telling the verbosity level of the current message. If the value is above the globally configured threshold the message gets discarded. Some libraries even allow to control verbosity level on a per-module basis (Ex: google log).
Another possible approach is to leverage the power of a debugger since the debugger has access to all sorts of meta information. One can create a conditional breakpoint testing variables in scope for arbitrary conditions. Once the program stops any information could be extracted from the scope. This can be automated using scripting facilities provided by a debugger (gdb has great ones).
Finally there are tools generating glue code to use C libraries from scripting languages. One example is SWIG. It analyzes a header file and generates code allowing a scripting language to invoke functions, access structure fields, etc.
Your filter expression will become a program in, say, Lua (other scripting languages are supported as well). You invoke this program passing in the pointer to execution context struct (the "context"). Thanks to the accessors generated by SWIG Lua program can examine any field in the structure.
I generated introspection out of SWIG-CSV parser.
Suppose the C code contains structure like the following,
class Bike {
public:
int color; // color of the bike
int gearCount; // number of configurable gear
Bike() {
// bla bla
}
~Bike() {
// bla bla
}
void operate() {
// bla bla
}
};
Then it will generate the following CSV metadata,
Bike|color|int|variable|public|
Bike|gearCount|int|variable|public|
Bike|operate|void|function|public|f().
Now it is easy to parse the CSV file with python or C/C++ if needed.
import csv
with open('bike.csv', 'rb') as csvfile:
bike_metadata = csv.reader(csvfile, delimiter='|')
# do your thing

Generate documentation for 2 languages with same code

Can I somehow generate documentation for 2 different languages with same code? Problem is that I have a C API that is also exposed trough a proprietary language that resembles VB.
So the exposed function in C is something like:
int Function (*PointerToObject)
While in VB it would be something like:
int Function (ByVal long PointerToObject)
I already opened another thread before about this same problem, but by that time I did not know anything about Doxygen. The last couple of days I have been reading the documentation and apparently it can create documentation for VB, but I have to have actual VB code for it to work and I don't. The only thing I have is the original C and the swig output also in C.
What I have in mind is some tool (doxygen, sphinx,...) that would enable me to create some kind of multi-language documentation from a single source (not valid doxygen, but explains the idea):
/*! \fn int Function(*PointerToObject)
* \brief A simple function.
* \Cparam[in] PointerToObject A pointer to the object being used.
* \VBparam[in] ByVal long PointerToObject A pointer to the object being used.
* \return An integer.
*/
It would be great if I could somehow integrate it to swig since it is swig that identifies the correct VB type, but I guess I maybe be asking too much.
It is a bit complicated, if I am not clear enough please leave a comment I will try to explain it further.
I'm not positive how useful this will be as I haven't done precisely what you're looking for (and it is a bit of a kludge), but under similar circumstances I came to the conclusion that our best bet was to generate a fluff object just for doxygen to document.
In our case we have an LDMud which has a few hundred driver-published external functions which don't exist in the LPC language the rest of the MUD is written in. We could parse it in its native C code as a separate documentation run and include the docs with our primary documentation. In our case we have fairly thorough plain-text documentation including what we should consider the definition of these functions to be, so we use this documentation instead to generate an object with dummy function definitions and parse the plaintext documentation into function-head comments in doxygen style. This obviously doesn't provide us the advantages of including code or in-code comments with the documentation, but it does allow us to generate the documentation in multiple forms, push it around in guides, make references to these functions from other documentation, etc.
I'm not directly familiar with C so I'm not sure if there's any introspective support for performing a programmatic inventory of the functions you need and generating dummy declarations from that information or not. If you don't and have a small number of functions I suspect it would be the best investment of your time to simply write the dummy declarations by hand. If you have a really large number of these functions it may be worth your time to use doxygen to parse the original code into XML documentation, and generate the dummy object(s) from the XML. Doxygen's XML can be a pain to deal with, but if all you really need is the declaration and arguments it is (relatively) simple.
As a final aside, if your documentation can really be thought of as internal and external, you might also want to use two or more configuration files to generate different sets of documentation saved to different locations and publish them separately to your internal/external audiences. It will make the task of providing the right amount of detail to each less frustrating. Along those lines you may also find the INTERNAL_DOCS option and the #internal / #endinternal commands useful.

Access Client Write Key & Server Write Key from OpenSSL C API

I have a need to access the encryption (cipher, write) keys that are generated from the master key that is generated from the OpenSSL C API. I know I can access the master key using the SSL struct as follows:
ssl->session->master_key
Unfortunately looking at the OpenSSL code has not gotten me very far as the API is not very well documented and looking at GDB has been a pain as well. Are they exposed anywhere?
Thanks
I am working on the same thing as well.
Well for the SSlv3 connections, you can access the key block by,
ssl->s3->tmp.key_block
However, I managed to get this at the end of ssl3_setup_key_block method which is defined in s3_enc.c file. I mean I changed the source code and recompiled openssl. So I don't know yet if it changes or becomes lost after returning from that function.
This key block is a string and as defined in http://www.ietf.org/rfc/rfc2246.txt its structure is;
{
clientWriteMACSecret,
serverWriteMACSecret,
clientWriteKey,
serverWriteKey,
clientWriteIV,
serverWriteIV
}
The length of each section varies depending on the used cipher suite.
For SSLv2 connections I suppose you can again access these areas by calling ssl->s2->something as well. s2 is an object of ssl2_state_st structure which is defined in ssl2.h. You can look that up from there.
I don't know how it is for TLSv1 connections. But there is also a d1 variable in the SSL structure.
As a personal note, openssl source code is really a bunch of trash. No good code comments, no good documentation. Wandering inside the source code in order to find the simplest thing is a pain.

How to walk a directory in C

I am using glib in my application, and I see there are convenience wrappers in glib for C's remove, unlink and rmdir. But these only work on a single file or directory at a time.
As far as I can see, neither the C standard nor glib include any sort of recursive directory walk functionality. Nor do I see any specific way to delete an entire directory tree at once, as with rm -rf.
For what I'm doing this I'm not worried about any complications like permissions, symlinks back up the tree (infinite recursion), or anything that would rule out a very naive
implementation... so I am not averse to writing my own function for it.
However, I'm curious if this functionality is out there somewhere in the standard libraries gtk or glib (or in some other easily reused C library) already and I just haven't stumbled on it. Googling this topic generates a lot of false leads.
Otherwise my plan is to use this type of algorithm:
dir_walk(char* path, void* callback(char*) {
if(is_dir(path) && has_entries(path)) {
entries = get_entries(path);
for(entry in intries) { dir_walk(entry, callback); }
}
else { callback(path) }
}
dir_walk("/home/user/trash", remove);
Obviously I would build in some error handling and the like to abort the process as soon as a fatal error is encountered.
Have you looked at <dirent.h>? AFAIK this belongs to the POSIX specification, which should be part of the standard library of most, if not all C compilers. See e.g. this <dirent.h> reference (Single UNIX specification Version 2 by the Open Group).
P.S., before someone comments on this: No, this does not offer recursive directory traversal. But then I think this is best implemented by the developer; requirements can differ quite a lot, so one-size-fits-all recursive traversal function would have to be very powerful. (E.g.: Are symlinks followed up? Should recursion depth be limited? etc.)
You can use GFileEnumerator if you want to do it with glib.
Several platforms include ftw and nftw: "(new) file tree walk". Checking the man page on an imac shows that these are legacy, and new users should prefer fts. Portability may be an issue with either of these choices.
Standard C libraries are meant to provide primitive functionality. What you are talking about is composite behavior. You can easily implement it using the low level features present in your API of choice -- take a look at this tutorial.
Note that the "convenience wrappers" you mention for remove(), unlink() and rmdir(), assuming you mean the ones declared in <glib/gstdio.h>, are not really "convenience wrappers". What is the convenience in prefixing totally standard functions with a "g_"? (And note that I say this even if I who introduced them in the first place.)
The only reason these wrappers exist is for file name issues on Windows, where these wrappers actually consist of real code; they take file name arguments in Unicode, encoded in UTF-8. The corresponding "unwrapped" Microsoft C library functions take file names in system codepage.
If you aren't specifically writing code intended to be portable to Windows, there is no reason to use the g_remove() etc wrappers.

Resources