Does ext3 journal indirect/double indirect block in ordered mode? - filesystems

In ext3 file system, if it is on journal mode, would indirect block, double indirect block and triple indirect block be considered as metadata blocks and get journaled?

Use the source, Luke - from Linux fs/ext3/inode.c (line number for the link correct as per this writing, but the URL is Linus' head tree so this may eventually change):
/*
* Note that we always start a transaction even if we're not journalling
* data. This is to preserve ordering: any hole instantiation within
* __block_write_full_page -> ext3_get_block() should be journalled
* along with the data so we don't crash and then get metadata which
* refers to old data.
[ ... ]
static int ext3_ordered_writepage(struct page *page,
struct writeback_control *wbc)
{
struct inode *inode = page->mapping->host;
struct buffer_head *page_bufs;
handle_t *handle = NULL;
int ret = 0;
int err;
[ ... ]
handle = ext3_journal_start(inode, ext3_writepage_trans_blocks(inode));
[ ... ]
ret = block_write_full_page(page, ext3_get_block, wbc);
[ ... ]
/*
* And attach them to the current transaction. But only if
* block_write_full_page() succeeded. Otherwise they are unmapped,
* and generally junk.
*/
if (ret == 0) {
err = walk_page_buffers(handle, page_bufs, 0, PAGE_CACHE_SIZE,
NULL, journal_dirty_data_fn);
if (!ret)
ret = err;
}
walk_page_buffers(handle, page_bufs, 0,
PAGE_CACHE_SIZE, NULL, bput_one);
err = ext3_journal_stop(handle);
if (!ret)
ret = err;
[ ... ]
return ret;
}
[ ... ]
/*
* ext3_truncate()
*
* We block out ext3_get_block() block instantiations across the entire
* transaction, and VFS/VM ensures that ext3_truncate() cannot run
* simultaneously on behalf of the same inode.
*
* As we work through the truncate and commit bits of it to the journal there
* is one core, guiding principle: the file's tree must always be consistent on
* disk. We must be able to restart the truncate after a crash.
*
[ ... ]
*/
void ext3_truncate(struct inode *inode)
{
handle_t *handle;
[ ... ]
handle = start_transaction(inode);
[ ... ]
/*
* OK. This truncate is going to happen. We add the inode to the
* orphan list, so that if this truncate spans multiple transactions,
* and we crash, we will resume the truncate when the filesystem
* recovers. It also marks the inode dirty, to catch the new size.
*
* Implication: the file must always be in a sane, consistent
* truncatable state while each transaction commits.
*/
if (ext3_orphan_add(handle, inode))
goto out_stop;
[ ... ]
if (n == 1) { /* direct blocks */
ext3_free_data(handle, inode, NULL, i_data+offsets[0],
i_data + EXT3_NDIR_BLOCKS);
goto do_indirects;
}
[ ... ]
do_indirects:
/* Kill the remaining (whole) subtrees */
switch (offsets[0]) {
default:
nr = i_data[EXT3_IND_BLOCK];
if (nr) {
ext3_free_branches(handle, inode, NULL, &nr, &nr+1, 1);
i_data[EXT3_IND_BLOCK] = 0;
}
case EXT3_IND_BLOCK:
nr = i_data[EXT3_DIND_BLOCK];
if (nr) {
ext3_free_branches(handle, inode, NULL, &nr, &nr+1, 2);
i_data[EXT3_DIND_BLOCK] = 0;
}
case EXT3_DIND_BLOCK:
nr = i_data[EXT3_TIND_BLOCK];
if (nr) {
ext3_free_branches(handle, inode, NULL, &nr, &nr+1, 3);
i_data[EXT3_TIND_BLOCK] = 0;
}
case EXT3_TIND_BLOCK:
;
}
[ ... ]
ext3_journal_stop(handle);
[ ... ]
}
This clearly states file extensions / truncations (i.e. the allocation and/or release of direct/indirect blocks of any level) are always transacted. There's also the code in ext3_get_blocks() itself which will open a transaction even if called without previously having created a transaction - that's used for direct I/O writes.
So in short - yes, as long as the ext3 has a journal, then modifications of direct/indirect blocks of any level are always transacted.

Related

How to write log files correctly?

How to reliably record logs to be sure of the atomicity of entries in multi-threaded applications? In addition, I would like to be able to rotate logs using the logrotate utility.
The simplest variant to write logs is next:
open/reopen log file
write entries by printf()
close log file at exit
Here is my example:
// default log level
static Cl_loglevl loglevel = LOGLEVEL_NONE;
// log file descriptor (open with Cl_openlog)
static FILE *logfd = NULL;
/**
* #brief Cl_openlog - open log file
* #param logfile - file name
* #return FILE struct or NULL if failed
*/
FILE *Cl_openlog(const char *logfile, Cl_loglevl loglvl){
if(logfd){
Cl_putlog(LOGLEVEL_ERROR, "Reopen log file\n");
fclose(logfd);
logfd = NULL;
char newname[PATH_MAX];
snprintf(newname, PATH_MAX, "%s.old", logfile);
if(rename(logfile, newname)) WARN("Can't rename old log file");
}
if(loglvl < LOGLEVEL_CNT) loglevel = loglvl;
if(!logfile) return NULL;
if(!(logfd = fopen(logfile, "w"))) WARN(_("Can't open log file"));
return logfd;
}
/**
* #brief Cl_putlog - put message to log file
* #param lvl - message loglevel (if lvl > loglevel, message won't be printed)
* #param fmt - format and the rest part of message
* #return amount of symbols saved in file
*/
int Cl_putlog(Cl_loglevl lvl, const char *fmt, ...){
if(lvl > loglevel || !logfd) return 0;
char strtm[128];
time_t t = time(NULL);
struct tm *curtm = localtime(&t);
strftime(strtm, 128, "%Y/%m/%d-%H:%M", curtm);
int i = fprintf(logfd, "%s\t", strtm);
va_list ar;
va_start(ar, fmt);
i += vfprintf(logfd, fmt, ar);
va_end(ar);
fflush(logfd);
return i;
}
The call of Cl_openlog allows to rotate log once. I can call this function in SIG_USR1 handler and send this signal by logrotate. But it still remains unclear how to write to a file correctly in order to achieve atomicity of records.
I don't want to use external libraries like log4c for such simple problem.
Either you can assure that all logging is done through a single thread, or you protect logging function with mutexes.
For first case, you can have a worker thread that polls on a pipe's read-end; then each thread will use pipe's write-end to wakeup worker thread and let it manage its log (allocated somewhere on heap and passed by address through pipe).
The same can be done for your SIGUSR1 handler (logrotate).
Please note that writing less than PIPE_BUF to a pipe() is guaranteed not to be interleaved (thus it is atomic).
Consequently writing just an heap storage address will always be atomic.
Last but not least, you may use different logfiles for each thread.
Well, at last I did this:
// array with all opened logs - for error/warning messages
static Cl_log *errlogs = NULL;
static int errlogsnum = 0;
/**
* #brief Cl_createlog - create log file: init mutex, test file open ability
* #param log - log structure
* #return 0 if all OK
*/
int Cl_createlog(Cl_log *log){
if(!log || !log->logpath || log->loglevel <= LOGLEVEL_NONE || log->loglevel >= LOGLEVEL_CNT) return 1;
FILE *logfd = fopen(log->logpath, "a");
if(!logfd){
WARN("Can't open log file");
return 2;
}
fclose(logfd);
pthread_mutex_init(&log->mutex, NULL);
errlogs = realloc(errlogs, (++errlogsnum) *sizeof(Cl_log));
if(!errlogs) errlogsnum = 0;
else memcpy(&errlogs[errlogsnum-1], log, sizeof(Cl_log));
return 0;
}
/**
* #brief Cl_putlog - put message to log file with/without timestamp
* #param timest - ==1 to put timestamp
* #param log - pointer to log structure
* #param lvl - message loglevel (if lvl > loglevel, message won't be printed)
* #param fmt - format and the rest part of message
* #return amount of symbols saved in file
*/
int Cl_putlogt(int timest, Cl_log *log, Cl_loglevl lvl, const char *fmt, ...){
if(!log || !log->logpath || log->loglevel < 0 || log->loglevel >= LOGLEVEL_CNT) return 0;
if(lvl > log->loglevel) return 0;
if(pthread_mutex_lock(&log->mutex)) return 0;
int i = 0;
FILE *logfd = fopen(log->logpath, "a");
if(!logfd) goto rtn;
if(timest){
char strtm[128];
time_t t = time(NULL);
struct tm *curtm = localtime(&t);
strftime(strtm, 128, "%Y/%m/%d-%H:%M", curtm);
i = fprintf(logfd, "%s\t", strtm);
}
va_list ar;
va_start(ar, fmt);
i += vfprintf(logfd, fmt, ar);
va_end(ar);
fclose(logfd);
rtn:
pthread_mutex_unlock(&log->mutex);
return i;
}
In Cl_createlog() I just test ability of opening of given file and init mutex. In Cl_putlogt() I open given file, write to it and close it. The mutex used to make writing atomic for any threads.
The array errlogs consists all opened logs for writing critical messages.

Two-dimensional char array too large exit code 139

Hey guys I'm attempting to read in workersinfo.txt and store it into a two-dimensional char array. The file is around 4,000,000 lines with around 100 characters per line. I want to store each file line on the array. Unfortunately, I get exit code 139(Not enough memory). I'm aware I have to use malloc() and free() but I've tried a couple of things and I haven't been able to make them work.Eventually I have to sort the array by ID number but I'm stuck on declaring the array.
The file looks something like this:
First Name, Last Name,Age, ID
Carlos,Lopez,,10568
Brad, Patterson,,20586
Zack, Morris,42,05689
This is my code so far:
#include <stdio.h>
#include <stdlib.h>
int main(void) {
FILE *ptr_file;
char workers[4000000][1000];
ptr_file =fopen("workersinfo.txt","r");
if (!ptr_file)
perror("Error");
int i = 0;
while (fgets(workers[i],1000, ptr_file)!=NULL){
i++;
}
int n;
for(n = 0; n < 4000000; n++)
{
printf("%s", workers[n]);
}
fclose(ptr_file);
return 0;
}
The Stack memory is limited. As you pointed out in your question, you MUST use malloc to allocate such a big (need I say HUGE) chunk of memory, as the stack cannot contain it.
you can use ulimit to review the limits of your system (usually including the stack size limit).
On my Mac, the limit is 8Mb. After running ulimit -a I get:
...
stack size (kbytes, -s) 8192
...
Or, test the limit using:
struct rlimit slim;
getrlimit(RLIMIT_STACK, &rlim);
rlim.rlim_cur // the stack limit
I truly recommend you process each database entry separately.
As mentioned in the comments, assigning the memory as static memory would, in most implementations, circumvent the stack.
Still, IMHO, allocating 400MB of memory (or 4GB, depending which part of your question I look at), is bad form unless totally required - especially for a single function.
Follow-up Q1: How to deal with each DB entry separately
I hope I'm not doing your homework or anything... but I doubt your homework would include an assignment to load 400Mb of data to the computer's memory... so... to answer the question in your comment:
The following sketch of single entry processing isn't perfect - it's limited to 1Kb of data per entry (which I thought to be more then enough for such simple data).
Also, I didn't allow for UTF-8 encoding or anything like that (I followed the assumption that English would be used).
As you can see from the code, we read each line separately and perform error checks to check that the data is valid.
To sort the file by ID, you might consider either running two lines at a time (this would be a slow sort) and sorting them, or creating a sorted node tree with the ID data and the position of the line in the file (get the position before reading the line). Once you sorted the binary tree, you can sort the data...
... The binary tree might get a bit big. did you look up sorting algorithms?
#include <stdio.h>
// assuming this is the file structure:
//
// First Name, Last Name,Age, ID
// Carlos,Lopez,,10568
// Brad, Patterson,,20586
// Zack, Morris,42,05689
//
// Then this might be your data structure per line:
struct DBEntry {
char* last_name; // a pointer to the last name
char* age; // a pointer to the name - could probably be an int
char* id; // a pointer to the ID
char first_name[1024]; // the actual buffer...
// I unified the first name and the buffer since the first name is first.
};
// each time you read only a single line, perform an error check for overflow
// and return the parsed data.
//
// return 1 on sucesss or 0 on failure.
int read_db_line(FILE* fp, struct DBEntry* line) {
if (!fgets(line->first_name, 1024, fp))
return 0;
// parse data and review for possible overflow.
// first, zero out data
int pos = 0;
line->age = NULL;
line->id = NULL;
line->last_name = NULL;
// read each byte, looking for the EOL marker and the ',' seperators
while (pos < 1024) {
if (line->first_name[pos] == ',') {
// we encountered a devider. we should handle it.
// if the ID feild's location is already known, we have an excess comma.
if (line->id) {
fprintf(stderr, "Parsing error, invalid data - too many fields.\n");
return 0;
}
// replace the comma with 0 (seperate the strings)
line->first_name[pos] = 0;
if (line->age)
line->id = line->first_name + pos + 1;
else if (line->last_name)
line->age = line->first_name + pos + 1;
else
line->last_name = line->first_name + pos + 1;
} else if (line->first_name[pos] == '\n') {
// we encountered a terminator. we should handle it.
if (line->id) {
// if we have the id string's possition (the start marker), this is a
// valid entry and we should process the data.
line->first_name[pos] = 0;
return 1;
} else {
// we reached an EOL without enough ',' seperators, this is an invalid
// line.
fprintf(stderr, "Parsing error, invalid data - not enough fields.\n");
return 0;
}
}
pos++;
}
// we ran through all the data but there was no EOL marker...
fprintf(stderr,
"Parsing error, invalid data (data overflow or data too large).\n");
return 0;
}
// the main program
int main(int argc, char const* argv[]) {
// open file
FILE* ptr_file;
ptr_file = fopen("workersinfo.txt", "r");
if (!ptr_file)
perror("File Error");
struct DBEntry line;
while (read_db_line(ptr_file, &line)) {
// do what you want with the data... print it?
printf(
"First name:\t%s\n"
"Last name:\t%s\n"
"Age:\t\t%s\n"
"ID:\t\t%s\n"
"--------\n",
line.first_name, line.last_name, line.age, line.id);
}
// close file
fclose(ptr_file);
return 0;
}
Followup Q2: Sorting array for 400MB-4GB of data
IMHO, 400MB is already touching on the issues related to big data. For example, implementing a bubble sort on your database should be agonizing as far as performance goes (unless it's a single time task, where performance might not matter).
Creating an Array of DBEntry objects will eventually get you a larger memory foot-print then the actual data..
This will not be the optimal way to sort large data.
The correct approach will depend on your sorting algorithm. Wikipedia has a decent primer on sorting algorythms.
Since we are handling a large amount of data, there are a few things to consider:
It would make sense to partition the work, so different threads/processes sort a different section of the data.
We will need to minimize IO to the hard drive (as it will slow the sorting significantly and prevent parallel processing on the same machine/disk).
One possible approach is to create a heap for a heap sort, but only storing a priority value and storing the original position in the file.
Another option would probably be to employ a divide and conquer algorithm, such as quicksort, again, only sorting a computed sort value and the entry's position in the original file.
Either way, writing a decent sorting method will be a complicated task, probably involving threading, forking, tempfiles or other techniques.
Here's a simplified demo code... it is far from optimized, but it demonstrates the idea of the binary sort-tree that holds the sorting value and the position of the data in the file.
Be aware that using this code will be both relatively slow (although not that slow) and memory intensive...
On the other hand, it will require about 24 bytes per entry. For 4 million entries, it's 96MB, somewhat better then 400Mb and definitely better then the 4GB.
#include <stdlib.h>
#include <stdio.h>
// assuming this is the file structure:
//
// First Name, Last Name,Age, ID
// Carlos,Lopez,,10568
// Brad, Patterson,,20586
// Zack, Morris,42,05689
//
// Then this might be your data structure per line:
struct DBEntry {
char* last_name; // a pointer to the last name
char* age; // a pointer to the name - could probably be an int
char* id; // a pointer to the ID
char first_name[1024]; // the actual buffer...
// I unified the first name and the buffer since the first name is first.
};
// this might be a sorting node for a sorted bin-tree:
struct SortNode {
struct SortNode* next; // a pointer to the next node
fpos_t position; // the DB entry's position in the file
long value; // The computed sorting value
}* top_sorting_node = NULL;
// this function will free all the memory used by the global Sorting tree
void clear_sort_heap(void) {
struct SortNode* node;
// as long as there is a first node...
while ((node = top_sorting_node)) {
// step forward.
top_sorting_node = top_sorting_node->next;
// free the original first node's memory
free(node);
}
}
// each time you read only a single line, perform an error check for overflow
// and return the parsed data.
//
// return 0 on sucesss or 1 on failure.
int read_db_line(FILE* fp, struct DBEntry* line) {
if (!fgets(line->first_name, 1024, fp))
return -1;
// parse data and review for possible overflow.
// first, zero out data
int pos = 0;
line->age = NULL;
line->id = NULL;
line->last_name = NULL;
// read each byte, looking for the EOL marker and the ',' seperators
while (pos < 1024) {
if (line->first_name[pos] == ',') {
// we encountered a devider. we should handle it.
// if the ID feild's location is already known, we have an excess comma.
if (line->id) {
fprintf(stderr, "Parsing error, invalid data - too many fields.\n");
clear_sort_heap();
exit(2);
}
// replace the comma with 0 (seperate the strings)
line->first_name[pos] = 0;
if (line->age)
line->id = line->first_name + pos + 1;
else if (line->last_name)
line->age = line->first_name + pos + 1;
else
line->last_name = line->first_name + pos + 1;
} else if (line->first_name[pos] == '\n') {
// we encountered a terminator. we should handle it.
if (line->id) {
// if we have the id string's possition (the start marker), this is a
// valid entry and we should process the data.
line->first_name[pos] = 0;
return 0;
} else {
// we reached an EOL without enough ',' seperators, this is an invalid
// line.
fprintf(stderr, "Parsing error, invalid data - not enough fields.\n");
clear_sort_heap();
exit(1);
}
}
pos++;
}
// we ran through all the data but there was no EOL marker...
fprintf(stderr,
"Parsing error, invalid data (data overflow or data too large).\n");
return 0;
}
// read and sort a single line from the database.
// return 0 if there was no data to sort. return 1 if data was read and sorted.
int sort_line(FILE* fp) {
// allocate the memory for the node - use calloc for zero-out data
struct SortNode* node = calloc(sizeof(*node), 1);
// store the position on file
fgetpos(fp, &node->position);
// use a stack allocated DBEntry for processing
struct DBEntry line;
// check that the read succeeded (read_db_line will return -1 on error)
if (read_db_line(fp, &line)) {
// free the node's memory
free(node);
// return no data (0)
return 0;
}
// compute sorting value - I'll assume all IDs are numbers up to long size.
sscanf(line.id, "%ld", &node->value);
// heap sort?
// This is a questionable sort algorythm... or a questionable implementation.
// Also, I'll be using pointers to pointers, so it might be a headache to read
// (it's a headache to write, too...) ;-)
struct SortNode** tmp = &top_sorting_node;
// move up the list until we encounter something we're smaller then us,
// OR untill the list is finished.
while (*tmp && (*tmp)->value <= node->value)
tmp = &((*tmp)->next);
// update the node's `next` value.
node->next = *tmp;
// inject the new node into the tree at the position we found
*tmp = node;
// return 1 (data was read and sorted)
return 1;
}
// writes the next line in the sorting
int write_line(FILE* to, FILE* from) {
struct SortNode* node = top_sorting_node;
if (!node) // are we done? top_sorting_node == NULL ?
return 0; // return 0 - no data to write
// step top_sorting_node forward
top_sorting_node = top_sorting_node->next;
// read data from one file to the other
fsetpos(from, &node->position);
char* buffer = NULL;
ssize_t length;
size_t buff_size = 0;
length = getline(&buffer, &buff_size, from);
if (length <= 0) {
perror("Line Copy Error - Couldn't read data");
return 0;
}
fwrite(buffer, 1, length, to);
free(buffer); // getline allocates memory that we're incharge of freeing.
return 1;
}
// the main program
int main(int argc, char const* argv[]) {
// open file
FILE *fp_read, *fp_write;
fp_read = fopen("workersinfo.txt", "r");
fp_write = fopen("sorted_workersinfo.txt", "w+");
if (!fp_read) {
perror("File Error");
goto cleanup;
}
if (!fp_write) {
perror("File Error");
goto cleanup;
}
printf("\nSorting");
while (sort_line(fp_read))
printf(".");
// write all sorted data to a new file
printf("\n\nWriting sorted data");
while (write_line(fp_write, fp_read))
printf(".");
// clean up - close files and make sure the sorting tree is cleared
cleanup:
printf("\n");
fclose(fp_read);
fclose(fp_write);
clear_sort_heap();
return 0;
}

getpwuid_r and getgrgid_r are slow, how can I cache them?

I am writing a clone of find while learning C. When implementing the -ls option I've stumbled upon a problem that the getpwuid_r and getgrgid_r calls are really slow, the same applies to getpwuid and getgrgid. I need them to display the user/group names from ids provided by stat.h.
For example, listing the whole filesystem gets 3x slower:
# measurements were made 3 times and the fastest run was recorded
# with getgrgid_r
time ./myfind / -ls > list.txt
real 0m4.618s
user 0m1.848s
sys 0m2.744s
# getgrgid_r replaced with 'return "user";'
time ./myfind / -ls > list.txt
real 0m1.437s
user 0m0.572s
sys 0m0.832s
I wonder how GNU find maintains such a good speed. I've seen the sources, but they are not exactly easy to understand and to apply without special types, macros etc:
time find / -ls > list.txt
real 0m1.544s
user 0m0.884s
sys 0m0.648s
I thought about caching the uid - username and gid - groupname pairs in a data structure. Is it a good idea? How would you implement it?
You can find my complete code here.
UPDATE:
The solution was exactly what I was looking for:
time ./myfind / -ls > list.txt
real 0m1.480s
user 0m0.696s
sys 0m0.736s
Here is a version based on getgrgid (if you don't require thread safety):
char *do_get_group(struct stat attr) {
struct group *grp;
static unsigned int cache_gid = UINT_MAX;
static char *cache_gr_name = NULL;
/* skip getgrgid if we have the record in cache */
if (cache_gid == attr.st_gid) {
return cache_gr_name;
}
/* clear the cache */
cache_gid = UINT_MAX;
grp = getgrgid(attr.st_gid);
if (!grp) {
/*
* the group is not found or getgrgid failed,
* return the gid as a string then;
* an unsigned int needs 10 chars
*/
char group[11];
if (snprintf(group, 11, "%u", attr.st_gid) < 0) {
fprintf(stderr, "%s: snprintf(): %s\n", program, strerror(errno));
return "";
}
return group;
}
cache_gid = grp->gr_gid;
cache_gr_name = grp->gr_name;
return grp->gr_name;
}
getpwuid:
char *do_get_user(struct stat attr) {
struct passwd *pwd;
static unsigned int cache_uid = UINT_MAX;
static char *cache_pw_name = NULL;
/* skip getpwuid if we have the record in cache */
if (cache_uid == attr.st_uid) {
return cache_pw_name;
}
/* clear the cache */
cache_uid = UINT_MAX;
pwd = getpwuid(attr.st_uid);
if (!pwd) {
/*
* the user is not found or getpwuid failed,
* return the uid as a string then;
* an unsigned int needs 10 chars
*/
char user[11];
if (snprintf(user, 11, "%u", attr.st_uid) < 0) {
fprintf(stderr, "%s: snprintf(): %s\n", program, strerror(errno));
return "";
}
return user;
}
cache_uid = pwd->pw_uid;
cache_pw_name = pwd->pw_name;
return pwd->pw_name;
}
UPDATE 2:
Changed long to unsigned int.
UPDATE 3:
Added the cache clearing. It is absolutely necessary, because pwd->pw_name may point to a static area. getpwuid can overwrite its contents if it fails or simply when executed somewhere else in the program.
Also removed strdup. Since the output of getgrgid and getpwuid should not be freed, there is no need to require free for our wrapper functions.
The timings indeed indicate a strong suspicion on these functions.
Looking at your function do_get_group, there are some issues:
You use sysconf(_SC_GETPW_R_SIZE_MAX); for every call to do_get_group and do_get_user, definitely cache that, it will not change during the lifetime of your program, but you will not gain much.
You use attr.st_uid instead of attr.st_gid, which probably causes the lookup to fail for many files, possibly defeating the cacheing mechanism, if any. Fix this first, this is a bug!
You return values that should not be passed to free() by the caller, such as grp->gr_name and "". You should always allocate the string you return. The same issue is probably present in do_get_user().
Here is a replacement for do_get_group with a one shot cache. See if this improves the performance:
/*
* #brief returns the groupname or gid, if group not present on the system
*
* #param attr the entry attributes from lstat
*
* #returns the groupname if getgrgid() worked, otherwise gid, as a string
*/
char *do_get_group(struct stat attr) {
char *group;
struct group grp;
struct group *result;
static size_t length = 0;
static char *buffer = NULL;
static gid_t cache_gid = -1;
static char *cache_gr_name = NULL;
if (!length) {
/* only allocate the buffer once */
long sysconf_length = sysconf(_SC_GETPW_R_SIZE_MAX);
if (sysconf_length == -1) {
sysconf_length = 16384;
}
length = (size_t)sysconf_length;
buffer = calloc(length, 1);
}
if (!buffer) {
fprintf(stderr, "%s: malloc(): %s\n", program, strerror(errno));
return strdup("");
}
/* check the cache */
if (cache_gid == attr.st_gid) {
return strdup(cache_gr_name);
}
/* empty the cache */
cache_gid = -1;
free(cache_gr_name);
cache_gr_name = NULL;
if (getgrgid_r(attr.st_gid, &grp, buffer, length, &result) != 0) {
fprintf(stderr, "%s: getpwuid_r(): %s\n", program, strerror(errno));
return strdup("");
}
if (result) {
group = grp.gr_name;
} else {
group = buffer;
if (snprintf(group, length, "%ld", (long)attr.st_gid) < 0) {
fprintf(stderr, "%s: snprintf(): %s\n", program, strerror(errno));
return strdup("");
}
}
/* load the cache */
cache_gid = attr.st_gid;
cache_gr_name = strdup(group);
return strdup(group);
}
Whether the getpwuid and getgrgid calls are cached depends on how they are implemented and on how your system is configured. I recently wrote an implementation of ls and ran into a similar problem.
I found that on all modern systems I tested, the two functions are uncached unless you run the name service caching dæmon (nscd) in which case nscd makes sure that the cache stays up to date. It's easy to understand why this happens: Without an nscd, caching the information could lead to outdated output which is a violation of the specification.
I don't think you should rely on these functions caching the group and passwd databases because they often don't. I implemented custom caching code for this purpose. If you don't require to have up-to-date information in case the database contents change during program execution, this is perfectly fine to do.
You can find my implementation of such a cache here. I'm not going to publish it on Stack Overflow as I do not desire to publish the code under the MIT license.

How to return a clean error on incorrect mount with VTreeFS?

When trying to mount a VTreeFS filesystem with a set of arguments (by using options -o when mounting) we want to let it fail cleanly if the user doesn't use the predefined arguments correctly. Currently we get this nasty error message when we do not mount the filesystem and let the main return 0. We basically want the filesystem to not be mounted if the arguments are inorrect
Current situations
mount -t filesystemtest -o testarguments none /mnt/filesystemtest
Arguments invalid
RS: service 'fs_00021' exited uring initialization
filesystemtest 109710 0xab6e 0x65f1 0x618d 0x6203 0x98ba 0x1010
Request to RS failed: unknown error (error 302)
mount: couldn't run /bin/sercie up /sbin/filesystemtest -label 'fs_00021'-args ''
mount: Can't mount none on /mnt/filesystemtest/: unknown error
Preferred situation
mount -t filesystemtest -o testarguments none /mnt/filesystemtest
Arguments invalid
Basically we wan't to know how to return a clean error message, when not calling start_vtreefs like below. The below example is not our actualy code and doesn't actually use arguments, but as an example there should be a way to have this piece of code to fail always. (sorry for that):
#include <minix/drivers.h>
#include <minix/vtreefs.h>
#include <sys/stat.h>
#include <time.h>
#include <assert.h>
static void my_init_hook(void)
{
/* This hook will be called once, after VTreeFS has initialized.
*/
struct inode_stat file_stat;
struct inode *inode;
/* We create one regular file in the root directory. The file is
* readable by everyone, and owned by root. Its size as returned by for
* example stat() will be zero, but that does not mean it is empty.
* For files with dynamically generated content, the file size is
* typically set to zero.
*/
file_stat.mode = S_IFREG | 0444;
file_stat.uid = 0;
file_stat.gid = 0;
file_stat.size = 0;
file_stat.dev = NO_DEV;
/* Now create the actual file. It is called "test" and does not have an
* index number. Its callback data value is set to 1, allowing it to be
* identified with this number later.
*/
inode = add_inode(get_root_inode(), "test", NO_INDEX, &file_stat, 0,
(cbdata_t) 1);
assert(inode != NULL);
}
static int my_read_hook(struct inode *inode, off_t offset, char **ptr,
size_t *len, cbdata_t cbdata)
{
/* This hook will be called every time a regular file is read. We use
* it to dyanmically generate the contents of our file.
*/
static char data[26];
const char *str;
time_t now;
/* We have only a single file. With more files, cbdata may help
* distinguishing between them.
*/
assert((int) cbdata == 1);
/* Generate the contents of the file into the 'data' buffer. We could
* use the return value of ctime() directly, but that would make for a
* lousy example.
*/
time(&now);
str = ctime(&now);
strcpy(data, str);
/* If the offset is beyond the end of the string, return EOF. */
if (offset > strlen(data)) {
*len = 0;
return OK;
}
/* Otherwise, return a pointer into 'data'. If necessary, bound the
* returned length to the length of the rest of the string. Note that
* 'data' has to be static, because it will be used after this function
* returns.
*/
*ptr = data + offset;
if (*len > strlen(data) - offset)
*len = strlen(data) - offset;
return OK;
}
/* The table with callback hooks. */
struct fs_hooks my_hooks = {
my_init_hook,
NULL, /* cleanup_hook */
NULL, /* lookup_hook */
NULL, /* getdents_hook */
my_read_hook,
NULL, /* rdlink_hook */
NULL /* message_hook */
};
int main(int argc, char* argv[])
{
/* The call above never returns. This just keeps the compiler happy. */
if (argc == 1) {
// We want it to fail right now!!!!
printf("Arguments invalid. (pass option with -o)");
}
else {
struct inode_stat root_stat;
/* Fill in the details to be used for the root inode. It will be a
* directory, readable and searchable by anyone, and owned by root.
*/
root_stat.mode = S_IFDIR | 0555;
root_stat.uid = 0;
root_stat.gid = 0;
root_stat.size = 0;
root_stat.dev = NO_DEV;
/* Now start VTreeFS. Preallocate 10 inodes, which is more than we'll
* need for this example. No indexed entries are used.
*/
start_vtreefs(&my_hooks, 10, &root_stat, 0);
}
return 0;
}

tmpfile() on windows 7 x64

Running the following code on Windows 7 x64
#include <stdio.h>
#include <errno.h>
int main() {
int i;
FILE *tmp;
for (i = 0; i < 10000; i++) {
errno = 0;
if(!(tmp = tmpfile())) printf("Fail %d, err %d\n", i, errno);
fclose(tmp);
}
return 0;
}
Gives errno 13 (Permission denied), on the 637th and 1004th call, it works fine on XP (haven't tried 7 x86). Am I missing something or is this a bug?
I've got similar problem on Windows 8 - tmpfile() was causing win32 ERROR_ACCESS_DENIED error code - and yes, if you run application with administrator privileges - then it works fine.
I guess problem is mentioned over here:
https://lists.gnu.org/archive/html/bug-gnulib/2007-02/msg00162.html
Under Windows, the tmpfile function is defined to always create
its temporary file in the root directory. Most users don't have
permission to do that, so it will often fail.
I would suspect that this is kinda incomplete windows port issue - so this should be an error reported to Microsoft. (Why to code tmpfile function if it's useless ?)
But who have time to fight with Microsoft wind mills ?! :-)
I've coded similar implementation using GetTempPathW / GetModuleFileNameW / _wfopen. Code where I've encountered this problem came from libjpeg - I'm attaching whole source code here, but you can pick up code from jpeg_open_backing_store.
jmemwin.cpp:
//
// Windows port for jpeg lib functions.
//
#define JPEG_INTERNALS
#include <Windows.h> // GetTempFileName
#undef FAR // Will be redefined - disable warning
#include "jinclude.h"
#include "jpeglib.h"
extern "C" {
#include "jmemsys.h" // jpeg_ api interface.
//
// Memory allocation and freeing are controlled by the regular library routines malloc() and free().
//
GLOBAL(void *) jpeg_get_small (j_common_ptr cinfo, size_t sizeofobject)
{
return (void *) malloc(sizeofobject);
}
GLOBAL(void) jpeg_free_small (j_common_ptr cinfo, void * object, size_t sizeofobject)
{
free(object);
}
/*
* "Large" objects are treated the same as "small" ones.
* NB: although we include FAR keywords in the routine declarations,
* this file won't actually work in 80x86 small/medium model; at least,
* you probably won't be able to process useful-size images in only 64KB.
*/
GLOBAL(void FAR *) jpeg_get_large (j_common_ptr cinfo, size_t sizeofobject)
{
return (void FAR *) malloc(sizeofobject);
}
GLOBAL(void) jpeg_free_large (j_common_ptr cinfo, void FAR * object, size_t sizeofobject)
{
free(object);
}
//
// Used only by command line applications, not by static library compilation
//
#ifndef DEFAULT_MAX_MEM /* so can override from makefile */
#define DEFAULT_MAX_MEM 1000000L /* default: one megabyte */
#endif
GLOBAL(long) jpeg_mem_available (j_common_ptr cinfo, long min_bytes_needed, long max_bytes_needed, long already_allocated)
{
// jmemansi.c's jpeg_mem_available implementation was insufficient for some of .jpg loads.
MEMORYSTATUSEX status = { 0 };
status.dwLength = sizeof(status);
GlobalMemoryStatusEx(&status);
if( status.ullAvailPhys > LONG_MAX )
// Normally goes here since new PC's have more than 4 Gb of ram.
return LONG_MAX;
return (long) status.ullAvailPhys;
}
/*
Backing store (temporary file) management.
Backing store objects are only used when the value returned by
jpeg_mem_available is less than the total space needed. You can dispense
with these routines if you have plenty of virtual memory; see jmemnobs.c.
*/
METHODDEF(void) read_backing_store (j_common_ptr cinfo, backing_store_ptr info, void FAR * buffer_address, long file_offset, long byte_count)
{
if (fseek(info->temp_file, file_offset, SEEK_SET))
ERREXIT(cinfo, JERR_TFILE_SEEK);
size_t readed = fread( buffer_address, 1, byte_count, info->temp_file);
if (readed != (size_t) byte_count)
ERREXIT(cinfo, JERR_TFILE_READ);
}
METHODDEF(void)
write_backing_store (j_common_ptr cinfo, backing_store_ptr info, void FAR * buffer_address, long file_offset, long byte_count)
{
if (fseek(info->temp_file, file_offset, SEEK_SET))
ERREXIT(cinfo, JERR_TFILE_SEEK);
if (JFWRITE(info->temp_file, buffer_address, byte_count) != (size_t) byte_count)
ERREXIT(cinfo, JERR_TFILE_WRITE);
// E.g. if you need to debug writes.
//if( fflush(info->temp_file) != 0 )
// ERREXIT(cinfo, JERR_TFILE_WRITE);
}
METHODDEF(void)
close_backing_store (j_common_ptr cinfo, backing_store_ptr info)
{
fclose(info->temp_file);
// File is deleted using 'D' flag on open.
}
static HMODULE DllHandle()
{
MEMORY_BASIC_INFORMATION info;
VirtualQuery(DllHandle, &info, sizeof(MEMORY_BASIC_INFORMATION));
return (HMODULE)info.AllocationBase;
}
GLOBAL(void) jpeg_open_backing_store(j_common_ptr cinfo, backing_store_ptr info, long total_bytes_needed)
{
// Generate unique filename.
wchar_t path[ MAX_PATH ] = { 0 };
wchar_t dllPath[ MAX_PATH ] = { 0 };
GetTempPathW( MAX_PATH, path );
// Based on .exe or .dll filename
GetModuleFileNameW( DllHandle(), dllPath, MAX_PATH );
wchar_t* p = wcsrchr( dllPath, L'\\');
wchar_t* ext = wcsrchr( p + 1, L'.');
if( ext ) *ext = 0;
wchar_t* outFile = path + wcslen(path);
static int iTempFileId = 1;
// Based on process id (so processes would not fight with each other)
// Based on some process global id.
wsprintfW(outFile, L"%s_%d_%d.tmp",p + 1, GetCurrentProcessId(), iTempFileId++ );
// 'D' - temporary file.
if ((info->temp_file = _wfopen(path, L"w+bD") ) == NULL)
ERREXITS(cinfo, JERR_TFILE_CREATE, "");
info->read_backing_store = read_backing_store;
info->write_backing_store = write_backing_store;
info->close_backing_store = close_backing_store;
} //jpeg_open_backing_store
/*
* These routines take care of any system-dependent initialization and
* cleanup required.
*/
GLOBAL(long)
jpeg_mem_init (j_common_ptr cinfo)
{
return DEFAULT_MAX_MEM; /* default for max_memory_to_use */
}
GLOBAL(void)
jpeg_mem_term (j_common_ptr cinfo)
{
/* no work */
}
}
I'm intentionally ignoring errors from some of functions - have you ever seen GetTempPathW or GetModuleFileNameW failing ?
A bit of a refresher from the manpage of on tmpfile(), which returns a FILE*:
The file will be automatically deleted when it is closed or the program terminates.
My verdict for this issue: Deleting a file on Windows is weird.
When you delete a file on Windows, for as long as something holds a handle, you can't call CreateFile on something with the same absolute path, otherwise it will fail with the NT error code STATUS_DELETE_PENDING, which gets mapped to the Win32 code ERROR_ACCESS_DENIED. This is probably where EPERM in errno is coming from. You can confirm this with a tool like Sysinternals Process Monitor.
My guess is that CRT somehow wound up creating a file that has the same name as something it's used before. I've sometimes witnessed that deleting files on Windows can appear asynchronous because some other process (sometimes even an antivirus product, in reaction to the fact that you've just closed a delete-on-close handle...) will leave a handle open to the file, so for some timing window you will see a visible file that you can't get a handle to without hitting delete pending/access denied. Or, it could be that tmpfile has simply chosen a filename that some other process is working on.
To avoid this sort of thing you might want to consider another mechanism for temp files... For example a function like Win32 GetTempFileName allows you to create your own prefix which might make a collision less likely. That function appears to resolve race conditions by retrying if a create fails with "already exists", so be careful about deleting the temp filenames that thing generates - deleting the file cancels your rights to use it concurrently with other processes/threads.

Resources