It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 10 years ago.
I have to create a utf8 file (say utf8_test.txt) in linux (Ubuntu), in c.
I tried fopen(), but it create hte file depending on the Locale - as the locale was en_IN, it created in Ascii I guess.
Is there any interface or function by which I can specify the format of the file to open or I need to add some byte in the beginning of the file, so that the OS understands that it is a UTF8 file?
Please give your valuable inputs.
Thank you.
GLib contains routines for working with UTF-8 text, and libiconv can be used to convert between various charsets, including UTF-8.
fopen can be used to write a binary stream of data. Your locale is not relevant.
You should check whether you are actually sending a UTF-8 byte stream to the file. You can do this by running a hex-editor on the file, e.g. xxd, and seeing if a UTF-8 sequence appears in the file.
If you do not have UTF-8 bytes in the file then the byte stream you are sending to the file is incorrect.
If you do have UTF-8 bytes in the file then your issue is just one of display.
Related
It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 9 years ago.
For a homework assignment I created a simple compression/decompression program that makes use of a naive implementation of run-length encoding. I've gotten my program working; compressing and decompressing any text file with a pretty large number of characters (e.g. the program source) works flawlessly. As an experiment I tried to compress/decompress the binary of the compression program itself. This resulted in a file that was much smaller than the original binary, and is obviously un-runnable. What is causing this data-loss?
My assumption was that it's related to how binary files are represented, but I can't figure much out past that.
Possible issues:
Your program opens the binary file in the text mode, which damages the '\r' and '\n' bytes
Your program incorrectly handles zero bytes, treating them as ends of strings ('\0') and not as data of its own
Your program uses char (that is actually signed char) for the bytes of data and correctly works only with non-negative values, which ASCII chars of English text are, but fails to work with arbitrary char/byte values, which may be negative
Your program has an overflow somewhere which shows up only on big files
Your program has some other data-dependent bug
If the platform is linux (as the question is tagged), there's no difference between binary and text modes. So it shouldn't be that; but even so, the files should be opened as binary.
I suspect that your problem is the program treats '\0' characters as terminators (or otherwise specially) instead of as valid data.
It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 10 years ago.
AFAIK, a buffer overflow is achieved by overwriting memory adjacent to a C variable's buffer. This overwriting is used to spawn a shell which executes commands.
But what if the user that is running the program vulnerable to a buffer overflow has the shell disabled ?
/etc/passwd:
user1:x:1000:1000:user1,,,,:/home/user1:/bin/false
sudo -u user1 /usr/bin/programname
"Shell disabled" only matters if you're actually logging in. If you're exploiting an already running program then you don't need to log in.
Exploits do not use a shell which is configured for a user — they normally include a binary code, shellcode, which are functionally equivalent to a primitive shell, meaning that it will start any chosen executable — for example a real shell program. Exploited program is then tricked to execute this code.
There are many different shellcodes available on the net, for example which do not include a byte '\0', so they will be passed unharmed as a C string, or which only include printable characters, valid unicode strings etc.
It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 10 years ago.
I would like to read from a file into an unsigned character buffer as well as write that information back into a file. I am using an unsigned character buffer because I need to send this information over a UDP socket.
The problem is I can't seem to find a way to properly read the file from the buffer and write from the buffer.
Can anyone point a way to do this?
Thanks so much
Take a look at write and read functions, or fread and fwrite. They should do the trick.
For example, you write a buffer to a file with:
int fd = open("file", O_CREAT | O_WRONLY, 0600);
write(fd, yourBuffer, numberOfCharactersToWrite);
The write function may return some error codes, so read it's manual.
fwrite is very similar in usage, look at the site here.
It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 10 years ago.
I'm trying to get the size of a file from the commandline in C using argv. I'm not too familiar with file i/o in C, so any pointers would be greatly appreciated. Thanks.
You've not stated the platform, but your C program is given an argument list when it is started, and the file names are strings. The POSIX function you'd probably use is stat(); it takes a pointer to a struct stat and will put the file's size into the st_size member of the structure.
The answer may be different on Windows; the POSIX subsystem will provide a stat() workalike (probably named _stat()), but there'll also be a native interface.
It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 11 years ago.
I would like to extract file names and their corresponding MD5 sum from a check sum file in a format such as this-
MD5 (FreeBSD-8.2-RELEASE-amd64-bootonly.iso) = 2587cb3d466ed19a7dc77624540b0f72
I would prefer to do this locally within the program, which rules out awk and the like.
You can read lines easily enough using fgets(). Don't even think of using gets().
If you're reasonably confident you won't be dealing with filenames containing spaces or close parentheses, you can use sscanf() to extract the bits and pieces:
char hash_type[16];
char file_name[1024];
char hash_value[128];
if (sscanf(line, "%15s (%1023s) = %127s", hash_type, file_name, hash_value) == 3)
...good to go...
else
...something went wrong...
Note the sizes specified in the sscanf() string compared to the variable definitions; there isn't an easy way to generalize that other than by using snprintf() to create the format string:
char format[32];
snprintf(format, sizeof(format), "%%1%zus )%%%zus) = %%%zus",
sizeof(hash_type)-1, sizeof(file_name)-1, sizeof(hash_value)-1);
Your alternative is some routine forward parsing to locate the hash type and the open parenthesis before the start of the file name, and some trickier backwards parsing, skipping over the hash value and finding the equals and the last close parenthesis, and then collecting the various parts.
You should be able to implement this with fopen(), fgets() and strchr() - but first you will need to nail down the format of the file more precisely (for example: what happens if the filename includes a ) character?)
I wouldn't advocate it in most languages, but why not just hit it up with POSIX regex?