I am currently attempting to read in Hex values from a text file.
There can be multiple lines of Hex's and each line can be as long as needed:
f53d6d0568c7c7ce
1307a7a1c84058
b41af04b24f3eb83ce
Currently, I put together a simple loop to read in the Hex values into an unsigned char line[500] with fscanf as such:
for(i=0; i < 500; i++)
{
if (fscanf(fp, "%02x", &line[i]) != 1) break;
}
At the current moment, this only reads in the first line. As well, it is definitely not the best approach to just throw in a random 500 there to read.
I was assuming I could use sscanf with fgets or something of that nature. But I was unsure if this would be the best approach.
If anyone could help point me in the right direction, I would greatly appreciate it.
You're on the right track with fgets() and sscanf(); that will let you size everything appropriately. If your data is really in that format, sscanf() might be overkill; you could just write a quick conversion loop yourself and save all those variadic function calls.
Note that sscanf is slow (library calls, memory usage, and overkill). Moreover it isway too dangerous (b/c of possible buffer overrun).
You probably would get better results with your own parser. It may show as a bigger source code but it gives you the chance to control and expand your code exactly as needed, without compromising security and speed.
The usual way is to accumulate the hex digits one by one as you read them and build up the corresponding integer:
hexDigit = one letter from "0123456789ABCDEF" remapped to a number within 0-15
accumulating_number= accumulating_number * 16 + hexDigit
Here is a tiny standalone parser as a full example. It accepts lower and upper case and it ignores any non hex character (so you can use space or commas for better readability in the source):
#include <stdio.h>
#define SPLIT_CHAR_SIZE 8 // size of the hex numbers to parse (eg. 6 for RGB colors)
void do_something_with(unsigned int n)
{
printf("%08X ",n);
}
int main(int argc, char** argv)
{
FILE* fp= (argc!=2) ? stdin : fopen(argv[1],"r");
if(!fp) { fprintf(stderr,"Usage: %s fileToRead\n", argv[0]); return(-1); }
unsigned int i=0, accumulator=0;
char c;
while(!feof(fp)) // you could parse a c-string via fscanf() to handle other file contents
{
c= fgetc(fp);
// The "<<4" gives room for 4 more bits, aka a nibble, aka one hex digit, aka a number within [0,15]
if(c>='0' && c<='9')
accumulator= (accumulator<<4) | (c - '0');
else if(c>='a' && c<='f') // lower case
accumulator= (accumulator<<4) | (c - 'a' + 10);
else if(c>='A' && c<='F') // upper case
accumulator= (accumulator<<4) | (c - 'A' + 10);
else
continue; // skip all other (invalid) characters
// When you want to parse more than one hex number you can use something like this:
if(++i % SPLIT_CHAR_SIZE == 0)
{
do_something_with(accumulator);
accumulator= 0; // do not forget this
}
}
printf("\n");
return 0;
}
If you give this parser it the following (somehow weird) file content:
ca 53,
FF 00
aa bb cc dd
then the function do_something_with() will output this:
CA53FF00 AABBCCDD
Related
I have a hexadecimal string like:
char str[] = "40004A0060007A0034006600";
I want to extract individual values from it like 0x40, 0x00, 0x4A, 0x00 etc.
How to do it?
Copy the 2 bytes of interest into a temporary 3 byte array.
Null terminate the 3 byte array to turn it into a string.
Call strtoul on this array from stdlib.h.
Alternatively you could manually decode it, since it's a trivial thing to do. Mask out nibbles, subtract some ASCII values or do a lookup table check, then multiply the ms nibble by 16.
NOTE: This answer applies to revision 3 of the question. Meanwhile, the question has been modified, thereby invalidating option #1 of my answer. As pointed out in the comments section of the question, this was not OP's fault, though.
You have two options:
Convert the string to an integer type, for example using the function strtoul or strtoull, and then use bit-shifting (>> operator) and bit-masking (& operator) to obtain the desired values. However, due to limitations in the range of values that the data types long and long long can represent, this option is only guaranteed to work with up to 8 hexadecimal digits with strtoul and 16 digits with strtoull. EDIT: Meanwhile, the question has been modified in such a way that the string is longer than 16 digits, so this solution is no longer viable.
Obtain the desired values by looking them up directly in the string. For example, if you are looking for the 3rd group of hexadecimal digits, then you will find them using str[4] and str[5]. This will give you two character values. If you want to convert these two hexadecimal characters to the number that they represent, then you can create a string from these two values and then use strtoul on that string.
Since you seem to be a beginner, I broke the task up into its constituent parts. This is a very simple hex dump facility where each step your code needs to take is its own routine. It is a quick and dirty and rather imperfect implementation, but understanding how to improve it will help you learn and write your own.
#include <ctype.h>
#include <stdint.h>
#include <stdio.h>
int
nibble(uint8_t ch) {
if ((ch >= '0') && (ch <= '9')) {
return ch - '0';
}
if ((ch >= 'A') && (ch <= 'F')) {
return 10 + (ch - 'A');
}
if ((ch >= 'a') && (ch <= 'f')) {
return 10 + (ch - 'a');
}
/* should never get here if isxdigit was called first */
return -1;
}
int
next_byte(const char *in)
{
uint8_t hi = 16 * nibble(*in);
uint8_t lo = nibble(*(in + 1));
return hi + lo;
}
int points_to_byte(const char *in) {
return ((*in) && isxdigit(*in))
&& (*(in + 1)) && isxdigit(*(in + 1));
}
void
dump(const char *in) {
/* Decide what to do with input that is not a string of hex bytes */
for (int i = 0; points_to_byte(in + i); i += 2) {
printf("%d\n", next_byte(in + i));
}
}
int
main(int argc, char *argv[]) {
if (argc < 2) {
puts("Need hex strings as arguments");
}
for (int i = 1; i < argc; ++i) {
dump(argv[i]);
}
}
When compile this into an executable called t and run it with your input, this is the output I get:
$ ./t 400004a005b002000113efb29f73f57589343e70e5244162edf312e303030322e313420200043472d58585858000000000032303139303833585858585858000000505230474C5043343554334C3343
64
0
4
160
...
67
52
195
52
want to convert it into a other string like
char str1[] ="0x40,0x00,0x4A,0x00,0x60";
Since you do not control the input string, you are going to need to malloc the buffer for the output. That and storing the transformed output is left as an exercise.
I need to take in some input that is a string and turn it into its equal hex number and put it in place in a binary file.
char *fpath = argv[3];
FILE *f = fopen(fpath, "wb+");
char buf[1000];
char *input = "46AB";
unsigned int value = strtol(input , NULL, 16);
sprintf(buf, "\\x%x", value);
fseek(f, 2, SEEK_SET);
fputs(buf, f);
In the file it produces
5C 78 34 36 61
while i need it look like
46 AB
Is there an elegant way of doing this?
Your sprintf call created the string
\x46ab
and you wrote those six characters to the file without further interpretation, so that's what you saw in the file (hex bytes 5c 78 34 36 61 62).
To get the hex-to-binary conversion you want, while avoiding byte order issues and anticipating the possibility of an arbitrary-length input string, you can do this one byte at a time with code like this:
char *p;
for(p = input; *p != '\0'; p += 2) {
unsigned int x;
if(sscanf(p, "%2x", &x) != 1) break;
putc(x, f);
}
This uses sscanf to convert the input string to hexadecimal, two characters (two hexadecimal digits, or one output byte) at a time.
I also tested it with a longer input string:
input = "0102034a4b4c";
This is quick-and-dirty, imperfect code. It will misbehave if input contains an odd number of characters, and it doesn't deal particularly gracefully with non-hexadecimal characters, either.
An improvement is to use the mildly obscure %n format specifier to scanf to discover exactly how many characters it consumed each time:
for(p = input; *p != '\0'; ) {
unsigned int x;
int n;
if(sscanf(p, "%2x%n", &x, &n) != 1) break;
putc(x, f);
p += n;
}
This should be more robust against malformed input strings, although I have not tested it exhaustively. (I hardly ever use functions in the scanf family, let alone the more obscure format specifiers like %n, but this is one of the few problems I know of where it's attractive, the alternatives being considerably more cumbersome.)
P.S. I got rid of your fseek call, because I wasn't sure what it was there for and it confused my testing. You can put it back in if you need it.
I have a file like this one:
1234 Homer 18.5
1223 Bart 25.5
9341 Lisa 30.0
3420 Marge 28.4
8730 Abram 26.7
1876 Barns 27.8
1342 Smiters 23.0
7654 Milhouse 29.7
How can i get the first part ( for example 1234 ) of each line?
And how can i get the name ( for example Homer ) of each line?
I wrote this code below:
#include <stdio.h>
#include <stdlib.h>
#include <ctype.h>
int main()
{
char ch[25];
int i, num;
FILE *fp;
fp = fopen("studenti.txt","r"); // read mode
if( fp == NULL )
{
perror("Error while opening the file.\n");
exit(EXIT_FAILURE);
}
printf("The contents of numeri.txt file are :\n");
for(i = 0; i < 25; i++){
while( ( ch[i] = fgetc(fp) ) != EOF ){
if(!(ch[i] >= 'A' && ch[i] <= 'Z') && !( ch[i] >= 'a' && ch[i] <='z')){printf("%c",ch[i]);}
}}
fclose(fp);
return 0;
}
How can do that??
This is what fscanf function is for:
int n;
char name[25];
float x;
FILE* fp = ...
while (fscanf(fp, "%d%24s%f", &n, name, &x) == 3) {
// Do something with the data you just read:
printf("int=%d name='%s' float=%f\n", n, name, x);
}
Several things to note about the above:
fscanf returns the number of items it read from the file. Continue calling fscanf while it returns 3
%24s means "a string of up to 24 characters in length". name has 25 characters, because the last one is used for null termination
int and float parameters are passed to fscanf with an ampersand, because fscanf needs a pointer. String, on the other hand, takes no ampersand, because it's equivalent to a pointer.
If you are sure of the text format, the simplest may be to use fscanf.
int num;
char name[1024];
float grade;
fscanf(fp, "%d %s %f", &num, name, &grade);
Be aware, if the name is longuer than 1024 chars, you will have buffer overflows. If the format is not sure, you need to check the return code of fscanf (see the man page).
The only thing you ever read from files is bytes.
The first step is to check if the bytes are valid characters, and convert the bytes into characters if necessary. This is not necessarily simple. If the bytes are supposed to be ASCII, then you might only need to check if the bytes are valid ASCII (e.g. not less than or equal to zero and not above 0x80; and possibly not control characters like "delete" or "vertical tab").
However, where names are involved it's extremely unlikely that ASCII is adequate. This means you want something like UTF-8. In that case, at a minimum you need to check if the bytes are valid (variable length) UTF-8 sequences; in addition to checking for invalid characters (like "delete" or "vertical tab").
More complicated is if you simply don't know what the bytes are. There are ways to auto-detect the character encoding (but it's heuristics not 100% reliable).
The second step is parsing. Parsing typically has 2 equally important goals. The first goal is to convert the characters into a more easily processed form - e.g. like maybe a structure with 3 fields (an integer, string and float) representing each line of characters. The second goal of parsing is reporting any errors to the user in an easily understood manner.
For example, maybe the first number on each line must be a 4 digit code (like "0123"); and if there's only 3 digits (like "123") then you want to generate an error (e.g. "ERROR: CourseID too short on line 5 of file 'foo.txt'") so that it's easy for the user to know exactly what the problem is and easy for the user to fix it.
Note: I don't think I've ever seen code that uses fscanf() that is close to (what I consider) acceptable. There's almost never useful/descriptive error messages.
I am getting the user to input 4 numbers. They can be input: 1 2 3 4 or 1234 or 1 2 34 , etc. I am currently using
int array[4];
scanf("%1x%1x%1x%1x", &array[0], &array[1], &array[2], &array[3]);
However, I want to display an error if the user inputs too many numbers: 12345 or 1 2 3 4 5 or 1 2 345 , etc.
How can I do this?
I am very new to C, so please explain as much as possible.
//
Thanks for your help.
What I have now tried to do is:
char line[101];
printf("Please input);
fgets(line, 101, stdin);
if (strlen(line)>5)
{
printf("Input is too large");
}
else
{
array[0]=line[0]-'0'; array[1]=line[1]-'0'; array[2]=line[2]-'0'; array[3]=line[3]-'0';
printf("%d%d%d%d", array[0], array[1], array[2], array[3]);
}
Is this a sensible and acceptable way? It compiles and appears to work on Visual Studios. Will it compile and run on C?
OP is on the right track, but needs adjust to deal with errors.
The current approach, using scanf() can be used to detect problems, but not well recover. Instead, use a fgets()/sscanf() combination.
char line[101];
if (fgets(line, sizeof line, stdin) == NULL) HandleEOForIOError();
unsigned arr[4];
int ch;
int cnt = sscanf(line, "%1x%1x%1x%1x %c", &arr[0], &arr[1], &arr[2],&arr[3],&ch);
if (cnt == 4) JustRight();
if (cnt < 4) Handle_TooFew();
if (cnt > 4) Handle_TooMany(); // cnt == 5
ch catches any lurking non-whitespace char after the 4 numbers.
Use %1u if looking for 1 decimal digit into an unsigned.
Use %1d if looking for 1 decimal digit into an int.
OP 2nd approach array[0]=line[0]-'0'; ..., is not bad, but has some shortcomings. It does not perform good error checking (non-numeric) nor handles hexadecimal numbers like the first. Further, it does not allow for leading or interspersed spaces.
Your question might be operating system specific. I am assuming it could be Linux.
You could first read an entire line with getline(3) (or readline(3), or even fgets(3) if you accept to set an upper limit to your input line size) then parse that line (e.g. with sscanf(3) and use the %n format specifier). Don't forget to test the result of sscanf (the number of read items).
So perhaps something like
int a=0,b=0,c=0,d=0;
char* line=NULL;
size_t linesize=0;
int lastpos= -1;
ssize_t linelen=getline(&line,&linesize,stdin);
if (linelen<0) { perror("getline"); exit(EXIT_FAILURE); };
int nbscanned=sscanf(line," %1d%1d%1d%1d %n", &a,&b,&c,&d,&lastpos);
if (nbscanned>=4 && lastpos==linelen) {
// be happy
do_something_with(a,b,c,d);
}
else {
// be unhappy
fprintf(stderr, "wrong input line %s\n", line);
exit(EXIT_FAILURE);
}
free(line); line=NULL;
And once you have the entire line, you could parse it by other means like successive calls of strtol(3).
Then, the issue is what happens if the stdin has more than one line. I cannot guess what you want in that case. Maybe feof(3) is relevant.
I believe that my solution might not be Linux specific, but I don't know. It probably should work on Posix 2008 compliant operating systems.
Be careful about the result of sscanf when having a %n conversion specification. The man page tells that standards might be contradictory on that corner case.
If your operating system is not Posix compliant (e.g. Windows) then you should find another way. If you accept to limit line size to e.g. 128 you might code
char line[128];
memset (line, 0, sizeof(line));
fgets(line, sizeof(line), stdin);
ssize_t linelen = strlen(line);
then you do append the sscanf and following code from the previous (i.e. first) code chunk (but without the last line calling free(line)).
What you are trying to get is 4 digits with or without spaces between them. For that, you can take a string as input and then check that string character by character and count the number of digits(and spaces and other characters) in the string and perform the desired action/ display the required message.
You can't do that with scanf. Problem is, there are ways to make scanf search for something after the 4 numbers, but all of them will just sit there and wait for more user input if the user does NOT enter more. So you'd need to use gets() or fgets() and parse the string to do that.
It would probably be easier for you to change your program, so that you ask for one number at a time - then you ask 4 times, and you're done with it, so something along these lines, in pseudo code:
i = 0
while i < 4
ask for number
scanf number and save in array at index i
E.g
#include <stdio.h>
int main(void){
int array[4], ch;
size_t i, size = sizeof(array)/sizeof(*array);//4
i = 0;
while(i < size){
if(1!=scanf("%1x", &array[i])){
//printf("invalid input");
scanf("%*[^0123456789abcdefABCDEF]");//or "%*[^0-9A-Fa-f]"
} else {
++i;
}
}
if('\n' != (ch = getchar())){
printf("Extra input !\n");
scanf("%*[^\n]");//remove extra input
}
for(i=0;i<size;++i){
printf("%x", array[i]);
}
printf("\n");
return 0;
}
I'm wanting to read hex numbers from a text file into an unsigned integer so that I can execute Machine instructions. It's just a simulation type thing that looks inside the text file and according to the values and its corresponding instruction outputs the new values in the registers.
For example, the instructions would be:
1RXY -> Save register R with value in
memory address XY
2RXY -> Save register R with value XY
BRXY -> Jump to register R if xy is
this and that etc..
ARXY -> AND register R with value at
memory address XY
The text file contains something like this each in a new line. (in hexidecimal)
120F
B007
290B
My problem is copying each individual instruction into an unsigned integer...how do I do this?
#include <stdio.h>
int main(){
FILE *f;
unsigned int num[80];
f=fopen("values.txt","r");
if (f==NULL){
printf("file doesnt exist?!");
}
int i=0;
while (fscanf(f,"%x",num[i]) != EOF){
fscanf(f,"%x",num[i]);
i++;
}
fclose(f);
printf("%x",num[0]);
}
You're on the right track. Here's the problems I saw:
You need to exit if fopen() return NULL - you're printing an error message but then continuing.
Your loop should terminate if i >= 80, so you don't read more integers than you have space for.
You need to pass the address of num[i], not the value, to fscanf.
You're calling fscanf() twice in the loop, which means you're throwing away half of your values without storing them.
Here's what it looks like with those issues fixed:
#include <stdio.h>
int main() {
FILE *f;
unsigned int num[80];
int i=0;
int rv;
int num_values;
f=fopen("values.txt","r");
if (f==NULL){
printf("file doesnt exist?!\n");
return 1;
}
while (i < 80) {
rv = fscanf(f, "%x", &num[i]);
if (rv != 1)
break;
i++;
}
fclose(f);
num_values = i;
if (i >= 80)
{
printf("Warning: Stopped reading input due to input too long.\n");
}
else if (rv != EOF)
{
printf("Warning: Stopped reading input due to bad value.\n");
}
else
{
printf("Reached end of input.\n");
}
printf("Successfully read %d values:\n", num_values);
for (i = 0; i < num_values; i++)
{
printf("\t%x\n", num[i]);
}
return 0
}
You can also use the function strtol(). If you use a base of 16 it will convert your hex string value to an int/long.
errno = 0;
my_int = strtol(my_str, NULL, 16);
/* check errno */
Edit: One other note, various static analysis tools may flag things like atoi() and scanf() as unsafe. atoi is obsolete due to the fact that it does not check for errors like strtol() does. scanf() on the other hand can do a buffer overflow of sorts since its not checking the type sent into scanf(). For instance you could give a pointer to a short to scanf where the read value is actually a long....and boom.
You're reading two numbers into each element of your array (so you lose half of them as you overwrite them. Try using just
while (i < 80 && fscanf(f,"%x",&num[i]) != EOF)
i++;
for your loop
edit
you're also missing the '&' to get the address of the array element, so you're passing a random garbage pointer to scanf and probably crashing. The -Wall option is your friend.
In this case, all of your input is upper case hex while you are trying to read lower case hex.
To fix it, change %x to %X.
Do you want each of the lines (each 4 characters long) separated in 4 different array elements? If you do, I'd try this:
/* read the line */
/* fgets(buf, ...) */
/* check it's correct, mind the '\n' */
/* ... strlen ... isxdigit ... */
/* put each separate input digit in a separate array member */
num[i++] = convert_xdigit_to_int(buf[j++]);
Where the function convert_xdigit_to_int() simply converts '0' (the character) to 0 (an int), '1' to 1, '2' to 2, ... '9' to 9, 'a' or 'A' to 10, ...
Of course that pseudo-code is inside a loop that executes until the file runs out or the array gets filled. Maybe putting the fgets() as the condition for a while(...)
while(/*there is space in the array && */ fgets(...)) {
}