Memcpy and splitting up a string from a pointer array - arrays

I am attempting to split up a string allocated to a pointer array and assign it into a matrix array. The number of characters that get assigned to the array are dependent on the number (bytes) entered by the user, so I can't use a function like strtok() to split up the string, since I don't have an actual delimeter.
Anyways, if the number of bytes entered are 1, I can successfully fill my "matrix" from start to finish.
My issue comes when the input is either 2 or 4. For some reason, the code skips over the first character in my original pointer array, and starts at the second. A suspicion I had was that since memcpy() skips over the first char because it is stored at the pointer. I thought this might be the case since technically pointer arrays are not arrays but pointers to arrays, but that wouldn't really explain why the first char gets stored when the input byte is 1.
Below is a little snippet of my code, which includes the dynamic array allocation, and use of memcpy().
Here is where I allocated the string and read in from a file:
char * fileArray = (char*)malloc(size*sizeof(char));
if (fileArray== NULL){
printf("NULL");
return 1;
}
fgets(fileArray, size+1, fp);
After a few lines in between, which calculated the amount of columns I would have to allocate for the matrix, I tried using memcpy.
char maTrix[numOfCols][bytes];
if (bytes == 1){
for (i = 0; i<numOfCols; i++) {
memcpy(&maTrix[i], &fileArray[i], bytes);
}
}
else if (bytes == 2 || bytes == 4) {
for (i = 0; i < numOfCols; i++) {
int k = i * bytes;
int p = k + bytes;
while (k < p) {
memcpy(&maTrix[i], &fileArray[k], bytes);
k++;
}
}
}
If my original theory is right, how would I go about correcting this issue?
The input file I am using contains something like this:
The highest forms of understanding we can achieve are laughter and human compass
ion. Richard Feynman
To make clearer what I am trying to do, is to basically split the total number of characters into columns of n bytes (read chars) each. I have padded the # of characters so that n should always be divisible by the string.
I'm looking for an output of something like this.
if bytes == 2:
maTrix[0][0] = 'T'
maTrix[0][1] = 'h'
maTrix[1][0] = 'e'
maTrix[1][1] = ' '
maTrix[2][0] = 'h'
And so on until the whole matrix is filled.
Instead I get:
maTrix[0][0] = 'h'
maTrix[0][1] = 'e'
maTrix[1][0] = ' '
maTrix[1][1] = 'h'
maTrix[2][0] = 'i'
The same is expected for an input of 4 bytes, just with less columns and 4 rows.

Related

Do arrays end with NULL in C programming?

I am a beginner to C and I was asked to calculate size of an array without using sizeof operator. So I tried out this code, but it only works for odd number of elements. Do all arrays end with NULL just like string.
#include <stdio.h>
void main()
{
int a[] = {1,2,3,4,5,6,7,8,9};
int size = 0;
for (int i = 0; a[i] != '\0'; i++)
{
size++;
}
printf("size=%d\n", size);
}
No, in general, there is no default sentinel character for arrays.
As a special case, the arrays which ends with a null terminator (ASCII value 0), is called a string. However, that's a special case, and not the standard.
> So I tried out this code, but it only works for odd number of elements.
Try your code with this array -
int a[] = {1,2,0,4,5,6,7,8,9};
^
|
3 replaced with 0
and you will find the output will be size=2, why?
Because of the for loop condition - a[i] != '\0'.
So, what's happening when for loop condition hit - a[i] != '\0'?
This '\0' is integer character constant and its type is int. It is same as 0. When a[i] is 0, the condition becomes false and loop exits.
In your program, none of the element of array a has value 0 and for loop keep on iterating as the condition results in true for every element of array and your program end up accessing array beyond its size and this lead to undefined behaviour.
> Do all arrays end with NULL just like string.
The answer is NO. In C language, neither array nor string end with NULL, rather, strings are actually one-dimensional array of characters terminated by and including the first null character '\0'.
To calculate size of array without using sizeof, what you need is total number of bytes consumed by array and size (in bytes) of type of elements of array. Once you have this information, you can simply divide the total number of bytes by size of an element of array.
#include <stdio.h>
#include <stddef.h>
int main (void) {
int a[] = { 1, 2, 3, 4, 5, 6, 7, 8, 9 };
ptrdiff_t size = ((char *)(&a + 1) - (char *)&a) / ((char *)(a + 1) - (char *)a);
printf("size = %td\n", size);
return 0;
}
Output:
# ./a.out
size = 9
Additional:
'\0' and NULL are not same.

Function to Split a String into Letters and Digits in C

I'm pretty new to C, and I'm trying to write a function that takes a user input RAM size in B, kB, mB, or gB, and determines the address length. My test program is as follows:
int bitLength(char input[6]) {
char nums[4];
char letters[2];
for(int i = 0; i < (strlen(input)-1); i++){
if(isdigit(input[i])){
memmove(&nums[i], &input[i], 1);
} else {
//memmove(&letters[i], &input[i], 1);
}
}
int numsInt = atoi(nums);
int numExponent = log10(numsInt)/log10(2);
printf("%s\n", nums);
printf("%s\n", letters);
printf("%d", numExponent);
return numExponent;
}
This works correctly as it is, but only because I have that one line commented out. When I try to alter the 'letters' character array with that line, it changes the 'nums' character array to '5m2'
My string input is '512mB'
I need the letters to be able to tell if the user input is in B, kB, mB, or gB.
I am confused as to why the commented out line alters the 'nums' array.
Thank you.
In your input 512mB, "mB" is not digit and is supposed to handled in commented code. When handling those characters, i is 3 and 4. But because length of letters is only 2, when you execute memmove(&letters[i], &input[i], 1);, letters[i] access out of bounds of array so it does undefined behaviour - in this case, writing to memory of nums array.
To fix it, you have to keep unique index for letters. Or better, for both nums and letters since i is index of input.
There are several problems in your code. #MarkSolus have already pointed out that you access letters out-of-bounds because you are using i as index and i can be more than 1 when you do the memmove.
In this answer I'll address some of the other poroblems.
string size and termination
Strings in C needs a zero-termination. Therefore arrays must be 1 larger than the string you expect to store in the array. So
char nums[4]; // Can only hold a 3 char string
char letters[2]; // Can only hold a 1 char string
Most likely you want to increase both arrays by 1.
Further, your code never adds the zero-termination. So your strings are invalid.
You need code like:
nums[some_index] = '\0'; // Add zero-termination
Alternatively you can start by initializing the whole array to zero. Like:
char nums[5] = {0};
char letters[3] = {0};
Missing bounds checks
Your loop is a for-loop using strlen as stop-condition. Now what would happen if I gave the input "123456789BBBBBBBB" ? Well, the loop would go on and i would increment to values ..., 5, 6, 7, ... Then you would index the arrays with a value bigger than the array size, i.e. out-of-bounds access (which is real bad).
You need to make sure you never access the array out-of-bounds.
No format check
Now what if I gave an input without any digits, e.g. "HelloWorld" ? In this case nothin would be written to nums so it will be uninitialized when used in atoi(nums). Again - real bad.
Further, there should be a check to make sure that the non-digit input is one of B, kB, mB, or gB.
Performance
This is not that important but... using memmove for copy of a single character is slow. Just assign directly.
memmove(&nums[i], &input[i], 1); ---> nums[i] = input[i];
How to fix
There are many, many different ways to fix the code. Below is a simple solution. It's not the best way but it's done like this to keep the code simple:
#define DIGIT_LEN 4
#define FORMAT_LEN 2
int bitLength(char *input)
{
char nums[DIGIT_LEN + 1] = {0}; // Max allowed number is 9999
char letters[FORMAT_LEN + 1] = {0}; // Allow at max two non-digit chars
if (input == NULL) exit(1); // error - illegal input
if (!isdigit(input[0])) exit(1); // error - input must start with a digit
// parse digits (at max 4 digits)
int i = 0;
while(i < DIGITS && isdigit(input[i]))
{
nums[i] = input[i];
++i;
}
// parse memory format, i.e. rest of strin must be of of B, kB, mB, gB
if ((strcmp(&input[i], "B") != 0) &&
(strcmp(&input[i], "kB") != 0) &&
(strcmp(&input[i], "mB") != 0) &&
(strcmp(&input[i], "gB") != 0))
{
// error - illegal input
exit(1);
}
strcpy(letters, &input[i]);
// Now nums and letter are ready for further processing
...
...
}
}

C: array of pointers pointing to the same value

I'm writing an IP forwarding program, and I need to splice the following routing table into a char* array
128.15.0.0 255.255.0.0 177.14.23.1
137.34.0.0 255.255.0.0 206.15.7.2
137.34.128.0 255.255.192.0 138.27.4.3
137.34.0.0 255.255.224.0 139.34.12.4
201.17.34.0 255.255.255.0 192.56.4.5
27.19.54.0 255.255.255.0 137.7.5.6
0.0.0.0 0.0.0.0 142.45.9.7
I have initialized an array of pointers to char* like so char *routing[128][128];, then I loop through each line, build a char temp[128] array which then get's assigned to routing as shown below.
In other words, each pointer in the routing array will point to an IP address.
char *routing[128][128];
char line[128], temp[128];
int i = 0, j = 0, token_count = 0, line_count = 0;
while(fgets(line, sizeof(line), routing_table) != NULL){
// printf("%s\n", line);
token_count = 0;
for (i = 0; i < strlen(line); i++){
if (line[i] != ' ' && token_count == 2){
for (j = 0; j < 128; j++){
if (line[i+j] == ' ')
break;
temp[j] = line[i+j];
}
i += j;
routing[line_count][token_count] = temp; //ASSIGNMENT OCCURS HERE
token_count++;
}
else if (line[i] != ' ' && token_count == 1){
for (j = 0; j < 128; j++){
if (line[i+j] == ' ')
break;
temp[j] = line[i+j];
}
i += j;
routing[line_count][token_count] = temp;
token_count++;
}
else if (line[i] != ' ' && token_count == 0){
for (j = 0; j < 128; j++){
if (line[i+j] == ' ')
break;
temp[j] = line[i+j];
}
i += j;
routing[line_count][token_count] = temp;
token_count++;
}
}
line_count++;
}
The problem is that every time an assignment happens, it updates the entire array. So when I print out the array at the end, I get the last element repeated.
142.45.9.7
142.45.9.7
142.45.9.7
...
142.45.9.7
142.45.9.7
I figured I needed to dereference temp but that threw a warning and didn't even fix it.
What do I have wrong syntactically? Why are all pointers pointing to the same char string?
Right near the end, just before where you increment token_count, you've got this line:
routing[line_count][token_count] = temp;
And that's your problem. temp is an array of 128 characters -- what we call a string in C, if there's a null terminator in there somewhere. You're copying each IP address string into it in turn, and then for each one, you assign temp's address to a slot in one of your big two dimensional array of pointers. All those pointers, but they're all pointing to the same actual buffer.
So at the end of the loop, they all point to the last string you copied into temp. That's not what you want, and if temp is a local in a function, then you're in real trouble.
What you want is either a two dimensional array of actual char arrays (16 chars each would suffice for a dotted quad and a null terminator) -- but I'm a little rusty on C and won't risk leading you astray about how to declare that -- or else you need to allocate new memory for each string in your existing two dimensional array. You'd do that like so, in many C implementations:
routing[line_count][token_count] = strdup(temp);
strdup() is a handy function that allocates sufficient memory for the string you give it, copies the string into that memory, and returns a pointer to it. But it's not part of the standard, so can't be counted on in all cases. If you're writing for one compiler and it's there, you're all set (this is often the case). But if you may have any concerns about portability, here's a portable way to do the same thing.
Portable or not, you're allocating memory now, and when you're done with the routing table, you'll need to free the allocated memory by calling free(routing[i][j]) on every one of those pointers that you got back from strdup() (or from some other, more portable function).
To that end, you should call this before allocating anything:
memset(routing, 0, sizeof(char) * 128 * 128);
...where 128 and 128 are the two dimensions of the array (which I would put in #defined constants that I used for the loop and for the declaration, if I really had to do this in C, and didn't allocate the whole mess dynamically). If you first set the whole thing to zeroes, then every pointer in there starts out as NULL. Then when you loop through it to free the memory, you can just loop through each "row" freeing each pointer until you hit a NULL, and then stop.
Thinking about another approach, it is always error-prone to parse ip addresses of command line output, like ifconfig, route, ip etc etc, so why not use the programmatic way to obtain the routing table information? For example, on Linux, RTNETLINK is the standard way to manipulate route table:
http://man7.org/linux/man-pages/man7/rtnetlink.7.html
It gives you all the information in well defined struct.
On windows, you can use Win32 API to do the same:
https://msdn.microsoft.com/en-us/library/windows/desktop/aa365953(v=vs.85).aspx
routing[line_count][token_count] = temp; //ASSIGNMENT OCCURS HERE
The contents of the temp array may be different at each iteration.
The address of the temp array is constant throughout the execution of the program.
So you are obviously assigning the same value to every entry in the routing array.

reading strings to a char array and then getting the size of the strings

Im working on a project and I am stumped on this part.
I need to read words from stdin and place them in a char array and use an array of pointers to point to each word since they will be jagged. where numwords is an int read in representing the number of words.
char words[10000];
char *wordp[2000];
the problem is that I can only use the pointers to add the words.I can no longer use the [] to help.
*wordp = words; //set the first pointer to the beginning of the char array.
while (t < numwords){
scanf("%s", *(wordp + t)) //this is the part I dont know
wordp = words + charcounter; //charcounter is the num of chars in the prev word
t++;
}
for(int i = 0;words+i != '\n';i++){
charcounter++;
}
any help would be great I am so confused when it comes to pointers and arrays.
Your code will be much more manageable if you use an additional pointer
reference and increment that directly. In this way you won't have to do any
mental math. Additionally you need to be incrementing the reference before
reading in the next string, scanf doesn't move the pointer for you.
char buffer[10000];
char* words[200];
int number_of_words = 200;
int current_words_index = 0;
// This is what we are going to use to write to the buffer
char* current_buffer_prt = buffer;
// quick memset (as I don't remember if c does this for us)
for (int i = 0; i < 10000; i++)
buffer[i] = '\0';
while (current_words_index < number_of_words) {
// Store a pointer to the current word before doing anything to it
words[current_word_index] = current_buffer_ptr;
// Read the word into the buffer
scanf("%s", current_buffer_ptr);
// NOTE: The above line could also be written
// scanf("%s", words[current_word_index]);
// this is how we move the buffer to it's next empty position.
while (current_buffer_ptr != '\n')
current_buffer_ptr++;
// this ensures we don't overwrite the previous \n char
current_buffer_ptr++;
current_words_index += 1;
}
What you want to do is relatively straightforward. You've got an array of 10,000 chars for storage, and 2000 pointers. So to start with you'll want to assign the first pointer to the start of the array:
wordp[0] = &words[0];
In pointer form this is:
*(wordp + 0) = words + 0;
I've used the zeros to show how it relates to the arrays. In general, to set each pointer to each element:
*(wordp + i) == wordp[i]
words + i == &words[i]
So all you need to do is keep track of where you are in the pointer array, and as long as you've assigned correctly, the pointer array will keep track of the position in your char array.

Why are the elements in my char* array two bytes instead of four? :

I am new to C, so forgive me if this question is trivial. I am trying to reverse a string, in
my case the letters a,b,c,d. I place the characters in a char* array, and declare a buffer
which will hold the characters in the opposite order, d,c,b,a. I achieve this result using
pointer arithmetic, but to my understanding each element in a char* array is 4 bytes, so when I do the following: buffer[i] = *(char**)letters + 4; I am supposed to be pointing at the
second element in the array. Instead of pointing to the second element, it points to the third. After further examination I figured that if I increment the base pointer by two
each time I would get the desired results. Does this mean that each element in the array
is two bytes instead of 4? Here is the rest of my code:
#include <stdio.h>
int main(void)
{
char *letters[] = {"a","b","c","d"};
char *buffer[4];
int i, add = 6;
for( i = 0 ; i < 4 ; i++ )
{
buffer[i] = *(char**)letters + add;
add -= 2;
}
printf("The alphabet: ");
for(i = 0; i < 4; i++)
{
printf("%s",letters[i]);
}
printf("\n");
printf("The alphabet in reverse: ");
for(i = 0; i < 4; i++)
{
printf("%s",buffer[i]);
}
printf("\n");
}
You're not making an array of characters: you're making an array of character strings -- i.e., an array of pointers to arrays of characters. I am not going to rewrite the whole program for you of course, but I'll start out with two alternative possible correct declarations for your main data structure:
char letters[] = {'a','b','c','d, 0};
char * letters = "abcd";
Either of these declares an array of five characters: a, b, c, d followed by 0, the traditional ending for a character string in C.
Another thing: rather than making assumptions about the size of things, use the language to tell you. For instance:
char *my_array[] = { "foo" , "bar" , "baz" , "bat" , } ;
// the size of an element of my_array
size_t my_array_element_size = sizeof(my_array[0]) ;
size_t alt_element_size = size(*my_array) ; // arrays are pointers under the hood
// the number of elements in my_array
size_t my_array_element_cnt = sizeof(my_array) / sizeof(*myarray ;
// the size of a char
size_t char_size = sizeof(*(my_array[0])) ; // size of a char
Another thing: understand your data structures (as noted above). You talk about chars, but your data structures are talking about strings. Your declarations:
char *letters[] = {"a","b","c","d"};
char *buffer[4];
get parsed as follows:
letters is an array of pointers to char (which happen to be nul-terminated C-style strings), and it's initialized with 4 elements.
Like letters, buffer is an array of 4 pointers to char, but uninitialized.
You are not actually dealing individual chars anywhere, even in the printf() statements: the %s specifier says the argument is a nul-terminated string. Rather, you're dealing with strings (aka pointers to char) and arrays of the same.
An easier way:
#include <stdio.h>
int main(void)
{
char *letters[] = { "a" , "b" , "c" , "d" , } ;
size_t letter_cnt = size(letters)/sizeof(*letters) ;
char *buffer[sizeof(letters)/sizeof(*letters)] ;
for ( int i=0 , j=letter_cnt ; i < letter_cnt ; ++i )
{
buffer[--j] = letters[i] ;
}
printf("The alphabet: ");
for( int i = 0 ; i < letter_cnt ; ++i )
{
printf("%s",letters[i]);
}
printf("\n");
printf("The alphabet in reverse: ");
for( int i=0 ; i < letter_cnt ; i++ )
{
printf("%s",buffer[i]);
}
printf("\n");
}
BTW, is this homework?
This is a case of operator precedence. When you use buffer[i] = *(char**)letters + add;, the * before the cast is performed before the +, making this code equivalent to (*(char**)letters) + add;. The first part is equivalent to the address of the first element in your array, the string "a". Since using string constant automatically adds a null byte, this points to 'a\0'. It happens that the compiler placed all four strings immediately after each other in memory, so if you go past the end of that string you flow into the next. When you add to the pointer, you are moving through this character array: 'a\0b\0c\0d\0'. Notice that each character is 2 bytes after the last. Since this is only true because the compiler placed the 4 strings directly after each other, you should never depend on it (it won't even work if you tried to re-reverse your other string). Instead, you need to put in parentheses to make sure the addition happens before the dereference, and use the 4 byte pointer size. (Of course, as pointed out by Nicholas, you shouldn't assume the size of anything. Use sizeof to get the size of a pointer instead.)
buffer[i] = *((char**)letters + add);
char *letters[] = {"a","b","c","d"};
I think you didn't get the pointer arithmetic correctly. letters is an array of pointers and when incremented by 1 makes to go to next row.
letters + 1 ; // Go to starting location of 2 row, i.e., &"b"
char *letters[] = { "abc" , "def" } ;
(letters + 1) ; // Point to the second row's first element, i.e., &"d"
*((*letters) + 1) ; // Get the second element of the first row. i.e., "b"

Resources