Split character array into multiple parts - arrays

In C, how would I go about splitting a char array into multiple parts, then back into an array. I am looking to split into 10 parts. But to make sure when it's split, it's split at a space and not by the character count.
I would like to be able to split it into another array so I can just call the index for each of them. But I am rather new to C/C++. In Java, I assume I could create an Array and then call array[0]-array[9] to get the split values after the operation is complete.
I would have:
char *s1 = "An example array that can be present right here. With a lot more words than this. But this is just an example after all. So does it really matter currently?"
And would need to be split into 10 parts (doesn't need to be equal in length) but just in 10 semi-equal parts.

So there are two parts to this:
a function that can find the closest space character to the next split point
a function that finds the next wrap point and saves the resulting string
For the first function, we need to pass in the string, its size, the last wrap point, and the next potential wrap point. It needs to return the next actual wrap point. So the function will have a definition like this:
int wrappoint(const char *string, int size, int previous, int current);
To find the next wrap point we need to search for a space character from the current wrap point both forward (until the end of the string) and backward (until the previous wrap point):
int before=-1, after=-1;
for(int i=current; i>previous; i--)
if(string[i] == ' ')
{ before = i; break; }
for(int i=current; i<=size; i++)
if(string[i] == ' ')
{ after = i; break; }
At this point before and after will contain the indices of the nearest space characters - or -1 if a space was not found i nthat direction. Then we just need to return a valid next wrap point based on what we found:
if(before==-1 && after==-1)
return size;
else if(before==-1)
return after;
else if(after==-1)
return before;
else if(current-before < after-current)
return before;
else
return after;
For the second function, we can just pass the string and the number of parts. The function will dynamically allocate an array big enough to hold all the parts. Each array index will be a pointer to a dynamically allocated string. Then the function can return the array and it will be the caller's responsibility to free all that memory. So the function will have a definition like this:
char **split(const char *string, int parts) {
The first thing we need to do is find the length of the string and make an array to hold all the parts. We use calloc so that any indices we don't use will be set to NULL:
char **array = calloc(parts, sizeof(char *));
int size = strlen(string);
Then we have to loop through the string, calling the wrappoint function to find the start and end of each part, allocating space for that part and then copying that part into the array:
int previous = 0, current = 0;
for(int i = 0; i < parts && current < size; i++) {
current = wrappoint(string, size, previous, previous+size/parts);
array[i] = malloc(current-previous+1);
strncpy(array[i], string+previous, current-previous);
array[i][current-previous] = '\0';
previous = current+1;
}
Then we can call split, which returns an array containing the parts - but some of them might be NULL if there are fewer spaces in the string than the number of parts we asked for. We can display the array of parts like this:
for(int i=0; i<10 && array[i]; i++)
printf("%d: %s\n", i, array[i]);
And then when we are done we have to free all the memory that split allocated:
for(int i=0; i<10 && array[i]; i++)
free(array[i]);
free(array);
Here is the full code:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int wrappoint(const char *string, int size, int previous, int current) {
int before=-1, after=-1;
for(int i=current; i>previous; i--)
if(string[i] == ' ')
{ before = i; break; }
for(int i=current; i<=size; i++)
if(string[i] == ' ')
{ after = i; break; }
if(before==-1 && after==-1)
return size;
else if(before==-1)
return after;
else if(after==-1)
return before;
else if(current-before < after-current)
return before;
else
return after;
}
char **split(const char *string, int parts) {
char **array = calloc(parts, sizeof(char *));
int size = strlen(string);
int previous = 0, current = 0;
for(int i = 0; i < parts && current < size; i++) {
current = wrappoint(string, size, previous, previous+size/parts);
array[i] = malloc(current-previous+1);
strncpy(array[i], string+previous, current-previous);
array[i][current-previous] = '\0';
previous = current+1;
}
return array;
}
int main() {
char *string = "An example array that can be present right here. With a lot more words than this. But this is just an example after all. So does it really matter currently?";
char **array = split(string,10);
for(int i=0; i<10 && array[i]; i++)
printf("%d: %s\n", i, array[i]);
for(int i=0; i<10 && array[i]; i++)
free(array[i]);
free(array);
return 0;
}
Try it at https://onlinegdb.com/uoDzM2toi
This example string:
An example array that can be present right here. With a lot more words than this. But this is just an example after all. So does it really matter currently?
Produces this output:
0: An example array
1: that can be present
2: right here. With
3: a lot more words
4: than this. But
5: this is just an
6: example after
7: all. So does it
8: really matter
9: currently?

Related

C: realloc works on Linux, but not on Windows

this is my first question on Stack Overflow, sorry if it's not well written.
I have a little problem. I wrote a program in C (I'm currently learning C, I am a newbie, my first language, don't say I should've learnt Python, please, because I'm doing just fine with C). So, I wrote this little program. It's an attempt of mine to implement a sorting algorithm (I made the algorithm myself, with no help or documentation, it's very inefficient I think, I was just fooling around, though I don't know whether the algorithm already exists or not). The only sorting algorithm I know is QuickSort.
In any case, here is the final program (has plenty of comments, to help me remember how it works if I'll ever revisit it):
// trying to implement my own sorting algorithm
// it works the following way:
// for an array of n integers, find the largest number,
// take it out of the array by deleting it, store it
// at the very end of the sorted array.
// Repeat until the original array is empty.
// If you need the original array, simply
// make a copy of it before sorting
/***************************************/
// second implementation
// same sorting algorithm
// main difference: the program automatically
// computes the number of numbers the user enters
#include <stdio.h>
#include <stdlib.h>
#include <limits.h>
int *sort(int *a, int n); // sort: the actual sorting function
char *read_line(char *str,int *num_of_chars); // read_line: reads input in string form
int *create_array(char *str, int n); // create_array: counts the num of integers entered and extracts them
// from the string the read_line function returns, forming an array
int size_of_array_to_be_sorted = 0; // of integers
int main(void)
{
int *array, i, *sorted_array, size = 3;
char *str = malloc(size + 1);
if (str == NULL)
{
printf("\nERROR: malloc failed for str.\nTerminating.\n");
exit(EXIT_FAILURE);
}
printf("Enter the numbers to be sorted: ");
str = read_line(str, &size);
array = create_array(str, size + 1);
sorted_array = sort(array, size_of_array_to_be_sorted);
printf("Sorted: ");
for (i = 0; i < size_of_array_to_be_sorted; i++)
printf("%d ", sorted_array[i]);
printf("\n\n");
return 0;
}
int *sort(int *a, int n)
{
int i, j, *p, *sorted_array, current_max;
sorted_array = malloc(n * (sizeof(int)));
if (sorted_array == NULL)
{
printf("ERROR: malloc failed in sort function.\nTerminating.\n");
exit(EXIT_FAILURE);
}
for (i = n - 1; i >= 0; i--) // repeat algorithm n times
{
current_max = a[0]; // intiliaze current_max with the first number in the array
p = a;
for (j = 0; j < n; j++) // find the largest integer int the array
if (current_max < a[j])
{
current_max = a[j];
p = (a + j); // make p point to the largest value found
}
*p = INT_MIN; // delete the largest value from the array
sorted_array[i] = current_max; // store the largest value at the end of the sorted_array
}
return sorted_array;
}
char *read_line(char *str, int *num_of_chars)
{
int i = 0; // num of chars initially
char ch, *str1 = str;
while ((ch = getchar()) != '\n')
{
str1[i++] = ch;
if (i == *num_of_chars) // gives str the possibility to
{ // dinamically increase size if needed
str1 = realloc(str, (*num_of_chars)++);
if (str1 == NULL)
{
printf("\nERROR: realloc failed in read_line.\nTerminating.\n");
exit(EXIT_FAILURE);
}
}
}
// at the end of the loop, str1 will contain the whole line
// of input, except for the new-line char. '\n' will be stored in ch
str1[i++] = ch;
str1[i] = '\0'; // store the null char at the end of the string
return str1;
}
int *create_array(char *str, int n)
{
int *array, i, j, k, num_of_ints = 0;
for (i = 0; i < n; i++) // computing number of numbers entered
if (str[i] == ' ' || str[i] == '\n')
num_of_ints++;
array = calloc((size_t) num_of_ints, sizeof(int)); // allocacting necessary space for the array
if (array == NULL)
{
printf("\nERROR: calloc failed in create_array.\nTerminating.\n");
exit(EXIT_FAILURE);
}
k = 0;
i = 1; // populating the array
for (j = n - 1; j >= 0; j--)
{
switch (str[j])
{
case '0': case '1': case '2':
case '3': case '4': case '5':
case '6': case '7': case '8':
case '9': array[k] += ((str[j] - '0') * i);
i *= 10;
break;
case '-': array[k] = -array[k]; // added to support negative integers
default: i = 1;
if (str[j] == ' ' && (str[j - 1] >= '0' && str[j - 1] <= '9'))
/* only increment k
*right before a new integer
*/
k++;
break;
}
}
// the loop works in this way:
// it reads the str string from the end
// if it finds a digit, it will try to extract it from the
// string and store in array, by adding to one of the elements
// of array the current char - ASCII for '0', so that it actually gets a digit,
// times the position of that digit in the number,
// constructing the number in base 10: units have 1, decimals 10, hundreds 100, and so on
// when it finds a char that's not a digit, it must be a space, so it resets i
// and increments k, to construct a new number in the next element of array
size_of_array_to_be_sorted = num_of_ints;
return array;
}
I've written everything myself, so if you think I use some bad methods or naive approaches or something, please tell me, in order for me to be able to correct them. Anyways, my problem is that I have these 'try to handle errors' if statements, after every call of malloc, calloc or realloc. I have a Linux machine and a Windows one. I wrote the program on the Linux one, which has 4GB of RAM. I wrote it, compiled with gcc, had to change a few things in order to make it work, and it runs flawlessly. I have no problem. I then copied it onto a USB drive and compiled it with mingw on my Windows machine, which has 8GB of RAM. I run it, and if I give it more than 3 2-digit integers, it displays
ERROR: realloc failed in read_line.
Terminating.
At least I know that the 'error handling' if statements work, but why does this happen? It's the same code, the machine has twice as much RAM, with most of it free, and it runs with no problem on Linux.
Does this mean that my code is not portable?
Is it something I don't do right?
Is the algorithm wrong?
Is the program very, very inefficient?
Sorry for the long question.
Thanks if you wanna answer it.
The line in question is:
str1 = realloc(str, (*num_of_chars)++);
where *num_of_chars is the current size of str. Because you are using post-increment, the value passed for the new allocation is the same as the current one, so you haven't made str any bigger, but go ahead and act as if you had.

C upper case to lower case

I am having issue with lower casing my words that are being used as inputs. So my program takes in words and sorts them alphabetically and removes duplicates. But I'd like to change words upper case and lower them to equal to lower case words.
example: Apple changes to apple
my input:
./a.out Orange apple banana Apple banana
my output:
Apple
Orange
apple
banana
Here is what I am trying to achieve
output:
apple
banana
orange
Here is my code
#include <stdio.h>
#include <string.h>
int main(int argc, char* argv[]) {
int i, j, k, size;
size = argc -1;
char *key;
char* a[argc-1];
for (i = 2; i < argc; i++) {
key = argv[i];
j = i-1;
while (j >= 1 && strcmp(argv[j], key) > 0) {
argv[j+1] = argv[j];
j--;
}
argv[j+1] = key;
}
if (argc > 1){
for (i = 1; i < argc;){
puts(argv[i]);
while (argv[++i] != NULL && strcmp(argv[i - 1], argv[i] ) == 0)
continue;
}
}
return 0;
}
You have a list of words and you want to output them sorted, and only the unique ones. And you want to do it in a case insensitive fashion.
Get all the strings to the same case.
Sort the list of strings.
Don't output repeats.
C has no built in function to lower case a string, but it does have ones to lower case characters: tolower. So we write a function to lower case a whole string by iterating through it and lower casing each character.
void str_lower(char *str) {
for( ; str[0] != NULL; str++ ) {
str[0] = (char)to_lower(str[0]);
}
}
Then we need to sort. That's handled by the built in qsort function. To use it, you need to write a function that compares two strings and returns just like strcmp. In fact, your comparison function will just be a wrapper around strcmp to make qsort happy.
int compare_strings( const void *_a, const void *_b ) {
/* The arguments come in as void pointers to the strings
and must be cast. Best to do it early. */
const char **a = (const char **)_a;
const char **b = (const char **)_b;
/* Then because they're pointers to strings, they must
be dereferenced before being used as strings. */
return strcmp(*a, *b);
}
In order to handle any data type, the comparison function takes void pointers. They need to be cast back into char pointers. And it's not passed the string (char *) it's passed a pointer to the string (char **), again so it can handle any data type. So a and b need to be dereferenced. That's why strcmp(*a, *b).
Calling qsort means telling it the array you want to sort, the number of items, how big each element is, and the comparison function.
qsort( strings, (size_t)num_strings, sizeof(char*), compare_strings );
Get used to this sort of thing, you'll be using it a lot. It's how you work with generic lists in C.
The final piece is to output only unique strings. Since you have them sorted, you can simply check if the previous string is the same as the current string. The previous string is strings[i-1] BUT be sure not to try to check strings[-1]. There's two ways to handle that. First is to only do the comparison if i < 1.
for( int i = 0; i < num_strings; i++ ) {
if( i < 1 || strcmp( strings[i], strings[i-1] ) != 0 ) {
puts(strings[i]);
}
}
Another way is to always output the first string and then start the loop from the second.
puts( strings[0] );
for( int i = 1; i < num_strings; i++ ) {
if( strcmp( strings[i], strings[i-1] ) != 0 ) {
puts(strings[i]);
}
}
This means some repeated code, but it simplifies the loop logic. This trade-off is worth it, complicated loops mean bugs. I botched the check on the first loop myself by writing if( i > 0 && strcmp ... )`.
You'll notice I'm not working with argv... except I am. strings and num_strings are just a bit of bookkeeping so I didn't always have to remember to start with argv[1] or use argv+1 if I wanted to pass around the array of strings.
char **strings = argv + 1;
int num_strings = argc-1;
This avoids a whole host of off-by-one errors and reduces complexity.
I think you can put the pieces together from there.
There are a set of standard functions for checking and changing the type of characters in ctype.h. The one you are interested in is tolower(). You can #include<ctype.h> and then add a snippet like the following to pre-process your argv before doing the sorting:
for(i = 1; i < argc; i++) {
argv[i][0] = tolower(argv[i][0]);
}
That will only operate on the first character of each word. If you need to normalize the entire word:
for(i = 1; i < argc; i++) {
for(j = 0; argv[i][j]; j++) {
argv[i][j] = tolower(argv[i][j]);
}
}
Silly me, I was able to figure it out after looking at my code realizing that i can do key[0] = tolower(key[0]); which i did before having a pointer point at it.
for (i = 2; i < argc; i++) {
key = argv[i];
key[0] = tolower(key[0]);
j = i-1;
while (j >= 1 && strcmp(argv[j], key) > 0) {
argv[j+1] = argv[j];
j--;
}
argv[j+1] = key;
}
Which lower cases the first letter. And if i wanted to lower case all the letters, i would've have used a for loop. Thank you everyone for your contribution. :)

Remove some elements from array and re-size array in C

Regards
I want to remove some elements from my array and re-size it.
for example my array is:
char get_res[6] = {0x32,0x32,0x34,0x16,0x00,0x00};
Now I want to remove elements after 0x16, so my desire array is:
get_res[] = {0x32,0x32,0x34,0x16};
what is solution?
You cannot resize arrays in C (unlike Python, for example). For real resizing, at least from an API user's point of view, use malloc, calloc, realloc, and free (realloc specifically).
Anyway, "resizing" an array can be imitated using
a delimiter; for example, a delimiter like 0xff could mark the end of the valid data in the array
Example:
#define DELIMITER 0xff
print_data(char* data) {
for (size_t i = 0; data[i] != DELIMITER; ++i)
printf("%x", data[i]);
}
a member counter; count the number of valid data from the beginning of the array onward
Example:
size_t counter = 5;
print_data(char* data) {
for (size_t i = 0; i < counter; ++i)
printf("%x", data[i]);
}
Notes:
Use unsigned char for binary data. char may be aliasing signed char, which you might run into problems with because signed char contains a sign bit.
There is no need to "remove" them. Just don't access them. Pretend like they don't exist. Same like in stacks, when you "pop" a value from the top of the stack, you just decrement the stack pointer.
Manipulating arrays in C isn't easy as it is for vector in C++ or List in Java. There is no "remove element" in C. I mean that you have to do the job yourself, that is, create another array, copy only the elements you want to this new array, and free the memory occupied by the previous one.
Can you do that? Do you want the code?
EDIT:
Try that. It's just a simple program that simulates the situation. Now, you have to see the example and adapt it to your code.
#include <stdio.h>
#include <stdlib.h>
int main() {
char get_res[6] = {0x32,0x32,0x34,0x16,0x00,0x00};
char target = 0x16;
int pos, i, length = 6; // or specify some way to get this number
for(i = 0; i < length; i++)
if(get_res[i] == target) {
pos = i;
break;
}
pos = pos + 1; // as you have to ignore the target itself
char *new_arr = malloc(pos);
for(i = 0; i < length; i++) {
new_arr[i] = get_res[i];
i++;
}
for(i = 0; i < pos; i++)
printf("%c ", new_arr[i]);
return 0;
}

How to find an element in an array of structs in C?

I have to write a function that finds a product with given code from the given array. If product is found, a pointer to the corresponding array element is returned.
My main problem is that the given code should first be truncated to seven characters and only after that compared with array elements.
Would greatly appreciate your help.
struct product *find_product(struct product_array *pa, const char *code)
{
char *temp;
int i = 0;
while (*code) {
temp[i] = (*code);
code++;
i++;
if (i == 7)
break;
}
temp[i] = '\0';
for (int j = 0; j < pa->count; j++)
if (pa->arr[j].code == temp[i])
return &(pa->arr[j]);
}
Why don't you just use strncmp in a loop?
struct product *find_product(struct product_array *pa, const char *code)
{
for (size_t i = 0; i < pa->count; ++i)
{
if (strncmp(pa->arr[i].code, code, 7) == 0)
return &pa->arr[i];
}
return 0;
}
temp is a pointer which is uninitialized and you are dereferencing it which will lead to undefined behavior.
temp = malloc(size); // Allocate some memory size = 8 in your case
One more mistake I see is
if (pa->arr[j].code == temp[i]) // i is already indexing `\0`
should be
strcmp(pa->arr[j].code,temp); // returns 0 if both the strings are same
This code can completely be avoided if you can use strncmp()
As pointed out by others, you are using temp uninitialized and you are always comparing characters with '\0'.
You don't need a temp variable:
int strncmp ( const char * str1, const char * str2, size_t num );
Compare characters of two strings
Compares up to num characters of the
C string str1 to those of the C string str2.
/* Don't use magic numbers like 7 in the body of function */
#define PRODUCT_CODE_LEN 7
struct product *find_product(struct product_array *pa, const char *code)
{
for (int i = 0; i < pa->count; i++) {
if (strncmp(pa->arr[i].code, code, PRODUCT_CODE_LEN) == 0)
return &(pa->arr[i]);
}
return NULL; /* Not found */
}
When you write char* temp; you are just declaring an uninitialized pointer
In your case since you say that the code is truncated to 7 you could create a buffer
on the stack with place for the code
char temp[8];
Writing
temp[i] = (*code);
code++;
i++;
Can be simplified to:
temp[i++] = *code++;
In your loop
for (int j = 0; j < pa->count; j++)
if (pa->arr[j].code == temp[i])
return &(pa->arr[j]);
You are comparing the address of code and the character value of temp[i] which incidentally could be 8 and outside the array.
Instead what you want to do is compare what code points to and what temp contains:
for (int j = 0; j < pa->count; j++)
if (!strncmp(pa->arr[j].code, temp, 7)
return &(pa->arr[j]);
You should also return NULL; if nothing was found, seems you do not return anything.
Probably a good thing is also to make sure your temp[] always contains 7 characters.

realloc() seems to affect already allocated memory

I am experiencing an issue where the invocation of realloc seems to modify the contents of another string, keyfile.
It's supposed to run through a null-terminated char* (keyfile), which contains just above 500 characters. The problem, however, is that the reallocation I perform in the while-loop seems to modify the contents of the keyfile.
I tried removing the dynamic reallocation with realloc and instead initialize the pointers in the for-loop with a size of 200*sizeof(int) instead. The problem remains, the keyfile string is modified during the (re)allocation of memory, and I have no idea why. I have confirmed this by printing the keyfile-string before and after both the malloc and realloc statements.
Note: The keyfile only contains the characters a-z, no digits, spaces, linebreaks or uppercase. Only a text of 26, lowercase letters.
int **getCharMap(const char *keyfile) {
char *alphabet = "abcdefghijklmnopqrstuvwxyz";
int **charmap = malloc(26*sizeof(int));
for (int i = 0; i < 26; i++) {
charmap[(int) alphabet[i]] = malloc(sizeof(int));
charmap[(int) alphabet[i]][0] = 0; // place a counter at index 0
}
int letter;
int count = 0;
unsigned char c = keyfile[count];
while (c != '\0') {
int arr_count = charmap[c][0];
arr_count++;
charmap[c] = realloc(charmap[c], (arr_count+1)*sizeof(int));
charmap[c][0] = arr_count;
charmap[c][arr_count] = count;
c = keyfile[++count];
}
// Just inspecting the results for debugging
printf("\nCHARMAP\n");
for (int i = 0; i < 26; i++) {
letter = (int) alphabet[i];
printf("%c: ", (char) letter);
int count = charmap[letter][0];
printf("%d", charmap[letter][0]);
if (count > 0) {
for (int j = 1; j < count+1; j++) {
printf(",%d", charmap[letter][j]);
}
}
printf("\n");
}
exit(0);
return charmap;
}
charmap[(int) alphabet[i]] = malloc(sizeof(int));
charmap[(int) alphabet[i]][0] = 0; // place a counter at index 0
You are writing beyond the end of your charmap array. So, you are invoking undefined behaviour and it's not surprising that you are seeing weird effects.
You are using the character codes as an index into the array, but they do not start at 0! They start at whatever the ASCII code for a is.
You should use alphabet[i] - 'a' as your array index.
The following piece of code is a source of troubles:
int **charmap = malloc(26*sizeof(int));
for (int i = 0; i < 26; i++)
charmap[...] = ...;
If sizeof(int) < sizeof(int*), then it will be performing illegal memory access operations.
For example, on 64-bit platforms, the case is usually sizeof(int) == 4 < 8 == sizeof(int*).
Under that scenario, by writing into charmap[13...25], you will be accessing unallocated memory.
Change this:
int **charmap = malloc(26*sizeof(int));
To this:
int **charmap = malloc(26*sizeof(int*));

Resources