I have a few questions regarding array of strings in C.
I have an array char *string. I have a char *string and then I split every 4 characters in a array of strings called sep_str. So for example if char *string = 'The sum';, then char **sep_str is:
0: |_| --> "The "
1: |_| --> "Sum"
My first question is, in an array of strings in C (so array of array of chars), will there be a null terminating character at the end of each sep_str[i], or just at the last position of sep_str? Here is how I copy string into an array of strings:
for (int i = 0; i < str_length; i++) {
sep_str[i/4][i%4] = *ptr;
ptr++;
}
My second question is, how would I reverse the elements of each string in sep_str? Here's how I did it, but I feel like it is stepping out of the array of the substring. (so out of the element of the sep_str):
// Reverse each element in the array
char temp;
for (int i = 0; i < num_strs; i++) {
for (int j = 0, k = 4; j < k; j++, k--) {
temp = sep_str[i][j];
sep_str[i][j] = sep_str[i][k];
sep_str[i][k] = temp;
}
}
The copy of the strings sounds good to me. Since each string has always 4 chars, you can avoid the null terminator \0. Alternatively you need to declare sep_str as a 5x(lenght/4) matrix, to store the \0 char at the end of each string.
To reverse a string you need to iterate from the start to the middle of the string, replacing the i-th char with the length-i-1-th. You need to replace the inner for replacing k=3 to k=2.
You also need to take care of the last string, since the lenght might not be multiple of four.
char temp;
for (int i = 0; i < (num_strs - 1); i++) {
for (int j = 0, k = 3; j < k; j++, k--) {
temp = sep_str[i][j];
sep_str[i][j] = sep_str[i][k];
sep_str[i][k] = temp;
}
}
if (num_strs > 0) {
for (int j = 0, k = strlen(sep_str[i]) - 1; j < k; j++, k--) {
temp = sep_str[i][j];
sep_str[i][j] = sep_str[i][k];
sep_str[i][k] = temp;
}
}
In a C string, there will be only one termination character. But if you need to tokenize the strings, then each string must be null terminated.
But before that -
char *string = "The sum"; // should be const char* string = "The sum";
String literal in the above case resides in read only location and cannot be modified. If you need to modify, then
char string[] = "The sum";
If you don't have the terminating character in your strings then yes, you will be outside the bounds of the array since you are accessing sep_str[i][4], which is not a valid location:
sep_str[0] = 'T'
sep_str[1] = 'h'
sep_str[2] = 'e'
sep_str[3] = ' '
However, I doubt that you want to have the null character at the beginning of your string, so you need k=3 in your for loop, not k=4.
My first question is, in an array of strings in C (so array of array of chars), will there be a null terminating character at the end of each sep_str[i], or just at the last position of sep_str?
Only at the end, but if you want to treat each individual chunk as its own string, you'll need to add the \0 yourself.
My second question is, how would I reverse the elements of each string in sep_str?
You could do it with pointers...
char temp;
// Point to start of string, `str` will decay to first memory position.
char *start = str;
// Point to the end of the string. You will need to `#include <string.h>`
// for `strlen()`. Otherwise, write a `while` loop that goes until `\0` to find
// the last position.
char *end = &str[strlen(str) - 1];
// Do until we hit the middle of the string.
while (start < end) {
// Need a temp char, no parallel assignment in C.
temp = str[start];
// Swap chars.
str[start++] = str[end];
str[end--] = str[temp];
}
Assuming str is your string.
Related
I was doing an exercise from LeetCode in which consisted in deleting any adjacent elements from a string, until there are only unique characters adjacent to each other. With some help I could make a code that can solve most testcases, but the string length can be up to 10^5, and in a testcase it exceeds the time limit, so I'm in need in some tips on how can I optimize it.
My code:
char res[100000]; //up to 10^5
char * removeDuplicates(char * s){
//int that verifies if any char from the string can be deleted
int ver = 0;
//do while loop that reiterates to eliminate the duplicates
do {
int lenght = strlen(s);
int j = 0;
ver = 0;
//for loop that if there are duplicates adds one to ver and deletes the duplicate
for (int i = 0; i < lenght ; i++){
if (s[i] == s[i + 1]){
i++;
j--;
ver++;
}
else {
res[j] = s[i];
}
j++;
}
//copying the res string into the s to redo the loop if necessary
strcpy(s,res);
//clar the res string
memset(res, '\0', sizeof res);
} while (ver > 0);
return s;
}
The code can't pass a speed test that has a string that has around the limit (10^5) length, I won't put it here because it's a really big text, but if you want to check it, it is the 104 testcase from the LeetCode Daily Problem
If it was me doing something like that, I would basically do it like a simple naive string copy, but keep track of the last character copied and if the next character to copy is the same as the last then skip it.
Perhaps something like this:
char result[1000]; // Assumes no input string will be longer than this
unsigned source_index; // Index into the source string
unsigned dest_index; // Index into the destination (result) string
// Always copy the first character
result[0] = source_string[0];
// Start with 1 for source index, since we already copies the first character
for (source_index = 1, dest_index = 0; source_string[source_index] != '\0'; ++source_index)
{
if (source_string[source_index] != result[dest_index])
{
// Next character is not equal to last character copied
// That means we can copy this character
result[++dest_index] = source_string[source_index];
}
// Else: Current source character was equal to last copied character
}
// Terminate the destination string
result[dest_index + 1] = '\0';
Could you help please ?
When I execute this code I receive that:
AAAAABBBBBCCCCCBBBBBCOMP¬ıd┐╔ LENGTH 31
There are some weirds characters after letters, while I've allocate just 21 bytes.
#include <stdio.h>
#include <stdlib.h>
char * lineDown(){
unsigned short state[4] = {0,1,2,1};
char decorationUp[3][5] = {
{"AAAAA"},{"BBBBB"},{"CCCCC"}
};
char * deco = malloc(21);
int k;
int p = 0;
for(int j = 0; j < 4; j++){
k = state[j];
for(int i = 0; i < 5; i++){
*(deco+p) = decorationUp[k][i];
p++;
}
}
return deco;
}
int main(void){
char * lineDOWN = lineDown();
int k = 0;
char c;
do{
c = *(lineDOWN+k);
printf("%c",*(lineDOWN+k));
k++;
}while(c != '\0');
printf("LENGTH %d\n\n",k);
}
The function does not build a string because the result array does not contain the terminating zero though a space for it was reserved when the array was allocated.
char * deco = malloc(21);
So you need to append the array with the terminating zero before exiting the function
//...
*(deco + p ) = '\0';
return deco;
}
Otherwise this do-while loop
do{
c = *(lineDOWN+k);
printf("%c",*(lineDOWN+k));
k++;
}while(c != '\0')
will have undefined behavior.
But even if you will append the array with the terminating zero the loop will count the length of the stored string incorrectly because it will increase the variable k even when the current character is the terminating zero.
Instead you should use a while loop. In this case the declaration of the variable c will be redundant. The loop can look like
while ( *( lineDOWN + k ) )
{
printf("%c",*(lineDOWN+k));
k++;
}
In this case this call
printf("\nLENGTH %d\n\n",k);
^^
will output the correct length of the string equal to 20.
And you should free the allocated memory before exiting the program
free( lineDOWN );
As some other wrote here in their answers that the array decorationUp must be declared like
char decorationUp[3][6] = {
{"AAAAA"},{"BBBBB"},{"CCCCC"}
};
then it is not necessary if you are not going to use elements of the array as strings and you are not using them as strings in your program.
Take into account that your program is full of magic numbers. Such a program is usually error-prone. Instead you should use named constants.
In
char decorationUp[3][5] = {
{"AAAAA"},{"BBBBB"},{"CCCCC"}
};
your string needs 6 characters to also place the null char, even in that case you do not use them as 'standard' string but only array of char. To get into the habit always reverse the place for the ending null character
you can do
char decorationUp[3][6] = {
{"AAAAA"},{"BBBBB"},{"CCCCC"}
};
Note it is useless to give the first size, the compiler counts for you
Because in main you stop when you read the null character you also need to place it in deco at the end, so you need to allocate 21 for it. As before you missed the place for the null character, but here that produces an undefined behavior because you read after the allocated block.
To do *(deco+p) is not readable, do deco[p]
So for instance :
char * lineDown(){
unsigned short state[] = {0,1,2,1};
char decorationUp[][6] = {
{"AAAAA"},{"BBBBB"},{"CCCCC"}
};
char * deco = malloc(4*5 + 1); /* a formula to explain why 21 is better than 21 directly */
int k;
int p = 0;
for(int j = 0; j < 4; j++){
k = state[j];
for(int i = 0; i < 5; i++){
deco[p] = decorationUp[k][i];
p++;
}
}
deco[p] = 0;
return deco;
}
When I print out the length of the temp string, it starts at a random number. The goal of this for loop is to filter out everything that's not a letter, and it works for the most part, but when I print out the filtered string it returns the filtered string but with some extra random characters before and after the string.
#define yes 1000
...
char stringed[yes] = "teststring";
int len = strlen(text);
char filt[yes];
for (int i = 0; i < len; i++) {
if (isalpha(stringed[i])) {
filt[strlen(filt)] = tolower(stringed[i]);
}
}
There are at least two problems with the line:
temp[strlen(temp)] = "\0";
The compiler should be shrieking about converting a pointer to an integer. You need '\0' and not "\0". (This might account for some of the odd characters; the least-significant byte of the address is probably stored over the null byte, making it and random other characters visible until the string printing comes across another null byte somewhere.)
With that fixed, the code carefully writes a null byte over the null byte that marks the end of the string.
You should probably not be using strlen() at this point (or at a number of other points where you use it in the loop).
You should be using i more in the loop. If your goal is to eliminate non-alpha characters, you probably need two indexes, one for 'next character to check' and one for 'next position to overwrite'. After the loop, you need to write over the 'next position to overwrite' with the null byte.
int j = 0; // Next position to overwrite
for (int i = 0; i < length; i++)
{
if (isalpha(text[i]))
temp[j++] = text[i];
}
temp[j] = '\0';
For starters the character array
char temp[MAX];
is not initialized. It has indeterminate values.
So these statements
printf("NUM:[%i] CHAR:[%c] TEMP:[%c] TEMPSTRLEN:[%i]\n", i, text[i], temp[strlen(temp)], strlen(temp));
temp[strlen(temp)] = tolower(text[i]);
have undefined behavior because you may not apply the standard function strlen to uninitialized character array.
This statement
temp[strlen(temp)] = "\0";
is also invalid.
In the left side of the assignment statement there is used the string literal "\0" which is implicitly converted to pointer to its first character.
So these statements
length = strlen(temp);
printf("[%s]\n", temp);
do not make sense.
It seems what you mean is the following
#include <stdio.h>
#include <string.h>
#include <ctype.h>
#define MAX 1000
int main(void)
{
char text[MAX] = "teststring";
size_t length = strlen(text);
char temp[MAX] = { '\0' };
// or
//char temp[MAX] = "";
for ( size_t i = 0; i < length; i++)
{
if (isalpha( ( unsigned char )text[i] ) )
{
printf("NUM:[%zu] CHAR:[%c] TEMP:[%c] TEMPSTRLEN:[%zu]\n", i, text[i], temp[strlen(temp)], strlen(temp));
temp[strlen(temp)] = tolower(text[i]);
temp[i+1] = '\0';
}
}
length = strlen(temp);
printf( "[%s]\n", temp );
return 0;
}
The program output is
NUM:[0] CHAR:[t] TEMP:[] TEMPSTRLEN:[0]
NUM:[1] CHAR:[e] TEMP:[] TEMPSTRLEN:[1]
NUM:[2] CHAR:[s] TEMP:[] TEMPSTRLEN:[2]
NUM:[3] CHAR:[t] TEMP:[] TEMPSTRLEN:[3]
NUM:[4] CHAR:[s] TEMP:[] TEMPSTRLEN:[4]
NUM:[5] CHAR:[t] TEMP:[] TEMPSTRLEN:[5]
NUM:[6] CHAR:[r] TEMP:[] TEMPSTRLEN:[6]
NUM:[7] CHAR:[i] TEMP:[] TEMPSTRLEN:[7]
NUM:[8] CHAR:[n] TEMP:[] TEMPSTRLEN:[8]
NUM:[9] CHAR:[g] TEMP:[] TEMPSTRLEN:[9]
[teststring]
Edit: next time do not change your question so cardinally because this can confuse readers of the question.
I have this string
char currentString[212] = { 0 };
and after I'm using it once, I want to reset it.
I tried many ways, such as:
for (int k = 0; k < strlen(currentString); k++)
{
currentString[k] = '\0';
}
but it won't go over the loop more than once, and it give '\0' only to the first char, the rest remain the same.
and I also tried:
currentString[0] = '\0';
yet I get the same result.
any suggestions for what can I do?
thanks!
strlen will find the length by searching for the first occurrence of \0. So if you want to reset the whole array, you should change strlen(currentString) to sizeof currentString. However, do note that this will not work with pointers.
If you pass the array to a function, you cannot determine the size of the array afterwards, so this will not work:
void foo(char * arr) {
for (int k = 0; k < sizeof arr; k++)
arr[k] = '\0';
}
Instead you need to do like this:
void foo(char * arr, size_t size) {
for (int k = 0; k < size; k++)
arr[k] = '\0';
}
But of course there's no reason to write custom functions for this when memset is available.
Imagine char currentString[] = "abc"; and then running you loop:
k = 0
initialy strlen(currentString) = 3, there are 3 characters before '\0' byte. the loop condition k < strlen(currentString) is true
k = 0 -> currentString[0] = '\0'
k++ -> k = 1
then strlen(currentString) = 0 (as the first byte of currentString is equal to '\0', there are no characters before '\0')
the loop condition is false k < strlen(currentString) -> 1 < 0
So the loop will always run only one time.
If you want to write only zero bytes to a memory region, use memset
memset(currentString, 0, sizeof(currentString));
will set the memory region as pointed to by currentString pointer with sizeof(currentString) bytes to zeros.
Setting the first byte to zero:
currentString[0] = '\0';
maybe considered enough to "clear a string".
Setting the first byte to '\0' wont clear out the currentString.You may think that because ANSI C thinks that is a string terminator and if you print your string it will show empty.But if you check the second byte you will see the second char from your string. As other's said the best option to wipe out the string is:
memset(currentString, 0, sizeof(currentString));
And is way safer and faster.Also in ANSI C 0 and '\0' are the same.
to zero the whole array
char arr[SOMESIZE];
/* ... */
memset(arr, 0, sizeof(arr));
pointer - you need to know the size of the allocated memory as sizeof will return the size of the pointer itself only, not the referenced object;
char *p = malloc(SIZE);
/* ..... */
memset(p, 0 , SIZE);
It is never a good decision to calculate anything again and again. Instead you should calculate the strlen() only once.
That being said, in your case, doing so will solve the problem, as the reason it didn't work was that strlen() returned 0 right after the first round, since the length of the string became 0.
int n = strlen(currentString);
for (int k = 0; k < n; k++)
{
currentString[k] = '\0';
}
As marked in the code, the first printf() rightfully prints only the i-th line of the matrix. But outiside the loop, both printf() and strcat() act on the whole matrix from i-th line on as a single-lined string. This means that
printf("%s\n",m_cfr[0])
will print whole matrix, but m_cfr[i] will print whole matrix from the i-th line on. char* string is a single lined string with no spaces.
trasp(char* string)
{
int row = strlen(string) / 5;
char m[row][5];
char m_cfr[row][5];
char cfr[row*5];
memset(cfr, 0, row * 5);
int key[5] = {3, 1, 2, 0, 4};
int k = 0;
for (i = 0 ; i < row ; i++)
{
strncpy(m[i], string + k, 5);
m[i][5] = '\0';
k += 5;
}
for (i = 0 ; i < row ; i++)
{
for (j = 0 ; j < 5 ; j++)
{
m_cfr[i][key[j]] = m[i][j];
}
m_cfr[i][5] = '\0';
printf("%s\n", m_cfr[i]); //--->prints only line i
}
printf("%s\n", m_cfr[0]); //prints whole matrix
strcat(cfr, m_cfr[0]); //concatenates whole matrix
printf("%s\n", cfr);
}
In your code, your array definition is
char m_cfr[row][5];
while you're accessing
m_cfr[i][5] = '\0';
/* ^
|
there is no 6th element
*/
You're facing off-by-one error. Out-of-bound memory access causes undefined behaviour.
Maybe you want to change the null-terminating statement to
m_cfr[i][4] = '\0'; //last one is null
%s expects a char* and prints everything until it encounters a \0. So,
printf("%s\n", m_cfr[i]);
printf("%s\n",m_cfr[0]);
strcat(cfr,m_cfr[0]);
All exhibit Undefined Behavior as m_cfr[i],m_cfr[0] and m_cfr[0] are chars and not char*s and %s as well as both the arguments of strcat expects a char*. Also, as SouravGhosh points out, using
m_cfr[i][5] = '\0';
And
m[i][5] = '\0';
Are wrong.
To fix the former issue, use
printf("%s\n", &m_cfr[i]);
printf("%s\n",m_cfr);
strcat(cfr,&m_cfr[0]);
To print the whole string and concatenate the two strings in the arguments of strcat or if you wanted to print just the chars, use
printf("%c\n", m_cfr[i]);
printf("%c\n",m_cfr[0]);
As for the latter issue, use
char m[row][5]={{0}};
char m_cfr[row][5]={{0}};