Attempting to split and store arrays similar to strtok - c

For an assignment in class, we have been instructed to write a program which takes a string and a delimiter and then takes "words" and stores them in a new array of strings. i.e., the input ("my name is", " ") would return an array with elements "my" "name" "is".
Roughly, what I've attempted is to:
Use a separate helper called number_of_delimeters() to determine the size of the array of strings
Iterate through the initial array to find the number of elements in a given string which would be placed in the array
Allocate storage within my array for each string
Store the elements within the allocated memory
Include directives:
#include <stdlib.h>
#include <stdio.h>
This is the separate helper:
int number_of_delimiters (char* s, int d)
{
int numdelim = 0;
for (int i = 0; s[i] != '\0'; i++)
{
if (s[i] == d)
{
numdelim++;
}
}
return numdelim;
}
`This is the function itself:
char** split_at (char* s, char d)
{
int numdelim = number_of_delimiters(s, d);
int a = 0;
int b = 0;
char** final = (char**)malloc((numdelim+1) * sizeof(char*));
for (int i = 0; i <= numdelim; i++)
{
int sizeofj = 0;
while (s[a] != d)
{
sizeofj++;
a++;
}
final[i] = (char*)malloc(sizeofj);
a++;
int j = 0;
while (j < sizeofj)
{
final[i][j] = s[b];
j++;
b++;
}
b++;
final[i][j+1] = '\0';
}
return final;
}
To print:
void print_string_array(char* a[], unsigned int alen)
{
printf("{");
for (int i = 0; i < alen; i++)
{
if (i == alen - 1)
{
printf("%s", a[i]);
}
else
{
printf("%s ", a[i]);
}
}
printf("}");
}
int main(int argc, char *argv[])
{
print_string_array(split_at("Hi, my name is none.", ' '), 5);
return 0;
}
This currently returns {Hi, my name is none.}
After doing some research, I realized that the purpose of this function is either similar or identical to strtok. However, looking at the source code for this proved to be little help because it included concepts we have not yet used in class.
I know the question is vague, and the code rough to read, but what can you point to as immediately problematic with this approach to the problem?

The program has several problems.
while (s[a] != d) is wrong, there is no delimiter after the last word in the string.
final[i][j+1] = '\0'; is wrong, j+1 is one position too much.
The returned array is unusable, unless you know beforehand how many elements are there.

Just for explanation:
strtok will modify the array you pass in! After
char test[] = "a b c ";
for(char* t = test; strtok(t, " "); t = NULL);
test content will be:
{ 'a', 0, 'b', 0, 'c', 0, 0 }
You get subsequently these pointers to your test array: test + 0, test + 2, test + 4, NULL.
strtok remembers the pointer you pass to it internally (most likely, you saw a static variable in your source code...) so you can (and must) pass NULL the next time you call it (as long as you want to operate on the same source string).
You, in contrast, apparently want to copy the data. Fine, one can do so. But here we get a problem:
char** final = //...
return final;
void print_string_array(char* a[], unsigned int alen)
You just return the array, but you are losing length information!
How do you want to pass the length to your print function then?
char** tokens = split_at(...);
print_string_array(tokens, sizeof(tokens));
will fail, because sizeof(tokens) will always return the size of a pointer on your local system (most likely 8, possibly 4 on older hardware)!
My personal recommendation: create a null terminated array of c strings:
char** final = (char**)malloc((numdelim + 2) * sizeof(char*));
// ^ (!)
// ...
final[numdelim + 1] = NULL;
Then your print function could look like this:
void print_string_array(char* a[]) // no len parameter any more!
{
printf("{");
if(*a)
{
printf("%s", *a); // printing first element without space
for (++a; *a; ++a) // *a: checking, if current pointer is not NULL
{
printf(" %s", *a); // next elements with spaces
}
}
printf("}");
}
No problems with length any more. Actually, this is exactly the same principle C strings use themselves (the terminating null character, remember?).
Additionally, here is a problem in your own code:
while (j < sizeofj)
{
final[i][j] = s[b];
j++; // j will always point behind your string!
b++;
}
b++;
// thus, you need:
final[i][j] = '\0'; // no +1 !
For completeness (this was discovered by n.m. already, see the other answer): If there is no trailing delimiter in your source string,
while (s[a] != d)
will read beyond your input string (which is undefined behaviour and could result in your program crashing). You need to check for the terminating null character, too:
while(s[a] && s[a] != d)
Finally: how do you want to handle subsequent delimiters? Currently, you will insert empty strings into your array? Print out your strings as follows (with two delimiting symbols - I used * and + like birth and death...):
printf("*%s+", *a);
and you will see. Is this intended?
Edit 2: The variant with pointer arithmetic (only):
char** split_at (char* s, char d)
{
int numdelim = 0;
char* t = s; // need a copy
while(*t)
{
numdelim += *t == d;
++t;
}
char** final = (char**)malloc((numdelim + 2) * sizeof(char*));
char** f = final; // pointer to current position within final
t = s; // re-assign t, using s as start pointer for new strings
while(*t) // see above
{
if(*t == d) // delimiter found!
{
// can subtract pointers --
// as long as they point to the same array!!!
char* n = (char*)malloc(t - s + 1); // +1: terminating null
*f++ = n; // store in position pointer and increment it
while(s != t) // copy the string from start to current t
*n++ = *s++;
*n = 0; // terminate the new string
}
++t; // next character...
}
*f = NULL; // and finally terminate the string array
return final;
}

While I've now been shown a more elegant solution, I've found and rectified the issues in my code:
char** split_at (char* s, char d)
{
int numdelim = 0;
int x;
for (x = 0; s[x] != '\0'; x++)
{
if (s[x] == d)
{
numdelim++;
}
}
int a = 0;
int b = 0;
char** final = (char**)malloc((numdelim+1) * sizeof(char*));
for (int i = 0; i <= numdelim; i++)
{
int sizeofj = 0;
while ((s[a] != d) && (a < x))
{
sizeofj++;
a++;
}
final[i] = (char*)malloc(sizeofj);
a++;
int j = 0;
while (j < sizeofj)
{
final[i][j] = s[b];
j++;
b++;
}
final[i][j] = '\0';
b++;
}
return final;
}
I consolidated what I previously had as a helper function, and modified some points where I incorrectly incremented .

Related

How to use two pointer to define a string isPalindrome?

Input: s = "A man, a plan, a canal: Panama"
Output: true
Explanation: "amanaplanacanalpanama" is a palindrome.
bool isPalindrome(char * s){
if(strlen(s) == 0) return true;
int m = 0;
for(int i = 0; i < strlen(s); i++)
if(isalnum(s[i])) s[m++] = tolower(s[i]);
int i = 0;
while(i<m)
if(s[i++] != s[--m]) return false;
return true;
}
My code's running time is 173ms. My instructor suggested me to use two pointers to improve the performance and memory usage, but I have no idea where to start.
Just position the two pointers like this
char* first = someString;
char* end = someString + strlen(s) - 1;
Now for it to be a palindrome what first and end point to must be the same
e.g. char someString[] = "1331";
So you in the first iteration *first == *last i.e. '1'
Now move the pointers towards each other until there is nothing left to compare or when they differ
++first, --end;
now *first and *last point to '3'
and so on, check if they are pointing to the same or have passed each other it is a palindrome.
Something like this
#include <stdio.h>
#include <string.h>
int palindrome(char* str)
{
char* start = str;
char* end = str + strlen(str) - 1;
for (; start < end; ++start, --end )
{
if (*start != *end)
{
return 0;
}
}
return 1;
}
int main()
{
printf("palindrome: %d\n", palindrome("1331"));
printf("palindrome: %d\n", palindrome("132331"));
printf("palindrome: %d\n", palindrome("74547"));
return 0;
}
You should add error checks, there are no error checks in the function.
My code's running time is 173ms. My instructor suggested me to use two pointers to improve the performance and memory usage, but I have no idea where to start.
It's already running in O(n) so you cannot reduce the time complexity (except for the iterative call to strlen, see below), although there are some room for improving performance.
Your function does not declare any arrays, and only use a few variables and the memory usage does not depend at all on input size. The memory usage is already O(1) and very low, so it's not a real concern.
But if you want to do it with pointers, here is one:
bool isPalindrome(char * s){
char *end = s + strlen(s);
char *a = s;
char *b = end-1;
while(true) {
// Skip characters that's not alphanumeric
while( a != end && !isalnum(*a) ) a++;
while( b != s && !isalnum(*b) ) b--;
// We're done when we have passed the middle
if(b < a) break;
// Perform the check
if(tolower(*a) != tolower(*b)) return false;
// Step to next character
a++;
b--;
}
return true;
}
When it comes to performance, your code has two issues, none of which gets solved by pointers. First one is that you're calling strlen for each iteration. The second is that you don't need to loop through the whole array, because that's checking it twice.
for(int i = 0; i < strlen(s); i++)
should be
size_t len = strlen(s);
for(size_t i = 0; i < len/2; i++)
Another remark I have on your code is that it changes the input string. That's not necessary. If I have a function that is called isPalindrome I'd expect it to ONLY check if the string is a palindrome or not. IMO, the signature should be bool isPalindrome(const char * s)

Check if Char Array contains special sequence without using string library on Unix in C

Let‘s assume we have a char array and a sequence. Next we would like to check if the char array contains the special sequence WITHOUT <string.h> LIBRARY: if yes -> return true; if no -> return false.
bool contains(char *Array, char *Sequence) {
// CONTAINS - Function
for (int i = 0; i < sizeof(Array); i++) {
for (int s = 0; s < sizeof(Sequence); s++) {
if (Array[i] == Sequence[i]) {
// How to check if Sequence is contained ?
}
}
}
return false;
}
// in Main Function
char *Arr = "ABCDEFG";
char *Seq = "AB";
bool contained = contains(Arr, Seq);
if (contained) {
printf("Contained\n");
} else {
printf("Not Contained\n");
}
Any ideas, suggestions, websites ... ?
Thanks in advance,
Regards, from ∆
The simplest way is the naive search function:
for (i = 0; i < lenS1; i++) {
for (j = 0; j < lenS2; j++) {
if (arr[i] != seq[j]) {
break; // seq is not present in arr at position i!
}
}
if (j == lenS2) {
return true;
}
}
Note that you cannot use sizeof because the value you seek is not known at run time. Sizeof will return the pointer size, so almost certainly always four or eight whatever the strings you use. You need to explicitly calculate the string lengths, which in C is done by knowing that the last character of the string is a zero:
lenS1 = 0;
while (string1[lenS1]) lenS1++;
lenS2 = 0;
while (string2[lenS2]) lenS2++;
An obvious and easy improvement is to limit i between 0 and lenS1 - lenS2, and if lenS1 < lenS2, immediately return false. Obviously if you haven't found "HELLO" in "WELCOME" by the time you've gotten to the 'L', there's no chance of five-character HELLO being ever contained in the four-character remainder COME:
if (lenS1 < lenS2) {
return false; // You will never find "PEACE" in "WAR".
}
lenS1minuslenS2 = lenS1 - lenS2;
for (i = 0; i < lenS1minuslenS2; i++)
Further improvements depend on your use case.
Looking for the same sequence among lots of arrays, looking for different sequences always in the same array, looking for lots of different sequences in lots of different arrays - all call for different optimizations.
The length and distribution of characters within both array and sequence also matter a lot, because if you know that there only are (say) three E's in a long string and you know where they are, and you need to search for HELLO, there's only three places where HELLO might fit. So you needn't scan the whole "WE WISH YOU A MERRY CHRISTMAS, WE WISH YOU A MERRY CHRISTMAS AND A HAPPY NEW YEAR" string. Actually you may notice there are no L's in the array and immediately return false.
A balanced option for an average use case (it does have pathological cases) might be supplied by the Boyer-Moore string matching algorithm (C source and explanation supplied at the link). This has a setup cost, so if you need to look for different short strings within very large texts, it is not a good choice (there is a parallel-search version which is good for some of those cases).
This is not the most efficient algorithm but I do not want to change your code too much.
size_t mystrlen(const char *str)
{
const char *end = str;
while(*end++);
return end - str - 1;
}
bool contains(char *Array, char *Sequence) {
// CONTAINS - Function
bool result = false;
size_t s, i;
size_t arrayLen = mystrlen(Array);
size_t sequenceLen = mystrlen(Sequence);
if(sequenceLen <= arrayLen)
{
for (i = 0; i < arrayLen; i++) {
for (s = 0; s < sequenceLen; s++)
{
if (Array[i + s] != Sequence[s])
{
break;
}
}
if(s == sequenceLen)
{
result = true;
break;
}
}
}
return result;
}
int main()
{
char *Arr = "ABCDEFG";
char *Seq = "AB";
bool contained = contains(Arr, Seq);
if (contained)
{
printf("Contained\n");
}
else
{
printf("Not Contained\n");
}
}
Basically this is strstr
const char* strstrn(const char* orig, const char* pat, int n)
{
const char* it = orig;
do
{
const char* tmp = it;
const char* tmp2 = pat;
if (*tmp == *tmp2) {
while (*tmp == *tmp2 && *tmp != '\0') {
tmp++;
tmp2++;
}
if (n-- == 0)
return it;
}
tmp = it;
tmp2 = pat;
} while (*it++ != '\0');
return NULL;
}
The above returns n matches of substring in a string.

Try to split string but got messy substrings

I try to split one string to 3-gram strings. But turns out that the resulting substrings were always messy. The length and char ** input... are needed, since I will use them as args later for python calling the funxtion.
This is the function I wrote.
struct strArrIntArr getSearchArr(char* input, int length) {
struct strArrIntArr nameIndArr;
// flag of same bit
int same;
// flag/index of identical strings
int flag = 0;
// how many identical strings
int num = 0;
// array of split strings
char** nameArr = (char **)malloc(sizeof(char *) * (length - 2));
if ( nameArr == NULL ) exit(0);
// numbers of every split string
int* valueArr = (int* )malloc(sizeof(int) * (length-2));
if ( valueArr == NULL ) exit(0);
// loop length of search string -2 times (3-gram)
for(int i = 0; i<length-2; i++){
if(flag==0){
nameArr[i - num] = (char *)malloc(sizeof(char) * 3);
if ( nameArr[i - num] == NULL ) exit(0);
printf("----i------------%d------\n", i);
printf("----i-num--------%d------\n", i-num);
}
flag = 0;
// compare splitting string with existing split strings,
// if a string exists, it would not be stored
for(int k=0; k<i-num; k++){
same = 0;
for(int j=0; j<3; j++){
if(input[i + j] == nameArr[k][j]){
same ++;
}
}
// identical strings found, if all the three bits are the same
if(same == 3){
flag = k;
num++;
break;
}
}
// if the current split string doesn't exist yet
// put current split string to array
if(flag == 0){
for(int j=0; j<3; j++){
nameArr[i-num][j] = input[i + j];
valueArr[i-num] = 1;
}
}else{
valueArr[flag]++;
}
printf("-----string----%s\n", nameArr[i-num]);
}
// number of N-gram strings
nameIndArr.length = length- 2- num;
// array of N-gram strings
nameIndArr.charArr = nameArr;
nameIndArr.intArr = valueArr;
return nameIndArr;
}
To call the function:
int main(int argc, const char * argv[]) {
int length = 30;
char* input = (char *)malloc(sizeof(char) * length);
input = "googleapis.com.wncln.wncln.org";
// split the search string into N-gram strings
// and count the numbers of every split string
struct strArrIntArr nameIndArr = getSearchArr(input, length);
}
Below is the result. The strings from 17 are messy.
----i------------0------
----i-num--------0------
-----string----goo
----i------------1------
----i-num--------1------
-----string----oog
----i------------2------
----i-num--------2------
-----string----ogl
----i------------3------
----i-num--------3------
-----string----gle
----i------------4------
----i-num--------4------
-----string----lea
----i------------5------
----i-num--------5------
-----string----eap
----i------------6------
----i-num--------6------
-----string----api
----i------------7------
----i-num--------7------
-----string----pis
----i------------8------
----i-num--------8------
-----string----is.
----i------------9------
----i-num--------9------
-----string----s.c
----i------------10------
----i-num--------10------
-----string----.co
----i------------11------
----i-num--------11------
-----string----com
----i------------12------
----i-num--------12------
-----string----om.
----i------------13------
----i-num--------13------
-----string----m.w
----i------------14------
----i-num--------14------
-----string----.wn
----i------------15------
----i-num--------15------
-----string----wnc
---i------------16------
----i-num--------16------
-----string----ncl
----i------------17------
----i-num--------17------
-----string----clnsole
----i------------18------
----i-num--------18------
-----string----ln.=C:
----i------------19------
----i-num--------19------
-----string----n.wgram 馻绚s
----i------------20------
----i-num--------20------
-----string----n.wgram 馻绚s
-----string----n.wgram 馻绚s
-----string----n.wgram 馻绚s
-----string----n.wgram 馻绚s
-----string----n.wgram 馻绚s
-----string----n.oiles(騛窑=
----i------------26------
----i-num--------21------
-----string----.orSModu鯽蓼t
----i------------27------
----i-num--------22------
-----string----org
under win10, codeblocks 17.12, gcc 8.1.0
You are making life complicated for you in several places:
Don't count backwards: Instead of making num the count of duplicates, make it the count of unique trigraphs.
Scope variable definitions in functions as closely as possible. You have several uninitialized variables. You have declared them at the start of the function, but you need them only in local blocks.
Initialize as soon as you allocate. In your code, you use a flag to determine whather to create a new string. The code to allocate he string and to initialize it are in different blocks. Those blocks have the same flag as condition, but the flag is updated in between. This could lead to asynchronities, even to bugs when you try to initialize memory that wasn't allocated.
It's probably better to keep the strings and their counts together in a struct. If anything, this will help you with sorting later. This also offers some simplification: Instead of allocating chunks of 3 bytes, keep a char array of four bytes in the struct, so that all entries can be properly null-terminated. Those don't need to be allocated separately.
Here's an alternative implementation:
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
struct tri {
char str[4]; // trigraph: 3 chars and NUL
int count; // count of occurrences
};
struct stat {
struct tri *tri; // list of trigraphs with counts
int size; // number of trigraphs
};
/*
* Find string 'key' in list of trigraphs. Return the index
* or in the array or -1 if it isn't found.
*/
int find_trigraph(const struct tri *tri, int n, const char *key)
{
for (int i = 0; i < n; i++) {
int j = 0;
while (j < 3 && tri[i].str[j] == key[j]) j++;
if (j == 3) return i;
}
return -1;
}
/*
* Create an array of trigraphs from the input string.
*/
struct stat getSearchArr(char* input, int length)
{
int num = 0;
struct tri *tri = malloc(sizeof(*tri) * (length - 2));
for(int i = 0; i < length - 2; i++) {
int index = find_trigraph(tri, num, input + i);
if (index < 0) {
snprintf(tri[num].str, 4, "%.3s", input + i); // see [1]
tri[num].count = 1;
num++;
} else {
tri[index].count++;
}
}
for(int i = 0; i < num; i++) {
printf("#%d %s: %d\n", i, tri[i].str, tri[i].count);
}
struct stat stat = { tri, num };
return stat;
}
/*
* Driver code
*/
int main(void)
{
char *input = "googleapis.com.wncln.wncln.org";
int length = strlen(input);
struct stat stat = getSearchArr(input, length);
// ... do stuff with stat ...
free(stat.tri);
return 0;
}
Footnote 1: I find that snprintf(str, n, "%.*s", len, str + offset) is useful for copying substrings: The result will not overflow the buffer and it will be null-terminated. There really ought to be a stanard function for this, but strcpy may overflow and strncpy may leave the buffer unterminated.
This answer tries to fix the existing code instead of proposing alternative/better solutions.
After fixing the output
printf("-----string----%s\n", nameArr[i-num]);
in the question, there is still another important problem.
You want to store 3 characters in nameArr[i-num] and allocate space for 3 characters. Later you print is as a string in the code shown above. This requires a trailing '\0' after the 3 characters, so you have to allocate memory for 4 characters and either append a '\0' or initialize the allocated memory with 0. Using calloc instead of malloc would automatically initialize the memory to 0.
Here is a modified version of the source code
I also changed the initialization of the string value and its length in main() to avoid the memory leak.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
struct strArrIntArr {
int length;
char **charArr;
int *intArr;
};
struct strArrIntArr getSearchArr(char* input, int length) {
struct strArrIntArr nameIndArr;
// flag of same bit
int same;
// flag/index of identical strings
int flag = 0;
// how many identical strings
int num = 0;
// array of split strings
char** nameArr = (char **)malloc(sizeof(char *) * (length - 2));
if ( nameArr == NULL ) exit(0);
// numbers of every split string
int* valueArr = (int* )malloc(sizeof(int) * (length-2));
if ( valueArr == NULL ) exit(0);
// loop length of search string -2 times (3-gram)
for(int i = 0; i<length-2; i++){
if(flag==0){
nameArr[i - num] = (char *)malloc(sizeof(char) * 4);
if ( nameArr[i - num] == NULL ) exit(0);
printf("----i------------%d------\n", i);
printf("----i-num--------%d------\n", i-num);
}
flag = 0;
// compare splitting string with existing split strings,
// if a string exists, it would not be stored
for(int k=0; k<i-num; k++){
same = 0;
for(int j=0; j<3; j++){
if(input[i + j] == nameArr[k][j]){
same ++;
}
}
// identical strings found, if all the three bits are the same
if(same == 3){
flag = 1;
num++;
break;
}
}
// if the current split string doesn't exist yet
// put current split string to array
if(flag == 0){
for(int j=0; j<3; j++){
nameArr[i-num][j] = input[i + j];
valueArr[i-num] = 1;
}
nameArr[i-num][3] = '\0';
}else{
valueArr[flag]++;
}
printf("-----string----%s\n", nameArr[i-num]);
}
// number of N-gram strings
nameIndArr.length = length- 2- num;
// array of N-gram strings
nameIndArr.charArr = nameArr;
nameIndArr.intArr = valueArr;
return nameIndArr;
}
int main(int argc, const char * argv[]) {
int length;
char* input = strdup("googleapis.com.wncln.wncln.org");
length = strlen(input);
// split the search string into N-gram strings
// and count the numbers of every split string
struct strArrIntArr nameIndArr = getSearchArr(input, length);
}
This other answer contains more improvements which I personally would prefer over the modified original solution.

remove a specified number of characters from a string in C

I can't write a workable code for a function that deletes N characters from the string S, starting from position P. How you guys would you write such a function?
void remove_substring(char *s, int p, int n) {
int i;
if(n == 0) {
printf("%s", s);
}
for (i = 0; i < p - 1; i++) {
printf("%c", s[i]);
}
for (i = strlen(s) - n; i < strlen(s); i++) {
printf("%c", s[i]);
}
}
Example:
s: "abcdefghi"
p: 4
n: 3
output:
abcghi
But for a case like n = 0 and p = 1 it's not working!
Thanks a lot!
A few people have shown you how to do this, but most of their solutions are highly condensed, use standard library functions or simply don't explain what's going on. Here's a version that includes not only some very basic error checking but some explanation of what's happening:
void remove_substr(char *s, size_t p, size_t n)
{
// p is 1-indexed for some reason... adjust it.
p--;
// ensure that we're not being asked to access
// memory past the current end of the string.
// Note that if p is already past the end of
// string then p + n will, necessarily, also be
// past the end of the string so this one check
// is sufficient.
if(p + n >= strlen(s))
return;
// Offset n to account for the data we will be
// skipping.
n += p;
// We copy one character at a time until we
// find the end-of-string character
while(s[n] != 0)
s[p++] = s[n++];
// And make sure our string is properly terminated.
s[p] = 0;
}
One caveat to watch out for: please don't call this function like this:
remove_substr("abcdefghi", 4, 3);
Or like this:
char *s = "abcdefghi";
remove_substr(s, 4, 3);
Doing so will result in undefined behavior, as string literals are read-only and modifying them is not allowed by the standard.
Strictly speaking, you didn't implement a removal of a substring: your code prints the original string with a range of characters removed.
Another thing to note is that according to your example, the index p is one-based, not zero-based like it is in C. Otherwise the output for "abcdefghi", 4, 3 would have been "abcdhi", not "abcghi".
With this in mind, let's make some changes. First, your math is a little off: the last loop should look like this:
for (i = p+n-1; i < strlen(s); i++) {
printf("%c", s[i]);
}
Demo on ideone.
If you would like to use C's zero-based indexing scheme, change your loops as follows:
for (i = 0; i < p; i++) {
printf("%c", s[i]);
}
for (i = p+n; i < strlen(s); i++) {
printf("%c", s[i]);
}
In addition, you should return from the if at the top, or add an else:
if(n == 0) {
printf("%s", s);
return;
}
or
if(n == 0) {
printf("%s", s);
} else {
// The rest of your code here
...
}
or remove the if altogether: it's only an optimization, your code is going to work fine without it, too.
Currently, you code would print the original string twice when n is 0.
If you would like to make your code remove the substring and return a result, you need to allocate the result, and replace printing with copying, like this:
char *remove_substring(char *s, int p, int n) {
// You need to do some checking before calling malloc
if (n == 0) return s;
size_t len = strlen(s);
if (n < 0 || p < 0 || p+n > len) return NULL;
size_t rlen = len-n+1;
char *res = malloc(rlen);
if (res == NULL) return NULL;
char *pt = res;
// Now let's use the two familiar loops,
// except printf("%c"...) will be replaced with *p++ = ...
for (int i = 0; i < p; i++) {
*pt++ = s[i];
}
for (int i = p+n; i < strlen(s); i++) {
*pt++ = s[i];
}
*pt='\0';
return res;
}
Note that this new version of your code returns dynamically allocated memory, which needs to be freed after use.
Here is a demo of this modified version on ideone.
Try copying the first part of the string, then the second
char result[10];
const char input[] = "abcdefg";
int n = 3;
int p = 4;
strncpy(result, input, p);
strncpy(result+p, input+p+n, length(input)-p-n);
printf("%s", result);
If you are looking to do this without the use of functions like strcpy or strncpy (which I see you said in a comment) then use a similar approach to how strcpy (or at least one possible variant) works under the hood:
void strnewcpy(char *dest, char *origin, int n, int p) {
while(p-- && *dest++ = *origin++)
;
origin += n;
while(*dest++ = *origin++)
;
}
metacode:
allocate a buffer for the destination
decalre a pointer s to your source string
advance the pointer "p-1" positions in your source string and copy them on the fly to destination
advance "n" positions
copy rest to destination
What did you try? Doesn't strcpy(s+p, s+p+n) work?
Edit: Fixed to not rely on undefined behaviour in strcpy:
void remove_substring(char *s, int p, int n)
{
p--; // 1 indexed - why?
memmove(s+p, s+p+n, strlen(s) - n);
}
If your heart's really set on it, you can also replace the memmove call with a loop:
char *dst = s + p;
char *src = s + p + n;
for (int i = 0; i < strlen(s) - n; i++)
*dst++ = *src++;
And if you do that, you can strip out the strlen call, too:
while ((*dst++ = *src++) != '\0);
But I'm not sure I recommend compressing it that much.

Reversing an array in C

I am new to cpp and have a question regarding arrays. The code I have below should create a reversed version of str and have it be stored in newStr. However, newStr always comes up empty. Can someone explain to me why this is happening even though I am assigning a value from str into it?
void reverse (char* str) {
char* newStr = (char*)malloc(sizeof(str));
for (int i=0;i<sizeof(str)/sizeof(char);i++) {
int index = sizeof(str)/sizeof(char)-1-i;
newStr [i] = str [index];
}
}
PS: I know that it is much more efficient to reverse an array by moving the pointer or by using the std::reverse function but I am interested in why the above code does not work.
As above commenters pointed out sizeof(str) does not tell you the length of the string. You should use size_t len = strlen(str);
void reverse (char* str) {
size_t len = strlen(str);
char* newStr = (char*)malloc(len + 1);
for (int i=0; i<len;i++) {
int index = len-1-i;
newStr[i] = str[index];
}
newStr[len] = '\0'; // Add terminator to the new string.
}
Don't forget to free any memory you malloc. I assume your function is going to return your new string?
Edit: +1 on the length to make room for the terminator.
The sizeof operator (it is not a function!) is evaluated at compile time. You are passing it a pointer to a region of memory that you claim holds a string. However, the length of this string isn't fixed at compile time. sizeof(str)/sizeof(char) will always yield the size of a pointer on your architecture, probably 8 or 4.
What you want is to use strlen to determine the length of your string.
Alternatively, a more idiomatic way of doing this would be to use std::string (if you insist of reversing the string yourself)
std::string reverse(std::string str) {
for (std::string::size_type i = 0, j = str.size(); i+1 < j--; ++i) {
char const swap = str[i];
str[i] = str[j];
str[j] = swap;
}
return str;
}
Note that due to implicit conversion (see overload (5)), you can also call this function with your plain C-style char pointer.
There are two issues here:
The sizeof operator won't give you the length of the string. Rather, it gives you the size of a char* on the machine you are using. You can use strlen() instead to get the
A c-string is terminated by a NULL character (which is why strlen() can return the correct length of the string). You need to make sure you are not accidentally copying the NULL character from your source string to the beginning of your destination string. Also, you need to add a NULL character at the end of your destination string or you will get some unexpected output.
#include <bits/stdc++.h>
using namespace std;
vector<string> split_string(string);
// Complete the reverseArray function below.
vector<int> reverseArray(vector<int> a) {
return {a.rbegin(), a.rend()};
}
int main()
{
ofstream fout(getenv("OUTPUT_PATH"));
int arr_count;
cin >> arr_count;
cin.ignore(numeric_limits<streamsize>::max(), '\n');
string arr_temp_temp;
getline(cin, arr_temp_temp);
vector<string> arr_temp = split_string(arr_temp_temp);
vector<int> arr(arr_count);
for (int i = 0; i < arr_count; i++) {
int arr_item = stoi(arr_temp[i]);
arr[i] = arr_item;
}
vector<int> res = reverseArray(arr);
for (int i = 0; i < res.size(); i++) {
fout << res[i];
if (i != res.size() - 1) {
fout << " ";
}
}
fout << "\n";
fout.close();
return 0;
}
vector<string> split_string(string input_string) {
string::iterator new_end = unique(input_string.begin(), input_string.end(), [] (const char &x, const char &y) {
return x == y and x == ' ';
});
input_string.erase(new_end, input_string.end());
while (input_string[input_string.length() - 1] == ' ') {
input_string.pop_back();
}
vector<string> splits;
char delimiter = ' ';
size_t i = 0;
size_t pos = input_string.find(delimiter);
while (pos != string::npos) {
splits.push_back(input_string.substr(i, pos - i));
i = pos + 1;
pos = input_string.find(delimiter, i);
}
splits.push_back(input_string.substr(i, min(pos, input_string.length()) - i + 1));
return splits;
}

Resources