Write a function in C language that:
Takes as its only parameter a sentence stored in a string (e.g., "This is a short sentence.").
Returns a string consisting of the number of characters in each word (including punctuation), with spaces separating the numbers. (e.g., "4 2 1 5 9").
I wrote the following program:
int main()
{
char* output;
char *input = "My name is Pranay Godha";
output = numChar(input);
printf("output : %s",output);
getch();
return 0;
}
char* numChar(char* str)
{
int len = strlen(str);
char* output = (char*)malloc(sizeof(char)*len);
char* out = output;
int count = 0;
while(*str != '\0')
{
if(*str != ' ' )
{
count++;
}
else
{
*output = count+'0';
output++;
*output = ' ';
output++;
count = 0;
}
str++;
}
*output = count+'0';
output++;
*output = '\0';
return out;
}
I was just wondering that I am allocating len amount of memory for output string which I feel is more than I should have allocated hence there is some wasting of memory. Can you please tell me what can I do to make it more memory efficient?
I see lots of little bugs. If I were your instructor, I'd grade your solution at "C-". Here's some hints on how to turn it into "A+".
char* output = (char*)malloc(sizeof(char)*len);
Two main issues with the above line. For starters, you are forgetting to "free" the memory you allocate. But that's easily forgiven.
Actual real bug. If your string was only 1 character long (e.g. "x"), you would only allocate one byte. But you would likely need to copy two bytes into the string buffer. a '1' followed by a null terminating '\0'. The last byte gets copied into invalid memory. :(
Another bug:
*output = count+'0';
What happens when "count" is larger than 9? If "count" was 10, then *output gets assigned a colon, not "10".
Start by writing a function that just counts the number of words in a string. Assign the result of this function to a variable call num_of_words.
Since you could very well have words longer than 9 characters, so some words will have two or more digits for output. And you need to account for the "space" between each number. And don't forget the trailing "null" byte.
If you think about the case in which a 1-byte unsigned integer can have at most 3 chars in a string representation ('0'..'255') not including the null char or negative numbers, then sizeof(int)*3 is a reasonable estimate of the maximum string length for an integer representation (not including a null char). As such, the amount of memory you need to alloc is:
num_of_words = countWords(str);
num_of_spaces = (num_of_words > 0) ? (num_of_words - 1) : 0;
output = malloc(num_of_spaces + sizeof(int)*3*num_of_words + 1); // +1 for null char
So that's a pretty decent memory allocation estimate, but it will definitely allocate enough memory in all scenarios.
I think you have a few other bugs in your program. For starters, if there are multiple spaces between each word e.g.
"my gosh"
I would expect your program to print "2 4". But your code prints something else. Likely other bugs exist if there are leading or trailing spaces in your string. And the memory allocation estimate doesn't account for the extra garbage chars you are inserting in those cases.
Update:
Given that you have persevered and attempted to make a better solution in your answer below, I'm going to give you a hint. I have written a function that PRINTs the length of all words in a string. It doesn't actually allocate a string. It just prints it - as if someone had called "printf" on the string that your function is to return. Your job is to extrapolate how this function works - and then modify it to return a new string (that contains the integer lengths of all the words) instead of just having it print. I would suggest you modify the main loop in this function to keep a running total of the word count. Then allocate a buffer of size = (word_count * 4 *sizeof(int) + 1). Then loop through the input string again to append the length of each word into the buffer you allocated. Good luck.
void PrintLengthOfWordsInString(const char* str)
{
if ((str == NULL) || (*str == '\0'))
{
return;
}
while (*str)
{
int count = 0;
// consume leading white space
while ((*str) && (*str == ' '))
{
str++;
}
// count the number of consecutive non-space chars
while ((*str) && (*str != ' '))
{
count++;
str++;
}
if (count > 0)
{
printf("%d ", count);
}
}
printf("\n");
}
The answer is: it depends. There are trade-offs.
Yes, it's possible to write some extra code that, before performing this action, counts the number of words in the original string and then allocates the new string based on the number of words rather than the number of characters.
But is it worth it? The extra code would make your program longer. That is, you would have more binary code, taking up more memory, which may be more than you gain. In addition, it will take more time to run.
By the way, you have a memory leak in your program, which is more of a problem.
As long as none of the words in the sentence are longer than 9 characters, the length of your output array needs only to be the number of words in the sentence, multiplied by 2 (to account for the spaces), plus an extra one for the null terminator.
So for the string
My name is Pranay Godha
...you need only an array of length 11.
If any of the words are ten characters or more, you'll need to calculate how many extra char your array will need by determining the length of the numeric required. (e.g. a word of length 10 characters clearly requires two char to store the number 10.)
The real question is, is all of this worth it? Unless you're specifically required (homework?) to use the minimal space required in your output array, I'd be minded to allocate a suitably large array and perform some bounds checking when writing to it.
Related
I want to write a function that converts CamelCase to snake_case without using tolower.
Example: helloWorld -> hello_world
This is what I have so far, but the output is wrong because I overwrite a character in the string here: string[i-1] = '_';.
I get hell_world. I don't know how to get it to work.
void snake_case(char *string)
{
int i = strlen(string);
while (i != 0)
{
if (string[i] >= 65 && string[i] <= 90)
{
string[i] = string[i] + 32;
string[i-1] = '_';
}
i--;
}
}
This conversion means, aside from converting a character from uppercase to lowercase, inserting a character into the string. This is one way to do it:
iterate from left to right,
if an uppercase character if found, use memmove to shift all characters from this position to the end the string one position to the right, and then assigning the current character the to-be-inserted value,
stop when the null-terminator (\0) has been reached, indicating the end of the string.
Iterating from right to left is also possible, but since the choice is arbitrary, going from left to right is more idiomatic.
A basic implementation may look like this:
#include <stdio.h>
#include <string.h>
void snake_case(char *string)
{
for ( ; *string != '\0'; ++string)
{
if (*string >= 65 && *string <= 90)
{
*string += 32;
memmove(string + 1U, string, strlen(string) + 1U);
*string = '_';
}
}
}
int main(void)
{
char string[64] = "helloWorldAbcDEFgHIj";
snake_case(string);
printf("%s\n", string);
}
Output: hello_world_abc_d_e_fg_h_ij
Note that:
The size of the string to move is the length of the string plus one, to also move the null-terminator (\0).
I am assuming the function isupper is off-limits as well.
The array needs to be large enough to hold the new string, otherwise memmove will perform invalid writes!
The latter is an issue that needs to be dealt with in a serious implementation. The general problem of "writing a result of unknown length" has several solutions. For this case, they may look like this:
First determine how long the resulting string will be, reallocating the array, and only then modifying the string. Requires two passes.
Every time an uppercase character is found, reallocate the string to its current size + 1. Requires only one pass, but frequent reallocations.
Same as 2, but whenever the array is too small, reallocate the array to twice its current size. Requires a single pass, and less frequent (but larger) reallocations. Finally reallocate the array to the length of the string it actually contains.
In this case, I consider option 1 to be the best. Doing two passes is an option if the string length is known, and the algorithm can be split into two distinct parts: find the new length, and modify the string. I can add it to the answer on request.
This is my code for two functions in C:
// Begin
void readTrain(Train_t *train){
printf("Name des Zugs:");
char name[STR];
getlinee(name, STR);
strcpy(train->name, name);
printf("Name des Drivers:");
char namedriver[STR];
getlinee(namedriver, STR);
strcpy(train->driver, namedriver);
}
void getlinee(char *str, long num){
char c;
int i = 0;
while(((c=getchar())!='\n') && (i<num)){
*str = c;
str++;
i++;
}
printf("i is %d\n", i);
*str = '\0';
fflush(stdin);
}
// End
So, with void getlinee(char *str, long num) function I want to get user input to first string char name[STR] and to second char namedriver[STR]. Maximal string size is STR (30 charachters) and if I have at the input more than 30 characters for first string ("Name des Zuges"), which will be stored in name[STR], after that I input second string, which will be stored in namedriver, and then printing FIRST string, I do not get the string from the user input (first 30 characters from input), but also the second string "attached" to this, I simply do not know why...otherwise it works good, if the limit of 30 characters is respected for the first string.
Here my output, when the input is larger than 30 characters for first string, problem is in the row 5 "Zugname", why I also have second string when I m printing just first one...:
Name des Zugs:aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
i is 30
Name des Drivers:xxxxxxxx
i is 8
Zugname: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaxxxxxxxx
Drivername: xxxxxxxx
I think your issue is that your train->name is not properly terminated with '\0', as a consequence when you call printf("%s", train->name) the function keeps reading memory until it finds '\0'. In your case I guess your structure looks like:
struct Train_t {
//...
char name[STR];
char driver[STR];
//...
};
In getlinee() function, you write '\0' after the last character. In particular, if the input is more than 30 characters long, you copy the first 30 characters, then add '\0' at the 31-th character (name[30]). This is a first buffer overflow.
So where is this '\0' actually written? well, at name[30], even though your not supposed to write there. Then, if you have the structure above when you do strcpy(train->name, name); you will actually copy a 31-bytes long string: 30 chars into train->name, and the '\0' will overflow into train->driver[0]. This is the second buffer overflow.
After this, you override the train->driver buffer so the '\0' disappears and your data in memory basically looks like:
train->name = "aaa...aaa" // no '\0' at the end so printf won't stop reading here
train->driver = "xxx\0" // but there
You have an off-by-one error on your array sizes -- you have arrays of STR chars, and you read up to STR characters into them, but then you store a NUL terminator, requiring (up to) STR + 1 bytes total. So whenever you have a max size input, you run off the end of your array(s) and get undefined behavior.
Pass STR - 1 as the second argument to getlinee for the easiest fix.
Key issues
Size test in wrong order and off-by-one. ((c=getchar())!='\n') && (i<num) --> (i+1<num) && ((c=getchar())!='\n'). Else no room for the null character. Bad form to consume an excess character here.
getlinee() should be declared before first use. Tip: Enable all compiler warnings to save time.
Other
Use int c; not char c; to well distinguish the typical 257 different possible results from getchar().
fflush(stdin); is undefined behavior. Better code would consume excess characters in a line with other code.
void getlinee(char *str, long num) better with size_t num. size_t is the right size type for array sizing and indexing.
int i should be the same type as num.
Better code would also test for EOF.
while((i<num) && ((c=getchar())!='\n') && (c != EOF)){
A better design would return something from getlinee() to indicate success and identify troubles like end-of-file with nothing read, input error, too long a line and parameter trouble like str == NULL, num <= 0.
I believe you have a struct similar to this:
typedef struct train_s
{
//...
char name[STR];
char driver[STR];
//...
} Train_t;
When you attempt to write a '\0' to a string that is longer than STR (30 in this case), you actually write a '\0' to name[STR], which you don't have, since the last element of name with length STR has an index of STR-1 (29 in this case), so you are trying to write a '\0' outside your array.
And, since two strings in this struct are stored one after another, you are writing a '\0' to driver[0], which you immediately overwrite, hence when printing out name, printf doesn't find a '\0' until it reaches the end of driver, so it prints both.
Fixing this should be easy.
Just change:
while(((c=getchar())!='\n') && (i<num))
to:
while(((c=getchar())!='\n') && (i<num - 1))
Or, as I would do it, add 1 to array size:
char name[STR + 1];
char driver[STR + 1];
I'm facing an issue connected with printing one char from string in c.
The function takes from users two variables - number (number which should print character from string) and string. When I put as a string "Martin" and number is 5 then the output is "i". But when the number is larger than the string length something goes wrong and I actually don't know what's wrong.
PS. If the number is longer than string size it should print "Nothing".
void printLetter() {
char * string = (char*)malloc(sizeof(char));
int n;
printf("Number:\n");
scanf("%i", &n);
printf("String:\n");
scanf("%s", string);
if(n > strlen(string)) {
printf("nothing");
} else {
printf("%c\n", string[n+1]);
}
free(string);
}
There is no need for dynamic allocation here, since you do not know the length of the string in advance, so just do:
void printLetter() {
char string[100]; // example size 100
...
scanf("%99s", string); // read no more than your array can hold
}
A fun exercise would be to count the length of the string, allocate dynamically exactly as mush space as you need (+1 for the null terminator), copy string to that dynamically allocated space, use it as you wish, and then free it.
Moreover this:
printf("%c\n", string[n+1]);
should be written as this:
printf("%c\n", string[n-1]);
since you do not want to go out bounds of your array (and cause Undefined Behavior), or print two characters next of the requested character, since when I ask for the 1st character, you should print string[0], when I ask for the 2nd, you should print string[1], and so on. So you see why we need to print string[n-1], when the user asks for the n-th letter.
By the way, it's common to use a variable named i, and not n as in your case, when dealing with an index. ;)
In your code, this:
char * string = malloc(sizeof(char));
allocates memory for just one character, which is no good, since even if the string had one letter only, where would you put the null terminator? You know that strings in C should (almost) always be NULL terminated.
In order to allocate dynamically memory for a string of size N, you should do:
char * string = malloc((N + 1) * sizeof(char));
where you allocate space for N characters, plus 1 for the NULL terminator.
Couple of problems...
sizeof(char) is generally 1 byte. Hence malloc() is allocating only one byte of memory to string. Perhaps a larger block of memory is required? "Martin", for example, will require at least 6 bytes, plus the string termination character (seven bytes total).
printf("%c\n", string[n+1]) is perhaps not quite right...
String: Martin\0
strlen= 6
Offset: 0123456
n = 5... [n+1] = 6
The character being output is the string terminator '\0' at index 6.
This might work better:
void printLetter() {
char * string = malloc(100 * sizeof(char));
int n;
printf("Number:\n");
scanf("%i", &n);
printf("String:\n");
scanf("%s", string);
if(n > strlen(string)) {
printf("nothing");
} else {
printf("%c\n", string[n-1]);
}
free(string);
}
You are facing buffer overflow.
Take a look to this question, so it will show you how to manage your memory properly in such situation: How to prevent scanf causing a buffer overflow in C?
Alternatively you can ask for number of letter and allocate only that much memory + 1. Then fgets(string, n,stdin); because you don't need rest of the string :-)
I'm given a string as such:
"Hello World"
With 5 Spaces characters inbetween the two words. I want to remove all but one of the spaces between the two words . However, my code seems to only work when there is 3 spaces or less. I'm using memmove to try and accomplish this.
Here is what I've tried:
int main(void) {
char * word = malloc(sizeof(char)*16);
strcpy(word,"Hello World");
checkWords(word);
return 0;
}
void checkWords(char * word) {
int i;
for(i=0; i < strlen(word); i++) {
if(word[i] == ' ' )
memmove(&word[i],&word[i+1],strlen(word)+1);
}
printf("The string without spaces is %s\n",word);
}
The output here is "Hello World"
Not "Hello World"
If It try input such as:
"Hello World" gets me "Hello World" -->correct
"Hello World" gets me "Hello World" -->correct
anything greater than 3 spaces, gets me incorrect output. (I want to have one space between the two words.
There are four problems. One is undefined behaviour: Take a 1,000,000 byte string where just the last character is a space. You will be moving about one megabyte after the end of the string one byte forward. That's quite fatal.
One is just a bug: You examine every character position in the string only once. If you have two consecutive spaces, you move the second space into the position of the first one and leave it there.
And one is a performance issue: If you fix the first two problems, and then you take a 1,000,000 byte string consisting of spaces only, you will do one million memmoves each moving between 0 and 1 megabyte. That's 500 gigabyte moved. That takes time.
And another performance issue is calling strlen in a loop. If your string is one million bytes, you do one million calls to strlen, and each call will go through the whole string until the end, scanning the whole megabyte string for the trailing zero byte.
PS. I misinterpreted what you are trying to achieve: Your code will delete any single space, and delete one out of any pairs of spaces. So it leaves one space for every two spaces you had. So it won't work if there is one space. It works by coincidence for two or three spaces. And if you have lots of spaces, it will keep about half of them.
Copying the "right" side of the string potentially multiple times is inefficient. Use of strcpy() or memcpy() is not an efficient approach.
Repeatedly calling strlen() is also inefficient.
Suggest using two indexes and walk them through the string.
#include <stdbool.h>
void RemoveExtraSpaces(char * word) {
if (word[0]) {
size_t src = 1;
size_t dest = 1;
do {
if (word[src] != ' ' || word[src - 1] != ' ') {
word[dest] = word[src];
dest++;
}
} while (word[src++]);
}
printf("The string without spaces is `%s`\n", word);
}
Unclear what should happen with multiple spaces before the first word or after the last word. This code shrinks those to 1 space.
After accept variation - slight simplification. Inspired by #Joachim Pileborg good answer.
void RemoveExtraSpaces2(char * word) {
size_t src = 0;
size_t dest = 0;
do {
if (word[src] == ' ' && word[src + 1] == ' ') {
src++;
}
word[dest] = word[src];
dest++;
} while (word[src++]);
printf("The string without spaces is `%s`\n", word);
}
As this is too close to being a duplicate of Replace multiple spaces by single space in C, (BTW: which did not accept the best answer IMO), I am making this community wiki.
This works for me
for(i=0; i < strlen(word); i++) {
if(word[i] == ' ' )
{
while (word[i+1] == ' ' && word[i+1] != '\0')
memmove(&word[i],&word[i+1], strlen(word)-i);
}
}
Okay I have two problems with my solution to this problem, I was hoping I could get some help on. The problem itself is being able to print out #s in a specific format based on user input.
My questions are:
When I input 7, it outputs the correct solution, but when I output 8 (or higher), my buffer, for whatever reason add some garbage at the end, which I am unsure why it happens. I would add a picture but I don't have enough rep points for it :(
In my code, where I've inputted **HELPHERE**, I'm unsure why this gives me the correct solution. I'm confused because in the links I've read (on format specifiers) I thought that the 1 input (x in my case) specified how many spaces you wanted. I thought this would've made the solution x-n, as each consequent row, you'd need the space segment to decrease by 1 each time. Am I to understand that the array somehow reverses it's input into the printf statement? I'm confused because does that mean since the array increases by 1, on each subsequent iteration of the loop, it eats into the space area?
int main(void){
printf("Height: ");
int x = GetInt();
int n = 1;
int k=0;
char buff[x]; /* creates buffer where hashes will go*/
while(n<=x){ /* stops when getint value is hit*/
while(k<n) /* fill buffer on each iteration of loop with 1 more hashtag*/
{
buff[k] = '#';
k++;
}
printf("%*s",x, buff); /*makes x number of spaces ****HELPHERE*****, then prints buffer*/
printf(" ");
printf("%s\n",buff); /*prints other side of triangle */
/*printf("%*c \n",x-n, '\0');*/
n++;
}
}
Allocate enough memory and make sure the string is null terminated:
char buff[x+1];//need +1 for End of the string('\0')
memset(buff, '\0', sizeof(buff));//Must be initialized by zero
Print as many blanks as requested by blank-padding an empty string:
printf("%*s", x, "");
※the second item was written by Jonathan Leffler.
In printf("%*s",x, buff);, buff in not null character terminated.
Present code "worked" sometimes as buff was not properly terminated and the result was UB - undefined behavior. What likely happened in OP's case was that the buffer up to size 7, fortunately had '\0' in subsequent bytes, but not so when size was 8.
1) As per #BLUEPIXY, allocated a large enough buffer to accommodate the '#' and the terminating '\0' with char buff[x+1];
2) Change while loop to append the needed '\0'.
while (k<n) {
buff[k] = '#';
k++;
}
buff[k] = '\0';
3) Minor:insure x is valid.
if (x < 0) Handle_Error();
char buff[x];
4) Minor: Return a value for int main() such as return 0;.