How do I parse a string in C? - c

I am a beginner learning C; so, please go easy on me. :)
I am trying to write a very simple program that takes each word of a string into a "Hi (input)!" sentence (it assumes you type in names). Also, I am using arrays because I need to practice them.
My problem is that, some garbage gets putten into the arrays somewhere, and it messes up the program. I tried to figure out the problem but to no avail; so, it is time to ask for expert help. Where have I made mistakes?
p.s.: It also has an infinite loop somewhere, but it is probably the result of the garbage that is put into the array.
#include <stdio.h>
#define MAX 500 //Maximum Array size.
int main(int argc, const char * argv[])
{
int stringArray [MAX];
int wordArray [MAX];
int counter = 0;
int wordCounter = 0;
printf("Please type in a list of names then hit ENTER:\n");
// Fill up the stringArray with user input.
stringArray[counter] = getchar();
while (stringArray[counter] != '\n') {
stringArray[++counter] = getchar();
}
// Main function.
counter = 0;
while (stringArray[wordCounter] != '\n') {
// Puts first word into temporary wordArray.
while ((stringArray[wordCounter] != ' ') && (stringArray[wordCounter] != '\n')) {
wordArray[counter++] = stringArray[wordCounter++];
}
wordArray[counter] = '\0';
//Prints out the content of wordArray.
counter = 0;
printf("Hi ");
while (wordArray[counter] != '\0') {
putchar(wordArray[counter]);
counter++;
}
printf("!\n");
//Clears temporary wordArray for new use.
for (counter = 0; counter == MAX; counter++) {
wordArray[counter] = '\0';
}
wordCounter++;
counter = 0;
}
return 0;
}
Solved it! I needed to add to following if sentence to the end when I incremented the wordCounter. :)
if (stringArray[wordCounter] != '\n') {
wordCounter++;
}

You are using int arrays to represent strings, probably because getchar() returns in int. However, strings are better represented as char arrays, since that's what they are, in C. The fact that getchar() returns an int is certainly confusing, it's because it needs to be able to return the special value EOF, which doesn't fit in a char. Therefore it uses int, which is a "larger" type (able to represent more different values). So, it can fit all the char values, and EOF.
With char arrays, you can use C's string functions directly:
char stringArray[MAX];
if(fgets(stringArray, sizeof stringArray, stdin) != NULL)
printf("You entered %s", stringArray);
Note that fscanf() will leave the end of line character(s) in the string, so you might want to strip them out. I suggest implementing an in-place function that trims off leading and trailing whitespace, it's a good exercise as well.

for (counter = 0; counter == MAX; counter++) {
wordArray[counter] = '\0';
}
You never enter into this loop.

user1799795,
For what it's worth (now that you've solved your problem) I took the liberty of showing you how I'd do this given the restriction "use arrays", and explaining a bit about why I'd do it that way... Just beware that while I am experienced programmer I'm no C guru... I've worked with guys who absolutely blew me into the C-weeds (pun intended).
#include <stdio.h>
#include <string.h>
#define LINE_SIZE 500
#define MAX_WORDS 50
#define WORD_SIZE 20
// Main function.
int main(int argc, const char * argv[])
{
int counter = 0;
// ----------------------------------
// Read a line of input from the user (ie stdin)
// ----------------------------------
char line[LINE_SIZE];
printf("Please type in a list of names then hit ENTER:\n");
while ( fgets(line, LINE_SIZE, stdin) == NULL )
fprintf(stderr, "You must enter something. Pretty please!");
// A note on that LINE_SIZE parameter to the fgets function:
// wherever possible it's a good idea to use the version of the standard
// library function that allows you specificy the maximum length of the
// string (or indeed any array) because that dramatically reduces the
// incedence "string overruns", which are a major source of bugs in c
// programmes.
// Also note that fgets includes the end-of-line character/sequence in
// the returned string, so you have to ensure there's room for it in the
// destination string, and remember to handle it in your string processing.
// -------------------------
// split the line into words
// -------------------------
// the current word
char word[WORD_SIZE];
int wordLength = 0;
// the list of words
char words[MAX_WORDS][WORD_SIZE]; // an array of upto 50 words of
// upto 20 characters each
int wordCount = 0; // the number of words in the array.
// The below loop syntax is a bit cyptic.
// The "char *c=line;" initialises the char-pointer "c" to the start of "line".
// The " *c;" is ultra-shorthand for: "is the-char-at-c not equal to zero".
// All strings in c end with a "null terminator" character, which has the
// integer value of zero, and is commonly expressed as '\0', 0, or NULL
// (a #defined macro). In the C language any integer may be evaluated as a
// boolean (true|false) expression, where 0 is false, and (pretty obviously)
// everything-else is true. So: If the character at the address-c is not
// zero (the null terminator) then go-round the loop again. Capiche?
// The "++c" moves the char-pointer to the next character in the line. I use
// the pre-increment "++c" in preference to the more common post-increment
// "c++" because it's a smidge more efficient.
//
// Note that this syntax is commonly used by "low level programmers" to loop
// through strings. There is an alternative which is less cryptic and is
// therefore preferred by most programmers, even though it's not quite as
// efficient. In this case the loop would be:
// int lineLength = strlen(line);
// for ( int i=0; i<lineLength; ++i)
// and then to get the current character
// char ch = line[i];
// We get the length of the line once, because the strlen function has to
// loop through the characters in the array looking for the null-terminator
// character at its end (guess what it's implementation looks like ;-)...
// which is inherently an "expensive" operation (totally dependant on the
// length of the string) so we atleast avoid repeating this operation.
//
// I know I might sound like I'm banging on about not-very-much but once you
// start dealing with "real word" magnitude datasets then such habits,
// formed early on, pay huge dividends in the ability to write performant
// code the first time round. Premature optimisation is evil, but my code
// doesn't hardly ever NEED optimising, because it was "fairly efficient"
// to start with. Yeah?
for ( char *c=line; *c; ++c ) { // foreach char in line.
char ch = *c; // "ch" is the character value-at the-char-pointer "c".
if ( ch==' ' // if this char is a space,
|| ch=='\n' // or we've reached the EOL char
) {
// 1. add the word to the end of the words list.
// note that we copy only wordLength characters, instead of
// relying on a null-terminator (which doesn't exist), as we
// would do if we called the more usual strcpy function instead.
strncpy(words[wordCount++], word, wordLength);
// 2. and "clear" the word buffer.
wordLength=0;
} else if (wordLength==WORD_SIZE-1) { // this word is too long
// so split this word into two words.
strncpy(words[wordCount++], word, wordLength);
wordLength=0;
word[wordLength++] = ch;
} else {
// otherwise: append this character to the end of the word.
word[wordLength++] = ch;
}
}
// -------------------------
// print out the words
// -------------------------
for ( int w=0; w<wordCount; ++w ) {
printf("Hi %s!\n", words[w]);
}
return 0;
}
In the real world one can't make such restrictive assumptions about the maximum-length of words, or how many there will be, and if such restrictions are given they're almost allways arbitrary and therefore proven wrong all too soon... so straight-off-the-bat for this problem, I'd be inclined to use a linked-list instead of the "words" array... wait till you get to "dynamic data structures"... You'll love em ;-)
Cheers. Keith.
PS: You're going pretty well... My advise is "just keep on truckin"... this gets a LOT easier with practice.

Related

Stdin + Dictionary Text Replacement Tool -- Debugging

I'm working on a project in which I have two main files. Essentially, the program reads in a text file defining a dictionary with key-value mappings. Each key has a unique value and the file is formatted like this where each key-value pair is on its own line:
ipsum i%##!
fubar fubar
IpSum XXXXX24
Ipsum YYYYY211
Then the program reads in input from stdin, and if any of the "words" match the keys in the dictionary file, they get replaced with the value. There is a slight thing about upper and lower cases -- this is the order of "match priority"
The exact word is in the replacement set
The word with all but the first character converted to lower case is in the replacement set
The word converted completely to lower case is in the replacement set
Meaning if the exact word is in the dictionary, it gets replaced, but if not the next possibility (2) is checked and so on...
My program passes the basic cases we were provided but then the terminal shows
that the output vs reference binary files differ.
I went into both files (not c files, but binary files), and one was super long with tons of numbers and the other just had a line of random characters. So that didn't really help. I also reviewed my code and made some small tests but it seems okay? A friend recommended I make sure I'm accounting for the null operator in processInput() and I already was (or at least I think so, correct me if I'm wrong). I also converted getchar() to an int to properly check for EOF, and allocated extra space for the char array. I also tried vimdiff and got more confused. I would love some help debugging this, please! I've been at it all day and I'm very confused.
There are multiple issues in the processInput() function:
the loop should not stop when the byte read is 0, you should process the full input with:
while ((ch = getchar()) != EOF)
the test for EOF should actually be done differently so the last word of the file gets a chance to be handled if it occurs exactly at the end of the file.
the cast in isalnum((char)ch) is incorrect: you should pass ch directly to isalnum. Casting as char is actually counterproductive because it will turn byte values beyond CHAR_MAX to negative values for which isalnum() has undefined behavior.
the test if(ind >= cap) is too loose: if word contains cap characters, setting the null terminator at word[ind] will write beyond the end of the array. Change the test to if (cap - ind < 2) to allow for a byte and a null terminator at all times.
you should check that there is at least one character in the word to avoid calling checkData() with an empty string.
char key[ind + 1]; is useless: you can just pass word to checkData().
checkData(key, ind) is incorrect: you should pass the size of the buffer for the case conversions, which is at least ind + 1 to allow for the null terminator.
the cast in putchar((char)ch); is useless and confusing.
There are some small issues in the rest of the code, but none that should cause a problem.
Start by testing your tokeniser with:
$ ./a.out <badhash2.c >zooi
$ diff badhash2.c zooi
$
Does it work for binary files, too?:
$ ./a.out <./a.out > zooibin
$ diff ./a.out zooibin
$
Yes, it does!
#include <stdio.h>
#include <stdlib.h>
#include <ctype.h>
#include <string.h>
void processInput(void);
int main(int argc, char **argv) {
processInput();
return 0;
}
void processInput() {
int ch;
char *word;
int len = 0;
int cap = 60;
word = malloc(cap);
while(1) {
ch = getchar(); // (1)
if( ch != EOF && isalnum(ch)) { // (2)
if(len+1 >= cap) { // (3)
cap += cap/2;
word = realloc(word, cap);
}
word[len++] = ch;
} else {
if (len) { // (4)
#if 0
char key[len + 1];
memcpy(key, word, len); key[len] = 0;
checkData(key, len);
#else
word[len] = 0;
fputs(word, stdout);
#endif
len = 0;
}
if (ch == EOF) break; // (5)
putchar(ch);
}
}
free(word);
}
I only repaired your tokeniser, leaving out the hash table and the search & replace stuff. It is now supposed to generate a verbatim copy of the input. (which is silly, but great for testing)
If you want to allow binary input, you cannot use while((ch = getchar()) ...) : a NUL in the input would cause the loop to end. You must pospone testing for EOF, because ther could still be a final word in your buffer ...&& ch != EOF)
treat EOF just like a space here: it could be the end of a word
you must reserve space for the NUL ('\0') , too.
if (len==0) there would be no word, so no need to look it up.
we treated EOF just like a space, but we don't want to write it to the output. Time to break out of the loop.

CS50: pset2 / initials:- I've got code that works but I feel like I am taking a shortcut with setting my array size

So I am working away on the 'less comfortable' version of the initials problem in CS50, and after beginning with very verbose code I've managed to whittle it down to this:
#include <cs50.h>
#include <ctype.h>
#include <stdio.h>
#include <string.h>
int c = 0;
int main(void)
{
string name = get_string();
int n = strlen(name);
char initials[10];
// first letter is always going to be the first initial
initials[0] = name[0];
// count through letters looking for spaces + add the first letter after a
// space to the initials array
for (int j = 0; j < n; j++)
{
if (name[j] == 32)
{
c += 1;
initials[c] += name[j+1];
}
}
// print out initials
for (int k = 0; k <= c; k++)
{
printf("%c", toupper(initials[k]));
}
printf("\n");
}
As it stands like that it passes, but I feel like I am copping out a little cos I just pick [10] out of the air for the initial array size which I know isn't good practice. To make it a little 'better' I've tried to run a 'for' loop to iterate through the name string and add up the number of spaces. I then want to make the array [spaces + 1] as if there are 2 spaces then there will be 3 initials. The code I am trying for that is:
string name = get_string();
int n = strlen(name);
for (int i = 0; i < n; i++)
{
if (name[i] == 32)
{
spaces +=1;
}
}
The thought is that I then make 'char initials[spaces + 1]' on the next line, but even before I can do that, compiling my code with just this 'for' loop returns a fail when I upload it for checking (although it compiles no problem). Even if I don't use any of the 'for' loops output the mere fact it is there gives me this error.
Where am I going wrong?
Any help on this would be much appreciated.
Thanks!
First of all, keep in mind that execution speed is most often more valuable than memory use. If you first go look for spaces and after that allocate memory, you have to iterate through the array twice. This is an optimization of memory use at the cost of execution speed. So it might make more sense to just allocate a "large enough" array of lets say 100 characters and keep the code that you have.
I then want to make the array [spaces + 1] as if there are 2 spaces then there will be 3 initials
Keep in mind that C strings are null terminated, so you need to allocate room for the null terminator too, spaces + 1 + 1.
compiling my code with just this 'for' loop returns a fail when I upload it for checking (although it compiles no problem). Even if I don't use any of the 'for' loops output the mere fact it is there gives me this error.
What error? Does it compile or does it not compile, your text is contradicting.
Make sure you initialize spaces to zero.
As a side note, never use "magic numbers" in C code. if (name[i] == 32), 32 is gibberish to anyone who can't cite the ASCII table by memory. In addition, it is non-portable to systems with other symbol tables that might not have the same index numbers. Instead write:
if (name[i] == ' ')
In my opinion, a good approach to cater for such situations is the one the library function snprintf uses: It requires you to pass in the string to fill and the size of that string. In ensures that the string isn't overwritten and that the string is zero-terminated.
The function returns the length of the characters written to the string if the had the string been large enough. You can now do one of two things: Guess a reasonable buffer size and accept that the string will be cut short occasionally. Or call the function with a zero length, use the return value to allocate a char buffer and then fill it with a second call.
Applying this approach to your initials problem:
int initials(char *ini, int max, const char *str)
{
int prev = ' '; // pretend there's a space before the string
int n = 0; // actual number of initials
while (*str) {
if (prev == ' ' && *str != ' ') {
if (n + 1 < max) ini[n] = *str;
n++;
}
prev = *str++;
}
if (n < max) {
ini[n] = '\0';
} else if (max > 0) {
ini[max] = '\0';
}
return n;
}
You can then either use the fixed-size bufer approach:
char *name = "Theodore Quick Brown Fox";
char ini[4];
initials(ini, sizeof(ini), name);
puts(ini); // prints "TQB", "F" is truncated
Or the two-step dynamic-size approach:
char *name = "Theodore Quick Brown Fox";
int n;
n = initials(NULL, 0, name);
char ini[n + 1];
initials(ini, sizeof(ini), name);
puts(ini); // prints "TQBF"
(Note that this implementation of initals will ignore multiple spaces and spaces at the end or at the beginning of the string. Your look-one-ahead function will insert spaces in these cases.)
You know your initials array can't be any bigger than the name itself; at most, it can't be more than half as big (every other character is a space). So use that as your size. The easiest way to do that is to use a variable-length array:
size_t n = strlen( name ); // strlen returns a size_t type, not int
char initials[n/2+1]; // n/2+1 is not a *constant expression*, so this is
// a variable-length array.
memset( initials, 0, n + 1 ); // since initials is a VLA, we can't use an initializer
// in the declaration.
The only problem is that VLA support may be iffy - VLAs were introduced in C99, but made optional in C2011.
Alternately, you can use a dynamically-allocated buffer:
#include <stdlib.h>
...
size_t n = strlen( name );
char *initials = calloc( n/2+1, sizeof *initials ); // calloc initializes memory to 0
/**
* code to find and display initials
*/
free( initials ); // release memory before you exit your program.
Although, if all you have to do is display the initials, there's really no reason to store them - just print them as you find them.
Like others have suggested, use the character constant ' ' instead of the ASCII code 32 for comparing against a space:
if ( name[j] == ' ' )
or use the isspace library function (which will return true for spaces, tabs, newlines, etc.):
#include <ctype.h>
...
if ( isspace( name[j] ) )

K&R - Recursive descent parser - strcat

What would be the reason for out[0] = '\0'; on the main() function?
It does seem to be working without it.
Code
#include <stdio.h>
#include <string.h>
#include <ctype.h>
#define MAXTOKEN 100
enum { NAME, PARENS, BRACKETS };
int tokentype;
char token[MAXTOKEN]; /*last token string */
char name[MAXTOKEN]; /*identifier name */
char datatype[MAXTOKEN]; /*data type = char, int, etc. */
char out[1000];
void dcl(void);
void dirdcl(void);
int gettoken(void);
/*
Grammar:
dcl: optional * direct-dcl
direct-dcl: name
(dcl)
direct-dcl()
direct-dcl[optional size]
*/
int main() /* convert declaration to words */
{
while (gettoken() != EOF) { /* 1st token on line */
/* 1. gettoken() gets the datatype from the token */
strcpy(datatype, token);
/* 2. Init out to end of the line? */
/* out[0] = '\0'; */
/* parse rest of line */
dcl();
if (tokentype != '\n')
printf("syntax error\n");
printf("%s: %s %s\n", name, out, datatype);
}
return 0;
}
int gettoken(void) /* return next token */
{
int c, getch(void);
void ungetch(int);
char *p = token;
/* Skip blank spaces and tabs */
while ((c = getch()) == ' ' || c == '\t')
;
if (c == '(') {
if ((c = getch()) == ')') {
strcpy(token, "()");
return tokentype = PARENS;
} else {
ungetch(c);
return tokentype = '(';
}
} else if (c == '[') {
for (*p++ = c; (*p++ = getch()) != ']'; )
;
*p = '\0';
return tokentype = BRACKETS;
} else if (isalpha(c)) {
/* Reads the next character of input */
for (*p++ = c; isalnum(c = getch()); ) {
*p++ = c;
}
*p = '\0';
ungetch(c); /* Get back the space, tab */
return tokentype = NAME;
} else
return tokentype = c;
}
/* dcl: parse a declarator */
void dcl(void)
{
int ns;
for (ns = 0; gettoken() == '*'; ) /* count *'s */
ns++;
dirdcl();
while (ns-- > 0)
strcat(out, " pointer to");
}
/* dirdcl: parse a direct declarator */
void dirdcl(void)
{
int type;
if (tokentype == '(') {
dcl();
if (tokentype != ')')
printf("error: missing )\n");
}
else if (tokentype == NAME) /* variable name */ {
strcpy(name, token);
printf("token: %s\n", token);
}
else
printf("error: expected name or (dcl)\n");
while ((type = gettoken()) == PARENS || type == BRACKETS) {
if (type == PARENS)
strcat(out, " function returning");
else {
strcat(out, " array");
strcat(out, token);
strcat(out, " of");
}
}
}
You need out[0] to be zero in order for strcat to work.
While this line
out[0] = '\0';
was required prior to the introduction of static initialization rules, it is no longer required, because static arrays, such as out[], are initialized to all zeros.
According to initialization rules of C99,
...
if it has arithmetic type, it is initialized to (positive or unsigned) zero.
if it is an aggregate, every member is initialized (recursively) according to these rules.
It is resetting the char array (aka string) to empty array. (removing junk values)
like we use:
int i = 0;
before doing something like:
i += 1;
so that junk value don't add
So just '\0' in 0 index of array tells that array is completely empty and the strcat function starts appending value from 0 index, over writing the junk values in other indexes of array.
If program is working without resetting array then it means your IDE tool is doing that for you, but it is good practice to reset it.
In short: In this particular case it's not strictly necessary, but in many other cases that look suspiciously similar, it is, so most people do it as "good style". So why would it be necessary?
There is no such thing as "empty" memory. There is no such thing as a "length". Unless you explicitly keep track of it, or define your own.
Memory is just bytes, which are numbers from 0 to 255. Since 0 is just as valid a number as 255, there is no way to tell whether a byte is used or not. You can "add up" several bytes if you need larger numbers, but everything is built out of bytes, in the end. Text is simply mapped to a number. A couple decades ago it was decided which number represents which character. So if you see a byte with the value 32, it could be a 32. Or it could be the 32nd letter in the computer's alphabet (which is the space character).
When you receive a string and you don't know how much text you will be dealing with, what you usually do is you reserve a large block of bytes. This is what char out[1000]; above does. But how do you tell where the text ends? How much of the 1000 bytes you've already used?
Well, in the old days, some people would just declare another variable, say, int length; and keep track of how many bytes they've used so far. The designers of C went a different route. They decided to pick a very rare character and use that as a marker. They picked the character with the value 0 for that (That is not the character '0'. The character '0' actually is the 48th letter of a computer's alphabet).
So you can just look at all the bytes in your string from the start, and if a character is > 0, you know it is used. If you reach a 0 character, you know this is the end of your string. There are various advantages to either approach. An int uses 4 bytes, an additional 0-character only 1. On the other hand, if you use an int, a string can also contain a 0-character, it's just another character, nobody cares.
Whenever you write "foo" in C, what C actually does is reserve room for 4 bytes, for 'f', 'o', 'o' and for the 0 to indicate the end. When you write "" in C, what it does is reserve room for a single byte, the 0. So that you can tell that the string is empty.
So, what is memory filled with before you put something into it at startup? Well, in most cases, it is just garbage. Whatever was in that memory the last time it was used (after all, you have limited RAM, so when you quit one application on your computer, its memory can get re-used for the next app you launch after that). These will be random numbers, often outside of the range of common characters.
So, if you want strcat to see out as an empty string, you need to give it a block of memory that starts with this 0 value character. If you just leave memory like it is, there might be some random characters in it. Your buffer might contain "jbhasugaudq7e1723876123798dbkda0skno§§^^%$#-9H0HWDZmwus0/usr/local/bin"
or whatever was in that memory before. If you now appended some text to it, it would think the stuff before the first 0 (which is just randomly in this place) was a valid string, and append it to that. It will only know that this string is supposed to be empty, if you put a 0 right at the start.
So why did I say it is "not strictly necessary"? Well, because in your case, out is a global variable, and global variables are special because they automatically get cleared to 0 when your application starts up (or assigned any value that you assign them when you declare them).
However, this is only true for global variables (both regular globals and static globals). So many programmers make it a habit to always initialize their blocks of bytes. That way, if someone later decides to change a global into a local variable, or copy-and-pastes the code to another spot to use with a local variable, they do not have to worry about forgetting to add this statement.
This is especially useful as random memory often contains 0 characters. So depending on what program you previously used, you might not notice you forgot the initial 0 because there happened to be one already in there. And only later, when one of your users runs this application, they get garbage at the start of their string.
Does that clarify things a bit?

How to correctly input a string in C

I am currently learning C, and so I wanted to make a program that asks the user to input a string and to output the number of characters that were entered, the code compiles fine, when I enter just 1 character it does fine, but when I enter 2 or more characters, no matter what number of character I enter, it will always say there is just one character and crashes after that. This is my code and I can't figure out what is wrong.
int main(void)
{
int siz;
char i[] = "";
printf("Enter a string.\n");
scanf("%s", i);
siz = sizeof(i)/sizeof(char);
printf("%d", siz);
getch();
return 0;
}
I am currently learning to program, so if there is a way to do it using the same scanf() function I will appreciate that since I haven't learned how to use any other function and probably won't understand how it works.
Please, FORGET that scanf exists. The problem you are running into, whilst caused mostly by your understandable inexperience, will continue to BITE you even when you have experience - until you stop.
Here is why:
scanf will read the input, and put the result in the char buffer you provided. However, it will make no check to make sure there is enough space. If it needs more space than you provided, it will overwrite other memory locations - often with disastrous consequences.
A safer method uses fgets - this is a function that does broadly the same thing as scanf, but it will only read in as many characters as you created space for (or: as you say you created space for).
Other observation: sizeof can only evaluate the size known at compile time : the number of bytes taken by a primitive type (int, double, etc) or size of a fixed array (like int i[100];). It cannot be used to determine the size during the program (if the "size" is a thing that changes).
Your program would look like this:
#include <stdio.h>
#include <string.h>
#define BUFLEN 100 // your buffer length
int main(void) // <<< for correctness, include 'void'
{
int siz;
char i[BUFLEN]; // <<< now you have space for a 99 character string plus the '\0'
printf("Enter a string.\n");
fgets(i, BUFLEN, stdin); // read the input, copy the first BUFLEN characters to i
siz = sizeof(i)/sizeof(char); // it turns out that this will give you the answer BUFLEN
// probably not what you wanted. 'sizeof' gives size of array in
// this case, not size of string
// also not
siz = strlen(i) - 1; // strlen is a function that is declared in string.h
// it produces the string length
// subtract 1 if you don't want to count \n
printf("The string length is %d\n", siz); // don't just print the number, say what it is
// and end with a newline: \n
printf("hit <return> to exit program\n"); // tell user what to do next!
getc(stdin);
return 0;
}
I hope this helps.
update you asked the reasonable follow-up question: "how do I know the string was too long".
See this code snippet for inspiration:
#include <stdio.h>
#include <string.h>
#define N 50
int main(void) {
char a[N];
char *b;
printf("enter a string:\n");
b = fgets(a, N, stdin);
if(b == NULL) {
printf("an error occurred reading input!\n"); // can't think how this would happen...
return 0;
}
if (strlen(a) == N-1 && a[N-2] != '\n') { // used all space, didn't get to end of line
printf("string is too long!\n");
}
else {
printf("The string is %s which is %d characters long\n", a, strlen(a)-1); // all went according to plan
}
}
Remember that when you have space for N characters, the last character (at location N-1) must be a '\0' and since fgets includes the '\n' the largest string you can input is really N-2 characters long.
This line:
char i[] = "";
is equivalent to:
char i[1] = {'\0'};
The array i has only one element, the program crashes because of buffer overflow.
I suggest you using fgets() to replace scanf() like this:
#include <stdio.h>
#define MAX_LEN 1024
int main(void)
{
char line[MAX_LEN];
if (fgets(line, sizeof(line), stdin) != NULL)
printf("%zu\n", strlen(line) - 1);
return 0;
}
The length is decremented by 1 because fgets() would store the new line character at the end.
The problem is here:
char i[] = "";
You are essentially creating a char array with a size of 1 due to setting it equal to "";
Instead, use a buffer with a larger size:
char i[128]; /* You can also malloc space if you desire. */
scanf("%s", i);
See the link below to a similar question if you want to include spaces in your input string. There is also some good input there regarding scanf alternatives.
How do you allow spaces to be entered using scanf?
That's because char i[] = ""; is actually an one element array.
Strings in C are stored as the text which ends with \0 (char of value 0). You should use bigger buffer as others said, for example:
char i[100];
scanf("%s", i);
Then, when calculating length of this string you need to search for the \0 char.
int length = 0;
while (i[length] != '\0')
{
length++;
}
After running this code length contains length of the specified input.
You need to allocate space where it will put the input data. In your program, you can allocate space like:
char i[] = " ";
Which will be ok. But, using malloc is better. Check out the man pages.

How to erase every occurences of vowels in a string

In a schools assignment we are asked to remove every occurences of vowels from a string.
So:
"The boy kicked the ball" would result in
"Th by kckd th bll"
Whenever a vowel is found, all the subsequent characters somehow have to shift left, or at least that's my approach. Being that I just started learning C, it may very well be that it's a ridiculous approach.
What I'm trying to do is: When I hit the first vowel, I "shift" the next char ([i+1]) to the current pos (i). the shifting then has to continue for every subsequent character, so int startshift is set to 1 so the first if block excecutes on every subsequent iteration.
The first if block also test to see if the next char is a vowel. Without such a test any character preceding a vowel would "transform" to the adjacent vowel, and every vowel except the first would still be present. However this resulted in every vowel being replaced by the preceding char, hence the if else block.
Anyway, this ugly code is what I've come up with so far. (The names used for the char* pointers make no sense (I just don't know what to call them), and having two sets of them is probably redudant.
char line[70];
char *blank;
char *hlp;
char *blanktwo;
char *hlptwo;
strcpy(line, temp->data);
int i = 0;
int j;
while (line[i] != '\n') {
if (startshift && !isvowel(line[i+1])) { // need a test for [i + 1] is vowel
blank = &line[i+1]; // blank is set to til point to the value of line[i+1]
hlp = &line[i]; // hlp is set to point to the value of line[i]
*hlp = *blank; // shifting left
} else if (startshift && isvowel(line[i+1])) {
blanktwo = &line[i+1];
hlptwo = &line[i];
*hlptwo = *blanktwo;
//*hlptwo = line[i + 2]; // LAST MOD, doesn't work
}
for (j = 0; j < 10; j++) { // TODO: j < NVOWELS
if (line[i] == vowels[j]) { // TODO: COULD TRY COPY EVERYTHING EXCEPT VOWELS
blanktwo = &line[i+1];
hlptwo = &line[i];
*hlptwo = *blanktwo;
startshift = 1;
}
}
i++;
}
printf("%s", line);
The code doesn't work.
with text.txt:
The boy kicked the ball
He kicked it hard
./oblig1 remove test.txt produces:
Th boy kicked the ball
e kicked it hard
NB. I've omitted the outer while loop used for iterating the lines in the text file.
Just some food for thought, since this is homework and I don't want to spoil the fun:
You might also tackle this problem without using a second 'temp->data' buffer. If the given input string is in a modifiable memory chunk, like
char data[] = "The boy kicked the ball";
You could also write a program which maintains two pointers into the buffer:
One pointer points to the position in the string where the next vowel would need to be written; this pointer is advanced whenever a vowel was written.
The second pointer points to the position in the string where the next character to consider is read from; this pointer is advanced whenever a character is read.
If you think about it, you can see that the first pointer will not advance as fast as the second pointer (since every character is read, but not every character is written out - vowels are skipped).
If you go for this route, consider that you may need to terminate the string properly.
Try use std containers and objects
#include <iostream>
#include <string>
#include <vector>
std::string editStr = "qweertadoi";
std::vector<char> vowels{'i', 'o', 'u', 'e', 'a'};
int main() {
for(unsigned int i = 0; i<editStr.size(); i++){
for(char c: vowels){
if(editStr.at(i) == c){
editStr.erase(i--,1);
break;
}
}
}
std::cout << editStr << std::endl;
return 0;
}

Resources