Reading a string character by character in C

Reading a string character by character in C - c

I am trying to read a string character by character in C. Since there is no string class, there are no functions to help this. Here is what i want to do: I have,
char m[80]; //I do some concenation, finally m is:
m= 12;256;2;
Now, i want to count how many characters are there between the semicolumns. In this example, there are 2,4 and 1 characters respectively. How can do this?
Thank you

What do you mean "there are no functions to help this"? There are. If you want to read a string, check out the function fgets.
On to the problem at hand, let's say you have this:
char m[80] = "12;256;2";
And you want to count the characters between the semi-colons. The easiest way is to use strchr.
char *p = m;
char *pend = m + strlen(m);
char *plast;
int count;
while( p != NULL ) {
plast = p;
p = strchr(p, ';');
if( p != NULL ) {
// Found a semi-colon. Count characters and advance to next char.
count = p - plast;
p++;
} else {
// Found no semi-colon. Count characters to the end of the string.
count = pend - p;
}
printf( "Number of characters: %d\n", count );
}

Well I'm not sure were supposed to write the code for you here, just correct it. But...
int strcount, charcount = 0, numcharcount = 0, num_char[10] = 0;
//10 or how many segments you expect
for (strcount = 0; m[strcount] != '\0'; strcount++) {
if (m[strcount] == ';') {
num_char[numcharcount++] = charcount;
charcount = 0;
} else {
charcount++;
}
}
This will store the amount of each character between the ; in an array.
It is kind of sloppy I'll admit but it will work for what you asked.

If you don't mind modifying your string then the easiest way is to use strtok.
#include <string.h>
#include <stdio.h>
int main(void) {
char m[80] = "12;256;2;";
char *p;
for (p = strtok(m, ";"); p; p = strtok(NULL, ";"))
printf("%s = %u\n", p, strlen(p));
}

Related

Convert String into Array of Strings in C

I'm trying to divide a string of alphabetically sorted words char *str = "a/apple/arm/basket/bread/car/camp/element/..."
into an array of strings alphabetically like so:
arr[0] = "a/apple/arm"
arr[1] = "basket/bread"
arr[2] = "car/camp"
arr[3] = ""
arr[4] = "element"
...
I'm not very skilled in C, so my approach was going to be to declare:
char arr[26][100];
char curr_letter = "a";
and then iterate over each char in the string looking for "/" follow by char != curr_letter, then strcpy that substring to the correct location.
I'm not sure if my approach is very good, let alone how to implement it properly.
Any help would be greatly appreciated!

So we basically loop through the string, and check if we found the "split character' and we also check that we didn't find the 'curr_letter' as the next character.
We keep track of the consumed length, the current length (used for memcpy later to copy the current string to the array).
When we find a position where we can add the current string to the array, we allocate space and copy the string to it as the next element in the array. We also add the current_length to consumed, and the current_length is reset.
We use due_to_end to find out if we have a / in the current string, and remove it accordingly.
Try:
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
int main() {
char *str = "a/apple/arm/basket/bread/car/camp/element/...";
char split_char = '/';
char nosplit_char = 'a';
char **array = NULL;
int num_elts = 0;
// read all the characters one by one, and add to array if
// your condition is met, or if the string ends
int current_length = 0; // holds the current length of the element to be added
int consumed = 0; // holds how much we already added to the array
for (int i = 0; i < strlen(str); i++) { // loop through string
current_length++; // increment first
int due_to_end = 0;
if ( ( str[i] == split_char // check if split character found
&& ( i != (strlen(str) - 1) // check if its not the end of the string, so when we check for the next character, we don't overflow
&& str[i + 1] != nosplit_char ) ) // check if the next char is not the curr_letter(nosplit_char)
|| (i == strlen(str) - 1 && (due_to_end = 1))) { // **OR**, check if end of string
array = realloc(array, (num_elts + 1) * sizeof(char *)); // allocate space in the array
array[num_elts] = calloc(current_length + 1, sizeof(char)); // allocate space for the string
memcpy(array[num_elts++], str + consumed, (due_to_end == 0 ? current_length - 1 : current_length)); // copy the string to the current array offset's allocated memory, and remove 1 character (slash) if this is not the end of the string
consumed += current_length; // add what we consumed right now
current_length = 0; // reset current_length
}
}
for (int i = 0; i < num_elts; i++) { // loop through all the elements for overview
printf("%s\n", array[i]);
free(array[i]);
}
free(array);
}

Yes, the approach that you specify in your question seems good, in principle. However, I see the following problem:
Using strcpy will require a null-terminated source string. This means if you want to use strcpy, you will have to overwrite the / with a null character. If you don't want to have to modify the source string by writing null characters into it, then an alternative would be to use the function memcpy instead of strcpy. That way, you can specify the exact number of characters to copy and you don't require the source string to have a null terminating character. However, this also means that you will somehow have to count the number of characters to copy.
On the other hand, instead of using strcpy or memcpy, you could simply copy one character at a time from str into arr[0], until you encounter the next letter, and then copy one character at a time from str into arr[1], and so on. That solution may be simpler.
In accordance with the community guidelines for homework questions, I will not provide a full solution to your problem at this time.
EDIT: Since another answer has already provides a full solution which uses memcpy, I will now also provide a full solution, which uses the simpler solution mentioned above of copying one character at a time:
#include <stdio.h>
#include <stdlib.h>
#include <stdbool.h>
#define NUM_LETTERS 26
#define MAX_CHARS_PER_LETTER 99
int main( void )
{
//declare the input string
char *str =
"a/apple/arm/basket/bread/car/camp/element/"
"frog/glass/saddle/ship/water";
//declare array which holds all the data
//we must add 1 for the terminating null character
char arr[NUM_LETTERS][MAX_CHARS_PER_LETTER+1];
//this variable will store the current letter that we
//have reached
char curr_letter = 'a';
//this variable will store the number of chars that are
//already used in the current letter, which will be a
//number between 0 and MAX_CHARS_PER_LETTER
int chars_used = 0;
//this variable stores whether the next character is
//the start of a new word
bool new_word = true;
//initialize the arrays to contain empty strings
for ( int i = 0; i < NUM_LETTERS; i++ )
arr[i][0] = '\0';
//read one character at a time
for ( const char *p = str; *p != '\0'; p++ )
{
//determine whether we have reached the end of a word
if ( *p == '/' )
{
new_word = true;
}
else
{
//determine whether we have reached a new letter
if ( new_word && *p != curr_letter )
{
//write terminating null character to string of
//previous letter, overwriting the "/"
if ( chars_used != 0 )
arr[curr_letter-'a'][chars_used-1] = '\0';
curr_letter = *p;
chars_used = 0;
}
new_word = false;
}
//verify that buffer is large enough
if ( chars_used == MAX_CHARS_PER_LETTER )
{
fprintf( stderr, "buffer overflow!\n" );
exit( EXIT_FAILURE );
}
//copy the character
arr[curr_letter-'a'][chars_used++] = *p;
}
//the following code assumes that the string pointed to
//by "str" will not end with a "/"
//write terminating null character to string
arr[curr_letter-'a'][chars_used] = '\0';
//print the result
for ( int i = 0; i < NUM_LETTERS; i++ )
printf( "%c: %s\n", 'a' + i, arr[i] );
}
This program has the following output:
a: a/apple/arm
b: basket/bread
c: car/camp
d:
e: element
f: frog
g: glass
h:
i:
j:
k:
l:
m:
n:
o:
p:
q:
r:
s: saddle/ship
t:
u:
v:
w: water
x:
y:
z:
Here is another solution which uses strtok:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define NUM_LETTERS 26
#define MAX_CHARS_PER_LETTER 99
int main( void )
{
//declare the input string
char str[] =
"a/apple/arm/basket/bread/car/camp/element/"
"frog/glass/saddle/ship/water";
//declare array which holds all the data
//we must add 1 for the terminating null character
char arr[NUM_LETTERS][MAX_CHARS_PER_LETTER+1];
//this variable will store the current letter that we
//have reached
char curr_letter = 'a';
//this variable will store the number of chars that are
//already used in the current letter, which will be a
//number between 0 and MAX_CHARS_PER_LETTER
int chars_used = 0;
//initialize the arrays to contain empty strings
for ( int i = 0; i < NUM_LETTERS; i++ )
arr[i][0] = '\0';
//find first token
char *p = strtok( str, "/" );
//read one token at a time
while ( p != NULL )
{
int len;
//determine whether we have reached a new letter
if ( p[0] != curr_letter )
{
curr_letter = p[0];
chars_used = 0;
}
//count length of string
len = strlen( p );
//verify that buffer is large enough to copy string
if ( chars_used + len >= MAX_CHARS_PER_LETTER )
{
fprintf( stderr, "buffer overflow!\n" );
exit( EXIT_FAILURE );
}
//add "/" if necessary
if ( chars_used != 0 )
{
arr[curr_letter-'a'][chars_used++] = '/';
arr[curr_letter-'a'][chars_used] = '\0';
}
//copy the word
strcpy( arr[curr_letter-'a']+chars_used, p );
//update number of characters used in buffer
chars_used += len;
//find next token
p = strtok( NULL, "/" );
}
//print the result
for ( int i = 0; i < NUM_LETTERS; i++ )
printf( "%c: %s\n", 'a' + i, arr[i] );
}

Odd behavior removing duplicate characters in a C string

I am using the following method in a program used for simple substitution-based encryption. This method is specifically used for removing duplicate characters in the encryption/decryption key.
The method is functional, as is the rest of the program, and it works for 99% of the keys I've tried. However, when I pass it the key "goodmorning" or any key consisting of the same letters in any order (e.g. "dggimnnooor"), it fails. Further, keys containing more characters than "goodmorning" work, as well as keys with less characters.
I ran the executable through lldb with the same arguments and it works. I've cloned my repository on a machine running CentOS, and it works as is.
But I get no warnings or errors on compile.
//setting the key in main method
char * key;
key = removeDuplicates(argv[2]);
//return 1 if char in word
int targetFound(char * charArr, int num, char target){
int found = 0;
if(strchr(charArr,target))
found = 1;
return found;
}
//remove duplicate chars
char * removeDuplicates(char * word){
char * result;
int len = strlen(word);
result = malloc (len * sizeof(char));
if (result == NULL)
errorHandler(2);
char ch;
int i;
int j;
for( i = 0, j = 0; i < len; i++){
ch = word[i];
if(!targetFound(result, i, ch)){
result[j] = ch;
j++;
}
}
return result;
}
Per request: if "feather" was passed in to this function the resulting string would be "feathr".

As R Sahu already said, you are not terminating your string with a NUL character. Now I'm not going to explain why you need to do this, but you always need to terminate your strings with a NUL character, which is '\0'. If you want to know why, head over here for a good explanation. However this is not the only problem with your code.
The main problem is that the function strchr that you are calling to find out if your result already contains some character expects you to pass a NUL terminated string, but your variable is not NUL terminated, because you keep appending characters to it.
To solve your problem, I would suggest you to use a map instead. Map all the characters you already used and if they aren't in the map add them both to the map and the result. This is simpler (no need to call strchr or any other function), faster (no need to scan all the string every time), and most importantly correct.
Here's a simple solution:
char *removeDuplicates(char *word){
char *result, *map, ch;
int i, j;
map = calloc(256, 1);
if (map == NULL)
// Maybe you want some other number here?
errorHandler(2);
// Add one char for the NUL terminator:
result = malloc(strlen(word) + 1);
if (result == NULL)
errorHandler(2);
for(i = 0, j = 0; word[i] != '\0'; i++) {
ch = word[i];
// Check if you already saw this character:
if(map[(size_t)ch] == 0) {
// If not, add it to the map:
map[(size_t)ch] = 1;
// And to your result string:
result[j] = ch;
j++;
}
}
// Correctly NUL terminate the new string;
result[j] = '\0';
return result;
}
Why does this work on other machines, but not on your machine?
You are being a victim of undefined behavior. Different compilers on different systems treat undefined behavior differently. For example, GCC may decide to not do anything in this particular case and make strchr just keep searching in the memory until it founds a '\0' character, and this is exactly what happens. Your program keeps searching for the NUL terminator and never stops because who knows where a '\0' could be in memory after your string? This is both dangerous and incorrect, because the program is not reading inside the memory reserved for it, so for example, another compiler could decide to stop the search there, and give you a correct result. This however is not something to take for granted, and you should always avoid undefined behavior.

I see couple of problems in your code:
You are not terminating the output with the null character.
You are not allocating enough memory to hold the null character when there are no duplicate characters in the input.
As a consequence, your program has undefined behavior.
Change
result = malloc (len * sizeof(char));
to
result = malloc (len+1); // No need for sizeof(char)
Add the following before the function returns.
result[j] = '\0';
The other problem, the main one, is that you are using strchr on result, which is not a null terminated string when you call targetFound. That also caused undefined behavior. You need to use:
char * removeDuplicates(char * word){
char * result;
int len = strlen(word);
result = malloc (len+1);
if (result == NULL)
{
errorHandler(2);
}
char ch;
int i;
int j;
// Make result an empty string.
result[0] = '\0';
for( i = 0, j = 0; i < len; i++){
ch = word[i];
if(!targetFound(result, i, ch)){
result[j] = ch;
j++;
// Null terminate again so that next call to targetFound()
// will work.
result[j] = '\0';
}
}
return result;
}
A second option is to not use strchr in targetFound. Use num instead and implement the equivalent functionality.
int targetFound(char * charArr, int num, char target)
{
for ( int i = 0; i < num; ++i )
{
if ( charArr[i] == target )
{
return 1;
}
}
return 0;
}
That will allow you to avoid assigning the null character to result so many times. You will need to null terminate result only at the end.
char * removeDuplicates(char * word){
char * result;
int len = strlen(word);
result = malloc (len+1);
if (result == NULL)
{
errorHandler(2);
}
char ch;
int i;
int j;
for( i = 0, j = 0; i < len; i++){
ch = word[i];
if(!targetFound(result, i, ch)){
result[j] = ch;
j++;
}
}
result[j] = '\0';
return result;
}

method that takes a string, in this format "aabbaadddc". Encode the string by counting the consecutive letters. Ex: "a2b2a2d3c1"

I have written a code but i am not able to get the desired output. I am doing some mistake, but unable to trace what exactly it is.
Below is my code.
#include<stdio.h>
#include<string.h>
int main()
{
char c[] = "aabbaadddc",*ptr = c,prev,t[10],*tptr = t;
int i,j,count = 0;
while(*ptr != '\0') {
prev = *ptr;
count = 0;
while(*ptr == prev) {
count++;
ptr++;
}
*tptr = prev;
tptr++;
*tptr = (char)count;
tptr++;
}
*tptr = '\0';
printf("%s\n", t);
return 0;
}
I am expecting "a2b2a2d3c1" in the t string. I want to know my mistakes.

You should change 17th line to
*tptr = (char)(count + '0');
to add ASCII value of 0 to your counter.
But i would change the code because it has several problem:
if we have abcde (strlen = 5) string the result string should be a1b1c1d1e1 (strlen = 10). So we get buffer overflow
if some char occurs more then 10 times we get incorrect result string
To make this algorithms right i would use another buffer with enough size to store all values and add values (chars and numbers) there. To convert int to char[] use itoa or snprintf.

From what I see
*tptr = prev;
tptr++;
*tptr = (char)count;
is the mistake.
It should be
*tptr = count+48;
I am not sure how you are going to manage double digit numbers with this algorithm.I would suggest you look at the sprintf function.It would be helpful in your case.

t is too short - is has no room for the terminating \x00!
Furthermore, you're inserting a character \x02 (from the counter), not '2'. That means you would need to add '0' in order to get the right output:
*tptr = (char)count + '0';

#include<stdio.h>
#include<string.h>
int main()
{
char c[] = "aabbaadddc",*ptr = c,prev;
char t[10]; // t is too small - there is no room for the string terminator
char *tptr = t;
int i,j,count = 0;
while(*ptr != '\0') {
prev = *ptr;
count = 0;
while(*ptr == prev) {
count++;
ptr++;
}
*tptr = prev;
tptr++;
*tptr = (char)count; // You need to add the ascii value for 0
tptr++;
}
*tptr = '\0';
printf("%s\n", t);
return 0;
}
Some notes and observations:
Your code is not very general. The encoding should be in a function and main should hold the code to test the function (calling it and printing the result).
Your code can only handle up to 9 consecutive letters.
The encoding algorithm isn't reversible if the source string contains digits.

How to reverse a sentence in C and Perl

If the sentence is
"My name is Jack"
then the output should be
"Jack is name My".
I did a program using strtok() to separate the words and then push them onto a stack,
popping them and concatenating.
Is there any other, more efficient way than this?
Is it easier to do in Perl?

Whether it is more efficient or not will be something you can test but in Perl you could do something along the lines of:
my $reversed = join( " ", reverse( split( / /, $string ) ) );

Perl makes this kind of text manipulation very easy, you can even test this easily on the shell:
echo "run as fast as you can" | perl -lne 'print join $",reverse split /\W+/'
or:
echo "all your bases are belong to us" | perl -lne '#a=reverse /\w+/g;print "#a"'

The strategy for C could be this:
1) Reverse the characters of the string. This results in the words being the right general position, albeit backward.
2) Reverse the characters of each word in the string.
We will need one function to reverse characters in a buffer:
/*
* Reverse characters in a buffer.
*
* If provided "My name is Jack", modifies the input to become
* "kcaJ si eman yM".
*/
void reverse_chars(char * buf, int cch_len)
{
char * front = buf, *back = buf + cch_len - 1;
while (front < back)
{
char tmp = *front;
*front = *back;
*back = tmp;
front ++;
back --;
}
}
For the purpose of breaking the input buffer into words, a function which returns the number of non-space characters in the buffer. (strtok() modifies the buffer and is harder to use in-place)
int word_len(char *input)
{
char * p = input;
while (*p && !isspace(*p))
p++;
return p - input;
}
Finally, we will need a function which uses those two helpers to achieve the strategy described in the first paragraph.
/*
* Reverse words in a buffer.
*
* Given the input "My name is Jack", modifies the input to become
* "Jack is name My"
*/
void reverse_words(char *input)
{
int cch_len = strlen(input);
/* Part 1: Reverse the string characters. */
reverse_chars(input, cch_len);
char * p = input;
/* Part 2: Loop over one word at a time. */
while (*p)
{
/* Skip leading spaces */
while (*p && isspace(*p))
p++;
if (*p)
{
/* Advance one complete word. */
int cch_word = word_len(p);
reverse_chars(p, cch_word);
p += cch_word;
}
}
}

You've gotten a couple of versions in C, but they strike me as a bit more verbose than is probably really necessary. Absent a reason to do otherwise, I'd consider something like this:
#define MAX 32
char *words[MAX];
char word[256];
int pos = 0;
for (pos=0; pos<MAX && scanf("%255s", word); pos++)
words[pos] = strdup(word);
while (--pos >= 0)
printf("%s ", words[pos]);
One possible "intermediate" level between C and Perl would be C++:
std::istringstream input("My name is Jack");
std::vector<std::string> words((std::istream_iterator<std::string>(input)),
std::istream_iterator<std::string>());
std::copy(words.rbegin(), words.rend(),
std::ostream_iterator<std::string>(std::cout, " "));

Here is a C idea that uses a little recursion to do the stacking for you:
void rev(char * x){
char * p;
if(p = strchr(x, ' ')){
rev(p+1);
printf("%.*s ", p-x, x);
}
else{
printf("%s ", x);
}
}

Some fun with a little help from regexp and perl special variables :)
$_ = "My name is Jack";
unshift #_, "$1 " while /(\w+)/g;
print #_;
EDIT
And a killer (by now):
$,=' ';print reverse /\w+/g;
Little explanation: $, is special variable for print output separator. Of course you can do it in shorter way without this special var:
print reverse /\w+ ?/g;
but the result might be not as satisfactiry as example above.

Using reverse:
my #words = split / /, $sentence;
my $newSentence = join(' ', reverse #words);

It's probably a lot easier to do in Perl, but...
char *strrtok(char *str, const char *delim)
{
int i, j;
for (i = strlen(str) - 1; i > 0; i--)
{
// Sets the furthest set of contiguous delimiters to null characters
if (strchr(delim, str[i]))
{
j = i + 1;
while (strchr(delim, str[i]) && i >= 0)
{
str[i] = '\0';
i--;
}
return &(str[j]);
}
}
return str;
}
This should work similarly to strtok() in reverse, but you continue to pass the pointer to the original string location rather than passing NULL after the first call. Also, you should get empty strings for start and end cases.

C version:
#include <string.h>
int main()
{
char s[] = "My name is Jack";
char t[100];
int i = 0, j = 0, k = 0;
for(i = strlen(s) - 1 ; i >= 0 ;i--)
{
if(s[i] == ' ' || i == 0)
{
j = i == 0 ? i : i + 1;
for(j = j; s[j] != '\0'; j++) t[k++] = s[j];
t[k++] = ' ';
s[i] = '\0';
}
}
t[k] = '\0';
printf("%s\n", t);
return 0;
}

C example
char * srtrev (char * str) {
int l = strlen(str);
char * rev;
while(l != 0)
{
rev += str[ --l];
}
return rev;
}

C - Largest String From a Big One

So pray tell, how would I go about getting the largest contiguous string of letters out of a string of garbage in C? Here's an example:
char *s = "(2034HEY!!11 th[]thisiswhatwewant44";
Would return...
thisiswhatwewant
I had this on a quiz the other day...and it drove me nuts (still is) trying to figure it out!
UPDATE:
My fault guys, I forgot to include the fact that the only function you are allowed to use is the strlen function. Thus making it harder...

Uae strtok() to split your string into tokens, using all non-letter characters as delimiters, and find the longest token.
To find the longest token you will need to organise some storage for tokens - I'd use linked list.
As simple as this.
EDIT
Ok, if strlen() is the only function allowed, you can first find the length of your source string, then loop through it and replace all non-letter characters with NULL - basically that's what strtok() does.
Then you need to go through your modified source string second time, advancing one token at a time, and find the longest one, using strlen().

This sounds similar to the standard UNIX 'strings' utility.
Keep track of the longest run of printable characters terminated by a NULL.
Walk through the bytes until you hit a printable character. Start counting. If you hit a non-printable character stop counting and throw away the starting point. If you hit a NULL, check to see if the length of the current run is greater then the previous record holder. If so record it, and start looking for the next string.

What defines the "good" substrings compared to the many others -- being lowercase alphas only? (i.e., no spaces, digits, punctuation, uppercase, &c)?
Whatever the predicate P that checks for a character being "good", a single pass over s applying P to each character lets you easily identify the start and end of each "run of good characters", and remember and pick the longest. In pseudocode:
longest_run_length = 0
longest_run_start = longest_run_end = null
status = bad
for i in (all indices over s):
if P(s[i]): # current char is good
if status == bad: # previous one was bad
current_run_start = current_run_end = i
status = good
else: # previous one was also good
current_run_end = i
else: # current char is bad
if status == good: # previous one was good -> end of run
current_run_length = current_run_end - current_run_start + 1
if current_run_length > longest_run_length:
longest_run_start = current_run_start
longest_run_end = current_run_end
longest_run_length = current_run_length
status = bad
# if a good run ends with end-of-string:
if status == good: # previous one was good -> end of run
current_run_length = current_run_end - current_run_start + 1
if current_run_length > longest_run_length:
longest_run_start = current_run_start
longest_run_end = current_run_end
longest_run_length = current_run_length

Why use strlen() at all?
Here's my version which uses no function whatsoever.
#ifdef UNIT_TEST
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#endif
/*
// largest_letter_sequence()
// Returns a pointer to the beginning of the largest letter
// sequence (including trailing characters which are not letters)
// or NULL if no letters are found in s
// Passing NULL in `s` causes undefined behaviour
// If the string has two or more sequences with the same number of letters
// the return value is a pointer to the first sequence.
// The parameter `len`, if not NULL, will have the size of the letter sequence
//
// This function assumes an ASCII-like character set
// ('z' > 'a'; 'z' - 'a' == 25; ('a' <= each of {abc...xyz} <= 'z'))
// and the same for uppercase letters
// Of course, ASCII works for the assumptions :)
*/
const char *largest_letter_sequence(const char *s, size_t *len) {
const char *p = NULL;
const char *pp = NULL;
size_t curlen = 0;
size_t maxlen = 0;
while (*s) {
if ((('a' <= *s) && (*s <= 'z')) || (('A' <= *s) && (*s <= 'Z'))) {
if (p == NULL) p = s;
curlen++;
if (curlen > maxlen) {
maxlen = curlen;
pp = p;
}
} else {
curlen = 0;
p = NULL;
}
s++;
}
if (len != NULL) *len = maxlen;
return pp;
}
#ifdef UNIT_TEST
void fxtest(const char *s) {
char *test;
const char *p;
size_t len;
p = largest_letter_sequence(s, &len);
if (len && (len < 999)) {
test = malloc(len + 1);
if (!test) {
fprintf(stderr, "No memory.\n");
return;
}
strncpy(test, p, len);
test[len] = 0;
printf("%s ==> %s\n", s, test);
free(test);
} else {
if (len == 0) {
printf("no letters found in \"%s\"\n", s);
} else {
fprintf(stderr, "ERROR: string too large\n");
}
}
}
int main(void) {
fxtest("(2034HEY!!11 th[]thisiswhatwewant44");
fxtest("123456789");
fxtest("");
fxtest("aaa%ggg");
return 0;
}
#endif

While I waited for you to post this as a question I coded something up.
This code iterates through a string passed to a "longest" function, and when it finds the first of a sequence of letters it sets a pointer to it and starts counting the length of it. If it is the longest sequence of letters yet seen, it sets another pointer (the 'maxStringStart' pointer) to the beginning of that sequence until it finds a longer one.
At the end, it allocates enough room for the new string and returns a pointer to it.
#include<stdio.h>
#include<stdlib.h>
#include<string.h>
int isLetter(char c){
return ( (c >= 'a' && c <= 'z') || (c >= 'A' && c <= 'Z') );
}
char *longest(char *s) {
char *newString = 0;
int maxLength = 0;
char *maxStringStart = 0;
int curLength = 0;
char *curStringStart = 0;
do {
//reset the current string length and skip this
//iteration if it's not a letter
if( ! isLetter(*s)) {
curLength = 0;
continue;
}
//increase the current sequence length. If the length before
//incrementing is zero, then it's the first letter of the sequence:
//set the pointer to the beginning of the sequence of letters
if(curLength++ == 0) curStringStart = s;
//if this is the longest sequence so far, set the
//maxStringStart pointer to the beginning of it
//and start increasing the max length.
if(curLength > maxLength) {
maxStringStart = curStringStart;
maxLength++;
}
} while(*s++);
//return null pointer if there were no letters in the string,
//or if we can't allocate any memory.
if(maxLength == 0) return NULL;
if( ! (newString = malloc(maxLength + 1)) ) return NULL;
//copy the longest string into our newly allocated block of
//memory (see my update for the strlen() only requirement)
//and null-terminate the string by putting 0 at the end of it.
memcpy(newString, maxStringStart, maxLength);
newString[maxLength + 1] = 0;
return newString;
}
int main(int argc, char *argv[]) {
int i;
for(i = 1; i < argc; i++) {
printf("longest all-letter string in argument %d:\n", i);
printf(" argument: \"%s\"\n", argv[i]);
printf(" longest: \"%s\"\n\n", longest(argv[i]));
}
return 0;
}
This is my solution in simple C, without any data structures.
I can run it in my terminal like this:
~/c/t $ ./longest "hello there, My name is Carson Myers." "abc123defg4567hijklmnop890"
longest all-letter string in argument 1:
argument: "hello there, My name is Carson Myers."
longest: "Carson"
longest all-letter string in argument 2:
argument: "abc123defg4567hijklmnop890"
longest: "hijklmnop"
~/c/t $
the criteria for what constitutes a letter could be changed in the isLetter() function easily. For example:
return (
(c >= 'a' && c <= 'z') ||
(c >= 'A' && c <= 'Z') ||
(c == '.') ||
(c == ' ') ||
(c == ',') );
would count periods, commas and spaces as 'letters' also.
as per your update:
replace memcpy(newString, maxStringStart, maxLength); with:
int i;
for(i = 0; i < maxLength; i++)
newString[i] = maxStringStart[i];
however, this problem would be much more easily solved with the use of the C standard library:
char *longest(char *s) {
int longest = 0;
int curLength = 0;
char *curString = 0;
char *longestString = 0;
char *tokens = " ,.!?'\"()#$%\r\n;:+-*/\\";
curString = strtok(s, tokens);
do {
curLength = strlen(curString);
if( curLength > longest ) {
longest = curLength;
longestString = curString;
}
} while( curString = strtok(NULL, tokens) );
char *newString = 0;
if( longest == 0 ) return NULL;
if( ! (newString = malloc(longest + 1)) ) return NULL;
strcpy(newString, longestString);
return newString;
}

First, define "string" and define "garbage". What do you consider a valid, non-garbage string? Write down a concrete definition you can program - this is how programming specs get written. Is it a sequence of alphanumeric characters? Should it start with a letter and not a digit?
Once you get that figured out, it's very simple to program. Start with a naive method of looping over the "garbage" looking for what you need. Once you have that, look up useful C library functions (like strtok) to make the code leaner.

Another variant.
#include <stdio.h>
#include <string.h>
int main(void)
{
char s[] = "(2034HEY!!11 th[]thisiswhatwewant44";
int len = strlen(s);
int i = 0;
int biggest = 0;
char* p = s;
while (p[0])
{
if (!((p[0] >= 'A' && p[0] <= 'Z') || (p[0] >= 'a' && p[0] <= 'z')))
{
p[0] = '\0';
}
p++;
}
for (; i < len; i++)
{
if (s[i] && strlen(&s[i]) > biggest)
{
biggest = strlen(&s[i]);
p = &s[i];
}
}
printf("%s\n", p);
return 0;
}

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

Reading a string character by character in C - c

If you don't mind modifying your string then the easiest way is to use strtok. #include <string.h> #include <stdio.h> int main(void) { char m[80] = "12;256;2;"; char *p; for (p = strtok(m, ";"); p; p = strtok(NULL, ";")) printf("%s = %u\n", p, strlen(p)); }

Related

Convert String into Array of Strings in C

Odd behavior removing duplicate characters in a C string

method that takes a string, in this format "aabbaadddc". Encode the string by counting the consecutive letters. Ex: "a2b2a2d3c1"

How to reverse a sentence in C and Perl

C - Largest String From a Big One

Categories

Resources