C: extract numbers from a string

C: extract numbers from a string - c

I have a bunch of strings structured like this one
Trim(2714,8256)++Trim(10056,26448)++Trim(28248,49165)
and what I want to do is to save all the numbers into an array (for the sake of this answer let's say I want to save the numbers of just one string).
My plan was to find the the position of the first digit of every number and just read the number with sscanf, but as much as I've thought about it, I couldn't find a proper way to do so. I've read a lot about strstr, but it is used to search for a string into another string, so I should search for the exact number or do 10 cases to cover from 0 to 9.
Thanks in advance for your support!

You could try something like this:
Walk the string until you find the first digit (use isdigit)
Use strtoul to extract the number starting at that position
strtoul returns the number
the second argument (endptr) points to the next character in the string, following the extracted number
Rinse, repeat
Alternatively you could tokenize the string (using "(,+)") and try to strtoul everything.

#include <stdio.h>
#include <stdlib.h>
#include <ctype.h>
int main() {
int arr[10], idx=0, d, l=0;
char *p, *str = "Trim(2714,8256)++Trim(10056,26448)++Trim(28248,49165)";
for (p = str; *p != 0; p+=l) {
l = 1;
if (isdigit(*p)){
sscanf(p, "%d%n", &d, &l);
arr[idx++] = d;
}
}
for (l=0; l<idx; l++) {
printf("%d\n", arr[l]);
}
return 0;
}

You can also try using YACC or Lex, which will format your string as you want.

Here is how I would think about the code:
start loop over source array characters
if the character in the current position (of the source array) is a digit
copy it to the destination array (in the current position of the destination array)
move to the next position in the destination array
move to the next position in the source array
if the end of the source string is reached, exit loop
make sure that the destination string is terminated properly (i.e. by '\0')
Note that we are counting with two different counters one for the source array which will increment with every loop iteration and the other for the destination array and will only increment if a digit is found
checking of a character is a digit or not can be done using the function "isdigit()" but it will require the header file ctype.h
Another way to check if the character is a digit is by checking its value in reference to the ASCII table
character '0' equals 48 and character '9' equals 57. So if the character is within that range it is a digit, other wise it is a character. You can actually compare directly with the characters.
if (character >= '0' && character =< '9') printf("%c is a digit", character);

#include <stdio.h>
#include <stdlib.h>
#include <ctype.h>
int main() {
int array[8], count=0, data;
const char *s = "Trim(+2714,8256)++Trim(10056,26448)++Trim(28248,49165)";
char *p;
while(*s) {
if(isdigit(*s) || *s=='-' && isdigit(s[1])){
data = strtol(s, &p, 10);
s = p;
array[count++] = data;
} else
++s;
}
{//test print
int i;
for(i=0;i<count;++i)
printf("%d\n", array[i]);
}
return 0;
}

Related

Add up digit from char array in c language

I am new to C programming and trying to make a program to add up the digits from the input like this:
input = 12345 <= 5 digit
output = 15 <= add up digit
I try to convert the char index to int but it dosent seems to work! Can anyone help?
Here's my code:
#include <stdio.h>
#include <string.h>
int main(){
char nilai[5];
int j,length,nilai_asli=0,i;
printf("nilai: ");
scanf("%s",&nilai);
length = strlen(nilai);
for(i=0; i<length; i++){
int nilai1 = nilai[i];
printf("%d",nilai1);
}
}
Output:
nilai: 12345
4950515253

You have two problems with the code you show.
First lets talk about the problem you ask about... You display the encoded character value. All characters in C are encoded in one way or another. The most common encoding scheme is called ASCII where the digits are encoded with '0' starting at 48 up to '9' at 57.
Using this knowledge it should be quite easy to figure out a way to convert a digit character to the integer value of the digit: Subtract the character '0'. As in
int nilai1 = nilai[i] - '0'; // "Convert" digit character to its integer value
Now for the second problem: Strings in C are really called null-terminated byte strings. That null-terminated bit is quite important, and all strings functions (like strlen) will look for that to know when the string ends.
When you input five character for the scanf call, the scanf function will write the null-terminator on the sixth position in the five-element array. That is out of bounds and leads to undefined behavior.
You can solve this by either making the array longer, or by telling scanf not to write more characters into the array than it can actually fit:
scanf("%4s", nilai); // Read at most four characters
// which will fit with the terminator in a five-element array

First of all, your buffer isn't big enough. String input is null-terminated, so if you want to read in your output 12345 of 5 numbers, you need a buffer of at least 6 chars:
char nilai[6];
And if your input is bigger than 5 chars, then your buffer has to be bigger, too.
But the problem with adding up the digits is that you're not actually adding up anything. You're just assigning to int nilai1 over and over and discarding the result. Instead, put int nilai1 before the loop and increase it in the loop. Also, to convert from a char to the int it represents, subtract '0'. All in all this part should look like this:
int nilai1 = 0;
for (i = 0; i < length; i++) {
nilai1 += nilai[i] - '0';
}
printf("%d\n", nilai1);

For starters according to the C Standard the function main without parameters shall be declared like
int main( void )
This character array
char nilai[5];
can not contain a string with 5 digits. Declare the array with at least one more character to store the terminating zero of a string.
char nilai[6];
In the call of scanf
scanf("%s",&nilai);
remove the operator & before the name nilai. And such a call is unsafe. You could use for example the standard function fgets.
This call
length = strlen(nilai);
is redundant and moreover the variable length should be declared having the type size_t.
This loop
for(i=0; i<length; i++){
int nilai1 = nilai[i];
printf("%d",nilai1);
}
entirely does not make sense.
The program can look the following way
#include <stdio.h>
#include <ctype.h>
int main(void)
{
enum { N = 6 };
char nilai[N];
printf( "nilai: ");
fgets( nilai, sizeof( nilai ), stdin );
int nilai1 = 0;
for ( const char *p = nilai; *p != '\0'; ++p )
{
if ( isdigit( ( unsigned char ) *p ) ) nilai1 += *p - '0';
}
printf( "%d\n", nilai1 );
return 0;
}
Its output might look like
nilai: 12345
15

How to count the number of distinct characters in common between two strings?

How can a program count the number of distinct characters in common between two strings?
For example, if s1="connect" and s2="rectangle", the count is being displayed as 5 but the correct answer is 4; repeating characters must be counted only once.
How can I modify this code so that the count is correct?
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main()
{
int i,j,count=0;
char s1[100],s2[100];
scanf("%s",s1);//string 1 is inputted
scanf("%s",s2);//string 2 is taken as input
for(i=1;i<strlen(s1);i++)
{
for(j=1;j<strlen(s2);j++)
{
if(s1[i]==s2[j])//compare each char of both the strings to find common letters
{
count++;//count the common letters
break;
}
}
}
printf("%d",count);//display the count
}
The program is to take two strings as input and display the count of the common characters in those strings. Please let me know what's the problem with this code.

If repeating characters must be ignored, the program must 'remember' the character which were already encountered. You could do this by storing the characters which were processed into a character array and then consult this array while processing the other characters.
You could use a counter variable to keep track of the number of common characters like
int ctr=0;
char s1[100]="connect", s2[100]="rectangle", t[100]="";
Here, t is the character array where the examined characters will be stored. Its size is made to be same as the size of the largest of the other 2 character arrays.
Now use a loop like
for(int i=0; s1[i]; ++i)
{
if(strchr(t, s1[i])==NULL && strchr(s2, s1[i])!=NULL)
{
t[ctr++]=s1[i];
t[ctr]=0;
}
}
t initially has an empty string. Characters which were previously absent in t are added to it via the body of the loop which will be executed only if the character being examined (ie, s1[i]) is not in t but is present in the other string (ie, s2).
strchr() is a function with a prototype
char *strchr( const char *str, int c );
strchr() finds the first occurrence of c in the string pointed to by str. It returns NULL if c is not present in str.
Your usage of scanf() may cause trouble.
Use
scanf("%99s",s1);
(where 99 is one less than the size of the array s1) instead of
scanf("%s",s1);
to prevent overflow problems. And check the return value of scanf() and see if it's 1. scanf() returns the number of successful assignment that it made.
Or use fgets() to read the string.
Read this post to see more about this.
And note that array indexing starts from 0. So in your loops, the first character of the strings are not checked.
So it should've been something like
for(i=0;i<strlen(s1);i++)
instead of
for(i=1;i<strlen(s1);i++)

Here's a solution that avoids quadratic O(N²) or cubic O(N³) time algorithms — it is linear time, requiring one access to each character in each of the input strings. The code uses a pair of constant strings rather than demanding user input; an alternative might take two arguments from the command line and compare those.
#include <limits.h>
#include <stdio.h>
int main(void)
{
int count = 0;
char bytes[UCHAR_MAX + 1] = { 0 };
char s1[100] = "connect";
char s2[100] = "rectangle";
for (int i = 0; s1[i] != '\0'; i++)
bytes[(unsigned char)s1[i]] = 1;
for (int j = 0; s2[j] != '\0'; j++)
{
int k = (unsigned char)s2[j];
if (bytes[k] == 1)
{
bytes[k] = 0;
count++;
}
}
printf("%d\n",count);
return 0;
}
The first loop records which characters are present in s1 by setting an appropriate element of the bytes array to 1. It doesn't matter whether there are repeated characters in the string.
The second loop detects when a character in s2 was in s1 and has not been seen before in s2, and then both increments count and marks the character as 'no longer relevant' by setting the entry in bytes back to 0.
At the end, it prints the count — 4 (with a newline at the end).
The use of (unsigned char) casts is necessary in case the plain char type on the platform is a signed type and any of the bytes in the input strings are in the range 0x80..0xFF (equivalent to -128..-1 if the char type is signed). Using negative subscripts would not lead to happiness. The code does also assume that you're working with a single-byte code set, not a multi-byte code set (such as UTF-8). Counts will be off if you are dealing with multi-byte characters.
The code in the question is at minimum a quadratic algorithm because for each character in s1, it could step through all the characters in s2 only to find that it doesn't occur. That alone requires O(N²) time. Both loops also use a condition based on strlen(s1) or strlen(s2), and if the optimizer does not recognize that the value returned is the same each time, then the code could scan each string on each iteration of each loop.
Similarly, the code in the other two answers as I type (Answer 1 and Answer 2) are also quadratic or worse because of their loop structures.
At the scale of 100 characters in each string, you probably won't readily spot the difference, especially not in a single iteration of the counting. If the strings were bigger — thousands or millions of bytes — and the counts were performed repeatedly, then the difference between the linear and quadratic (or worse) algorithms would be much bigger and more easily detected.
I've also played marginally fast'n'loose with the Big-O notation. I'm assuming that N is the size of the strings, and they're sufficiently similar in size that treating N₁ (the length of s1) as approximately equal to N₂ (the length of s2) isn't going to be a major problem. The 'quadratic' algorithms might be more formally expressed as O(N₁•N₂) whereas the linear algorithm is O(N₁+N₂).

Based on what you expect as output you should keep track which char you used from the second string. You can achieve this as follows:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main()
{
int i, j, count = 0, skeep;
char s1[100], s2[100], s2Used[100]{0};
scanf("%s", s1); //string 1 is inputted
scanf("%s", s2); //string 2 is taken as input
for (i = 0; i<strlen(s1); i++)
{
skeep = 0;
for (j = 0; j < i; j++)
{
if (s1[j] == s1[i])
{
skeep = 1;
break;
}
}
if (skeep)
continue;
for (j = 0; j<strlen(s2); j++)
{
if (s1[i] == s2[j] && s2Used[j] == 0) //compare each char of both the strings to find common letters
{
//printf("%c\n", s1[i]);
s2Used[j] = 1;
count++;//count the common letters
break;
}
}
}
printf("%d", count);//display the count
}

C: Replacing a substring within a string using loops

I am struggling with the concept of replacing substrings within strings. This particular exercise does not want you to use built in functions from <string.h> or <strings.h>.
Given the string made up of two lines below:
"Mr. Fay, is this going to be a battle of wits?"
"If it is," was the indifferent retort, "you have come unarmed!"
I have to replace a substring with another string.
This is what I have so far, and I'm having trouble copying the substring to a new array, and replacing the substring with the new string:
#include <stdio.h>
#include <string.h>
int dynamic();
int main()
{
char str[]="\n\"Mr. Fay, is this going to be a battle of wits?\" \n\"If it is,\" was the indifferent retort, \"you have come unarmed!\"";
int i, j=0, k=0, l=0, n=0;
unsigned int e = n-2;
char data[150];
char newData[150];
char newStr[150];
printf("Give me a substring from the string");
gets(data);
printf("Give me a substring to replace it with");
gets(newData);
dynamic();
for (i=0; str[i] != '\0'; i++)
{
if (str[i] != data[j])
{
newStr[l] = str[i];
l++;
}
else if ((str[i+e] == data[j+e]) && (j<n))
{
newStr[l] = newData[j];
j++;
l++;
e--;
}
else if ((str[i+e] == data[j+e]) && (j>=n))
{
j++;
e--;
}
else
{
newStr[l] = str[i];
l++;
}
}
printf("original string is-");
for (k=0; k<n; k++)
printf("%c",str[k]);
printf("\n");
printf("modified string is-");
for(k=0; k<n; k++)
printf("%c",newStr[k]);
printf("\n");
}
int dynamic()
{
char str[]="\n\"Mr. Fay, is this going to be a battle of wits?\" \n\"If it is,\" was the indifferent retort, \"you have come unarmed!\"";
int i, n=0;
for (i=0; str[i] != '\0'; i++)
{
n++;
}
printf("the number of characters is %d\n",n);
return (n);
}

I tried your problem and got output for my code. Here is the code-
EDIT- THIS IS THE EDITED MAIN CODE
#include <stdio.h>
#include <string.h>
int var(char *); //function declaration. I am telling CPU that I will be using this function in the later stage with one argument of type char *
int main() //main function
{
char *str="\n\"Mr. Fay, is this going to be a battle of wits?\" \n\"If it is,\" was the indifferent retort, \"you have come unarmed!\"";
int i,j=0,k=0,l=0;
char data[] = "indifferent";
char newData[] = "nonchalant";
char newStr[150];
//here 'n' is returned from the 'var' function and is received in form of r,r1,r2,r3.
int r=var(str); //getting the length of str from the function 'var' and storing in 'r'
int r1=var(data); //getting the length of data from the function 'var' and storing in 'r1'
int r2=var(newData); //getting the length of newData from the function and storing in 'r2'
unsigned int e=r1-2; //r1-2 because r1 is the data to be replaced. and string index starts from 0. Here r1 is of length 12. but we dont need to check last
//character because it is null character and the index starts from 0. not from 1. so, it is 0 to 11 and 11th is '\0'. so "12-"2"=10" characters to be compared.
for(i=0;str[i]!='\0';i++)
{
if(str[i]!=data[j])
{
newStr[l]=str[i];
l++;
}
else if((str[i+e]==data[j+e]) && (j<r2))
{
newStr[l]=newData[j];
j++;
l++;
e--;
}
else if((str[i+e]==data[j+e]) && (j>=r2))
{
j++;
e--;
}
else
{
newStr[l]=str[i];
l++;
}
}
int r3=var(newStr); //getting the length of str from the function and storing in 'r'
printf("original string is-");
for(k=0;k<r;k++)
printf("%c",str[k]);
printf("\n");
printf("modified string is-");
for(k=0;k<r3;k++)
printf("%c",newStr[k]);
printf("\n");
} // end of main function
// Below is the new function called 'var' to get the character length
//'var' is the function name and it has one parameter. I am returning integer. so, it is int var.
int var(char *stri)//common function to get length of strings and substrings
{
int i,n=0;
for(i=0;stri[i]!='\0';i++)
{
n++; //n holds the length of a string.
}
// printf("the number of characters is %d\n",n);
return (n); //returning this 'n' wherever the function is called.
}
Let me explain few parts of the code-
I have used unsigned int e, because I don't want 'e' to go negative.(I will explain more about this later).
In the first for loop, I am checking whether my string has reached the end.
In first 'IF' condn, I am checking whether the first character of string is NOT-EQUAL to the first character of the word which needs to be replaced. If condition satisfies, print regularly thr original string.
ELSE IF, i.e(first character of string is EQUAL to the first character of the word)then check the next few characters to make sure that the word matches. Here, I used 'e' because it will check the condition for str[i+e] and data[i+e]. example- ai notequalto ae. If I had not used 'e'in code,... after checking the first character itself, newdata would have been printed in newstr. I used 'e'=5 because the probabilty of 1st letter and 5th letter being the same in data and the str is less. You can use 'e'=4 also. No rule that you have to use 'e'=5 only.
Now, I am decrementing 'e' and checking whether the letters in the string is same or no. I can't increment because, there is a certain limit of size of a string. As, I used unsigned int, 'e' won't go down below 0.
ELSE, (this means that only first letter is matching, the 5th letter of str and data are not matching), print the str in newstr.
In the last FOR loop, I have used k<114 because, that much characters are there in the string. (You can write a code to find how many characters are there in a string. No need to manually count).
And lastly, I have used conditions (j<10) and (j>=10) along with ELSE-IF condition because, in first ELSE-IF, the new data is ofsize 10. So, even if the word to be replaced is more than 10,say 12 for example. I don't need the extra 2 bits to be stored in new data. So, if the size is more than 10, just bypass that in the next ELSE-IF condition. Note that this 10 is the size of new word. So, it varies if your word is smaller or bigger. And , in second ELSE-IF, I am not incrementing 'l'(l++) because, here, I am not putting anything in newstr. I am just bypassing it. So, I didn't increment.
I tried my best to put the code in words. If you have any doubt, you can ask again. I will be glad to help. And this code is NOT OPTIMAL. The numerical values used varies with the words/strings you use. Ofcourse, I can write a generalized code for that(to fetch the numerical values automatically from the strings). But, I didn't write that code here. This code works for your problem. You can change few variables like 'e' and ELSE-IF part and try to understand how the code works. Play with it.
EDIT-
include
int main()
{
char str[]="\n\"Mr. Fay, is this going to be a battle of wits?\" \n\"If it is,\" was the indifferent retort, \"you have come unarmed!\"";// I took this as string. The string which u need to calculate the length, You have to pass that as the function parameter.
int i,n=0;
for(i=0;str[i]!='\0';i++)
{
n++;
}
printf("the number of characters is %d\n",n);
return (n);
}// If you execute this as a separate program, you will get the number of characters in the string. Basically, you just have to modify this code to act as a separate function and when calling the function, you have to pass correct arguments.
//Use Pointers in the function to pass arguments.

Writing a program in C with the function isAlphabetic to determine if a string strictly contains alphabetic letters or not

This is what I have so far.
#include <stdio.h>
#include <stdlib.h>
int main(void) {
int value;
char c='Z';
char alph[30]="there is a PROF 1 var orada";
char freq[27];
int i;
// The function isAlphabetic will accept a string and test each character to
// verify if it is an alphabetic character ( A through Z , lowercase or uppercase)
// if all characters are alphabetic characters then the function returns 0.
// If a nonalphabetic character is found, it will return the index of the nonalpabetic
// character.
value = isAlphabetic(alph);
if (value == 0)
printf("\n The string is alphabetic");
else
printf("Non alphabetic character is detected at position %d\n",value);
return EXIT_SUCCESS;
}
int isAlphabetic(char *myString) {
}
What I'm confused is how will I have the program scan through a string to detect exactly where a non alphabetic character is, if any? I'm guessing it'll first involve counting all the characters in a string first?

Not going to provide the answer via code (as someone else did), but consider:
A string in C is nothing more than an array of characters and a null terminator.
You can iterate through each item in an array using [] (i.e., input[i]) to check its value against an ASCII table for example.
Your function can exit as soon as it finds one value that is not alphabetic.
There are certainly other ways to solve this problem, but my assumption is that at this level, your professor would be a bit suspicious if you started using a bunch of libraries / tools you haven't been taught.

Let's take your questions one at a time:
...how will I have the program scan through a string...
"Scan through a string" means you skin the cat with a loop:
char xx[] = "ABC DEF 123 456";
int ii;
/* for, while, do while; pick your poison */
for (ii = 0; xx[ii] != '\0'; ++ii)
{
/* Houston, we're scanning. */
}
...to detect...
"Detect" means you skin the cat with a comparison of some sort:
char a, b;
a == b; /* equality of two char's */
a >= b; /* greater-than-or-equal-to relationship of two char's */
a < b; /* I'll bet you can guess what this does now */
...exactly where a non alphabetic character is...
Well by virtue of scanning you'll know "exactly where" due to your index.

Scan from the first alphabet to the last alphabet. Begin with a counter variable set to 0.
Each time you move to next character, do counter++;this will give you the index of non alphabet.
If you find any non-alphabet character,return counter there itself.

I will give you a hint :
#include <stdio.h>
int main()
{
char c = '1';
printf("%d",c-48); //notice this
return 0;
}
Output : 1
Should be more than enough to solve it on your own now :)

printf() isn't being executed

I wanted to write a program which counts the occurrences of each letter in a string, then prints one of each letter followed by the count for that letter.
For example:
aabbcccd -
Has 2 a, 2 b, 3 c, and 1 d
So I'd like to convert and print this as:
a2b2c3d1
I wrote code (see below) to perform this count/conversion but for some reason I'm not seeing any output.
#include<stdio.h>
main()
{
char array[]="aabbcccd";
char type,*count,*cp=array;
while(cp!='\0'){
type=*cp;
cp++;
count=cp;
int c;
for(c=1;*cp==type;c++,cp++);
*count='0'+c;
}
count++;
*count='\0';
printf("%s",array);
}
Can anyone help me understand why I'm not seeing any output from printf()?

char array[]="aabbcccd";
char type,*count,*cp=array;
while(cp!='\0'){
*cp is a pointer it's pointing to the address of the start of the array, it will never be == to a char '\0' so it can't leave the loop.
You need to deference the pointer to get what it's pointing at:
while(*cp != '\0') {
...
Also, you have a ; after your for loop, skipping the contents of it:
for(c=1;*cp==type;c++,cp++); <-- this ; makes it not execute the code beneath it
After fixing both of those problems the code produces an output:
mike#linux-4puc:~> ./a.out
a1b1c2cd
Not the one you wanted yet, but that fixes your problems with "printf not functional"

Incidentally, this code has a few other major problems:
You try to write past the end of the string if the last character appears once (you write a '1' where the trailing '\0' was, and a '\0' one character beyond that.
Your code doesn't work if a character appears more than 9 times ('0' + 10 is ':').
Your code doesn't work if a character appears more than 2 times ("dddd" doesn't become "d4"; it becomes "d4dd").

Probably line-buffering. Add a \n to your printf() formatting string. Also your code is very scary, what happens if there are more than 9 of the same character in a row?

1) error correction
while(*cp!='\0'){
and not
while(cp!='\0'){
2) advice
do not use array[] to put in your result user another array to put in your rusel it's more proper and eay

I tried to solve your question quickly and this is my code:
#include <stdio.h>
#define SIZE 255
int main()
{
char input[SIZE] = "aabbcccd";/*input string*/
char output[SIZE]={'\0'};/*where output string is stored*/
char seen[SIZE]={'\0'};/*store all chars already counted*/
char *ip = input;/*input pointer=ip*/
char *op = output;/*output pointer = op*/
char *sp = seen;/*seen pointer=sp*/
char c,count;
int i,j,done;
i=0;
while(i<SIZE && input[i]!='\0')
{
c=input[i];
//don't count if already searched:
done=0;
j=0;
while(j<SIZE)
{
if(c==seen[j])
{
done=1;
break;
}
j++;
}
if(done==0)
{//if i never searched char 'c':
*sp=c;
sp++;
*sp='\0';
//count how many "c" there are into input array:
count = '0';
j=0;
while(j<SIZE)
{
if(ip[j]==c)
{
count++;
}
j++;
}
*op=c;
op++;
*op=count;
op++;
}
i++;
}
*op='\0';
printf("input: %s\n",input);
printf("output: %s\n",output);
return 0;
}
It's not a good code for several reasons(I don't check arrays size writing new elements, I could stop searches at first empty item, and so on...) but you could think about it as a "start point" and improve it. You could take a look at standard library to copy substring elements and so on(i.e. strncpy).

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

C: extract numbers from a string - c

You can also try using YACC or Lex, which will format your string as you want.

Related

Add up digit from char array in c language

How to count the number of distinct characters in common between two strings?

C: Replacing a substring within a string using loops

Writing a program in C with the function isAlphabetic to determine if a string strictly contains alphabetic letters or not

printf() isn't being executed

Categories

Resources