I'm looking for a simple solution for stripping numbers from a string.
Example: "GA1UXT4D9EE1" => "GAUXTDEE"
The occurrence of the numbers inside the string is erratic hence I cannot rely on functions such as scanf().
I'm new at programming in C.
Thanks for any help.
I will give you some tips:
You need to creat a new string.
Iterat over the original string.
Check if the current character is between the ascii values of numbers
If not, add it to the new string.
char stringToStrip[128];
char stripped[128];
strcpy(stringToStrip,"GA1UXT4D9EE1");
const int stringLen = strlen(stringToStrip);
int j = 0;
char currentChar;
for( int i = 0; i < stringLen; ++i ) {
currentChar = stringToStrip[i];
if ((currentChar < '0') || (currentChar > '9')) {
stripped[j++] = currentChar;
}
}
stripped[j] = '\0';
iterate through the string and check for the ascii value.
for(i = 0; i < strlen(str); i++)
{
if(str[i] >= 48 && str[i] <= 57)
{
// do something
}
}
I would agree that walking through would be an easy way to do it, but there is also an easier function to do this. You can use isdigit(). C++ documentation has an awesome example. (Don't worry, this also works in c.)
http://www.cplusplus.com/reference/cctype/isdigit/
Here is the code to do it.
int i;
int strLength = strlen(OriginalString);
int resultPosCtr = 0;
char *result = malloc(sizeof(char) * strLength);//Allocates room for string.
for(i = 0; i < strLength; i++){
if(!isdigit(OriginalString[i])){
result[resultPosCtr] = OriginalString[i];
resultPosCtr++;
}
}
result[resultPosCtr++] = '\0'; //This line adds the sentinel value A.K.A the NULL Value that marks the end of a c style string.
Everyone has it right.
Create a new char[] A.K.A. C style string.
Iterate over the original string
Check to see if the character at that iteration is a number
if not add to new string
Related
I have a wriiten a C program that contains a char array 'long_string' that looks something like this.
long_string[16] = "AHDAHDAHDAHDAHDA";
I wish to replace the letters in the string as follows:
A-0, H-1, D-2.
Could somebody tell me how could I achieve this? I tried to look online but most of the cases show the conversion of letters to there ASCII values which is not what I need. THank you for your time in advance :)
The way you have defined your string, it won't be null terminated (16 is not enough to fit in also the null terminator). Other than that, what you want should be fairly easy:
int i = 0;
char long_string[] = "AHDAHDAHDAHDAHDA";
int len = strlen(long_string);
for(i = 0; i<len; i++)
{
if(long_string[i] == 'A')
long_string[i] = '0';
else if(long_string[i] == 'H')
long_string[i] = '1';
// etc.
}
int i = 0;
for( ; i < size ; i++ ){
switch( long_string[i] ){
case 'A':
long_string[i] = '0';
break;
// and so on...
}
}
If you want to translate uppercase alpha chars, you can use a lookup table and index it with the char value less 'A', eg:
//ABCDEFGHIJKLMNOPQRSTUVWXYZ
const char xlat[]=("0 2 1 ");
..
..
newChar=xlat[oldChar-'A'];
or, for what you seem to want, the more general form:
const char xlat[]=("\x00\x20\x20\x02\x20\x20\x20\x01\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20");
Note that translating the chars into a set that includes '\0' will render the output array unuseable as a C-style string.
I am new to programming in C and am trying to write a simple function that will normalize a char array. At the end i want to return the length of the new char array. I am coming from java so I apologize if I'm making mistakes that seem simple. I have the following code:
/* The normalize procedure normalizes a character array of size len
according to the following rules:
1) turn all upper case letters into lower case ones
2) turn any white-space character into a space character and,
shrink any n>1 consecutive whitespace characters to exactly 1 whitespace
When the procedure returns, the character array buf contains the newly
normalized string and the return value is the new length of the normalized string.
*/
int
normalize(unsigned char *buf, /* The character array contains the string to be normalized*/
int len /* the size of the original character array */)
{
/* use a for loop to cycle through each character and the built in c functions to analyze it */
int i;
if(isspace(buf[0])){
buf[0] = "";
}
if(isspace(buf[len-1])){
buf[len-1] = "";
}
for(i = 0;i < len;i++){
if(isupper(buf[i])) {
buf[i]=tolower(buf[i]);
}
if(isspace(buf[i])) {
buf[i]=" ";
}
if(isspace(buf[i]) && isspace(buf[i+1])){
buf[i]="";
}
}
return strlen(*buf);
}
How can I return the length of the char array at the end? Also does my procedure properly do what I want it to?
EDIT: I have made some corrections to my program based on the comments. Is it correct now?
/* The normalize procedure normalizes a character array of size len
according to the following rules:
1) turn all upper case letters into lower case ones
2) turn any white-space character into a space character and,
shrink any n>1 consecutive whitespace characters to exactly 1 whitespace
When the procedure returns, the character array buf contains the newly
normalized string and the return value is the new length of the normalized string.
*/
int
normalize(unsigned char *buf, /* The character array contains the string to be normalized*/
int len /* the size of the original character array */)
{
/* use a for loop to cycle through each character and the built in c funstions to analyze it */
int i = 0;
int j = 0;
if(isspace(buf[0])){
//buf[0] = "";
i++;
}
if(isspace(buf[len-1])){
//buf[len-1] = "";
i++;
}
for(i;i < len;i++){
if(isupper(buf[i])) {
buf[j]=tolower(buf[i]);
j++;
}
if(isspace(buf[i])) {
buf[j]=' ';
j++;
}
if(isspace(buf[i]) && isspace(buf[i+1])){
//buf[i]="";
i++;
}
}
return strlen(buf);
}
The canonical way of doing something like this is to use two indices, one for reading, and one for writing. Like this:
int normalizeString(char* buf, int len) {
int readPosition, writePosition;
bool hadWhitespace = false;
for(readPosition = writePosition = 0; readPosition < len; readPosition++) {
if(isspace(buf[readPosition]) {
if(!hadWhitespace) buf[writePosition++] = ' ';
hadWhitespace = true;
} else if(...) {
...
}
}
return writePosition;
}
Warning: This handles the string according to the given length only. While using a buffer + length has the advantage of being able to handle any data, this is not the way C strings work. C-strings are terminated by a null byte at their end, and it is your job to ensure that the null byte is at the right position. The code you gave does not handle the null byte, nor does the buffer + length version I gave above. A correct C implementation of such a normalization function would look like this:
int normalizeString(char* string) { //No length is passed, it is implicit in the null byte.
char* in = string, *out = string;
bool hadWhitespace = false;
for(; *in; in++) { //loop until the zero byte is encountered
if(isspace(*in) {
if(!hadWhitespace) *out++ = ' ';
hadWhitespace = true;
} else if(...) {
...
}
}
*out = 0; //add a new zero byte
return out - string; //use pointer arithmetic to retrieve the new length
}
In this code I replaced the indices by pointers simply because it was convenient to do so. This is simply a matter of style preference, I could have written the same thing with explicit indices. (And my style preference is not for pointer iterations, but for concise code.)
if(isspace(buf[i])) {
buf[i]=" ";
}
This should be buf[i] = ' ', not buf[i] = " ". You can't assign a string to a character.
if(isspace(buf[i]) && isspace(buf[i+1])){
buf[i]="";
}
This has two problems. One is that you're not checking whether i < len - 1, so buf[i + 1] could be off the end of the string. The other is that buf[i] = "" won't do what you want at all. To remove a character from a string, you need to use memmove to move the remaining contents of the string to the left.
return strlen(*buf);
This would be return strlen(buf). *buf is a character, not a string.
The notations like:
buf[i]=" ";
buf[i]="";
do not do what you think/expect. You will probably need to create two indexes to step through the array — one for the current read position and one for the current write position, initially both zero. When you want to delete a character, you don't increment the write position.
Warning: untested code.
int i, j;
for (i = 0, j = 0; i < len; i++)
{
if (isupper(buf[i]))
buf[j++] = tolower(buf[i]);
else if (isspace(buf[i])
{
buf[j++] = ' ';
while (i+1 < len && isspace(buf[i+1]))
i++;
}
else
buf[j++] = buf[i];
}
buf[j] = '\0'; // Null terminate
You replace the arbitrary white space with a plain space using:
buf[i] = ' ';
You return:
return strlen(buf);
or, with the code above:
return j;
Several mistakes in your code:
You cannot assign buf[i] with a string, such as "" or " ", because the type of buf[i] is char and the type of a string is char*.
You are reading from buf and writing into buf using index i. This poses a problem, as you want to eliminate consecutive white-spaces. So you should use one index for reading and another index for writing.
In C/C++, a native string is an array of characters that ends with 0. So in essence, you can simply iterate buf until you read 0 (you don't need to use the len variable at all). In addition, since you are "truncating" the input string, you should set the new last character to 0.
Here is one optional solution for the problem at hand:
int normalize(char* buf)
{
char c;
int i = 0;
int j = 0;
while (buf[i] != 0)
{
c = buf[i++];
if (isspace(c))
{
j++;
while (isspace(c))
c = buf[i++];
}
if (isupper(c))
buf[j] = tolower(c);
j++;
}
buf[j] = 0;
return j;
}
you should write:
return strlen(buf)
instead of:
return strlen(*buf)
The reason:
buf is of type char* - it's an address of a char somewhere in the memory (the one in the beginning of the string). The string is null terminated (or at least should be), and therefore the function strlen knows when to stop counting chars.
*buf will de-reference the pointer, resulting on a char - not what strlen expects.
Not much different then others but assumes this is an array of unsigned char and not a C string.
tolower() does not itself need the isupper() test.
int normalize(unsigned char *buf, int len) {
int i = 0;
int j = 0;
int previous_is_space = 0;
while (i < len) {
if (isspace(buf[i])) {
if (!previous_is_space) {
buf[j++] = ' ';
}
previous_is_space = 1;
} else {
buf[j++] = tolower(buf[i]);
previous_is_space = 0;
}
i++;
}
return j;
}
#OP:
Per the posted code it implies leading and trailing spaces should either be shrunk to 1 char or eliminate all leading and trailing spaces.
The above answer simple shrinks leading and trailing spaces to 1 ' '.
To eliminate trailing and leading spaces:
int i = 0;
int j = 0;
while (len > 0 && isspace(buf[len-1])) len--;
while (i < len && isspace(buf[i])) i++;
int previous_is_space = 0;
while (i < len) { ...
I have a C string which has a value x.x.x where x can be 1 to 9. what's a good algorithm to make it x.x.8 for example the last digit fixed at 8.
I am thinking of using strtok function.
use this function:
void mask_string(char s[]) {
int j = 0,i;
while (s[j] != '\0')
j++;
i = j-1;
s[i] = '8';
}
this works even if you don't have a fixed length string
I am trying to loop a char*str use this to find out how many lines:
char *str = "test1\ntest2\ntest3";
int lines = 0;
for(int i = 0 ; i < ?? ; i ++ )
{
if(str[i] == '\n') {
lines++;
}
}
I am not sure what to put at the ??, the question is :
1.I mean do I need to use strlen(str) + 1 ?
2.when the str is "test1\ntest2\ntest3\n",does the code still calculate correct lines?
I am using gcc by the way,thanks
every literal string ends with \0 which is a null character..It depicts the end of the string
So,
You can do this
for(int i = 0 ; str[i]!='\0' ; i ++ )
To extend the already-existent good answers: the idiomatic way for looping through a C string is
const char *s = "abc\ndef\nghi\n";
int lines = 0;
int nonempty = 0;
while (*s) {
nonempty = 1;
if (*s++ == '\n') lines++;
}
If you don't want to count the last empty line as a separate line, then add
if (nonempty && s[-1] == '\n' && lines > 0) lines--;
after the while loop.
Take the length of the string and iterate through all characters.
const unsigned long length=strlen(str);
for(int i = 0 ; i < length ; i ++ )
{
if(str[i] == '\n') {
lines++;
}
}
The following will deliver the same result regardless if the last character is a newline or not.
char *abc = "test1\ntest2\ntest3";
int lines = 0;
{
bool lastWasNewline = true;
char * p = abc;
for (; *p; ++p) {
if (lastWasNewline) ++lines;
lastWasNewline = *p == '\n';
}
}
1.I mean do I need to use strlen(str) + 1 ?
no, just use str[i] for i < ??, this tests if that is the 0 character which terminates the string
2.when the abc is "test1\ntest2\ntest3\n",does the code still calculate correct lines?
no, you code assumes that the input is broken into one input line per buffer line[j].
in place of ?? put strlen(abc) and make sure #include <string.h>
For better efficiency do
int length= strlen(abc);
and then use i < length
Or use str[i]!= '\0'
How can I remove the '\n' from each string in this array?
I know that I can do something like this for a simple C-String, but I failed at using it in this case
cmd[strcspn(cmd, "\n")] = '\0';
I am also not sure if that would be the propper way or not.
The String will never contain any space or \n in the middle. They are also of a static length (6).
#include <stdlib.h>
unsigned char cmd[][6] = {
{"r123\n"},
{"r999\n"},
{"l092\n"},
{"l420\n"}};
void main(void) {
int i;
for(i = 0; i < (sizeof(cmd) / sizeof(cmd[0])); i++) {
printf("%s\n", cmd[i]);
}
}
Just do it by hand, it's easy!
If it's guaranteed to be only the last char in every word, and it's guaranteed to be there, than like this:
for (i = 0; i < elem_number; ++i){
cmd[i][strlen(cmd[i])-1] = 0;
}
If, on the other hand, you are unsure how many whitespace characters there will be at the end, but you know they will only be there at the end (there might be 0 in this case!) than this:
for (i = 0; i < elem_number; ++i){
for (j = 0; cmd[i][j] != 0; ++j){
if (isspace(cmd[i][j]))
cmd[i][j] = 0;
}
}
Voila!
If there will be whitespaces in the middle, then you have to define the desired behaviour: cut only the trailing whitespaces, cut the string in many little ones, or something completely different.
Oh, and one other sidenote:
everyone else seems to be using char = '\0'. In C, '\0' and 0 are equivalent, i.e. if ('\0' == 0) { ... } evaluates to true.
Sidenote 2: I used elem_number because I did not know if the number of elements is a parameter or hardcoded / know in advance. Substitute with what is appropriate.
Setting a character in a char array to \0 will truncate the string at that character.
So in your example setting the 5th character will do the job.
cmd[i][4] = '\0';
If the intended string can be less than 4 in length then don't hard-code to 4 but rather strlen(cmd[i])-1
Maybe you can use strrchr? Use in a loop if the string may contain several linebreaks.
for(i = 0; i< sizeof(cmd)/sizeof(unsigned char[6]);i++)
*strchr(cmd[i], '\n') = '\0';