String manipulations - c

Is there a proper way to just copy a part of a string after a certain point.
Party City 1422 Evergreen Street
I use strpbrk() to copy the name out, I could always just tokenize it by white space but is there a string process or technique where I can copy out a specific section of a string besides from the beginning like copy just [1422 Evergreen Street] or delete the first portion of the string?

If you want to specify it by starting position and length, you can always use strncpy and a bit of pointer arithmetic.
EDIT: When you know the starting string you can use
char *pos = strstr(src, "1422");
strcpy(dst, pos);

If you know the first and last characters' indexes of the substring you want to pick, you should do this with strncpy. See the following snippet to copy substringLength characters from the inputStr string at the given startIndex.
char * inputStr;
char * outputStr;
strncpy(outputStr, inputStr + startIndex, substringLength);

If you want to split at the location of a particular string, you can do something like this:
#define MAX_STRING 1024
int main() {
char myleftBuffer[MAX_STRING]="";
char myrightBuffer[MAX_STRING]="";
char mystring[]="Party City 1422 Evergreen Street";
char *start = strstr(mystring, "1422");
if(start) {
strcpy(myrightBuffer, start);
strncpy(myleftBuffer, mystring, (start - mystring));
}
printf("%s -> %s\n", myleftBuffer, myrightBuffer);
return;
}
Which outputs:
Party City -> 1422 Evergreen Street

Actually, strncpy is not a particularly good choice for the task at hand. It always pads your value out to occupy the entire destination, which is generally pretty wasteful (it was originally designed for putting file names into the Unix file system; it's good for that, but not really much else).
I think I'd use sscanf. Assuming we always want to copy from the first digit to the end of the string, you could do something like this:
char street_name[256];
sscanf(input_buffer, "%*[^0-9]%255[^\n]", street_name);
FWIW, the %*[^0-9] part skips over characters until it reaches something in the range 0-9 (yes, I know it looks like a regex, but scanf and company support it too). The * in it means to scan but not assign what it finds. The %255[^\n] means to read and assign until the next newline in the input, or up to 255 characters, whichever comes first.

int split_at(const char *in, const char *match, char *buf, size_t len)
{
char *pos;
if( (pos = strstr(in, match)) == NULL )
return -1; // No match
else if( pos == in )
return 0; // match is empty
if( strlcpy(buf, pos, len) >= len )
fprintf(stderr, "WARNING: match truncated: %s", buf);
return 1;
}

Probably impossible in the general case, and you would do better to get the input in seperate fields, but if thats not a option, something the following should work:
size_t street_extract(char* ret,size_t retsz,char* addr)
{
size_t i,nwrote;
for(i=0; addr[i] ;i++)
{
if(addr[i]!=' ') continue; /* only check at start of word */
i++;
if('0' < addr[i] && addr[i] < '9') break; /* found street number */
}
if(!addr[i]) return -1; /* not found */
for(nwrote=0; addr[i+nwrote] && nwrote < retsz-1 ;nwrote++)
{
ret[nwrote] = addr[i+nwrote];
}
ret[nwrote] = 0;
while(addr[i+nwrote]) nwrote++;
return nwrote; /* result is nwrote characters in length */
}
modify and error-check as needed.

Related

Using strncpy to remove part of a char*

I am trying to remove a certain part of my string using strncpy but I am facing some issues here.
This is what my 2 char* has.
trimmed has for example "127.0.0.1/8|rubbish|rubbish2|" which is a
prefix of a address.
backportion contains "|rubbish|rubbish2|"
What I wanna do is to remove the backportion of the code from trimmed. So far I got this:
char* extractPrefix(char buf[1024]){
int count = 0;
const char *divider = "|";
char *c = buf;
char *trimmed;
char *backportionl;
while(*c){
if(strchr(divider,*c)){
count++;
if(count == 5){
++c;
trimmed = c;
//printf("Statement: %s\n",trimmed);
}
if(count == 6){
backportionl = c;
}
}
c++;
}
strncpy(trimmed,backportionl,sizeof(backportionl));
printf("Statement 2: %s\n", trimmed);
Which nets me an error of backportionl being a char* instead of a char.
Is there anyway I can fix this issue or find a better way to trim this char* to get my aim?
Here's one way that works for a list of dividers, similar to how strtok works the first time it's called:
char *extractPrefix(char *buf, const char *dividers)
{
size_t div_idx = strcspn(buf, dividers);
if (buf[div_idx] != 0)
buf[div_idx] = 0;
return buf;
}
If you don't want the original buffer modified, you can use strndup, assuming your platform supports the function (Windows doesn't; you'd need to code it yourself). Don't forget to free the pointer that is returned when you're done with it:
char *extractPrefix(const char *buf, const char *dividers)
{
size_t div_idx = strcspn(buf, dividers);
return strndup(buf, div_idx);
}
Alternatively, you could just return the number of characters (or some value less than 0 if the number of characters in the prefix won't fit in an int):
int pfxlen(const char *buf, const char *dividers)
{
size_t div_idx = strcspn(buf, dividers);
if (div_idx > (size_t)INT_MAX)
return -1;
return (int)div_idx;
}
and use it like this:
int n;
const char *example = "127.0.0.1/8|rubbish|rubbish2|";
n = pfxlen(example, "|");
if (n >= 0)
printf("Prefix: %.*s\n", n, example);
else
fprintf(stderr, "prefix too long\n");
Obviously you have a number of options. It's really up to you which one you want to use.
Welp, this is stupid but i fixed my issue in basically one line. so here goes,
trimmed[strchr(trimmed,'|')-trimmed] = '\0';
printf("Statement 2: %s\n", trimmed);
So by getting the index of 'backportion' from the trimmed char* using strchr, i was effectively able to fix the issue.
Thanks internet, for not much.
Disclaimer: I'm not sure whether I correctly understood what you actually want to achieve. Some examples would probably be helpful.
I am trying to remove a certain part of my string [..]
I have no idea what you're trying in your code, but this is pretty easy to achieve with strstr, strlen and memmove:
First, find the position of the string you want to remove using strstr. Then copy what's behind that found string to the position where the found string starts.
char cut_out_first(char * input, char const * unwanted) {
assert(input); assert(unwanted);
char * start = strstr(input, unwanted);
if (start == NULL) {
return 0;
}
char * rest = start + strlen(unwanted);
memmove(start, rest, strlen(rest) + 1);
return 1;
}

How do I split a string by character position in c

I'm using C to read in an external text file. The input is not great and would look like;
0PAUL 22 ACACIA AVENUE 02/07/1986RN666
As you can see I have no obvious delimeter, and sometimes the values have no space between them. However I do know how long in character length each value should be when split. Which is as follows,
id = 1
name = 20
house number = 5
street name = 40
date of birth = 10
reference = 5
I've set up a structure I want to hold this information in, and have tried using fscanf to read in the file.
However I find something along the lines of just isn't doing what I need,
fscanf(file_in, "%1d, %20s", person.id[i], person.name[i]);
(The actual line I use attempts to grab all input but you should see where I'm going...)
The long term intention is to reformat the input file into another output file which would be made a little easier on the eye.
I appreciate I'm probably going about this all the wrong way, but I would hugely appreciate it if somebody could set me on the right path. If you're able to take it easy on me in regard to an obvious lack of understanding, I'd appreciate that also.
Thanks for reading
Use fgets to read each line at a time, then extract each field from the input line. Warning: no range checks is performed on buffers, so attention must be kept to resize buffers opportunely.
For example something like this (I don't compile it, so maybe some errors exist):
void copy_substr(const char * pBuffer, int content_size, int start_idx, int end_idx, char * pOutBuffer)
{
end_idx = end_idx > content_size ? content_size : end_idx;
int j = 0;
for (int i = start_idx; i < end_idx; i++)
pOutBuffer[j++] = pBuffer[i];
pOutBuffer[j] = 0;
return;
}
void test_solution()
{
char buffer_char[200];
fgets(buffer_char,sizeof(buffer_char),stdin); // use your own FILE handle instead of stdin
int len = strlen(buffer_char);
char temp_buffer[100];
// Reading first field: str[0..1), so only the char 0 (len=1)
int field_size = 1;
int filed_start_ofs = 0;
copy_substr(buffer_char, len, filed_start_ofs, filed_start_ofs + field_size, temp_buffer);
}
scanf is a good way to do it, you just need to use a buffer and call sscanf multiple times and give the good offsets.
For example :
char buffer[100];
fscanf(file_in, "%s",buffer);
sscanf(buffer, "%1d", person.id[i]);
sscanf(buffer+1, "%20s", person.name[i]);
sscanf(buffer+1+20, "%5d", person.street_number[i]);
and so on.
I feel like it is the easiest way to do it.
Please also consider using an array of your struct instead of a struct of arrays, it just feels wrong to have person.id[i] and not person[i].id
If you have fixed column widths, you can use pointer arithmetic to access substrings of your string str. if you have a starting index begin,
printf("%s", str + begin) ;
will print the substring beginning at begin and up to the end. If you want to print a string of a certain length, you can use printf's precision specifier .*, which takes a maximum length as additional argument:
printf("%.*s", length, str + begin) ;
If you want to copy the string to a temporary buffer, you could use strncpy, which will generate a null terminated string if the buffer is larger than the substring length. You could also use snprintf according to the above pattern:
char buf[length + 1];
snprintf(buf, sizeof(buf), "%.*s", length, str + begin) ;
This will extract leading and trailing spaces, which is probably not what you want. You could write a function to strip the unwanted whitespace; there should be plenty of examples here on SO.
You could also strip the whitespace when copying the substring. The example code below does this with the isspace function/macro from <ctype.h>:
#include <stdlib.h>
#include <stdio.h>
#include <ctype.h>
int extract(char *buf, const char *str, int len)
{
const char *end = str + len;
int tail = -1;
int i = 0;
// skip leading white space;
while (str < end && *str && isspace(*str)) str++;
// copy string
while (str < end && *str) {
if (!isspace(*str)) tail = i + 1;
buf[i++] = *str++;
}
if (tail < 0) tail= i;
buf[tail] = '\0';
return tail;
}
int main()
{
char str[][80] = {
"0PAUL 22 ACACIA AVENUE 02/07/1986RN666",
"1BOB 1 POLK ST 01/04/1988RN802",
"2ALICE 99 WEST HIGHLAND CAUSEWAY 28/06/1982RN774"
};
int i;
for (i = 0; i < 3; i++) {
char *p = str[i];
char id[2];
char name[20];
char number[6];
char street[35];
char bday[11];
char ref[11];
extract(id, p + 0, 1);
extract(name, p + 1, 19);
extract(number, p + 20, 5);
extract(street, p + 25, 34);
extract(bday, p + 59, 10);
extract(ref, p + 69, 10);
printf("<person id='%s'>\n", id);
printf(" <name>%s</name>\n", name);
printf(" <house>%s</house>\n", number);
printf(" <street>%s</street>\n", street);
printf(" <birthday>%s</birthday>\n", bday);
printf(" <reference>%s</reference>\n", ref);
printf("</person>\n\n");
}
return 0;
}
There's a danger here, however: When you access a string at a certain position str + pos you should make sure that you don't go beyond the actual string length. For example, you string may be terminated after the name. When you access the birthday, you access valid memory, but it might contain garbage.
You can avoid this problem by padding the full string with spaces.

Scanning in more than one word in C

I am trying to make a program which needs scans in more than one word, and I do not know how to do this with an unspecified length.
My first port of call was scanf, however this only scans in one word (I know you can do scanf("%d %s",temp,temporary);, but I do not know how many words it needs), so I looked around and found fgets. One issue with this is I cannot find how to make it move to the next code, eg
scanf("%99s",temp);
printf("\n%s",temp);
if (strcmp(temp,"edit") == 0) {
editloader();
}
would run editloader(), while:
fgets(temp,99,stdin);
while(fgets(temporary,sizeof(temporary),stdin))
{
sprintf(temp,"%s\n%s",temp,temporary);
}
if (strcmp(temp,"Hi There")==0) {
editloader();
}
will not move onto the strcmp() code, and will stick on the original loop. What should I do instead?
I would scan in each loop a word with scanf() and then copy it with strcpy() in the "main" string.
maybe you can use getline method ....I have used it in vc++ but if it exists in standard c library too then you are good to go
check here http://www.daniweb.com/software-development/c/threads/253585
http://www.cplusplus.com/reference/iostream/istream/getline/
Hope you find what you are looking for
I use this to read from stdin and get the same format that you would get by passing as arguments... so that you can have spaces in words and quoted words within a string. If you want to read from a specific file, just fopen it and change the fgets line.
#include <stdio.h>
void getargcargvfromstdin(){
char s[255], **av = (char **)malloc(255 * sizeof(char *));
unsigned char i, pos, ac;
for(i = 0; i < 255; i++)
av[i] = (char *)malloc(255 * sizeof(char));
enum quotes_t{QUOTED=0,UNQUOTED}quotes=UNQUOTED;
while (fgets(s,255,stdin)){
i=0;pos=0;ac=0;
while (i<strlen(s)) {
/* '!'=33, 'ÿ'=-1, '¡'=-95 outside of these are non-printables */
if ( quotes && ((s[i] < 33) && (s[i] > -1) || (s[i] < -95))){
av[ac][pos] = '\0';
if (av[ac][0] != '\0') ac++;
pos = 0;
}else{
if (s[i]=='"'){ /* support quoted strings */
if (pos==0){
quotes=QUOTED;
}else{ /* support \" within strings */
if (s[i-1]=='\\'){
av[ac][pos-1] = '"';
}else{ /* end of quoted string */
quotes=UNQUOTED;
}
}
}else{ /* printable ascii characters */
av[ac][pos] = s[i];
pos++;
}
}
i++;
}
//your code here ac is the number of words and av is the array of words
}
}
If it exceeds the buffer size you simply can't do it.
You will have to do multiple loops
the maximum size you can scan with scanf() will come from
char *name;
scanf("%s",name);
reed this
http://sekrit.de/webdocs/c/beginners-guide-away-from-scanf.html

How to do a split of a string

HI, I would like how to do a split of a string in c without #include
Multiple ways of doing that, which I'll just explain and not write for you as this can only be a homework (or self-enhancement exercise, so the intent is the same).
Either you split the string into multiple strings that you re-allocate into a multi-dimensional array,
or you simply cut the string on separators and add terminal '\0' where appropriate and just copy the starting address of each sub-string to an array of pointers.
The approach for the splitting is similar in both cases, but in the second one you don't need to allocate any memory (but modify the original string), while in the first one you create safe copies of each sub-string.
You were not specific on the splitting, so I don't know if you wanted to cut on substrings, a single charater, or a list of potential separators, etc...
Good luck.
find the point you would like to split it
make two buffers large enough to contain data
strcpy() or do it manually (see example)
in this code I assume you have a string str[] and would like to split it at the first comma:
for(int count = 0; str[count] != '\0'; count++) {
if(str[count] == ',')
break;
}
if(str[count] == '\0')
return 0;
char *s1 = malloc(count);
strcpy(s1, (str+count+1)); // get part after
char *s2 = malloc(strlen(str) - count); // get part before
for(int count1 = 0; count1 < count; count1++)
s2[count1] = str[count1];
got it? ;)
Assuming I have complete control of the function prototype, I'd do this (make this a single source file (no #includes) and compile, then link with the rest of the project)
If #include <stddef.h> is part of the "without #include" thing (but it shouldn't), then instead of size_t, use unsigned long in the code below
#include <stddef.h>
/* split of a string in c without #include */
/*
** `predst` destination for the prefix (before the split character)
** `postdst` destination for the postfix (after the split character)
** `src` original string to be splitted
** `ch` the character to split at
** returns the length of `predst`
**
** it is UB if
** src does not contain ch
** predst or postdst has no space for the result
*/
size_t split(char *predst, char *postdst, const char *src, char ch) {
size_t retval = 0;
while (*src != ch) {
*predst++ = *src++;
retval++;
}
*predst = 0;
src++; /* skip over ch */
while ((*postdst++ = *src++) != 0) /* void */;
return retval;
}
Example usage
char a[10], b[42];
size_t n;
n = split(b, a, "forty two", ' ');
/* n is 5; b has "forty"; a has "two" */

How do I set the number of characters output in a fprintf '%s' format using a variable?

I need to write a variable number of characters to a file. For example, lets say I want to print 3 characters. "TO" would print "TO" to a file. "LongString of Characters" would print "Lon" to a file.
How can I do this? (the number of characters is defined in another variable). I know that this is possible fprintf(file,"%10s",string), but that 10 is predefined
This one corresponds to your example:
fprintf(file, "%*s", 10, string);
but you mentioned a maximum as well, to also limit the number:
fprintf(file, "%*.*s", 10, 10, string);
I believe you need "%*s" and you'll need to pass the length as an integer before the string.
As an alternative, why not try this:
void print_limit(char *string, size_t num)
{
char c = string[num];
string[num] = 0;
fputs(string, file);
string[num] = c;
}
Temporarily truncates the string to the length you want and then restores it. Sure, it's not strictly necessary, but it works, and it's quite easy to understand.
As an alternative, you may create "c_str" from your buffer and prepare string to printf like I do that in this source:
void print(char *val, size_t size) {
char *p = (char *)malloc(size+1); // +1 to zero char
char current;
for (int i=0; i < size; ++i) {
current = val[i];
if (current != '\0') {
p[i] = current;
} else {
p[i] = '.'; // keep in mind that \0 was replace by dot
}
p[i+1] = '\0';
}
printf("%s", p);
free(p);
}
But it solution wrong way. You shuld use fprintf with format "%*s".

Resources