To find the substring in a given text.. C programm - c

char *substring(char *text, int position, int length)
{
int i, j=0;
char *temp ;
for(i=position-1; i<position+length-1; i++)
{
temp[j++] = text[i];
}
temp[j] = '\0';
return temp;
}
Hi What is the error in the following code.. I am trying to run this on Fedora Machine.. And its giving me a run-time error "Segmentation Fault". What is this error all about.. and why is it giving this error..
Thanks..

temp is uninitialized.

You need to allocate memory for temp - currently it's just a dangling pointer. You can use malloc for this but note that the caller will need to ensure that this storage is subsequently freed.
For example:
char *substring(const char *text, int position, int length)
{
char *temp = malloc(length + 1);
int i, j;
for (i = position, j = 0; i < position + length; i++, j++)
{
temp[j] = text[i];
}
temp[j] = '\0';
return temp;
}

It means that your code has violated some restriction set up by the operating system, in this case you are writing to memory that you do not have the right to write to.
This is because your temp variable is just an uninitialized pointer, it doesn't contain the address of memory where you are allowed to write.
If you expect to write length + 1 characters, it must be pointing to at least that many bytes worth of space.
Since you expect to return the string, you need to either make it static (but that can be dangerous), or allocate the space dynamically:
if((temp = malloc(length + 1)) == NULL)
return NULL;

I am making a copy of the sub-string into another pointer, this is the just the simple way of finding one substring of a given string..
Hope i am that very simple way in the correct manner..
Also, The methods given by SysAdmin,, looks pretty complex ones, but still thanx for the suggestion.. I will try and learn those as well.. But if you can tell me whether i have implemented the very basic pattern searching algorithm correctly, then it would be very kind..
Thanks..

while the answer is obvious - i.e temp is not initialized,
here is a suggetion.
If your intention is to find a substring in another string,
few alternatives are,
1. use C strstr(...)
2. Robin-Karp method
3. Knuth-Morris-Pratt method
4. Boyer Moore method
Update:
Initialy I thought this question was related to finding the substring (based on the title).
Anyway, this looks like strchr() implementation.

It is obvious from the code that you missed to allocate / initialize the pointer *temp. It is pointing to nowhere.
You either have to use malloc or strdup and do the rest. But yeah , you may also want to explore using strncpy (null terminate) to simplify the code.

Related

Insert a string inside of another string (without an intermediate string)

I came across this question and it asks if it's possible to write the function
void insert(char* M, char* T, int i)
that inserts the string T inside M starting from the index i, without using an intermediate string... I tried to use realloc but I think there's a problem when the original string M is a lot smaller than the result, my theory is that realloc changes the address of the string to be able to represent the new string.
For example: M="Wg" T="ron" and i=1; the result should be M="Wrong".
I'm using the following code:
void insert(char* M,char* T,int i)
{
int l;
l=strlen(M);
M=realloc(M,l+strlen(T)+1);
for(int j = l-1; j >= i; j--)
{
M[j+strlen(T)]=M[j];
}
for(int j = 0;j < strlen(T); j++)
{
M[i+j]=T[j];
}
M[l+strlen(T)]='\0'; //from what i've tested the string M is correct.
}
and using this declaration:
char *s=malloc(3);
char *c=malloc(18);
strcpy(s,"as");
strcpy(c,"bcdefghijklmnopqr");
insert(s,c,1); //this example does not work on my machine.
I hope this clarifies the question.
So is there a way to do it?
Example of a possible implementation using memmove. Explanations are in the comments
#include <stdio.h>
#include <string.h>
void insertString(char* M, const char* T, size_t index)
{
// ASSUMES there's enough space in M for this operation
// get the original lengths of each string
size_t Mlen = strlen(M);
size_t Tlen = strlen(T);
if (index < Mlen)
{
// M+index+Tlen is the destination position where the remaining characters in M will start
// M+index is the index where T will be inserted
// Mlen-index is the remaining number of characters in M that need to move
memmove(M+index+Tlen, M+index, Mlen-index);
// copy the T string to the space we just created
memcpy(M+index, T, Tlen);
// NUL terminate the new string
M[Mlen + Tlen] = '\0';
}
else
{
// simply strcat if the index falls outside the range of M
strcat(M, T);
}
}
If you're not allowed to use memmove or memcpy, it's simple enough to roll your own.
Demo
[This isn't really an answer; it's a clarification that's too complicated for a comment.]
If you can assume that the caller looks like
char string[6] = "Wg";
insert(string, "ron", 1);
(or with the string array having any size greater than 5), then you can write insert() easily.
If you can assume that the caller looks like
char *string = malloc(3);
strcpy(string, "Wg");
insert(string, "ron", 1);
then you can almost write insert() using realloc to make the string larger, except you have no way to return the possibly-new (that is, possibly moved) value of string.
If the caller might look like
char *string = "Wg";
insert(string, "ron", 1);
or even
char *string = "Wg\0\0\0";
insert(string, "ron", 1);
than you definitely cannot write insert(), because you cannot assume that the pointed-to string is writable (and on many platforms it will not be).
So, in general, the answer is: "No". You cannot write a general-purpose version of insert() that will work under all circumstances.
Note, too, that if you were to assume that the string were in malloc'ed memory and that you could use realloc (as in my second example), that code would not work for strings that were not malloc'ed (that is, it would not work for callers like my first example), and it would have no portable way of knowing, based on the pointer passed to it, whether it would be abe to safely use realloc or not.

How to fix segfault caused by a realloc going out of bounds?

Hello and TIA for your help. As I am new to to posting questions, I welcome any feedback on how this quesiton has been asked. I have researched much in SO without finding what I thought I was looking for.
I'm still working on it, and I'm not really good at C.
My purpose is extracting data from certain specific tags from a given XML and writing it to file. My issue arises because as I try to fill up the data struct I created for this purpose, at a certain point the realloc() function gives me a pointer to an address that's out of bounds.
If you look at this example
#include <stdio.h>
int main() {
char **arrayString = NULL;
char *testString;
testString = malloc(sizeof("1234567890123456789012345678901234567890123456789"));
strcpy(testString, "1234567890123456789012345678901234567890123456789");
int numElem = 0;
while (numElem < 50) {
numElem++;
arrayString = realloc(arrayString, numElem * sizeof(char**));
arrayString[numElem-1] = malloc(strlen(testString)+1);
strcpy(arrayString[numElem-1], testString);
}
printf("done\n");
return 0;
}
it does a similar, but simplified thing to my code. Basically tries to fill up the char** with c strings but it goes to segfault. (Yes I understand I am using strcpy and not its safer alternatives, but as far as I understand it copies until the '\0', which is automatically included when you write a string between "", and that's all I need)
I'll explain more in dephth below.
In this code i make use of the libxml2, but you don't need to know it to help me.
I have a custom struct declared this way:
struct List {
char key[24][15];
char **value[15];
int size[15];
};
struct List *list; //i've tried to make this static after reading that it could make a difference but to no avail
Which is filled up with the necessary key values. list->size[] is initialized with zeros, to keep track of how many values i've inserted in value.
value is delcared this way because for each key, i need an array of char* to store each and every value associated with it. (I thought this through, but it could be a wrong approach and am welcome to suggestions - but that's not the purpose of the question)
I loop through the xml file, and for each node I do a strcmp between the name of the node and each of my keys. When there is a match, the index of that key is used as an index in the value matrix. I then try to extend the allocated memory for the c string matrix and then afterwards for the single char*.
The "broken" code, follows, where
read is the index of the key abovementioned.
reader is the xmlNode
string contained the name of the xmlNode but is then freed so consider it as if its a new char*
list is the above declared struct
if (xmlTextReaderNodeType(reader) == 3 && read >= 0)
{
/* pull out the node value */
xmlChar *value;
value = xmlTextReaderValue(reader);
if (value != NULL) {
free(string);
string=strdup(value);
/*increment array size */
list->size[read]++;
/* allocate char** */ list->value[read]=realloc(list->value[read],list->size[read] * sizeof(char**));
if (list->value[read] == NULL)
return 16;
/*allocate string (char*) memory */
list->value[read][list->size[read]-1] = realloc(list->value[read][list->size[read]-1], sizeof(char*)*sizeof(string));
if (list->value[read][list->size[read]-1] == NULL)
return 16;
/*write string in list */
strcpy(list->value[read][list->size[read]-1], string);
}
/*free memory*/
xmlFree(value);
}
xmlFree(name);
free(string);
I'd expect this to allocate the char**, and then the char*, but after a few iteration of this code (which is a function wrapped in a while loop) i get a segfault.
Analyzing this with gdb (not an expert with it, just learned it on the fly) I noticed that indeed the code seems to work as expected for 15 iteration. At the 16th iteration, the list->value[read][list->size[read]-1] after the size is incremented, list->value[read][list->size[read]-1] points to a 0x51, marked as address out of bounds. The realloc only brings it to a 0x3730006c6d782e31, still marked as out of bounds. I would expect it to point at the last allocated value.
Here is an image of that: https://imgur.com/a/FAHoidp
How can I properly allocate the needed memory without going out of bounds?
Your code has quite a few problems:
You are not including all the appropriate headers. How did you get this to compile? If you are using malloc and realloc, you need to #include <stdlib.h>. If you are using strlen and strcpy, you need to #include <string.h>.
Not really a mistake, but unless you are applying sizeof to a type itself you don't have to use enclosing brackets.
Stop using sizeof str to get the length of a string. The correct and safe approach is strlen(str)+1. If you apply sizeof to a pointer someday you will run into trouble.
Don't use sizeof(type) as argument to malloc, calloc or realloc. Instead, use sizeof *ptr. This will avoid your incorrect numElem * sizeof(char**) and instead replace it with numElem * sizeof *arrayString, which correctly translates to numElem * sizeof(char*). This time, though, you were saved by the pure coincidence that sizeof(char**) == sizeof(char*), at least on GCC.
If you are dynamically allocating memory, you must also deallocate it manually when you no longer need it. Use free for this purpose: free(testString);, free(arrayString);.
Not really a mistake, but if you want to cycle through elements, use a for loop, not a while loop. This way your intention is known by every reader.
This code compiles fine on GCC:
#include <stdio.h> //NULL, printf
#include <stdlib.h> //malloc, realloc, free
#include <string.h> //strlen, strcpy
int main()
{
char** arrayString = NULL;
char* testString;
testString = malloc(strlen("1234567890123456789012345678901234567890123456789") + 1);
strcpy(testString, "1234567890123456789012345678901234567890123456789");
for (int numElem = 1; numElem < 50; numElem++)
{
arrayString = realloc(arrayString, numElem * sizeof *arrayString);
arrayString[numElem - 1] = malloc(strlen(testString) + 1);
strcpy(arrayString[numElem - 1], testString);
}
free(arrayString);
free(testString);
printf("done\n");
return 0;
}

Abort trap: 6 error with arrays in c

The following code compiled fine yesterday for a while, started giving the abort trap: 6 error at one point, then worked fine again for a while, and again started giving the same error. All the answers I've looked up deal with strings of some fixed specified length. I'm not very experienced in programming so any help as to why this is happening is appreciated. (The code is for computing the Zeckendorf representation.)
If I simply use printf to print the digits one by one instead of using strings the code works fine.
#include <string.h>
// helper function to compute the largest fibonacci number <= n
// this works fine
void maxfib(int n, int *index, int *fib) {
int fib1 = 0;
int fib2 = 1;
int new = fib1 + fib2;
*index = 2;
while (new <= n) {
fib1 = fib2;
fib2 = new;
new = fib1 + fib2;
(*index)++;
if (new == n) {
*fib = new;
}
}
*fib = fib2;
(*index)--;
}
char *zeckendorf(int n) {
int index;
int newindex;
int fib;
char *ans = ""; // I'm guessing the error is coming from here
while (n > 0) {
maxfib(n, &index, &fib);
n -= fib;
maxfib(n, &newindex, &fib);
strcat(ans, "1");
for (int j = index - 1; j > newindex; j--) {
strcat(ans, "0");
}
}
return ans;
}
Your guess is quite correct:
char *ans = ""; // I'm guessing the error is coming from here
That makes ans point to a read-only array of one character, whose only element is the string terminator. Trying to append to this will write out of bounds and give you undefined behavior.
One solution is to dynamically allocate memory for the string, and if you don't know the size beforehand then you need to reallocate to increase the size. If you do this, don't forget to add space for the string terminator, and to free the memory once you're done with it.
Basically, you have two approaches when you want to receive a string from function in C
Caller allocates buffer (either statically or dynamically) and passes it to the callee as a pointer and size. Callee writes data to buffer. If it fits, it returns success as a status. If it does not fit, returns error. You may decide that in such case either buffer is untouched or it contains all data fitting in the size. You can choose whatever suits you better, just document it properly for future users (including you in future).
Callee allocates buffer dynamically, fills the buffer and returns pointer to the buffer. Caller must free the memory to avoid memory leak.
In your case the zeckendorf() function can determine how much memory is needed for the string. The index of first Fibonacci number less than parameter determines the length of result. Add 1 for terminating zero and you know how much memory you need to allocate.
So, if you choose first approach, you need to pass additional two parameters to zeckendorf() function: char *buffer and int size and write to the buffer instead of ans. And you need to have some marker to know if it's first iteration of the while() loop. If it is, after maxfib(n, &index, &fib); check the condition index+1<=size. If condition is true, you can proceed with your function. If not, you can return error immediately.
For second approach initialize the ans as:
char *ans = NULL;
after maxfib(n, &index, &fib); add:
if(ans==NULL) {
ans=malloc(index+1);
}
and continue as you did. Return ans from function. Remember to call free() in caller, when result is no longer needed to avoid memory leak.
In both cases remember to write the terminating \0 to buffer.
There is also a third approach. You can declare ans as:
static char ans[20];
inside zeckendorf(). Function shall behave as in first approach, but the buffer and its size is already hardcoded. I recommend to #define BUFSIZE 20 and either declare variable as static char ans[BUFSIZE]; and use BUFSIZE when checking available size. Please be aware that it works only in single threaded environment. And every call to zeckendorf() will overwrite the previous result. Consider following code.
char *a,*b;
a=zeckendorf(10);
b=zeckendorf(15);
printf("%s\n",a);
printf("%s\n",b);
The zeckendorf() function always return the same pointer. So a and b would pointer to the same buffer, where the string for 15 would be stored. So, you either need to store the result somewhere, or do processing in proper order:
a=zeckendorf(10);
printf("%s\n",a);
b=zeckendorf(15);
printf("%s\n",b);
As a rule of thumb majority (if not all) Linux standard C library function uses either first or third approach.

Copying unsigned char in C

I want to use memcpy but it seems to me that it's copying the array from the start?
I wish to copy from A[a] to A[b]. So, instead I found an alternative way,
void copy_file(char* from, int offset, int bytes, char* to) {
int i;
int j = 0;
for (i = offset; i <= (offset+bytes); i++) to[i] = from[j++];
}
I'm getting seg faults but I don't know where I am getting this seg fault from?
each entry holds 8 bytes so my second attempt was
void copy_file(char* from, int offset, int bytes, char* to) {
int i;
int j = 0;
for (i = 8*offset; i <= 8*(offset+bytes); i++) to[i] = from[j++];
}
but still seg fault. If you need more information please don't hesitate to ask!
I'm getting seg faults but I don't know where I am getting this seg fault from?
Primary Suggestion: Learn to use a debugger. It provides helpful information about erroneous instruction(s).
To answer you query on the code snippet shown on above question,
Check the incoming pointers (to and from) against NULL before dereferencing them.
Put a check on the boundary limits for indexes used. Currently they can overrun the allocated memory.
To use memcpy() properly:
as per the man page, the signature of memcpy() indicates
void *memcpy(void *dest, const void *src, size_t n);
it copies n bytes from address pointer by src to address pointed by dest.
Also, a very very important point to note:
The memory areas must not overlap.
So, to copy A[a] to A[b], you may write something like
memcpy(destbuf, &A[a], (b-a) );
it seems to me that memcpy copying the array from the start
No, it does not. In fact, memcpy does not have a slightest idea that it is copying from or to an array. It treats its arguments as pointers to unstructured memory blocks.
If you wish to copy from A[a] to A[b], pass an address of A[a] and the number of bytes between A[b] and A[a] to memcpy, like this:
memcpy(Dest, &A[a], (b-a) * sizeof(A[0]));
This would copy the content of A from index a, inclusive, to index b, exclusive, into a memory block pointed to by Dest. If you wish to apply an offset to Dest as well, use &Dest[d] for the first parameter. Multiplication by sizeof is necessary for arrays of types other than char, signed or unsigned.
Change the last line from
for (i = offset; i <= (offset+bytes); i++)
to[i] = from[j++];
to
for (i = offset; i <= bytes; i++,j++)
to[j] = from[i];
This works fine for me. I have considered offset as the start of the array and byte as the end of the array. ie to copy from[offset] to from[bytes] to to[].

C memory free confusion

I have a short question about my code. I've created two situation or examples for testing.
example 1:
char *arr[1000000];
int i = 0;
for (; i < 1000000; i++){
char *c = (char *) calloc(1, sizeof(char) * 10);
free(c);
}
example 2:
char *arr[1000000];
int i = 0;
for (; i < 1000000; i++){
char *c = (char *) calloc(1, sizeof(char) * 10);
arr[i] = c;
free(arr[i]);
arr[i] = NULL;
}
The differents in examples: putting in an array before free'ing the memory.
When I run example 1 it free's all memory. When I run example 2 it doesn't free all memory.
I've searched and looked but couldn't figure it out.
Why is the result of example 2 different then example 1?
My common sense tells me example 1 and 2 should result the same, but in practice it doesn't. I use linux top for checking memory usage.
The result are the same. I'm not sure why you think there are differences.
It is caused by demand-paging. The process has the address space for the array (that is: pagetable entries exist for it) but there is no memory attached to it (yet). The loop assigns to (eventually) all the memory pages that belong to array[], so at the end of the loop all pages have been "faulted-in".
As a proof of concept, you can replace the loop with:
for (; i < 1000000; i++){
arr[i] = "hello, world!";
}
And the result will probably be (almost) the same as in snippet#2
Both are the same.
Since you use top for reading memory the difference can be explained with compiler optimizations. For example, the array in example one can be completely optimized out.
For checking memory issues, you should use valgrind or a similar tool.

Resources