I'm testing if a URL contains certain extensions. I have to do this about 100M times. I'm trying to pass the URL without the query string so I can compare the last 3 chars from the URL against some conditions.
My question is, can I pass only http://www.e.com/msusa/DisplayContactPage.jsp to textExtension? without modifying url in main and without strdup the string?
int testExtension(char *url) {
// compare last 3 chars against possible extensions
// return 0 if matches extension, else return 1
return match ? 0 : 1;
}
int main () {
char *url = "http://www.e.com/msusa/DisplayContactPage.jsp?q=string&t=12"
testExtension(url);
}
I can certainly do:
if ((end=strchr(url, '?')))
*end=0;
but that modifies url
Steps you can take:
Find the '?' in the URL.
char* cp = strchr(url, '?');
If you find it, move the pointer back by three. If you don't find it, move it to 3 characters before the end of the string.
Check that the previous character is a '.'.That is the start of the extension. Pass the pointer to textExtension.
if ( cp == NULL )
{
len = strlen(url);
cp = url + (len-3);
}
cp -= 3;
if ( *(cp-1) != '.' )
{
// Deal with the condition.
}
// Call textExtension.
testExtension(cp);
Make sure you don't access anything beyond the '?' or the null character in testExtension.
If you are not sure about the number of characters in the extension, you can use:
char* cp = strchr(url, '?');
if ( cp == NULL )
{
len = strlen(url);
cp = url + len;
}
// Move the pointer back until you find the '.'
while ( *cp != '.' && cp > url )
{
--cp;
}
There are a few ways to approach this.
Option 1: Operate on the substring
static const int EXTENSION_LEN = 3;
int testExtension(const char *url) {
int pos = index(url, '?');
if (pos > EXTENSION_LEN) {
pos -= EXTENSION_LEN;
return (0 == strncmp(EXTENSION, (url + pos), EXTENSION_LEN));
}
else {
return 0;
}
}
Depending on how many times you test the same URL, the overhead of that index() operation (linear in the length of the base URL) may become significant. You could avoid it by creating a copy of the extension (note that you don't need strdup() whole URL, but copy just the extension).
Option 2: Copy the substring to a new buffer
int testExtension(const char *extension) {
return (0 == strncmp(EXTENSION, extension, EXTENSION_LEN));
}
int main() {
char ext[EXTENSION_LEN];
char *url = "http://www.e.com/msusa/DisplayContactPage.jsp?q=string&t=12";
int testResult = 0;
int pos = index(url, '?');
if ( pos > EXTENSION_LEN ) {
for ( int idx = 0; idx < EXTENSION_LEN; ++idx ) {
ext[idx] = url[pos - EXTENSION_LEN + idx];
}
ext[EXTENSION_LEN - 1] = 0; // null-terminate
testResult = testExtension(ext);
}
}
If you have a lot of extensions to test against, then a hash table or other data structure may be necessary to achieve decent performance.
Related
For this code below that I was writing. I was wondering, if I want to split the string but still retain the original string is this the best method?
Should the caller provided the ** char or should the function "split" make an additional malloc call and memory manage the ** char?
Also, I was wondering if this is the most optimizing method, or could I optimize the code better than this?
I still have not debug the code yet, I am a bit undecided whether if the caller manage the ** char or the function manage the pointer ** char.
#include <stdio.h>
#include <stdlib.h>
size_t split(const char * restrict string, const char splitChar, char ** restrict parts, const size_t maxParts){
size_t size = 100;
size_t partSize = 0;
size_t len = 0;
size_t newPart = 1;
char * tempMem;
/*
* We just reverse a long page of memory
* At reaching the space character that is the boundary of the new
*/
char * mem = (char*) malloc( sizeof(char) * size );
if ( mem == NULL ) return 0;
for ( size_t i = 0; string[i] != 0; i++ ) {
// If it is a split char we at a new part
if ( string[i] == splitChar) {
// If the last character was not the split character
// Then mem[len] = 0 and increase the len by 1.
if (newPart == 0) mem[len++] = 0;
newPart = 1;
continue;
} else {
// If this is a new part
// and not a split character
// we make a new pointer
if ( newPart == 1 ){
// if reach maxpart we break.
// It is okay here, to not worry about memory
if ( partSize == maxParts ) break;
parts[partSize++] = &mem[len];
newPart = 0;
}
mem[len++] = string[i];
if ( len == size ){
// if ran out of memory realloc.
tempMem = (char*)realloc(mem, sizeof(char) * (size << 1) );
// if fail quit loop
if ( tempMem == NULL ) {
// If we can't get more memory the last part could be corrupted
// We have to return.
// Otherwise the code below can seg.
// There maybe a better way than this.
return partSize--;
}
size = size << 1;
mem = tempMem;
}
}
}
// If we got here and still in a newPart that is fine no need
// an additional character.
if ( newPart != 1 ) mem[len++] = 0;
// realloc to give back the unneed memory
if ( len < size ) {
tempMem = (char*) realloc(mem, sizeof(char) * len );
// If the resizing did not fail but yielded a different
// memory block;
if ( tempMem != NULL && tempMem != mem ){
for ( size_t i = 0; i < partSize; i++ ){
parts[i] = tempMem + (parts[i] - mem);
}
}
}
return partSize;
}
int main(){
char * tStr = "This is a super long string just to test the str str adfasfas something split";
char * parts[10];
size_t len = split(tStr, ' ', parts, 10);
for (size_t i = 0; i < len; i++ ){
printf("%zu: %s\n", i, parts[i]);
}
}
What is "best" is very subjective, as well as use case dependent.
I personally would keep the parameters as input only, define a struct to contain the split result, and probably return such by value. The struct would probably contain pointers to memory allocation, so would also create a helper function free that memory. The parts might be stored as list of strings (copy string data) or index&len pairs for the original string (no string copies needed, but original string needs to remain valid).
But there are dozens of very different ways to do this in C, and all a bit klunky. You need to choose your flavor of klunkiness based on your use case.
About being "more optimized": unless you are coding for a very small embedded device or something, always choose a more robust, clear, easier to use, harder to use wrong over more micro-optimized. The useful kind of optimization turns, for example, O(n^2) to O(n log n). Turning O(3n) to O(2n) of a single function is almost always completely irrelevant (you are not going to do string splitting in a game engine inner rendering loop...).
I would like to use the following function to compare two char arrays:
if(strcmp((PtrTst->cDatVonCom),szGeraeteAntwort)==0)
Now my problem is that PtrTst->cDatVonCom[5000] is different than the szGeraeteAntwort[255] and the entire values looks a little bit different:
(abstract from the logfile).
PtrTst->cDatVonCom:
04/16/19 12:53:36 AB A{CR}{LF}
0 0{CR}{LF}
szGeraeteAntwort:
04/16/19 12:53:36 AB A 0 0{CR}{LF}
Could I check if the command (in this case AB A) is the same in both?
The command can change and it must be in both the same to go through the if statement.
UPDATE:
Both char arrays are always there and i need to check if the "szGeraeteAntwort" is in the PtrTst->cDatVonCom.
In C# i would use an cDatVonCom.Contains... or something like this to check if there the same.
You have two strings that whose logical content you want to compare, but their literal presentation may vary. In particular, there may be CR/LF line termination sequences inserted into one or both, which are not significant for the purposes of the comparison. There are many ways to approach this kind of problem, but one common one is to define a unique canonical form for your strings, prepare versions of both strings to that form, and compare the results. In this case, the canonical form would presumably be one without any CR or LF characters.
The most general way to approach this is to create canonicalized copies of your strings. This accounts for the case where you cannot modify the strings in-place. For example:
/*
* src - the source string
* dest - a pointer to the first element of an array that should receive the result.
* dest_size - the capacity of the destination buffer
* Returns 0 on success, -1 if the destination array has insufficient capacity
*/
int create_canonical_copy(const char src[], char dest[], size_t dest_size) {
static const char to_ignore[] = "\r\n";
const char *start = src;
size_t dest_length = 0;
int rval = 0;
while (*start) {
size_t segment_length = strcspn(start, to_ignore);
if (dest_length + segment_length + 1 >= dest_size) {
rval = -1;
break;
}
memcpy(dest + dest_length, start, segment_length);
dest_length += segment_length;
start += segment_length;
start += strspn(start, to_ignore);
}
dest[dest_length] = '\0';
return rval;
}
You might use that like so:
char tmp1[255], tmp2[255];
if (create_canonical_copy(PtrTst->cDatVonCom, tmp1, 255) != 0) {
// COMPARISON FAILS: cDatVonCom has more non-CR/LF data than szGeraeteAntwort
// can even accommodate
return -1;
} else if (create_canonical_copy(szGeraeteAntwort, tmp2, 255) != 0) {
// should not happen, given that szGeraeteAntwort's capacity is the same as tmp2's.
// If it does, then szGeraeteAntwort must not be properly terminated
assert(0);
return -1;
} else {
return strcmp(tmp1, tmp2);
}
That assumes you are comparing the strings for equality only. If you were comparing them for order, as well, then you could still use this approach, but you would need to be more care ful about canonicalizing as much data as the destination can accommodate, and about properly handling the data-too-large case.
A function that compares the strings while skipping over some characters could be used.
#include <stdio.h>
#include <string.h>
int strcmpskip ( char *match, char *against, char *skip) {
if ( ! match && ! against) { //both are NULL
return 0;
}
if ( ! match || ! against) {//one is NULL
return 1;
}
while ( *match && *against) {//both are not zero
while ( skip && strchr ( skip, *match)) {//skip not NULL and *match is in skip
match++;
if ( ! *match) {//zero
break;
}
}
while ( skip && strchr ( skip, *against)) {//skip not NULL and *against is in skip
against++;
if ( ! *against) {//zero
break;
}
}
if ( *match != *against) {
break;
}
if ( *match) {//not zero
match++;
}
if ( *against) {//not zero
against++;
}
}
return *match - *against;
}
int main( void) {
char line[] = "04/16/19 12:53:36 AB A\r\n 0 0\r\n";
char text[] = "04/16/19 12:53:36 AB A 0 0\r\n";
char ignore[] = "\n\r";
if ( strcmpskip ( line, text, ignore)) {
printf ( "do not match\n");
}
else {
printf ( "match\n");
}
return 0;
}
There are several things you can do; here are two:
Parse both strings (e.g. using scanf() or something more fancy)), and during the parsing ignore the newlines. Now you'll have the different fields (or an indication one of the lines can't be parsed properly, which is an error anyway). Then you can compare the commands.
Use a regular expression matcher on those two strings, to obtain just the command while ignoring everything else (treating CR and LF as newline characters essentially), and compare the commands. Of course you'll need to write an appropriate regular expression.
I'm trying to organize chunks of an apache log file into an array. For example, assume my apache file has a line like this:
[a] [b] [ab] [abc] file not found: /something
What I want to achieve is an array (let's name it ext) so that:
ext[0] = a
ext[1] = b
ext[2] = ab
ext[3] = abc
I then reserve enough space for 20 entries at 5000 characters each via:
char ext[20][5000];
Then I attempt to call my extraction function as follows:
extract("[a] [b] [ab] [abc]",18,ext);
Ideally, the string is replaced with the variable holding the data and the 18 is replaced with the variable showing the actual string size, but I'm using this data as an example.
The extract function won't compile.
It's complaining that in:
char s[20][5000]=*extr,*p,*l=longstring;
there's an invalid initializer. I'm guessing s[20][5000]=*extr is it, but I'm trying to initialize a character array with index values then I want to pass it onto the function caller
It then complains:
warning: passing argument 3 of 'extract' from incompatible pointer type
Am I forced to strictly use pointers and mathematics to calculate offsets or is there a way to pass actual char array with the ability to modify them using index values like I tried to do?
long extract(char* longstring,long sz,char **extr){
unsigned long sect=0,si=0,ssi=0;
char s[20][5000]=*extr,*p,*l=longstring;
while (sz-- > 0){
if (*l=='['){sect=1;p=s[si++];if (si > 20){break;}}
if (*l==']'){sect=0;}else{
if (sect==1){*p++=*l;}
}
l++;
}
}
UPDATE:
As per suggested, I made minor changes and my code is now as follows:
Mainline:
char ext[20][5000];
extract("[a] [b] [ab] [abc]",18,(char**)ext);
printf("%s\n",ext);
return 0;
Function:
long extract(char* longstring, long sz, char **extr) {
unsigned long sect = 0, si = 0, ssi = 0;
char **s = extr, *p, *l = longstring;
while (sz-- > 0) {
if (*l == '[') {
sect = 1;
p = s[si++];
if (si > 20) {
break;
}
}
if (*l == ']') {
sect = 0;
} else {
if (sect == 1) {
*p++ = *l;
}
}
l++;
}
}
And now I receive a segmentation fault. I'm not sure why when I set the offset of one string via p=s[si++] and then incremented it as I add data. I even changed p=s[si++] to p=s[si++][0] in an attempt to specifically want the address of the first character of a particular index but then the compiler shows "warning: assignment makes pointer from integer without a cast".
This uses a scanset, %[], to parse the string. The scan skips leading whitespace and then scans a [. Then the scanset reads characters that are not a ]. Finally a ] is scanned. The %n specifier reports the number of characters processed and that is added to offset to advance through the string. The 4999 prevents writing too many characters to the string [5000].
#include <stdio.h>
#include <stdlib.h>
int extract ( char* longstring,char (*extr)[5000]) {
int used = 0;
int offset = 0;
int si = 0;
while ( ( sscanf ( longstring + offset, " [%4999[^]]]%n", extr[si], &used)) == 1) {
//one item successfully scanned
si++;
offset += used;
if ( si > 20) {
break;
}
}
return si;
}
int main( int argc, char *argv[])
{
char ext[20][5000];
int i = 0;
int result = 0;
result = extract("[a] [b] [ab] [abc]", ext);
for ( i = 0; i < result; i++) {
printf("ext[%d] %s\n",i,ext[i]);
}
return 0;
}
I have a fucntion which in it I want to return a string (i.e array of chars) with no spaces at all. This is my code, which in my understanding is not right:
char *ignoreSpace( char helpArr[], int length ){
int i = 0; int j = 0;
char withoutSpace[length];
while ( i < length ){
/*if not a space*/
if ( isspace( helpArr[i] ) == FALSE )
withoutSpace[j] = helpArr[i];
i++;
}
return *withoutSpace;
}
My intention in the line:
return *withoutSpace;
Is to return the content of the array withoutSpace so I could parse a string with no spaces at all.
Can you please tell me how can I make it any better?
Your current solution will lose the result of withoutSpace when the function returns as it is only defined in that function's scope.
A better pattern would be to accept a third argument to the function which is a pointer to a char[] to write the result into - in much the same way the standard functions do, (eg strcpy.
char* ignoreSpace(char* src, char* dst, int length) {
// copy from src to dst, ignoring spaces
// ...
// ...
return dst;
}
Try this (assuming null terminated string)
void ignoreSpace(char *str) {
int write_pos = 0, read_pos = 0;
for (; str[read_pos]; ++read_pos) {
if (!isspace(str[read_pos]) {
str[write_pos++] = str[read_pos];
}
}
str[write_pos] = 0;
}
You cannot return a pointer to a local variable from a function, because as soon as you leave the function all local variables are detroyed and no longer valid.
You must either
Allocate space with malloc in your function and return a pointer
to that allocated memory
not return a pointer from the function butmodify directly the
original string.
First solution :
char *ignoreSpace(char helpArr[], int length)
{
int i=0; int j=0;
char *withoutSpace = malloc(length) ;
while(i <= length)
{
/*if not a space*/
if(isspace(helpArr[i]) == FALSE)
withoutSpace[j++] = helpArr[i];
i++;
}
return withoutSpace;
}
Second solution:
char *ignoreSpace(char helpArr[], int length)
{
int i=0; int j=0;
while(i <= length)
{
/*if not a space*/
if(isspace(helpArr[i]) == FALSE)
helpArr[j++] = helpArr[i];
i++;
}
return helpArr;
}
There are some other small correction in my code. Finding out which ones is left as an exercise to the reader.
You don't increment j, ever. In the case that the current character of the source string is not a space, you probably would like to store it in your output string and then also increment the j by one; so that you'd store the next possible character into the next slot instead of overwriting the 0th one again and again.
So change this:
...
withoutSpace[j] = helpArr[i];
...
into this:
...
withoutSpace[j++] = helpArr[i];
...
And then also append your withoutSpace with a 0 or '\0' (they are the same), so that any string processing function may know its end. Also return the pointer, since you should do that, not the *withoutSpace or withoutSpace[0] (they are the same):
char *ignoreSpace( char helpArr[], int length ){
int i = 0; int j = 0;
char * withoutSpace = malloc( length * sizeof * withoutSpace ); // <-- changed this
while ( i < length ){
/*if not a space*/
if ( isspace( helpArr[i] ) == FALSE )
withoutSpace[j++] = helpArr[i]; // <-- replaced j with j++
i++;
}
withoutSpace[j] = 0; // <-- added this
return withoutSpace;
}
And then you should be good to go, assuming that you can have variable-length arrays.
Edit: Well, variable-length arrays or not, you better just use dynamic memory allocation by using malloc or calloc or something, because else, as per comments, you'd be returning a local pointer variable. Of course, this requires you to manually free the allocated memory in the end.
As part of learning C, I wrote the following code to combine directory name with file name. Eg: combine("/home/user", "filename") will result in /home/user/filename. This function is expected work across platforms (atleast on all popular linux distributions and windows 32 and 64bit).
Here is the code.
const char* combine(const char* path1, const char* path2)
{
if(path1 == NULL && path2 == NULL) {
return NULL;
}
if(path2 == NULL || strlen(path2) == 0) return path1;
if(path1 == NULL || strlen(path1) == 0) return path2;
char* directory_separator = "";
#ifdef WIN32
directory_separator = "\\";
#else
directory_separator = "/";
#endif
char p1[strlen(path1)]; // (1)
strcpy(p1, path1); // (2)
char *last_char = &p1[strlen(path1) - 1]; // (3)
char *combined = malloc(strlen(path1) + 1 + strlen(path2));
int append_directory_separator = 0;
if(strcmp(last_char, directory_separator) != 0) {
append_directory_separator = 1;
}
strcpy(combined, path1);
if(append_directory_separator)
strcat(combined, directory_separator);
strcat(combined, path2);
return combined;
}
I have the following questions regarding the above code.
Consider the lines numbered 1,2,3. All those 3 lines are for getting the last element from the string. It looks like I am writing more code for such a small thing. What is the correct method to get the last element from the char* string.
To return the result, I am allocating a new string using malloc. I am not sure this is the right way to do this. Is caller expected to free the result? How can I indicate the caller that he has to free the result? Is there a less error prone method available?
How do you rate the code (Poor, Average, Good)? What are the areas that can be imrpoved?
Any help would be great.
Edit
Fixed all the issues discussed and implemented the changes suggested. Here is the updated code.
void combine(char* destination, const char* path1, const char* path2)
{
if(path1 == NULL && path2 == NULL) {
strcpy(destination, "");;
}
else if(path2 == NULL || strlen(path2) == 0) {
strcpy(destination, path1);
}
else if(path1 == NULL || strlen(path1) == 0) {
strcpy(destination, path2);
}
else {
char directory_separator[] = "/";
#ifdef WIN32
directory_separator[0] = '\\';
#endif
const char *last_char = path1;
while(*last_char != '\0')
last_char++;
int append_directory_separator = 0;
if(strcmp(last_char, directory_separator) != 0) {
append_directory_separator = 1;
}
strcpy(destination, path1);
if(append_directory_separator)
strcat(destination, directory_separator);
strcat(destination, path2);
}
}
In the new version, caller has to allocate enough buffer and send to combine method. This avoids the use of malloc and free issue. Here is the usage
int main(int argc, char **argv)
{
const char *d = "/usr/bin";
const char* f = "filename.txt";
char result[strlen(d) + strlen(f) + 2];
combine(result, d, f);
printf("%s\n", result);
return 0;
}
Any suggestions for more improvements?
And there is a memory leak:
const char *one = combine("foo", "file");
const char *two = combine("bar", "");
//...
free(one); // needed
free(two); // disaster!
Edit: Your new code looks better. Some minor stylistic changes:
Double semi-colon ;; in line 4.
In line 6, replace strlen(path2) == 0 with path2[0] == '\0'' or just !path2[0].
Similarly in line 9.
Remove loop determining last_char, and use const char last_char = path1[strlen(path1) - 1];
Change if(append_directory_separator) to if(last_char != directory_separator[0]). And so you don't need the variable append_directory_separator any more.
Have your function also return destination, similar to strcpy(dst, src), which returns dst.
Edit: And your loop for last_char has a bug: it always returns the end of path1, and so you could end up with a double slash // in your answer. (But Unix will treat this as a single slash, unless it is at the start). Anyway, my suggestion fixes this--which I see is quite similar to jdmichal's answer. And I see that you had this correct in your original code (which I admit I only glanced at--it was too complicated for my taste; your new code is much better).
And two more, slightly-more subjective, opinions:
I would use stpcpy(), to avoid the inefficiency of strcat(). (Easy to write your own, if need be.)
Some people have very strong opinions about strcat() and the like as being unsafe. However, I think your usage here is perfectly fine.
The only time you use last_char is in the comparision to check if the last character is a separator.
Why not replace it with this:
/* Retrieve the last character, and compare it to the directory separator character. */
char directory_separator = '\\';
if (path1[strlen(path1) - 1] == directory_separator)
{
append_directory_separator = 1;
}
If you want to account for the possibility of multiple character separators, you can use the following. But be sure when allocating the combined string to add strlen(directory_separator) instead of just 1.
/* First part is retrieving the address of the character which is
strlen(directory_separator) characters back from the end of the path1 string.
This can then be directly compared with the directory_separator string. */
char* directory_separator = "\\";
if (strcmp(&(path1[strlen(path1) - strlen(directory_separator)]), directory_separator))
{
append_directory_separator = 1;
}
The less error-prone method would be to have the user give you the destination buffer and its length, much the way strcpy works. This makes it clear that they must manage allocating and freeing the memory.
The process seems decent enough. I think there's just some specifics that can be worked on, mostly with doing things in a clunky way. But you are doing well, in that you can already recognize that happening and ask for help.
This is what I use:
#if defined(WIN32)
# define DIR_SEPARATOR '\\'
#else
# define DIR_SEPARATOR '/'
#endif
void combine(char *destination, const char *path1, const char *path2) {
if (path1 && *path1) {
auto len = strlen(path1);
strcpy(destination, path1);
if (destination[len - 1] == DIR_SEPARATOR) {
if (path2 && *path2) {
strcpy(destination + len, (*path2 == DIR_SEPARATOR) ? (path2 + 1) : path2);
}
}
else {
if (path2 && *path2) {
if (*path2 == DIR_SEPARATOR)
strcpy(destination + len, path2);
else {
destination[len] = DIR_SEPARATOR;
strcpy(destination + len + 1, path2);
}
}
}
}
else if (path2 && *path2)
strcpy(destination, path2);
else
destination[0] = '\0';
}
Maybe I'm a bit late to this, but I improved the updated code in a way, that it also works with something like this "/../".
/*
* Combine two paths into one. Note that the function
* will write to the specified buffer, which has to
* be allocated beforehand.
*
* #dst: The buffer to write to
* #pth1: Part one of the path
* #pth2: Part two of the path
*/
void joinpath(char *dst, const char *pth1, const char *pth2)
{
if(pth1 == NULL && pth2 == NULL) {
strcpy(dst, "");
}
else if(pth2 == NULL || strlen(pth2) == 0) {
strcpy(dst, pth1);
}
else if(pth1 == NULL || strlen(pth1) == 0) {
strcpy(dst, pth2);
}
else {
char directory_separator[] = "/";
#ifdef WIN32
directory_separator[0] = '\\';
#endif
const char *last_char = pth1;
while(*last_char != '\0')
last_char++;
int append_directory_separator = 0;
if(strcmp(last_char, directory_separator) != 0) {
append_directory_separator = 1;
}
strcpy(dst, pth1);
if(append_directory_separator)
strcat(dst, directory_separator);
strcat(dst, pth2);
}
char *rm, *fn;
int l;
while((rm = strstr (dst, "/../")) != NULL) {
for(fn = (rm - 1); fn >= dst; fn--) {
if(*fn == '/') {
l = strlen(rm + 4);
memcpy(fn + 1, rm + 4, l);
*(fn + len + 1) = 0;
break;
}
}
}
}
Just a little remark in order to improve your function:
Windows does support both '/' and '\\' separators in paths. So I should be able to perform the following call:
const char* path1 = "C:\\foo/bar";
const char* path2 = "here/is\\my/file.txt";
char destination [ MAX_PATH ];
combine ( destination, path1, path2 );
An idea when writing a multiplatform project could be to convert '\\' to '/' in any input path (from user input, loaded files...), then you will only have to deal with '/' characters.
Regards.
A quick glance shows:
you are using C++ comments (//) which is not standard C
you are declaring variables part way down the code - also not C. They should be defined at the start of the function.
your string p1 at #1 has 1 too many bytes written to it at #2 because strlen returns the length of a string and you need 1 more byte for the null terminator.
the malloc does not allocate enough memory - you need length of path1 + length of path2 + length of separator + null terminator.