Splitting a string without a delimeter in c - c

I want to trim this string below (which is a log.txt file) into the substring below.
Since there are no delimeters, I can't use strtok().
So how could I do it?
Log file's contents:
[INFO][2019-10-2323:21:45.638]{"cmd":"set","objects":[{"type":"switch","data":["zwave-dc53:4-1"],"execution":{"command":"OnOff","params":{"on":true}}}],"raw":"DC530401010001","reqid": "0001"}
[INFO][2019-10-2323:22:11.936]{"cmd":"status","objects":[{"bridge_key":"zwave","data":[{"hash":"zwave-dc53:8-0","states":{"OnOff":{"on":false}}}],"type":"switch"}],"raw":"DC53010401000000","reqid": "0001"}
[INFO][2019-10-2323:22:29.232]{"cmd":"set","objects":[{"type":"switch","data":["zwave-dc53:4-1"],"execution":{"command":"OnOff","params":{"on":true}}}],"raw":"DC530401010001","reqid": "0002"}
[INFO][2019-10-2323:22:29.256]{"cmd":"status","objects":[{"bridge_key":"zwave","data":[{"hash":"zwave-dc53:8-0","states":{"OnOff":{"on":false}}}],"type":"switch"}],"raw":"DC53010401000000","reqid": "0002"}
[INFO][2019-10-2323:22:33.192]{"cmd":"set","objects":[{"type":"switch","data":["zwave-dc53:4-1"],"execution":{"command":"OnOff","params":{"on":true}}}],"raw":"DC530401010001","reqid": "0003}
[INFO][2019-10-2323:22:48.075]{"cmd":"status","objects":[{"bridge_key":"zwave","data":[{"hash":"zwave-dc53:8-0","states":{"OnOff":{"on":false}}}],"type":"switch"}],"raw":"DC53010401000000","reqid": "0003"}
[INFO][2019-10-2323:22:48.098]{"cmd":"set","objects":[{"type":"switch","data":["zwave-dc53:4-1"],"execution":{"command":"OnOff","params":{"on":true}}}],"raw":"DC530401010001","reqid": "0004"}
[INFO][2019-10-2323:22:52.034]{"cmd":"status","objects":[{"bridge_key":"zwave","data":[{"hash":"zwave-dc53:8-0","states":{"OnOff":{"on":false}}}],"type":"switch"}],"raw":"DC53010401000000","reqid": "0004"}
[INFO][2019-10-2323:25:58.509]{"cmd":"set","objects":[{"type":"switch","data":["zwave-dc53:4-1"],"execution":{"command":"OnOff","params":{"on":true}}}],"raw":"DC530401010001","reqid": "0005"}
[INFO][2019-10-2323:26:42.425]{"cmd":"status","objects":[{"bridge_key":"zwave","data":[{"hash":"zwave-dc53:8-0","states":{"OnOff":{"on":false}}}],"type":"switch"}],"raw":"DC53010401000000","reqid": "0005"}
[INFO][2019-10-2323:27:15.467]{"cmd":"set","objects":[{"type":"switch","data":["zwave-dc53:4-1"],"execution":{"command":"OnOff","params":{"on":true}}}],"raw":"DC530401010001","reqid": "0006"}
[INFO][2019-10-2323:27:42.030]{"cmd":"status","objects":[{"bridge_key":"zwave","data":[{"hash":"zwave-dc53:8-0","states":{"OnOff":{"on":false}}}],"type":"switch"}],"raw":"DC53010401000000","reqid": "0006"}
[INFO][2019-10-2323:32:45.088]{"cmd":"set","objects":[{"type":"switch","data":["zwave-ffa2:4-1"],"execution":{"command":"OnOff","params":{"on":true}}}],"raw":"FFA20401010001","reqid": "0033"}
[INFO][2019-10-2323:33:11.934]{"cmd":"status","objects":[{"bridge_key":"zwave","data":[{"hash":"zwave-ffa2:8-0","states":{"OnOff":{"on":false}}}],"type":"switch"}],"raw":"FFA2010401000000","reqid": "0007"}
[INFO][2019-10-2323:36:39.262]{"cmd":"set","objects":[{"type":"switch","data":["zwave-ffa2:4-1"],"execution":{"command":"OnOff","params":{"on":true}}}],"raw":"FFA20401010001","reqid": "0008"}
[INFO][2019-10-2323:36:39.267]{"cmd":"status","objects":[{"bridge_key":"zwave","data":[{"hash":"zwave-ffa2:8-0","states":{"OnOff":{"on":false}}}],"type":"switch"}],"raw":"FFA2010401000000","reqid": "0008"}
[INFO][2019-10-2323:36:39.267]{"cmd":"set","objects":[{"type":"switch","data":["zwave-ffa2:4-1"],"execution":{"command":"OnOff","params":{"on":true}}}],"raw":"FFA20401010001","reqid": "0022"}
[INFO][2019-10-2323:36:39.332]{"cmd":"status","objects":[{"bridge_key":"zwave","data":[{"hash":"zwave-ffa2:8-0","states":{"OnOff":{"on":false}}}],"type":"switch"}],"raw":"FFA2010401000000","reqid": "0009"}
The substring I want to find is the raw data's value, for example: FFA2010401000000

This should work to extract the raw data, assuming it's hex-encoded data (error checking, memory freeing, and proper headers are omitted for clarity):
#define RAW_STR ",\"raw\":\""
FILE *logfile = fopen( filename, "rb" );
char *line = NULL;
size_t len = 0;
for ( ;; )
{
ssize_t bytesRead = getline( &line, &len, logfile );
if ( bytesRead == -1 )
{
break;
}
char *rawData = strstr( line, RAW_STR );
if ( !raw )
{
continue;
}
// jump over the "raw":" string to the actual value
rawData += strlen( RAW_STR );
// assume the data is hex
unsigned long long value = strtoull( rawData, NULL, 16 );
...
}
This simple method depends on the log file being consistently formatted. If the log file doesn't always have the RAW_STR in that exact format, it won't work.
I've also assumed you're running on a POSIX system and have access to getline().

You might find regular expression useful in this scenario. Specifically regexec seems appropriate here.
int regexec(const regex_t *preg, const char *string, size_t nmatch,
regmatch_t pmatch[], int eflags);
Or a proper json parser. Checkout the c library section on the json format description website.

Related

how to use zpipe inflate on a binary unsigned char

I have a compressed and base64 encoded string that I want to decompress zpipe.
I did the tutorial here and it worked great. I b64 decoded the string first, saved it to a file and then used the inf() function to decompress it.
int ret;
char *b64_string = (char *)read_b64_string();
size_t my_string_len;
unsigned char *my_string = b64_decode_ex(my_string, strlen(my_string), &my_string_len);
free(b64_string);
write_decoded_b64_to_file(my_string, my_string_len);
free(my_string);
ret = inf();
and then I changed the inf() function to hardcoded files:
int inf()
{
FILE *source;
FILE *dest;
source = fopen("/path/to/my/b64decoded_file/", "r");
dest = fopen("/path/to/my/decompressed_file/", "w");
Now I want to change the inf() function to make it work when the binary is passed as an argument.
int ret;
size_t my_string_len;
unsigned char *my_string = b64_decode_ex(my_string, strlen(my_string), &my_string_len);
ret = inf(my_string);
I think I identified this line
strm.avail_in = fread(in, 1, CHUNK, source);
as the one where I have to read in the binary. fread is only for files though. How can I read this binary file in without a file?
Just use fmemopen() to open my_string as a file.
source = fmemopen(my_string, my_string_len, "rb");
(I put in the b in "rb" by habit. Never hurts. Can help.)

Search Binary File for a Pattern

I need to search for a binary pattern in binary file,
how can i do it?
I tried with "strstr()" function and convert the file and the pattern to a string but its not working.
(the pattern is also a binary file)
this is what it tried:
void isinfected(FILE *file, FILE *sign, char filename[], char filepath[])
{
char* fil,* vir;
int filelen, signlen;
fseek(file, 0, SEEK_END);
fseek(sign, 0, SEEK_END);
filelen = ftell(file);
signlen = ftell(sign);
fil = (char *)malloc(sizeof(char) * filelen);
if (!fil)
{
printf("unseccesful malloc!\n");
}
vir = (char *)malloc(sizeof(char) * signlen);
if (!vir)
{
printf("unseccesful malloc!\n");
}
fseek(file, 0, SEEK_CUR);
fseek(sign, 0, SEEK_CUR);
fread(fil, 1, filelen, file);
fread(vir, 1, signlen, sign);
if (strstr(vir, fil) != NULL)
log(filename, "infected",filepath );
else
log(filename, "not infected", filepath);
free(vir);
free(fil);
}
For any binary handling you should never use one of the strXX functions, because these only (and exclusively) work on C-style zero terminated strings. Your code is failing because the strXX functions cannot look beyond the first binary 0 they encounter.
As your basic idea with strstr appears correct (and only fails because it works on zero terminated strings only), you can replace it with memmem, which does the same on arbitrary data. Since memmem is a GNU C extension (see also Is there a particular reason for memmem being a GNU extension?), it may not be available on your system and you need to write code that does the same thing.
For a very basic implementation of memmem you can use memchr to scan for the first binary character, followed by memcmp if it found something:
void * my_memmem(const void *big, size_t big_len, const void *little, size_t little_len)
{
void *iterator;
if (big_len < little_len)
return NULL;
iterator = (void *)big;
while (1)
{
iterator = memchr (iterator, ((unsigned char *)little)[0], big_len - (iterator-big));
if (iterator == NULL)
return NULL;
if (iterator && !memcmp (iterator, little, little_len))
return iterator;
iterator++;
}
}
There are better implementations possible, but unless memmem is an important function in your program, it'll do the job just fine.
The basic idea is to check if vir matches the beginning of fil. If it doesn't, then you check again, starting at the second byte of fil, and repeating until you find a match or until you've reached the end of fil. (This is essentially what a simple implementation of strstr does, except that strstr treats 0 bytes as a special case.)
int i;
for (i = 0; i < filelen - signlen; ++i) {
if (memcmp(vir, fil + i, signlen) == 0) {
return true; // vir exists in fil found
}
}
return false; // vir is not in file
This is the "brute force" approach. It can get very slow if your files are long. There are advanced searching algorithms that can potentially make this much faster, but this is a good starting point.

ASN1_TIME_print functionality without BIO?

As described in this question: Openssl C++ get expiry date, there is the possibility to write an ASN1 time into a BIO buffer and then read it back into a custom buffer buf:
BIO *bio;
int write = 0;
bio = BIO_new(BIO_s_mem());
if (bio) {
if (ASN1_TIME_print(bio, tm))
write = BIO_read(bio, buf, len-1);
BIO_free(bio);
}
buf[write]='\0';
return write;
How could this be achieved without using BIO at all? The ASN1_TIME_print function is only present when OPENSSL_NO_BIO is not defined. Is there a way to write the time directly into a given buffer?
You can try the sample code below. It doesn't use BIO, but should give you the same output as the OP's example. If you don't trust the ASN1_TIME string, you'll want to add some error checking for:
notBefore->data is > 10 chars
each char value is between '0' and '9'
values for year, month, day, hour, minute, second
type
You should test for the type (i.e. UTC), if you expect multiple types.
You should also test whether or not the date/time is GMT and add that to the string if you want the output to match exactly as if using BIOs. see:
openssl/crypto/asn1/t_x509.c - ASN1_UTCTIME_print or ASN1_GENERALIZEDTIME_print
ASN1_TIME* notBefore = NULL;
int len = 32;
char buf[len];
struct tm tm_time;
notBefore = X509_get_notBefore(x509_cert);
// Format ASN1_TIME with type UTC into a tm struct
if(notBefore->type == V_ASN1_UTCTIME){
strptime((const char*)notBefore->data, "%y%m%d%H%M%SZ" , &tm_time);
strftime(buf, sizeof(char) * len, "%h %d %H:%M:%S %Y", &tm_time);
}
// Format ASN1_TIME with type "Generalized" into a tm struct
if(notBefore->type == V_ASN1_GENERALIZEDTIME){
// I didn't look this format up, but it shouldn't be too difficult
}
I think this should be possible, at least in terms of writing the time directly into a given buffer -- but you'll still need to use BIOs.
Ideally, BIO_new_mem_buf would suit, given that it creates an in-memory BIO using a given buffer as the source. Unfortunately, that function treats the given buffer as read-only, which is not what we want. However, we can create our own function (let's call it BIO_new_mem_buf2), based on the BIO_new_mem_buf source code:
BIO *BIO_new_mem_buf2(void *buf, int len)
{
BIO *ret;
BUF_MEM *b;
size_t sz;
if (!buf) {
BIOerr(BIO_F_BIO_NEW_MEM_BUF, BIO_R_NULL_PARAMETER);
return NULL;
}
sz = (size_t)len;
if (!(ret = BIO_new(BIO_s_mem())))
return NULL;
b = (BUF_MEM *)ret->ptr;
b->data = buf;
b->length = sz;
b->max = sz;
return ret;
}
This is just like BIO_new_mem_buf, except that a) the len argument must indicate the size of the given buffer, and b) the BIO is not marked "readonly".
With the above, you should now be able to call:
ASN1_TIME_print(bio, tm)
and have the time appear in your given buffer.
Note that I have not tested the above code, so YMMV. Hope this helps!

Can fseek() be used to insert data into the middle of a file? - C

I know that the function fseek() can be used to output data to a specific location in a file. But I was wondering if I use fseek() to move to the middle of the file and then output data. Would the new data overwrite the old data? For example if I had a file containing 123456789 and I used fseek() to output newdata after the 5 would the file contain 12345newdata6789 or would it contain 12345newdata.
Writing data in the "middle" of a file will overwrite existing data. So you would have '12345newdata'.
EDIT: As mentioned in the comments below, it should be noted that this overwrites data without truncating the rest of the file. As an extended version of your example, if you wrote newdata after the 5 in a file containing 1234567890ABCDEFG, you would then have 12345newdataCDEFG, not 12345newdata.
Yes it lets you do that, and those files are called "Random Access Files". Imagine you have already a set file ( with the structure but empty ), in that case you can fill the "slots" you want, or in the case the slot is filled with data you can overwrite on it.
typedef struct{
int number;
char name[ 20 ];
char lastname[ 20 ];
float score;
}students_t;
/* Supposing that you formatted the file already and the file is opened. */
/* Imagine the students are listed each one has a record. */
void modifyScore( FILE * fPtr ){
students_t student = { 0, "", "", 0.0 };
int nrecord;
float nscore;
printf( "Enter the number of the student:" );
scanf( "%d", &record )
printf( "Enter the new Score:" );
scanf( "%f", &nscore ); // this is a seek example so I will not complicate things.
/*Seek the file ( record - 1 ), because the file starts in position 0 but the list starts in 1*/
fseek( fPtr, ( record - 1 ) * sizeof ( students_t ), SEEK_SET );
/* Now you can read and copy the slot */
fread( fPtr, "%d%s%s%f", &student.number, student.name, student.lastname, &student.score );
/* Seek again cause the pointer moved. */
fseek( fPtr, ( record - 1 ) * sizeof ( students_t ), SEEK_SET );
student.score = nscore;
/*Overwrite his information, only the score will be altered. */
fwrite( &student, sizeof( student_t ), 1, fPtr );
}
This is how it works (picture obtained from Deitel-How to program in C 6th Edition):
You probably know this but fseek() merely moves the associated position indicator and doesn't dictate per se whether the proceeding output function will overwrite or insert.
You're probably using fwrite() or some other plain vanilla output function, and these will overwrite, giving you "12345newdata" instead of the inserted variant.
On the other hand, you could roll your own inserting function (I don't think there's a stock stdio.h function for this), and call this after fseek() to get your desired insertion.
Something like this could suffice:
insert(const void *ptr, size_t len, FILE *fp) {
char tmp[len];
size_t tmplen;
while (len) {
// save before overwriting
tmplen = fread(tmp, 1, sizeof(tmp), fp);
fseek(fp, -tmplen, SEEK_CUR);
// overwrite
fwrite(ptr, len, 1, fp);
// reloop to output saved data
ptr = tmp;
len = tmplen;
}
}
(Error handling on fread() and fwrite() left out for verbosity.)

fwrite() and file corruption

I'm trying to write a wchar array to a file in C, however there is some sort of corruption and unrelevant data like variables and paths like this
c.:.\.p.r.o.g.r.a.m. .f.i.l.e.s.\.m.i.c.r.o.s.o.f.t. .v.i.s.u.a.l. .s.t.u.d.i.o. 1.0...0.\.v.c.\.i.n.c.l.u.d.e.\.x.s.t.r.i.n.g..l.i.s.t...i.n.s.e.r.t
are written on to the file along with the correct data (example) I have confirmed that the buffer is null-terminated and contains proper data.
Heres my code:
myfile = fopen("logs.txt","ab+");
fseek(myfile,0,SEEK_END);
long int size = ftell(myfile);
fseek(myfile,0,SEEK_SET);
if (size == 0)
{
wchar_t bom_mark = 0xFFFE;
size_t written = fwrite(&bom_mark,sizeof(wchar_t),1,myfile);
}
// in another func
while (true)
{
[..]
unsigned char Temp[512];
iBytesRcvd = recv(sclient_socket,(char*)&Temp,iSize,NULL);
if(iBytesRcvd > 0 )
{
WCHAR* unicode_recv = (WCHAR*)&Temp;
fwrite(unicode_recv,sizeof(WCHAR),wcslen(unicode_recv),myfile);
fflush(myfile);
}
[..]
}
What could be causing this?
recv() will not null-terminate &Temp, so wcslen() runs over the bytes actually written by recv(). You will get correct results if you just use iBytesReceived as byte count for fwrite() instead of using wcslen() and hoping the data received is correctly null-terminated (wide-NULL-terminated, that is):
fwrite(unicode_recv, 1, iBytesReceived, myfile);

Resources