Issues printing whilst using getline() - c

I'm just having a bit of difficulty with a print. Basically, I have code and I'm assigning values to bestmatch[], which is defined as being of type line_t (see struct at bottom).
As you can see, I am storing values for bestmatch[].score (double), bestmatch[].index (int) and bestmatch[].buf (string). When I print them, show in second code block below, bestmatch[i].index and bestmatch[i].score print correctly; however, bestmatch[i].buf does not print at all.
Just to confuse matters more (for myself at least), if I print bestmatch[i].buf at the end of scorecmp (first code block), it prints fine. I've got my call to scorecmp down the very bottom for reference.
Why is it that it is printing index and score fine, but not buf? Or even more, how can I fix this behaviour?
Thank you for your help! Please let me know if you need any additional information
The print, appearing in main, is as follows (for reference, TOP_SCORING_MAX is the number of elements in bestmatch[]):
int i;
for (i = 0; i<TOP_SCORING_MAX; i++) {
if (bestmatch[i].score != -1) {
printf("line\t%d, score = %6.3f and string is %s \n",
bestmatch[i].index,bestmatch[i].score, bestmatch[i].buf);
}
}
And in case you would like the struct:
typedef struct line_t {
char* buf;
int lineLength;
int wordCount;
int index;
double score;
} line_t;
This is my call to scorecmp:
scorecmp(linePtr, bestmatch);

You need to copy the content of the strings, not just the pointers, because they seem to be destroyed, freed, or mutilated before you print them:
bestmatch[j].buf = strdup(linePtr->buf);
Don't forget to free the copied string at the end.

The getline function is the preferred method for reading lines of text from a stream.
The other standard functions, such as gets, fgets and scanf, are a little too unreliable.
The getline function reads an entire line from a stream, up to and including the next newline character.
This function takes three parameters:
A pointer to a block of memory allocated with malloc or calloc. This parameter is of type char**, and it will contain the line read by getline when the function returns.
A pointer to a variable of type size_t. This parameter specifies the size in bytes of the block of memory pointed to by the first parameter.
The stream from which to read the line.
The first parameter - a pointer to the block of memory allocated with malloc or calloc - is merely a suggestion. Function getline will automatically enlarge the block of memory as needed via realloc, so there is never a shortage of space - one reason why this function is so safe. Not only that, but it will also tell you the new size of the block, by updating the value returned in the second parameter.
That being said, every time you call function getline, you first need to:
Set maxSz to a reasonable size.
Set line.buf = malloc(maxSz).
Set the value of maxSz not too large, in order to reduce the amount of redundant memory used.
Set the value of maxSz not too small, in order to reduce the number of times getline calls realloc.

Related

Freeing a C pointer after altering its value

Can I free a pointer such as:
unsigned char *string1=NULL;
string1=malloc(4096);
After altering its value like:
*string1+=2;
Can free(string1) recognize the corresponding memory block to free after incrementing it (for example to point to a portion of a string), or do I need to keep the original pointer value for freeing purposes?
For example, for an implementation of the Visual Basic 6 function LTrim in C, I need to pass **string as a parameter, but in the end I will return *string+=string_offset_pointer to start beyond any blank spaces/tabs.
I think that here I am altering the pointer so if I do this in this way I will need to keep a copy of the original pointer to free it. It will probably be better to overwrite the non-blank contents into the string itself and then terminate it with 0 to avoid requiring an additional copy of the pointer just to free the memory:
void LTrim(unsigned char **string)
{
unsigned long string_length;
unsigned long string_offset_pointer=0;
if(*string==NULL)return;
string_length=strlen(*string);
if(string_length==0)return;
while(string_offset_pointer<string_length)
{
if(
*(*string+string_offset_pointer)!=' ' &&
*(*string+string_offset_pointer)!='\t'
)
{
break;
}
string_offset_pointer++;
}
*string+=string_offset_pointer;
}
It would probably be best to make the function to overwrite the string with a substring of it but without altering the actual value of the pointer to avoid requiring two copies of it:
void LTrim(unsigned char **string)
{
unsigned long string_length;
unsigned long string_offset_pointer=0;
unsigned long string_offset_rebase=0;
if(*string==NULL)return;
string_length=strlen(*string);
if(string_length==0)return;
//Detect the first leftmost non-blank
//character:
///
while(string_offset_pointer<string_length)
{
if(
*(*string+string_offset_pointer)!=' ' &&
*(*string+string_offset_pointer)!='\t'
)
{
break;
}
string_offset_pointer++;
}
//Copy the non-blank spaces over the
//originally blank spaces at the beginning
//of the string, from the first non-blank
//character up to the string length:
///
while(string_offset_pointer<string_length)
{
*(*string+string_offset_rebase)=
*(*string+string_offset_pointer);
string_offset_rebase++;
string_offset_pointer++;
}
//Terminate the newly-copied substring
//with a null byte for an ASCIIZ string.
//If the string was fully blank we will
//just get an empty string:
///
*(*string+string_offset_rebase)=0;
//Free the now unused part of the
//string. It assumes that realloc()
//will keep the current contents of our
//memory buffers and will just truncate them,
//like in this case where we are requesting
//to shrink the buffer:
///
realloc(*string,strlen(*string)+1);
}
Since you're actually doing
unsigned char *string1=NULL;
string1=malloc(4096);
*string1+=2;
free(string1);
free(string1) IS being passed the result of a malloc() call.
The *string1 += 2 will - regardless of the call of free() - have undefined behaviour if string1[0] is uninitialised. (i.e. If there is some operation that initialises string1[0] between the second and third lines above, the behaviour is perfectly well defined).
If the asterisk is removed from *string1 += 2 to form a statement string1 += 2 then the call of free() will have undefined behaviour. It is necessary for free() to be passed a value that was returned by malloc() (or calloc() or realloc()) that has not otherwise been deallocated.
The value passed to free() must be a pointer returned by malloc(), calloc(), or realloc(). Any other value results in undefined behavior.
So you have to save the original pointer if you modify it. In your code you don't actually modify the pointer, you just increment the contents of the location that it points to, so you don't have that problem.
Why is the language specified this way, you might ask? It allows for very efficient implementations. A common design is to store the size allocation in the memory locations just before the data. So the implementation of free() simply reads the memory before that address to determine how much to reclaim. If you give some other address, there's no way for it to know that this is in the middle of an allocation and it needs to scan back to the beginning to find the information.
A more complicated design would keep a list of all the allocations, and then determine which one the address points into. But this would use more memory and would be much less efficient, since it would have to search for the containing allocation.

Dynamically alloc size of array based on user input in C

I'm new to C so I'm having a little trouble handling everything Java already did for me in the background.
Basically what I would like to achieve is this:
Declare an array of char with no specified size
Ask to the user a string in input (single word or phrase)
Set the previous array of char with size of the length of the input string (dynamically)
Put the inputed string inside the char array
I've tried using scanf but it doesn't seem to handle string as an array of char(?) so I'm not able to work on the variable
I've also read about malloc() functions which dynamically allocates space for an array so I could use it to set the size of the array as the strlen of the string and then put '\0' at the end (just like .asciiz in some assembly language) but I can't figure out how to correlate malloc and input string.
Any help would be appreciated!
Thanks for your attention
You can use getline to read an entire line and not have to worry about managing memory during that read. The function was only standardized in POSIX.1-2008, so if you’re using glibc you’ll need to compile with -D_POSIX_C_SOURCE=200809L, for example.
To summarize the linked documentation: getline takes a pointer to a string, and will allocate memory entirely for you if the string is NULL and the size is 0. It returns −1 if it fails to allocate memory (e.g. there’s more input than free memory) or if the end of the input is reached immediately. You always have to free the memory allocated in this way, even if it fails.
#include <stdio.h>
#include <stdlib.h>
int main(void) {
size_t input_size = 0;
char* input_line = NULL;
if (getline(&input_line, &input_size, stdin) == -1) {
free(input_line);
perror("Failed to read input");
return EXIT_FAILURE;
}
printf("Got input: '%s'\n", input_line);
free(input_line);
}
C does not provide a way for you to create an array of unspecified size. Generally, to do this sort of thing, you must create an array of some size (e.g., using malloc) and start reading user input. If the user input continues too long, you increase the size of the array (using realloc) and continue reading.
Once you reach the end of whatever the user is inputting, then you can reduce the array to match the actual size (again using realloc) if desired.
A consequence of this is that you cannot read the user input in one go. You must write code that reads portions of specific size, either character-by-character or as many characters as fit in the array you have created so far.

Malloc, realloc, and returning pointers in C

So I am trying to get information from an html page. I use curl to get the html page. I then try to parse the html page and store the information I need in a character array, but I do not know what the size of the array should be. Keep in mind this is for an assignment so I won't be giving too much code, so I am supposed to dynamically allocate memory, but since I do not know what size it is, I have to keep allocating memory with realloc. Everything is fine within the function, but once it is returned, there is nothing stored within the pointer. Here is the code. Also if there is some library that would do this for me and you know about it, could you link me to it, would make my life a whole lot easier. Thank you!
char * parse(int * input)
{
char * output = malloc(sizeof(char));
int start = 270;
int index = start;
while(input[index]!='<')
{
output = realloc(output, (index-start+1)*sizeof(char));
output[index-start]=input[index];
index++;
}
return output;
}
The strchr function finds the first occurrence of its second argument in its first argument.
So here you'd have to find a way to run strchr starting at input[start], passing it the character '<' as second argument and store the length that strchr finds. This then gives you the length that you need to allocate for output.
Don't forget the '\0' character at the end.
Use a library function to copy the string from input to output.
Since this is an assignment, you'll probably find out the rest by yourself ...
That is the dynamic reading:
#include "stdio.h"
#include "string.h"
#include "stdlib.h"
int main(){
int mem=270;
char *str=malloc(mem);
fgets(str,mem,stdin);
while(str[strlen(str)-1]!='\n'){//checks if we ran out of space
mem*=2;
str=realloc(str,mem);//double the amount of space
fgets(str+mem/2-1,mem/2+1,stdin);//read the rest (hopefully) of the line into the new space.
}
printf("%s",str);
}
Your output needs to end with '\0'. A pointer is just a pointer to the beginning of the string, and has no length, so without a '\0' (NUL) as a sentinel, you don't know where the end is.
You generally don't want to call realloc for every individual new character. It would usually make more sense to malloc() output to be the strlen() of input and then realloc() it once at the end.
Alternatively, you should double it in size each time you realloc it instead of just adding one byte. That requires you to keep track of the current allocated length in a separate variable though, so that you know when you need to realloc.
You might read up on the function strcspn, it can be faster than using a while loop.

Memory corruption in the last byte

I have a function that returns a pointer to a structure as the following :
//header file
typedef struct {
unsigned char *buffer;
uint8_t len;
} T_ABC_PACKET
in the main file, I create a pointer to a function and tries to print this
T_ABC_PACKET *pct = NULL;
pct = function_that_return_the_packet;
printf("value of packet is %s \n", pct->buffer);
the result is always consistent in the printing function. I expect the buffer to have an 8 byte, and the last byte is always a corrupted memory.
value is 10000357`�2U
but if I print the buffer inside the function :
T_ABC_PACKET* function_that_return_the_packet {
T_ABC_PACKET *pct = NULL;
char string_temp[80];
//some more initialization...
pct->buffer = (unsigned char *)string_temp;
pct->len = 5;
printf("value of packet is %s \n", pct->buffer);
return pct;
}
the value printed in the function is 10000357f. Only the last character is corrupted.
This always provide a consistent value, no many times I run the program, only the last character is corrupted in the caller of the function.
I understand one possible case is memory leak, but I tried to check carefully and I can not find any leak. How do I get the pct->buffer to have everything correctly?
It looks like you are returning a pointer to a local variable which is undefined behavior, string_temp is local to function_that_return_the_packet and will not exist after you exit that function.
As Daniel suggested using strdup is probably the simplest way to fix the problem:
pct->buffer = strdup(string_temp);
Just make sure you check that it did not fail. You could of course also use malloc and then strcpy.
Once you fix the undefined behavior of returning a pointer to local (see Shafik Yaghmour answer) you still have undefined behavior: it appears that the buffer is not null-ternminated, so %s format specifier reads past it, and stops only when it finds an unrelated \0.
If you know that the buffer's length cannot exceed eight, you can copy its content up to pct->len into a char buffer, theninsert a terminator at the end:
char tmpBuf[9]; // max length is 8, plus one for null ternminator
memcpy(tmpBuf, pct->buffer, pct->len);
tmpBuf[pct->len] = '\0';
printf("value of packet is %s \n", tmpBuf);
This is the source of the problem:
pct->buffer = (unsigned char *)string_temp;
'string_temp' is allocated on the stack. When function returns, it is destroyed somewhere later, or not, as in your case, except last byte.
You should:
Use strdup() instead of assignment in that line.
When you are done with whole structure, use free() to release that string before releasing whole structure.

Malloc has junk for C string?

I'm new to C, so feel free to correct mistakes.
I have some code that somewhat goes like this:
// some variables declared here like int array_size
char* cmd = (char*)malloc(array_size*sizeof(char));
for(;;){
// code here sets cmd to some string
free(cmd);
array_size = 10;
cmd = (char*)malloc(array_size*sizeof(char));
// print 1
printf(cmd);
printf("%d\n", strlen(cmd));
// repeat above for some time and then break
}
So I do the loop for a while and see what it prints. What I expected was every time the string would be empty and the length would be 0. However, that is not the case. Apparently sometimes malloc gets memory with junk and prints that out and that memory with junk has a length != 0. So I was thinking about solving this by setting all char in a new char string to '\0' when malloc returns; however, I'm pretty sure I just did something wrong. Why is it even after I free the string and do a whole new malloc that my string comes with junk unlike the first malloc? What am I doing wrong?
malloc just allocated the memory and nothing more. It has no promises about what is in the memory. Specifically, it does not initialize memory. If you want allocated memory to be zeroed out, you can either do it manually with memset or simply call calloc (which is essentially malloc with zeroing out of memory).
malloc does not initialise the memory. You are just lucky the first time around.
Also if it is junk and contains a % symbol you are going to have other problems.
No you did nothing wrong - malloc does not guarantee the memory will be set to 0, only that it belongs to your process.
In general setting newly allocated memory to zero in unneeded so in C it is never explicitly cleared which would take several clock cycles.
There is a rather convenient method 'memset' to set it if you need
Your code segment has, at a minimum, the following problems.
You don't ever need to multiply by sizeof(char) - it's always one.
You cast the return value of malloc. This can hide errors that would otherwise be detected, such as if you forget to include the header with the malloc prototype (so it assumes int return code).
malloc is not required to do anything with the memory it gives you, nor will it necessarily give you the same block you just freed. You can initialise it to an empty string with a simple *cmd = '\0'; after every malloc if that's what you need.
printf (cmd) is dangerous if you don't know what cmd contains. If it has a format specifier character (%), you will get into trouble. A better way is printf ("%s", cmd).

Resources