how test if char array is null? - arrays

I've been doing (in C)
char array[100];
if (array == NULL)
something;
which is very wrong (which I have finally learned since my program doesn't work). What is the equivalent where I could test a new array to see if nothing has been put in it yet?
Also, how do you make an array empty/clean it out?
I know there are other posts out on this topic out there, but I couldn't find a straightforward answer.

An array declared with
char array[100]
always has 100 characters in it.
By "cleaning out" you may mean assigning a particular character to each slot, such as the character '\0'. You can do this with a loop, or one of several library calls to clear memory or move memory blocks.
Look at memset -- it can "clear" or "reset" your array nicely.
If you are working with strings, with are special char arrays terminated with a zero, then in order to test for an empty array, see this SO question. Otherwise if you have a regular character array not intended to represent text, write a loop to make sure all entries in the array are your special blank character, whatever you choose it to be.
You can also declare your character array like so:
char* array = malloc(100);
or even
char* array = NULL;
but that is a little different. In this case the array being NULL means "no array has been allocated" which is different from "an array has been allocated but I have not put anything in it yet."

Related

C - Splitting C-String into words without reallocating memory

I'm trying to split a string (const char*) into words and saving the individual words in an array of char-pointer (char**).
My problem is not the splitting part but that I'm not allowed to allocate any memory. I need to use the input string as my memory, but since its a const char* I'm not able to modify it.
My thirst thought was to change all whitespaces into '\0' and save the position of the beginning of the words in the array, which of course is not possible since the input string is const.
The declaration of the function looks like this:
int breakIntoWords(const char *line, int maxWords, char** words);
The function returns the number of words in line and maxWords is the size of the word-array.
Everything I found either used arrays as input strings or allocated memory with malloc.
There is no solution to the problem as posed. You can obtain a pointer to the start of each word, but in order to use the source string as the storage for separate word strings you must modify it by replacing delimiters with string terminators, as you considered doing.
If the task indeed supposes that you will alter the input line to use it for storage of several separate strings, then it seems that it is inherently incorrect for the function's line parameter to be const-qualified. Such qualification is inconsistent with the job the function is supposed to perform. Moreover, if you are supposed to assign pointers into the string pointed to by line into words, then the fact that words is not const-qualified also presents a conflict.
The only plausible solution I see to the problem described is to remove the const qualifier from your line parameter.

difference between strlen(string) and strlen( *string)

Let's say I have an array of strings that are all of same size.
char strings[][MAX_LENGTH];
what would be the difference between strlen(strings) and strlen(*strings)?
I know that strings by itself would be the address of the first string in the array,
but what is *strings?
First, don't do this. C will allow you to do lots of things that are a bad idea. This doesn't mean you ought to do it. :)
While you may have compiler warnings, these two are effectively identical. The reason is that with this definition:
char strings[][MAX_LENGTH];
The allocation for this will end up being one continuous block. Within that block of memory, there are no "structures" or management devices that can be used to identify where individual strings start and stop. This creates an interesting situation.
Effectively, *string and string are both pointers to precisely the same memory location. This means that calling strlen on either one of them will return the null delimited string length of the first element in the first array.
However, I must reiterate... Don't do this.

C - is there a way to work with strings which have NULL character in the middle

Is it possible to have strings with NULL character somewhere except the end and work with them? Like get their size, use strcat, etc?
I have some ideas:
1) Write your own function for getting length (or something else), which is going to iterate over a string. If it meets a NULL char, it is going to check the next char of the string. If it is not NULL - continue counting chars. But it may (and WILL!) eventually lead to situation when you are reading memory OUTSIDE of the char array. So it is a bad idea.
2) Use sizeof(array)/sizeof(type), eg sizeof(input)/sizeof(char). That is going to work pretty good I think.
Do you have any other ideas on how this can be done? Maybe there are some function which I am not aware of (C newbie alert :))?
The only really safe method I can think of is to use "Pascal"-type strings (that is, something that has a string header and assorted other data associated with it).
Something like this:
typedef struct {
int len, allocated;
char *data;
} my_string;
You would then have to implement pretty much every string manipulation function yourself. Keeping both the "length of the string" and "the size of the allocation" allows you to have an allocation that's larger than the current contents, this may make repeated string concatenation cheaper (allows an amortized O(1) append).
You can have an array of char, either statically or dynamically allocated, that contains a zero byte in the middle, but only the part up to and including the zero can be considered a "string" in the standard C sense. Only that part will be recognized or considered by the standard library's string functions.
You can use a different terminator -- say two zeroes in a row -- and write your own string functions, but that just pushes off the problem. What happens when you need two zeroes in the middle of your string? In any case, you need to exercise even more care in this case than in the ordinary string case to ensure that your custom strings are properly terminated. You also have to be certain to avoid using them with the standard string functions.
If your special strings are stored in char array of known size then you can get the length of the overall array via sizeof, but that doesn't tell you what portion of the array contains meaningful data. It also doesn't help with any of the other string functions you might want to perform, and it does nothing for you if your handle on the pseudo-strings is a char *.
If you are contemplating custom string functions anyway, then you should consider string objects that have an explicit length stored with them. For example:
struct my_string {
unsigned allocated, length;
char *contents;
};
Your custom functions then handle objects of that type, being certain to do the right thing with the length member. There is no explicit terminator, so these strings can contain any char value. Also, you can be certain not to mixed these up with standard strings.
As long as you store the length of the array of chars then you can have strings with nul characters or even without a terminating nul.
struct MyString
{
int length;
char* buffer;
};
And then you would have to write all your equivalent functions for managing the string.
The bstring library http://bstring.sourceforge.net and Microsofts BSTR (uses wide chars) are existing libraries that work in this way and also offer some compatibilty with c-style strings.
pros - getting the length of the string is quick
cons - the strings need to be dynamically allocated.

Working with Pointers and Strcpy in C

I'm fairly new to the concept of pointers in C. Let's say I have two variables:
char *arch_file_name;
char *tmp_arch_file_name;
Now, I want to copy the value of arch_file_name to tmp_arch_file_name and add the word "tmp" to the end of it. I'm looking at them as strings, so I have:
strcpy(&tmp_arch_file_name, &arch_file_name);
strcat(tmp_arch_file_name, "tmp");
However, when strcat() is called, both of the variables change and are the same. I want one of them to change and the other to stay intact. I have to use pointers because I use the names later for the fopen(), rename() and delete() functions. How can I achieve this?
What you want is:
strcpy(tmp_arch_file_name, arch_file_name);
strcat(tmp_arch_file_name, "tmp");
You are just copying the pointers (and other random bits until you hit a 0 byte) in the original code, that's why they end up the same.
As shinkou correctly notes, make sure tmp_arch_file_name points to a buffer of sufficient size (it's not clear if you're doing this in your code). Simplest way is to do something like:
char buffer[256];
char* tmp_arch_file_name = buffer;
Before you use pointers, you need to allocate memory. Assuming that arch_file_name is assigned a value already, you should calculate the length of the result string, allocate memory, do strcpy, and then strcat, like this:
char *arch_file_name = "/temp/my.arch";
// Add lengths of the two strings together; add one for the \0 terminator:
char * tmp_arch_file_name = malloc((strlen(arch_file_name)+strlen("tmp")+1)*sizeof(char));
strcpy(tmp_arch_file_name, arch_file_name);
// ^ this and this ^ are pointers already; no ampersands!
strcat(tmp_arch_file_name, "tmp");
// use tmp_arch_file_name, and then...
free(tmp_arch_file_name);
First, you need to make sure those pointers actually point to valid memory. As they are, they're either NULL pointers or arbitrary values, neither of which will work very well:
char *arch_file_name = "somestring";
char tmp_arch_file_name[100]; // or malloc
Then you cpy and cat, but with the pointers, not pointers-to-the-pointers that you currently have:
strcpy (tmp_arch_file_name, arch_file_name); // i.e., no "&" chars
strcat (tmp_arch_file_name, "tmp");
Note that there is no bounds checking going on in this code - the sample doesn't need it since it's clear that all the strings will fit in the allocated buffers.
However, unless you totally control the data, a more robust solution would check sizes before blindly copying or appending. Since it's not directly related to the question, I won't add it in here, but it's something to be aware of.
The & operator is the address-of operator, that is it returns the address of a variable. However using it on a pointer returns the address of where the pointer is stored, not what it points to.

C Programming: Find Length of a Char* with Null Bytes

If I have a character pointer that contains NULL bytes is there any built in function I can use to find the length or will I just have to write my own function? Btw I'm using gcc.
EDIT:
Should have mentioned the character pointer was created using malloc().
If you have a pointer then the ONLY way to know the size is to store the size separately or have a unique value which terminates the string. (typically '\0') If you have neither of these, it simply cannot be done.
EDIT: since you have specified that you allocated the buffer using malloc then the answer is the paragraph above. You need to either remember how much you allocated with malloc or simply have a terminating value.
If you happen to have an array (like: char s[] = "hello\0world";) then you could resort to sizeof(s). But be very careful, the moment you try it with a pointer, you will get the size of the pointer, not the size of an array. (but strlen(s) would equal 5 since it counts up to the first '\0').
In addition, arrays decay to pointers when passed to functions. So if you pass the array to a function, you are back to square one.
NOTE:
void f(int *p) {}
and
void f(int p[]) {}
and
void f(int p[10]) {}
are all the same. In all 3 versions, p is a pointer, not an array.
How do you know where the string ends, if it contains NULL bytes as part of it? Certainly no built in function can work with strings like that. It'll interpret the first null byte as the end of the string.
If you want the length, you'll have to store it yourself. Keep in mind that no standard library string functions will work correctly on strings like these.
You'll need to keep track of the length yourself.
C strings are null terminated, meaning that the first null character signals the end of the string. All builtin string functions rely on this, so if you have a buffer that can contain NULLs as part of the data then you can't use them.
Since you're using malloc then you may need to keep track of two sizes: the size of your allocated buffer, and how many characters within that buffer constitute valid data.

Resources