strcpy issue with char arrays in structs in C - c

So I'm working on a program to take in assembly code in a text file and produce the corresponding machine code. However, I'm running into an issue when I'm trying trying to assign values to the members of the AssemblyLine struct. What happens is that when ".fill" is the opcode, arg0 is concatenated to it, and there are also issues with arg0 if I assign value to arg0 first. It is important to note that this only happens when the opcode is ".fill". For example, if the opcode is "add" the values are what I intended for them to be.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
struct AssemblyLine {
char opcode[5];
char arg0[7];
char arg1[7];
char arg2[7];
_Bool initialized;
};
struct Label {
char name[7];
int address;
_Bool initialized;
};
main()
{
struct AssemblyLine line;
strcpy(line.opcode, ".fill");
strcpy(line.arg0, "5");
printf("%s\n", line.opcode);
return 0;
}
The output for this program is:
.fill5
My intention is that the output would just be:
.fill
I'm really confused about what would be causing this. Sorry if the answer is really obvious, this is my first time working in C, though I have programmed in C++ before. I was at first thinking that there was no null terminating character, but the string is read fine until after I use the second strcpy. Is fill used as a key word for strcpy or something? I thought maybe it had to do with the '.' but that didn't affect anything when the opcode was ".lw".
Sorry that this post is so long! Thanks for any help!

Your array isn't big enough. ".fill" is six characters include the terminating null, but you only allocate memory for five with char opcode[5]. You need to make your array bigger.

The string ".fill" is 5 characters + 1 zero character long. That makes 6 characters.
But the array 'opcode' is only 5 characters long, so the trailing zero is written to 'arg0'.
After that, your copy "5" (2 characters with zero) to 'arg0'.
Because 'printf' prints until the trailing zero occurs, it reads out of the bounds of opcode.

Related

How to add text to a variable in C?

I'm learning C at the moment and I run into some problem, hard to understand what is going on with variables and adding text to variable in C.
I know C is not treating the strings and character as in other programming language.
If I understand it correctly from my book:
I have to define a variable before I can use it, that's ok for me.
So, If I do this code, where I wish to print the variable text_1 it is failing:
#include <stdio.h>
int main()
{
char text_1[];
text_1[] = "Testing";
printf("Test 1 is: %s", text_1);
return(0);
}
But if I do this way, this is working:
#include <stdio.h>
int main()
{
char text_1[] = "Testing";
printf("Test 1 is: %s", text_1);
return(0);
}
In some other programming language I can do this way:
Dim a as string
a="Testing"
print("Testing 1 is:", a) --> or similar option to print out the variable 'a'.
What is the correct way to do this in C?
Thank you.
Oops... C language is a rather low level language if you compare it to Java, Python or Ruby, or even Basic or JavaScript:
it has no notion of text string but only uses null terminated character arrays in its standard library
arrays by themselves are not first class citizens: except at initialization time, the language can only process arrays elements and not the full array
Long story made short, an array has a size that is defined only once (at definition time) and will never change during the life time of the array. char text_1[] = "Testing"; is an idiomatic initialization: the size of the array is set to the number of characters of the literal string + 1 for the terminating null, so here 8. After that text_1 will be able to contain other strings of at most 7 characters + 1 terminating null if you copy the relevant character for example with strcpy.
To go back to your code, char text_1[]; declares an incomplete array with a declared size of 0 bytes. That means that you cannot use it. The 2 correct ways would be:
char text_1[] = "Testing"; // idiomatic initialization of a 8 characters array
or
char text_1[8]; // definition of an uninitialized char array of size 8
strcpy(text_1, "Testing"); // copy a string into the array
Not really sexy, but C language is like that...
Ok, for conclusion, what is the correct way to learn how to set up a variable with string in a way I can print it to the screen or use it for compare with other variables who are also holding strings?
I like this way but I'm not sure is it correct to use:
char * text_1; // btw. what is this `*` meaning here?
text_1 = "Some text...";
printf("Variable text is: %s\n",text_1);
but this way is also acceptable by me:
char text_1[n]; // where n is a number of max char
strcpy(text_1, "Some text...");
printf("Variable text is: %s\n",text_1);

Why does this program NOT segfault? [duplicate]

This question already has answers here:
Accessing an array out of bounds gives no error, why?
(18 answers)
Closed 2 years ago.
Usually, this question is probably phrased in a positive way, becoming the next member in the club of duplicate questions - this one hopefully isn't. I have written a simple program to reverse a string in C. Here it is:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(void) {
char arr[4] = "TEST";
char rev[2];
int j = 0;
for(int i = 0; i < 4; i++) {
j = 4 - i - 1;
rev[j] = arr[i];
}
printf("%s\n",rev);
}
When I define char arr and char rev to be of size 4, everything works fine. When I leave arr size out I get unexpected repeat output like "TSTTST". When I define rev to be an array of 2 chars, I do not get a segfault, yet in the loop I am trying to access its third and fourth element. As far as my relatively limited understanding tells me, accessing the third element in an array of length two should segfault, right? So why doesn't it?
EDIT:
Interestingly enough, when I leave the loop out like so
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(void) {
char arr[4] = "TEST";
char rev[2] = "00";
printf("%s\n",rev);
}
it prints "00TEST". What happened here? Some kind of overflow? I even restarted the terminal, recompiled and ran again.
EDIT 2:
I have been made aware that this is indeed a duplicate. However, most of the suggested duplicates referred to C++, which this isn't. I think this is a good question for new C programmers to learn about and understand undefined behavior. I, for one, didn't know that accessing an array out of bounds does not always cause a SEGFAULT. Also, I learned that I have to terminate string literals myself, which I falsely believed was done automatically. This is partly wrong: it is added automatically - the C99 Standard (TC3) says in 6.4.5 String literals that terminating nulls are added in translation phase 7. As per this answer and the answers for this question, char arrays are also null-terminated, but this is only safe if the array has the correct length (string length + 1 for null-terminator).
char rev[2] assigns a memory of size 2*sizeof(char) with variable/pointer rev. You are accessing memory not allocated to the pointer. It may or may not cause errors.
It might appear to work fine, but it isn't very safe at all. By writing data outside the allocated block of memory you are overwriting some data you shouldn't. This is one of the greatest causes of segfaults and other memory errors, and what you're observing with it appearing to work in this short program is what makes it so difficult to hunt down the root cause.
When you do rev[2] or rev[3] you are accessing rev + 2 and rev + 3 addresses which are not allocated to rev pointer. Since its a small program and there is nothing there, it's not causing any errors.
In respect to edit:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(void) {
char arr[4] = "TEST";
char rev[2] = "00";
printf("%s\n",rev);
}
%s prints till null is encountered, the size you have assigned of the arr and rev doesn't allow for null to be there, try changing values as follow:
char arr[5] = "TEST";
char rev[3] = "00";
The program will work as intended as in arr there will be TEST\0 and rev will be rev\0 where \0 is null character in C.
Give this article a read, it'll solve most of your queries.

Explanation on how does the memcpy function behaves? [duplicate]

This question already has answers here:
No out of bounds error
(7 answers)
Closed 5 years ago.
#include <stdio.h>
#include <string.h>
char lists[10][25];
char name[10];
void main()
{
scanf("%s" , lists[0]);
memcpy(name , lists[0], 25);
printf("%s\n" , name);
}
In the above code I am predefining the size of character array "name" as 10.
Now when I gave the input as :
Input - abcdefghijklmnopqrstuvwxy
The output I got was the same string : abcdefghijklmnopqrstuvwxy
Should'nt I get the output as : abcdefghij ???
how this is becoming possible even though the size of array is limited to 10?
Because it doesn't know the size of the allocated memory it's writing into, and you got away with where the extra data got written. You might not on another platform, or using a different compiler, or different optimisation settings.
When passing the size parameter to memcpy (), it's a good idea to take the size of the destination memory into account.
When using char arrays, if you want to be safer about not overrunning memory, you can use strncpy (). It'll take care of inserting the trailing NULL in the right place.
To start with, arrays are pointers. In C there are no length checks like on Java for example.
When you write char a[2]; the OS gives you space on the memory for 2 chars.
For example, let the memory be
|1|2|3|4|5|6|7|8|9|10|11|12|
a
a is a pointer to the address 1. The a[0] = 0; is equal with *(a+0) = 0, meaning write 0 to the address a + offset 0.
So if you try to write to an address that you have not allocated, unexpected things can happen.
For example, lets say we have char a[2];char b[2]; and the memory map is
|1|2|3|4|5|6|7|8|9|10|11|12|
a b
Then the a[2] = 0 is equal to b[0] = 0. But if this address is an address of an other program, then a segmentation error will be raised.
Try the program (it may work with no optimizations of the compiler):
#include <stdio.h>
#include <string.h>
char a[4];
char b[4];
void main()
{
scanf("%s" , a); // input "12345678"
printf("%s\n" , b); // print "5678"
}
memcpy just copies from an address to the other the size of data you said.
In your example, you were luky because all the addresses you accessed where assigned to your program (inside your's memory page).
In C/C++ you are responsible to handle the memory correctly. Also, keep in mind that strings end at the char \0 so inside an array char str[10]; we usually have tops 9 chars and the \0.

strcat() in c language programming

when I run this code I always found a problem in my IDE. Can you give this solution ?
#include<stdio.h>
#include<string.h>
int main(void)
{
char cname[4]="mahe";
strcat(cname, "Karim");
printf("%s",cname);
getch();
return 0;
}
Your array isn't big enough. The original array isn't big enough to hold the null byte at the end of its initial value, so strcat() can't find the end of the string. And then you're adding to it, which writes outside the array. These are both causing undefined behavior.
It needs to be declared large enough to hold the original string, the string you're adding to it, and the trailing null byte. So it has to be at least 10 bytes (4+5+1).
char cname[10] = "mahe";
strcat(cname, "Karim");
printf("%s\n", cname);
Change char cname[4] to char cname[10]. Because you are setting the size 4 and so, you can't append any more to it after adding 4 chars initially.
So, change the size. That's it

gets and puts to get and print a string

i'm trying to get and print a string with gets and puts but i get a segmentation fault error when i use them togheter.
this is the code i'm trying to get this working. [i type the string "prova" to test it]
int main()
{
char *s;
gets(s);
puts(s);
return 0;
}
if i change "gets" with "scanf" i get the same error.
if i change "puts" with "printf("%s", s)" i get the output.
if i declare char *s = "prova" and then puts(s) i get the output.
i also tried to change char *s; with char s[] but i get the same error.
where i'm i wrong on this? ty very much
i know gets is bad, is just bc i'm writing exercise from "C how to program, fifth edition" by Deitel and Deitel
You have multiple problems with that piece of code. To start with gets have been deprecated since the C99 standard, and in the C11 standard it has been removed. The reason is that it's not very safe, and has no bounds-checking and so can write beyond the bounds of the memory you pass to it leading to buffer overflows.
Secondly, you use the uninitialized local variable s. The value of an uninitialized variable is indeterminate, and will be seemingly random. Using an uninitialized local variable leads to undefined behavior, which often leads to crashes.
Another problem is if you initialize s to point to a literal strings. Literals strings are constant (read-only) arrays of characters, and attempting to write to it will again lead to undefined behavior.
You need to allocate some room for the string:
char s[256];
gets(s);
puts(s);
But gets is bad. (It doesn't know how big your buffer is, so what happens if more than 255 characters are read?)
The most important mistake you have is that you are declaring a char pointer, but you are not reserving the space in memory where the characters will be stored, so you got a pointer that point to some random memory adress that you should'nt use. the "right" thing to do will be:
#include <stdio.h>
#include <stdlib.h>
#define LENGHT 20
int main()
{
char *s;
s=malloc(sizeof(char)*LENGHT); //here you make the pointer point to a memory adress that you can use
gets(s);
puts(s);
free (s);
return 0;
}
But also is strongly recommend to avoid using gets because that function doesn't check for the length of the input, so use fgets instead that allow you to do that, you will only need to set the data stream to stdin.
The code will be:
#include <stdio.h>
#include <stdlib.h>
#define LENGHT 20
int main()
{
char *s;
s=malloc(sizeof(char)*LENGHT);
fgets(s,20,stdin);
puts(s);
free(s);
return 0;
}

Resources