Below is a section of a Tokenizer I built. The user types a string they wish to tokenize, that string is stored into a char array, and a null character ('\0') is placed as soon as the string ends. That section of the code seems to work fine after having tested it a few times.
The problem I'm getting occurs later on in the code when I make an array (tokenArray) of arrays (newToken). I use functions to get number of tokens and token length.
I entered the string "testing pencil calculator." I then store each token into an array. The problem is when I go to print the contents of the array, the loop that I have printing stops before it should.
Here's a sample input/output. My comments (not in code) noted by
$testing pencil calculator //string entered
complete index: 0 //index of the entire array, not the tokenized array
token length: 7 //length of 1st token "testing"
pointer: 0xbf953860
tokenIndex: 0 //index of the token array (array of arrays)
while loop iterations: 4 //number of times the while loop where i print is iterated. should be 7
test //the results of printing the first token
complete index: 8
token length: 6 //next token is "pencil"
tokenIndex: 1
while loop iterations: 5 //should be 6
penci //stops printing at penci
complete index: 15
token length: 10 //final token is "calculator"
pointer: 0xbf953862
tokenIndex: 2
while loop iterations: 5 //should be 10
calcu //stops printing at calcu
for the life of me, I simply cannot figure out why the while loop is exiting before it is supposed to. I doubt this is the only problem with my methodology, but until I can figure this out, I can't address other bugs.
Below is a section of my code that is responsible for this:
from main:
completeString[inputsize] = '\0';
char tokenArray[numTokens+1];
tokenArray[numTokens] = '\0';
putTokensInArray(tokenArray, completeString);
method where I'm getting errors:
char ** putTokensInArray(char tokenArray[], char * completeString){
int completeIndex = 0;
int tokenIndex = 0;
while(tokenArray[tokenIndex] != '\0'){
int tokenLength = tokenSize(completeString, completeIndex);
char newToken [tokenLength+1];
newToken[tokenLength] = '\0';
tokenArray[tokenIndex] = *newToken;
printf("\ncomplete index: %d", completeIndex);
printf("\ntoken length: %d", tokenLength);
printf("\ntokenIndex: %d\n", tokenIndex);
int i = 0;
while(newToken[i] != '\0'){
newToken[i] = completeString[i + completeIndex];
i++;
}
completeIndex += (tokenLength+1);
printf("while loop iterations: %d\n", i);
for(int j = 0; newToken[j] != '\0'; j++){
printf("%c", newToken[j]);
}
tokenIndex++;
tokenLength = 0;
}//big while loop
}//putTokensInArray Method
I have tried several things but just cannot get the grasp of it. I'm new to C, so it's entirely possible I'm making pointer mistakes or accessing memory I shouldn't be; on that note, how would I implement a malloc() and free()? I've been doing reading on that and seems to work, but I'm unable to implement those functions.
You are passing an uninitialized character array to your function putTokensInArray. Later in that function, in the while loop condition you are checking for \0 for every element starting from 0. However since the array is uninitialized, those characters could be ANY characters. There could be a \0 before the numTokens+1th element.
To fix this problem, pass the length of the character array i.e. numTokens to the putTokensInArray as an additional argument. Then in your while loop, do the following condition check instead:
while(tokenIndex < numTokens){
Related
I've been trying to understand why my string/char array loses the value assigned to it in the for loop as soon as the loop ends. The value for token2 is a user input that gets shunted into the "token2" variable earlier on in the code. I have several checks prior to this portion and token2 is populated as expected.
int dayInt2;
char dayToken2[3];
//For loop to parse out the month portion of second token.
for (int i = 3; i < 5; i++)
{
dayToken2[i] = token2[i];
printf("For loop parse: %c\n", token2[i]);
}
dayToken2[3] = '\0'; //Add null pointer to the last character space of the string array.
printf("dayToken2 value: %s\n", dayToken2); //Debugging to check the value in dayToken2.
dayInt2 = atoi(dayToken2); //converts day
printf("dayInt2 value: %s\n", dayInt2); //Debugging to check if the string conversion to int worked.
dayToken2[3] is outside the array. The indexes are from zero. So the maximum index is 2.
printf("dayInt2 value: %s\n", dayInt2); is wrong. You try to print the integer but you say the printf that it should expect the string. It has to be printf("dayInt2 value: %d\n", dayInt2);
dayToken2[i] = token2[i]; is also wrong as i changes from 3 to 4. It has to be dayToken2[i - 3] = token2[i];. You can use memcpy for that memcpy(dayToken2, token2 + 3, 2);
When defining your char array, you set the max indices as 3, so there is only dayToken[0], dayToken[1], dayToken[2]. In your for loop you set i from 3-5, try doing from 0-2.
Hey I am trying to understand pointers and I create a program in which I give words from keywords and I
store them in an array and after that I want to print the first character of the first word (I expected) but it prints the first character of the second word
What is going wrong?
#include<stdio.h>
#include<stdlib.h>
int main ()
{
int i=0;
char array[5][10];
for(i = 0 ; i < 5 ; i++)
{
gets(array[i]);
}
printf("\n");
char *p;
p=&array[0][10];
printf("%c",*p);
return 0;
}
The position in the array you are looking for doesn't exist, so the program is showing a random value.
The arrays go from 0 to (n-1), 'n' being 5 or 10 in your case. If you search a differente position in the range of the array you will find the correct answer.
Try changing this part of the code ('a' have to be a value from 0 to 4 and 'b' have to be a value from 0 to 9)
p=&array[a][b];
pointer are address in memory
1rst word adresses are from 0 to 9
2nd word from 10 to 19
p=&array[0][10]; points to the 10th elt so the first letter of the second word! and not for a random value as previous post suggests.
That said NEVER use gets
Why is the gets function so dangerous that it should not be used?
I'm pretty new to C, and I'm trying to write a function that takes a user input RAM size in B, kB, mB, or gB, and determines the address length. My test program is as follows:
int bitLength(char input[6]) {
char nums[4];
char letters[2];
for(int i = 0; i < (strlen(input)-1); i++){
if(isdigit(input[i])){
memmove(&nums[i], &input[i], 1);
} else {
//memmove(&letters[i], &input[i], 1);
}
}
int numsInt = atoi(nums);
int numExponent = log10(numsInt)/log10(2);
printf("%s\n", nums);
printf("%s\n", letters);
printf("%d", numExponent);
return numExponent;
}
This works correctly as it is, but only because I have that one line commented out. When I try to alter the 'letters' character array with that line, it changes the 'nums' character array to '5m2'
My string input is '512mB'
I need the letters to be able to tell if the user input is in B, kB, mB, or gB.
I am confused as to why the commented out line alters the 'nums' array.
Thank you.
In your input 512mB, "mB" is not digit and is supposed to handled in commented code. When handling those characters, i is 3 and 4. But because length of letters is only 2, when you execute memmove(&letters[i], &input[i], 1);, letters[i] access out of bounds of array so it does undefined behaviour - in this case, writing to memory of nums array.
To fix it, you have to keep unique index for letters. Or better, for both nums and letters since i is index of input.
There are several problems in your code. #MarkSolus have already pointed out that you access letters out-of-bounds because you are using i as index and i can be more than 1 when you do the memmove.
In this answer I'll address some of the other poroblems.
string size and termination
Strings in C needs a zero-termination. Therefore arrays must be 1 larger than the string you expect to store in the array. So
char nums[4]; // Can only hold a 3 char string
char letters[2]; // Can only hold a 1 char string
Most likely you want to increase both arrays by 1.
Further, your code never adds the zero-termination. So your strings are invalid.
You need code like:
nums[some_index] = '\0'; // Add zero-termination
Alternatively you can start by initializing the whole array to zero. Like:
char nums[5] = {0};
char letters[3] = {0};
Missing bounds checks
Your loop is a for-loop using strlen as stop-condition. Now what would happen if I gave the input "123456789BBBBBBBB" ? Well, the loop would go on and i would increment to values ..., 5, 6, 7, ... Then you would index the arrays with a value bigger than the array size, i.e. out-of-bounds access (which is real bad).
You need to make sure you never access the array out-of-bounds.
No format check
Now what if I gave an input without any digits, e.g. "HelloWorld" ? In this case nothin would be written to nums so it will be uninitialized when used in atoi(nums). Again - real bad.
Further, there should be a check to make sure that the non-digit input is one of B, kB, mB, or gB.
Performance
This is not that important but... using memmove for copy of a single character is slow. Just assign directly.
memmove(&nums[i], &input[i], 1); ---> nums[i] = input[i];
How to fix
There are many, many different ways to fix the code. Below is a simple solution. It's not the best way but it's done like this to keep the code simple:
#define DIGIT_LEN 4
#define FORMAT_LEN 2
int bitLength(char *input)
{
char nums[DIGIT_LEN + 1] = {0}; // Max allowed number is 9999
char letters[FORMAT_LEN + 1] = {0}; // Allow at max two non-digit chars
if (input == NULL) exit(1); // error - illegal input
if (!isdigit(input[0])) exit(1); // error - input must start with a digit
// parse digits (at max 4 digits)
int i = 0;
while(i < DIGITS && isdigit(input[i]))
{
nums[i] = input[i];
++i;
}
// parse memory format, i.e. rest of strin must be of of B, kB, mB, gB
if ((strcmp(&input[i], "B") != 0) &&
(strcmp(&input[i], "kB") != 0) &&
(strcmp(&input[i], "mB") != 0) &&
(strcmp(&input[i], "gB") != 0))
{
// error - illegal input
exit(1);
}
strcpy(letters, &input[i]);
// Now nums and letter are ready for further processing
...
...
}
}
I am creating a program for a college assignment where the user is required to input artists and their songs. The program then sorts them alphabetically and shuffles them. Artist names are stored in an array called artists[][80] and song names are stored in songsArtistx, where x is a number from 1 to 4. I initialised all arrays to be filled with the NULL terminator - '\0'. For the program to work, I need to find the number of songs entered (have to be at least 1, but can be 3 or less). To achieve this, I am using a function called checkSongs:
int checkSongs(char songsOfAnArtist[][80])
{
int i,numOfSongs;
//Loop goes through 4 possible artists.
for (i=0;i<4;i++)
{
//Assume there are 3 songs for each artits, and decrement by 1 each time an empty string occurs.
numOfSongs = 3;
if (songsOfAnArtist[i][0]=='\0' || songsOfAnArtist [i][0] == '\n')
{
numOfSongs--;
break;
}
}
return numOfSongs;
}
However, this function gives me a faulty result for when the number of songs is less than 3. Here is an example from the command line, and also a screenshot of the variables from the debugger:
In the photo above, the numbers on the last line indicates the number of artists inputted (which is correct in this case) and the number of songs in songsArtsist1, songsArtsist2, songsArtsist3, songsArtsist4 respectively. The last number is the number of artists again.
How do I alter my code so that checkSongs returns the number of songs entered for each artists?
Below is also an excerpt from the main file which could be relevant to the question:
//Get all inputs from command line: artists and songs
getInputs(artists,songsArtist1,songsArtist2,songsArtist3,songsArtist4);
//Use checkArtists to store the number of entered artists in variable 'numOfArtists'
numOfArtists = checkArtists(artists);
printf("%d ",numOfArtists);
//Use check songs to store number of songs per artist in array 'numSongsPerArtists'
numSongsPerArtist[0] = checkSongs(songsArtist1);
numSongsPerArtist[1] = checkSongs(songsArtist2);
numSongsPerArtist[2] = checkSongs(songsArtist3);
numSongsPerArtist[3] = checkSongs(songsArtist4);
//DEBUG
printf("%d ",numSongsPerArtist[0]);
printf("%d ",numSongsPerArtist[1]);
printf("%d ",numSongsPerArtist[2]);
printf("%d ",numSongsPerArtist[3]);
printf("%d ",numOfArtists);
Here are there arrays:
//The array containing artists names
char artists[4][80];
//The array containing the sorted artists
char sortedArtists[4][80];
//Songs for Artist 1
char songsArtist1[3][80];
//Songs for Artist 2
char songsArtist2[3][80];
//Songs for Artist 3
char songsArtist3[3][80];
//Songs for Artist 4
char songsArtist4[3][80];
//The total number of artists (Note it can be less than 4)
int numOfArtists = 0;
//The total number of songs for each artist (Note that less than 3 songs can be provided for each artist)
int numSongsPerArtist[4] = {0,0,0,0};
When you write a function that takes an array as argument, it always should ask
for the length from the caller, unless the end of the array is marked somehow
(like '\0' for strings). If you later change you program to accepts more or
less number of songs and you forget to update the loop conditions, you are in a
world of trouble. The caller knows the size of the array, either because it
created the array or because the array along with it's size was passed. This is
standard behaviour of the functions in the standard C library.
So I'd rewrite your functions as:
int checkSongs(char songsOfAnArtist[][80], size_t len)
{
int numOfSongs = 0;
for(size_t i = 0; i < len; ++i)
{
if(songsOfAnArtist[i][0] != 0 && songsOfAnArtist[i][0] != '\n')
numOfSongs++;
}
return numOfSongs;
}
And then calling the function
numSongsPerArtist[0] = checkSongs(songsArtist1, sizeof songsArtist1 / sizeof *songsArtist1);
numSongsPerArtist[1] = checkSongs(songsArtist2, sizeof songsArtist2 / sizeof *songsArtist2);
numSongsPerArtist[2] = checkSongs(songsArtist3, sizeof songsArtist3 / sizeof *songsArtist3);
numSongsPerArtist[3] = checkSongs(songsArtist4, sizeof songsArtist4 / sizeof *songsArtist4);
This is better because if you later change from char songsArtist3[3][80]; to
char songsArtist3[5][80];, you don't have to rewrite the boundaries in the
loop conditions. Every artists can have a different size of slots for the songs.
And when you store the song names, make sure that the source string is not
longer than 79 characters long, use for example strncpy and make sure to write the '\0'-terminating byte.
If you keep having the wrong results, then it may be that the songsArtists
variables are not initialized correctly, please check your getInputs function
so that all songsArtist[i][0] are set to 0 on initialization. If you do
that, you should get the correct results.
For example:
int main(void)
{
...
char songsArtist1[3][80];
char songsArtist2[3][80];
char songsArtist3[3][80];
char songsArtist4[3][80];
memset(songsArtist1, 0, sizeof songsArtist1);
memset(songsArtist2, 0, sizeof songsArtist2);
memset(songsArtist3, 0, sizeof songsArtist3);
memset(songsArtist4, 0, sizeof songsArtist4);
...
}
Add a condition in the block where you break the loop.
if(sumOfSongs==0)
break;
Also I would recommend to use unsigned types for the variables as the numbers most likely can never be less than 0;
And put numOfSongs = 3; outside the for loop as others suggested.
I got 2 arrays :
data[256] = "1#2#3#4#5"
question[256][256];
I need to split the number before the # into an array..
for example :
question[0][] = 1
question[1][] = 2
question[2][] = 3
question[3][] = 4
question[4][] = 5
It doesnt metter if I have the # in, or not.
This is what I wrote :
int i = 0, j = 0;
data = "1#2#3#4#5";
for (i = 0 ; i < strlen(data) ; i++)
{
for (j ; data[j] != '#' ; j++)
{
question[i][j] = data[j];
}
j++
}
printf ("%s\n", question);
The problem is, it works untill the first #, and then stops.
It only put the first # into question, and then stops.
(basiclly I'm supposed to get the same output for printing both data, and question).
There are a few problems.
First, printf only prints the string until the first terminating zero character ('\0'), which happens after the first "part" in question (even though there are other parts. Instead, you will need to print all:
for (i=0; i<255; ++i) {
printf("%s\n", question[i]);
}
Make sure you null-terminate ('\0') the rows of question before, so you don't print garbage for uninitialized rows. or just maintain the index of the last-good row and iterate until that
Also, the loop
for(j; data[j]!='#', j++)
will stop at the first '#', and all consequent iterations of the outside loop will evaluate the same j (which is the index of '#', so the loop is skipped in further iterations. You will need to advance j after the inner loop
you will also need to maintain a last-j position after the last '#' to be able to calculate the position of j from the last '#', so you can index into question[i] properly. set lastj to the value of j after is extra advancement suggested in the previous paragraph. Also, the second index of question should be j-lastj from now on.
Yet another thing about the inner loop: as it is, it will advance past the string in data after the last '#', so you will have to check for noll-termination as well.
Also, make sure you null-terminate the strings in question, otherwise printf will produce garbage (and possibly seg-fault when reaching memory not allocated to your progam). just write
question[i][j-lastj] = '\0';
after the inner loop. (j will have pointed after the last written index at the end of the inner loop)
Yet one more thing: do not iterate i until the length of data as you will not need to touch that many elements (and likely will overindex data in the inner loop). Use a while loop instead, incrementing i only until you have covered data with j in the inner loop
Note: look up strtok to make the tokenization easier on your part
I would use something like strchr to get the location of the next '#'.
The algorithm is something like this: You get the position of the next '#', and if there is none found then set next to the end of the string. Then copy from the current beginning of the string to next position into where you want it. Remember to terminate the copied string! Set the beginning of the string to one beyond next. Repeat until beginning is beyond the end of the data.
Edit: Code for my solution:
char *beg = data;
char *end = data + strlen(data); /* Points to one beyond the end of 'data' */
for (int i = 0; beg < end; i++)
{
char *next = strchr(beg, '#'); /* Find next '#' */
if (next == NULL)
break; /* No more '#' */
memcpy(question[i], beg, next - beg); /* Copy to array */
question[i][next - beg] = '\0'; /* Remember to terminate string */
beg = next + 1; /* Point to next possible number */
}
Note: Not tested. Might be one of with the copying, might have to be next - beg - 1. (Even after 25 years of C-programming, I always seem to get that wrong on the first try... :) )
There is a much simpler way to do this. Use strtok to tokenize the string by "#", and then use strcpy to copy the tokenized strings into your question array. For example (not tested):
char *pcur = data;
int i = 0;
do
{
if ((pcur = strtok(pcur, "#")) != NULL)
{
strcpy(question[i], pcur++);
printf ("%s\r\n", question[i++]);
}
}
while (pcur != NULL);
As shown in the above example, incrementing i moves the question array index to the next position, and incrementing pcur moves the tokenized string pointer past the nulled token for the next iteration through the loop.
See also:
http://msdn.microsoft.com/en-us/library/2c8d19sb.aspx
http://msdn.microsoft.com/en-us/library/kk6xf663.aspx