Copying valid strings to 2d array in C - c

I am checking if a function returns true, it prints out valid strings according some other function I got. At the moment, it's printing it out correctly but it is also printing empty lines which seem to correspond to the invalid strings.
How can I make these empty lines go away?
Here is my code:
int main()
{
int i, count = 0;
char input[10];
char validStr[10][60] = {""};
for (i = 0; i < 60; ++i){
if(fgets(input,10, stdin) == NULL){
break;
}
input[strcspn(input,"\n")] = '\0';
if(checkIfValid(input)){
memcpy(validStr[i],input,sizeof(input));
count++;
}
}
printf("%d\n",count);
for (int j = 0 ; j < count; ++j){
printf("%s\n",validStr[j]);
}
}
The count indicates it is printing only the valid strings but as you can tell by the pic it prints white lines.
Note: For various reasons the program needs to follow the current order so the output is printed after the first for loop.
Thanks in advance!

Instead of this:
if(checkIfValid(input)){
memcpy(validStr[i],input,sizeof(input));
count++;
}
This:
if(checkIfValid(input)){
memcpy(validStr[count],input,sizeof(input));
count++;
}
As others have pointed out in the comments, you want to safely secure that string copy. May I suggest:
if(checkIfValid(input)){
char* dst = validStr[count];
size_t MAXLEN = 10;
strncpy(dst, input, MAXLEN);
dest[MAXLEN-1] = '\0';
count++;
}

Continuing from the comment, if you want to store the entire string, you need to provide adequate space for the nul-terminating character.
AAAAAAAAAA
QELETIURTE
...
contain strings that are 10 characters long and will not fit in input as declared char[10].
Instead of looping with a for, allow the return from fgets() control your read-loop and keep count as a condition controlling the loop to ensure you protect your array bounds, e.g.
#include <stdio.h>
#include <string.h>
#define MAXC 128 /* if you need a constant, #define one (or more) */
#define NSTR 10
int checkIfValid (const char *s) { return 1; (void)s; }
int main(void)
{
size_t count = 0;
char input[MAXC];
char validStr[NSTR][MAXC] = {""};
while (count < NSTR && fgets (input, sizeof input, stdin)) {
input[strcspn(input,"\n")] = '\0';
if(checkIfValid(input)){
strcpy (validStr[count], input);
count++;
}
}
printf ("%zu\n",count);
for (size_t j = 0 ; j < count; ++j) {
printf("%s\n",validStr[j]);
}
}
(adjust your array declaration for 60 strings of 10 characters each)
If you want to cut off at 9 characters and ensure the stings are nul-terminated, #selbie has that covered.
Example Use/Output
With your data (as good as I could read it) in dat/validstr.txt you could do:
$ ./bin/validstring <dat/validstr.txt
6
AAAAAAAAAA
QELETIURTE
321qweve
sdsdsdfFF
GRSGGFDDSS
toLotssAAA

Related

Why is this code producing an infinite loop?

#include <Stdio.h>
#include <string.h>
int main(){
char str[51];
int k = 1;
printf("Enter string\n");
scanf("%s", &str);
for(int i = 0; i < strlen(str); i++){
while(str[k] != '\0')){
if(str[i] == str[k]){
printf("%c", str[i]);
k++;
}
}
}
return 0;
}
It is simple C code that checks for duplicate characters in string and prints the characters. I am not understanding why it is producing an infinite loop. The inner while loop should stop when str[k] reaches the null terminator but the program continues infinitely.
Points to know
You don't need to pass the address of the variable str to scanf()
Don't use "%s", use "%<WIDTH>s", to avoid buffer-overflow
Always check whether scanf() conversion was successful or not, by checking its return value
Always use size_t to iterator over any array
i < strlen(str), makes the loop's time complexity O(n3), instead of O(n2), which also isn't very good you should check whether str[i] != 0. But, many modern compilers of C will optimize it by the way.
#include <Stdio.h> it is very wrong, stdio.h != Stdio.h
Call to printf() can be optimized using puts() and putc() without any special formatting, here also modern compiler can optimize it
while(str[k] != '\0')){ has a bracket (')')
Initialize your variable str using {}, this will assign 0 to all the elements of str
Better Implementation
My implementation for this problem is that create a list of character (256 max) with 0 initialized, and then add 1 to ASCII value of the character (from str) in that list. After that print those character whose value was greater than 1.
Time Complexity = O(n), where n is the length of the string
Space Complexity = O(NO_OF_CHARACTERS), where NO_OF_CHARACTERS is 256
Final Code
#include <stdio.h>
#include <stdlib.h>
#include <limits.h>
static void print_dup(const char *str)
{
size_t *count = calloc(1 << CHAR_BIT, sizeof(size_t));
for(size_t i = 0; str[i]; i++)
{
count[(unsigned char)str[i]]++;
}
for (size_t i = 0; i < (1 << CHAR_BIT); i++)
{
if(count[i] > 1)
{
printf("`%c`, count = %zu\n", i, count[i]);
}
}
free(count);
}
int main(void) {
char str[51] = {};
puts("Enter string:");
if (scanf("%50s", str) != 1)
{
perror("bad input");
return EXIT_FAILURE;
}
print_dup(str);
return EXIT_SUCCESS;
}
Read your code in English: You only increment variable k if character at index k is equal to character at index i. For any string that has different first two characters you will encounter infinite loop: char at index i==0 is not equal to char at index k==1, so k is not incremented and while(str[k]!=0) loops forever.

Random Characters Appearing When Printing Arrays

I'm relatively new to coding array functions in C. After numerous tries, I've decided to surrender and ask for help.
I wish to the user to input the words and store them into the 2d array words. The problem is that it prints the words but also prints out random characters.
#include "mp1_lib.h"
void get_words(int n, char words[][16])
{
char c = ' ';
char check;
for(int x=0; x <= n; x++)
{
for(int y=0; y < 16; y++)
{
c = getchar();
check = c;
if (check == '\n')
{
break;
}
words[x][y] = c;
}
}
}
void print_words(int n, char words[][16])
{
for(int x=1; x <= n; x++)
{
for(int y=0; y < 16; y++)
{
if (words[x][y] == '\n')
{
break;
}
putchar(words[x][y]);
}
printf("\n");
}
}
In C, a string is an array of characters with the nul-terminating character '\0' as the character that marks the end of the contents of the string within the array. That is how all string functions like strlen or printf using the '%s' format specifier to print a string -- know where the string stops.
If you do not nul-terminate the array of characters -- then it is not a string, it is simply an array and you cannot pass an un-terminate array to any function expecting a string - or it won't know where the string ends (and in the case of printf will just print whatever unspecified character happens to be in memory until it comes upon a '\0' to stop the output (or SegFaults).
If you don't nul-terminate the words in your array, then you will have to have some way to store the number of characters in each word, so your print function will know where to stop printing. (if you have a two-letter word like "Hi" in a 16-char array, you can only print 2 characters from the array. Especially if it is an uninitialized array, then you will simply get gibberish printed for characters 3-16.
Your second problem is -- "How do you know how many words you have stored in your array?" -- you don't return a value from getwords, so unless you change the function type to int and return the number of words that you stored in your array, your only other option is to pass a pointer to an integer and update the value at that address so the value is available back in the calling function. Either way is fine, you generally only worry about making a value available through a pointer if you are already returning another value and need a second method to make another updated value visible back in the calling function (main() here).
Putting those pieces together, and passing a pointer to the number of words to getwords to make the number of words entered available back in main() (so you know how many words print_words has to print), you could do something similar to the following:
#include <stdio.h>
#include <ctype.h>
#define MAXC 16 /* if you need constants, define them */
#define MAXW 32
void getwords (char (*words)[MAXC], int *n)
{
int col = 0; /* column count */
while (*n < MAXW) { /* while words < MAXW */
int c = getchar(); /* read char */
/* column reaches MAXC-1 or if whitespace or EOF */
if (col == MAXC - 1 || isspace(c) || c == EOF) {
if (col) { /* if col > 0 */
words[(*n)++][col] = 0; /* nul-terminate, increment n */
col = 0; /* set col to zero */
}
if (c == EOF) /* if char EOF - all done */
return;
}
else /* otherwise - just add char to word */
words[*n][col++] = c;
}
}
void prnwords (char (*words)[MAXC], int n)
{
for (int i = 0; i < n; i++) /* loop over each of n-words & print */
printf ("words[%2d]: %s\n", i, words[i]);
}
int main (void) {
char words[MAXW][MAXC] = {""}; /* intiliaze words all zero */
int nwords = 0; /* number of words zero */
getwords (words, &nwords);
prnwords (words, nwords);
return 0;
}
(note: when reading characters into the words array, you must check the number of character read again the maximum characters per-word (MAXC) and the number of words against the maximum number of words/rows in your array (MAXW) to prevent writing outside of your array bounds -- which will invoke Undefined Behavior in your program)
(note: the ctype.h header was included to simplify checking whether the character read was whitespace (e.g. a space, tab, or newline). If you can't use it, then simply use an if (c == ' ' || c == '\t' || c == '\n') instead.)
Example Use/Output
$ echo "my dog has fleas and my cat has none" | ./bin/getwords
words[ 0]: my
words[ 1]: dog
words[ 2]: has
words[ 3]: fleas
words[ 4]: and
words[ 5]: my
words[ 6]: cat
words[ 7]: has
words[ 8]: none
Not too familiar with c. But it appears like you are not addding the new line character to the words array in get_words.
check = c;
if (check == '\n')
{
break;
}
words[x][y] = c;
So when printing in print_words this will never be true.
if (words[x][y] == '\n')
{
break;
}
That means that whatever happens to be in the memory location is what will get printed.
Your words have neither the newline character (which makes your code print garbage) nor the terminating NULLs (which makes them illegal as C strings). At least add words[x][y]="\n" before breaking the inner loop. Or, rather, move the if check after the assignment words[x][y]=c;. And yes, the loop should go from 0 to n-1.
As a side note, you do not need the variable check: just use c.
I tried to assign space as a placeholder for the 15 characters and it worked. Thanks, everyone! :)
#include "mp1_lib.h"
void get_words(int n, char words[][16])
{
char c = ' ';
char check;
for(int x=0; x < n; x++)
{
for(int y=0; y < 16; y++)
{
words[x][y] = ' ';
}
}
for(int x=0; x < n; x++)
{
for(int y=0; y < 16; y++)
{
c = getchar();
check = c;
if (check == '\n')
{
break;
}
words[x][y] = c;
}
}
}
void print_words(int n, char words[][16])
{
for(int x=0; x < n; x++)
{
for(int y=0; y < 16; y++)
{
putchar(words[x][y]);
}
printf("\n");
}
}

Program runs too slowly with large input - C

The goal for this program is for it to count the number of instances that two consecutive letters are identical and print this number for every test case. The input can be up to 1,000,000 characters long (thus the size of the char array to hold the input). The website which has the coding challenge on it, however, states that the program times out at a 2s run-time. My question is, how can this program be optimized to process the data faster? Does the issue stem from the large char array?
Also: I get a compiler warning "assignment makes integer from pointer without a cast" for the line str[1000000] = "" What does this mean and how should it be handled instead?
Input:
number of test cases
strings of capital A's and B's
Output:
Number of duplicate letters next to each other for each test case, each on a new line.
Code:
#include <stdio.h>
#include <string.h>
#include <math.h>
#include <stdlib.h>
int main() {
int n, c, a, results[10] = {};
char str[1000000];
scanf("%d", &n);
for (c = 0; c < n; c++) {
str[1000000] = "";
scanf("%s", str);
for (a = 0; a < (strlen(str)-1); a++) {
if (str[a] == str[a+1]) { results[c] += 1; }
}
}
for (c = 0; c < n; c++) {
printf("%d\n", results[c]);
}
return 0;
}
You don't need the line
str[1000000] = "";
scanf() adds a null terminator when it parses the input and writes it to str. This line is also writing beyond the end of the array, since the last element of the array is str[999999].
The reason you're getting the warning is because the type of str[10000000] is char, but the type of a string literal is char*.
To speed up the program, take the call to strlen() out of the loop.
size_t len = strlen(str)-1;
for (a = 0; a < len; a++) {
...
}
str[1000000] = "";
This does not do what you think it does and you're overflowing the buffer which results in undefined behaviour. An indexer's range is from 0 - sizeof(str) EXCLUSIVE. So you either add one to the
1000000 when initializing or use 999999 to access it instead. To get rid of the compiler warning and produce cleaner code use:
str[1000000] = '\0';
Or
str[999999] = '\0';
Depending on what you did to fix it.
As to optimizing, you should look at the assembly and go from there.
count the number of instances that two consecutive letters are identical and print this number for every test case
For efficiency, code needs a new approach as suggeted by #john bollinger & #molbdnilo
void ReportPairs(const char *str, size_t n) {
int previous = EOF;
unsigned long repeat = 0;
for (size_t i=0; i<n; i++) {
int ch = (unsigned char) str[i];
if (isalpha(ch) && ch == previous) {
repeat++;
}
previous = ch;
}
printf("Pair count %lu\n", repeat);
}
char *testcase1 = "test1122a33";
ReportPairs(testcase1, strlen(testcase1));
or directly from input and "each test case, each on a new line."
int ReportPairs2(FILE *inf) {
int previous = EOF;
unsigned long repeat = 0;
int ch;
for ((ch = fgetc(inf)) != '\n') {
if (ch == EOF) return ch;
if (isalpha(ch) && ch == previous) {
repeat++;
}
previous = ch;
}
printf("Pair count %lu\n", repeat);
return ch;
}
while (ReportPairs2(stdin) != EOF);
Unclear how OP wants to count "AAAA" as 2 or 3. This code counts it as 3.
One way to dramatically improve the run-time for your code is to limit the number of times you read from stdin. (basically process input in bigger chunks). You can do this a number of way, but probably one of the most efficient would be with fread. Even reading in 8-byte chunks can provide a big improvement over reading a character at a time. One example of such an implementation considering capital letters [A-Z] only would be:
#include <stdio.h>
#define RSIZE 8
int main (void) {
char qword[RSIZE] = {0};
char last = 0;
size_t i = 0;
size_t nchr = 0;
size_t dcount = 0;
/* read up to 8-bytes at a time */
while ((nchr = fread (qword, sizeof *qword, RSIZE, stdin)))
{ /* compare each byte to byte before */
for (i = 1; i < nchr && qword[i] && qword[i] != '\n'; i++)
{ /* if not [A-Z] continue, else compare */
if (qword[i-1] < 'A' || qword[i-1] > 'Z') continue;
if (i == 1 && last == qword[i-1]) dcount++;
if (qword[i-1] == qword[i]) dcount++;
}
last = qword[i-1]; /* save last for comparison w/next */
}
printf ("\n sequential duplicated characters [A-Z] : %zu\n\n",
dcount);
return 0;
}
Output/Time with 868789 chars
$ time ./bin/find_dup_digits <dat/d434839c-d-input-d4340a6.txt
sequential duplicated characters [A-Z] : 434893
real 0m0.024s
user 0m0.017s
sys 0m0.005s
Note: the string was actually a string of '0's and '1's run with a modified test of if (qword[i-1] < '0' || qword[i-1] > '9') continue; rather than the test for [A-Z]...continue, but your results with 'A's and 'B's should be virtually identical. 1000000 would still be significantly under .1 seconds. You can play with the RSIZE value to see if there is any benefit to reading a larger (suggested 'power of 2') size of characters. (note: this counts AAAA as 3) Hope this helps.

Split string into array in C

Currently I'm trying to take a binary string, say 100101010, and split it into groups of three, so 100 101 010. Here's what I've written so far, for some reason it only prints the first group, 100 and then nothing after that.
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
int main(){
int i;
char *line = NULL;
free(line);
scanf("%ms", &line);
printf("%d\n", strlen(line));
for(i=0; i < strlen(line); ++i) {
if ( i % 3 == 0 ){
sprintf(line, "%c%c%c", line[i],line[i+1],line[i+2]);
printf(line);
}
}
}
sprintf(line, "%c%c%c", line[i],line[i+1],line[i+2]); writes your 3 characters into line, and so you overwrite your original string with your first group of 3. This means the next time through the loop i(4) is > strlen(line)(3) and so the loop stops.
Try:
/* Since 'line' and it's contents doesn't change in the loop we can
* avoid the overhead of strlen() calls by doing it once and saving the
* result.
*/
int len = strlen(line);
/* As mentioned in the comments, you could do
* for(i = 0; i < len; i+=3) and then you don't need the
* if (i%3) check inside the loop
*/
for(i=0; i < len; ++i) {
if ( i % 3 == 0 ){
/* This could be refactored to a loop
* or scanf() to a different string but I say scanf is overkill
* in this scenario...
*/
char buffer[4];
buffer[0] = line[i];
buffer[1] = line[i+1];
buffer[2] = line[i+2];
buffer[3] = '\0';
printf("%s\n", buffer);
// Or just use puts() since we're not really doing
// any formatting.
}
}
strlen(line) is reevaluated on each pass through the for loop, and you're changing the data that line points to inside the for loop by calling sprintf. Your sprintf makes line a 3-character string, hence you get only one trip through the loop in which i%3 is zero.

C program Need help fixing my code for a word sort program

Hi I am still new to c and have been working on this word sort program for some time now. the guidelines are:
Write a program that sorts a series of words entered by the user. Assume that each word is no more than 20 characters long. Stop reading when the user enters an empty word. Store each word in a dynamically allocated string, using an array of pointers (use the read_line function). After all lines have been read sort the array. Then use a loop to print the words in sorted order.
The problem I seem to be having is that the program will accept words but when I enter the empty word it goes to a new line and nothing happens. An help or advice would be greatly appreciated. here is my code so far.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define LEN 20
#define LIM 20
int read_line(char str[], int n);
void sort_str(char *list[], int n);
int alpha_first(char *list[], int min_sub, int max_sub);
int main(void)
{
char *list[LIM];
char *alpha[LIM];
char word_str[LEN];
int word, i, j, num_count = 0;
for(;;){
printf("Enter a word: ");
scanf("%s", &word);
if(word == NULL)
break;
else
read_line(word_str, LEN);
list[i] = malloc(strlen(word_str) + 1);
strcpy(list[i], word_str);
alpha[i] = list[i];
}
sort_str(alpha, i);
for(i = 0; i < num_count; ++i){
printf("Sorted: ");
puts(list[i]);
}
return (0);
}
int read_line(char str[], int n)
{
int ch, i = 0;
while ((ch = getchar()) != '\n')
if (i < n)
str[i++] = ch;
str[i] = '\0';
return i;
}
void sort_str(char *list[], int n)
{
int i, index_of_min;
char *temp;
for (i= 0; i < n - 1; ++i) {
index_of_min = alpha_first(list, i, n - 1);
if (index_of_min != i) {
temp = list[index_of_min];
list[index_of_min] = list[i];
list[i] = temp;
}
}
}
int alpha_first(char *list[], int min_sub, int max_sub){
int i, first;
first = min_sub;
for(i = min_sub + 1; i <= max_sub; ++i){
if(strcmp(list[i], list[first]) < 0){
first = i;
}
}
return (first);
}
Your logic flow is flawed. If a word is entered, the scanf() will eat it from stdin and store a null-terminated string at the address of the integer 'word'. Any more than 3/7 chars entered, (32/64 bit, allowing for the null terminator), will start corrupting the stack. read_line() will then only have the line terminator to read from stdin, (assuming the UB doesn't blow it up first).
The problem I seem to be having is that the program will accept words but when I enter the empty word it goes to a new line and nothing happens.
There are several problems with this:
char word_str[LEN];
int word, i, j, num_count = 0;
/* ... */
scanf("%s", &word);
if(word == NULL)
break;
First, scanf("%s", &word) scans whitespace-delimited strings, and to that end it skips leading whitespace, including newlines. You cannot read an "empty word" that way, though you can fail to read a word at all if the end of the input is reached (or an I/O error occurs) before any non-whitespace characters are scanned.
Second, you are passing an inappropriate pointer to scanf(). You should pass a pointer to a character array, but you instead pass a pointer to an int. It looks like maybe you wanted to scan into word_str instead of into word.
Third, your scanf() format does not protect against buffer overflow. You should provide a field width to limit how many characters can be scanned. Moreover, you need to be sure to leave room for a string terminator.
Fourth, you do not check the return value of scanf(). If it fails to match any characters to the field, then it will not store any. Since it returns the number of fields that were successfully scanned (or an error indicator), you can detect this condition.
One way to correct the scanf() and "empty word" test would be:
int result;
result = scanf("%*[ \t]%19[^ \t\n]", word_str);
if (result < 1) break;
(That assumes a fixed maximum word length of 19 to go with your declared array length of 20.) You have several additional problems in your larger code, large among them that read_line() attempts to read the same data you just read via scanf() (in fact, that function looks altogether pointless). Also, you never update num_count, and after calling sort_str() you lose track of the number of strings you've read by assigning a new value to variable i.
There may be other problems, too.

Resources