C: String functions without string.h library

C: String functions without string.h library - c

I have homework to do and I need some help. I didn't come here to get someone to do my work, just to help me.
I have to create my own string.h "library" (actually header and function file), so I am forbidden to use #include <string.h> and #include <ctypes.h>. They also recommended us not to use malloc, but it was just a recommendation, not forbidden.
For most of the functions I know how to write them.
I planned to save "strings" like arrays of chars like:
char array[50];
But I came to a problem of creating toupper and tolower functions.
Sure, I can make huge switch cases or a lot if (else if's) like this:
if(string[i]=='a') {string[i]=='A' };
else if(string[i]=='b') {string[i]=='B' };
.
.
.
else if(string[i]=='z') {string[i]=='Z' };
But is there any better solution?
Strings are going to be created randomly so they will look somewhat like this:
ThisISSomESTRing123.
So after toupper function, randomly generated string should look like this:
THISISSOMESTRING123.
Also how would you create puts function (printing) so everything would be in same row? Just "printf" inside "for" loop?

Your system probably uses ASCII. In ASCII, the codepoints of lowercase characters and uppercase characters are sequential.
So we can just do:
void to_upper(char *message) {
while (*message) {
if (*message >= 'a' && *message <= 'z')
*message = *message - 'a' + 'A';
message++;
}
}
Other character encodings can become much more complicated. For example, EBCDIC doesn't have contiguous characters. And UTF-8 introduces a world of problems because of the numerous languages that you need to support.

You can just check if the character is within the range of characters, say between 'a' and 'z' and just add or subtract 32, which is the difference between the ascii value of capital and lower letters. See the ascii table here: http://www.asciitable.com/

Just do
if (string[i] >= 'a' && string[i] <= 'z') {
string[i] = string[i] - 'a' + 'A';
}
It will convert lower case to upper case by changing a..z to 0..26 then adding the ASCII value for 'A' to it.

There's a neat trick for converting ASCII cases. Consider these characters:
A - binary 01000001
a - binary 01100001
As we can see, the difference is in the 6th bit, counting from the right. Indeed, the difference between uppercase and lowercase ASCII chars is 2^5 = 32. So, to convert a letter to uppercase, simply AND it with 0xDF (11011111) to set that bit to 0. In this way you don't even have to check if the character is in uppercase already.
Note that this will break non-letter characters that are above 0x60, namely the backtick, {, |, } and ~. But as long as you don't have these in your strings, it should be fine to use this and you can avoid an if :).
Note: Only use that as a cool trick for this homework. Normally you should just use proper, tested solutions (aka string.h).

void toUpper(char *arr) {
while (*arr) {
if (*arr >= 'a' && *arr <= 'z')
*arr = *arr - 'a' + 'A';
arr++;
}
}
use this function to make all the letters uppercase in a string just call the toUpper funcn and give your array as a prarameter.
for printing the array just use for loop and move in array index elements and print the letters.
for(int i=0;arr[i] !='\0';i++)
{
printf("%c",arr[i]);
}
this will print every elements in the array which is the string,

Related

Condition for checking whether a letter is between 'j' and 'p'

The code should accept a character and it should check whether its between 'j' and 'p'.
If it is between 'j' and 'p' it should print yes or else it should print no.
I have tried to do something about it but the only ideas I got is this:
if (a=='j' || a=='k' || a=='k' || a=='l' || a=='m' || a=='n' || a=='o' || a=='p')
{
printf("YES");
}
else
{
printf("NO");
}

You can avoid all the alternative tests by using a function like strchr():
if (strchr("jklmnop", a)) {
puts("YES");
} else {
puts("NO");
}
The obvious approach is to do something like
if (a >= 'j' && a <= 'p') {
// ...
}
but that has a problem if you want to write portable code.
The C standard only requires that the characters '0' through '9' appear consecutively and in order. If you're following the standard to a t, you shouldn't assume that 'j' through 'p' appear together and can be used with a pair of >= and <= tests. If you add additional qualifications like requiring an ASCII compatible character set, it's a different story.

It depends on what you mean by "between j and p".
If you mean "Only lowercase English letters between j and p", then one portable way of writing it down is
if (strchr("jklmnop", a)) ...
If you mean "Character codes between that of 'j' and that of 'p', in whatever encoding is used by the machine", then one portable way of writing it down is
if (a >= 'j' && a <= 'p') ...
If your encoding is ASCII, then the two notions above strictly coincide for any range of English letters.
If your encoding is EBCDIC, then they coincide for the range j..p, but not say for the range i..p.
It is guaranteed that all English letters between j and p are included in the range of the codes in any standards-compliant encoding, but there might be additional, non-English-letter characters in the same range.
Finally, for completeness, if by "between j and p" you mean "letters of the user's language, whatever it is, that are between j and p", then one correct way of writing it down is probably
setlocale (LC_ALL, ""); // first statement of the program
...
if (strcoll(a, "j") >= 0 && strcoll(a, "p") <= 0) ...
Note that here, a is not a character as above, but a string. It is up to you to ensure that it contains a single character of the user's language (which is not the same thing as a single char element). Ensuring this is very non-trivial.
TL;DR
if (a >= 'j' && a <= 'p') will probably work for whatever task you currently have, but don't assume it will always work.

try this code
#include <stdio.h>
int main()
{
char a;
printf("enter the letter :");
scanf("%c",&a);
if(a>='j' && a<='p')
printf("yes the letter is between j and p");
else
printf("No the given letter is not between j and p");
}

Each and every alphabet has an ASCII code which is an integer so you can perform it like this,
char a;
scanf("%c",&a);
if(a>='j'&&a<='p')
{
printf("YES");
}
else
{
printf("NO");
}

Assuming lowercase letters form a contiguous block in the execution set, which is true for ASCII, you can write:
if (a >= 'j' && a <= 'p') ...
Assuming a is an int containing a char value, You can write this as a single test, but a good compiler should be able to generate the same code for the more readable test above:
if ((unsigned)(a - 'j') <= (unsigned)('p' - 'j')) ...
You could also test if a is in a set of characters, which will work regardless of the target encoding:
if (a != 0 && strchr("jklmnop", a)) ...
The test for a != 0 can be removed if you know a cannot be a null byte.

Character literals (This is a character literal: 'a') are just numbers. Almost all computers use a ASCII-compatible encoding (there are very few exceptions).
ASCII assumed, for example 'a' is a 97 for your computer, 'j' a 106. if you write a=='j' you basically write a==106. Using character literals is just syntax sugar, it makes it a lot easier for humans to read but the computer does not care.
This means, you have to check if a is between 106 and 112. You probably know a better way to do that than you current way. But instead of 106 and 112 write 'j' and 'p', because it easier to read.

CamelCase to snake_case in C without tolower

I want to write a function that converts CamelCase to snake_case without using tolower.
Example: helloWorld -> hello_world
This is what I have so far, but the output is wrong because I overwrite a character in the string here: string[i-1] = '_';.
I get hell_world. I don't know how to get it to work.
void snake_case(char *string)
{
int i = strlen(string);
while (i != 0)
{
if (string[i] >= 65 && string[i] <= 90)
{
string[i] = string[i] + 32;
string[i-1] = '_';
}
i--;
}
}

This conversion means, aside from converting a character from uppercase to lowercase, inserting a character into the string. This is one way to do it:
iterate from left to right,
if an uppercase character if found, use memmove to shift all characters from this position to the end the string one position to the right, and then assigning the current character the to-be-inserted value,
stop when the null-terminator (\0) has been reached, indicating the end of the string.
Iterating from right to left is also possible, but since the choice is arbitrary, going from left to right is more idiomatic.
A basic implementation may look like this:
#include <stdio.h>
#include <string.h>
void snake_case(char *string)
{
for ( ; *string != '\0'; ++string)
{
if (*string >= 65 && *string <= 90)
{
*string += 32;
memmove(string + 1U, string, strlen(string) + 1U);
*string = '_';
}
}
}
int main(void)
{
char string[64] = "helloWorldAbcDEFgHIj";
snake_case(string);
printf("%s\n", string);
}
Output: hello_world_abc_d_e_fg_h_ij
Note that:
The size of the string to move is the length of the string plus one, to also move the null-terminator (\0).
I am assuming the function isupper is off-limits as well.
The array needs to be large enough to hold the new string, otherwise memmove will perform invalid writes!
The latter is an issue that needs to be dealt with in a serious implementation. The general problem of "writing a result of unknown length" has several solutions. For this case, they may look like this:
First determine how long the resulting string will be, reallocating the array, and only then modifying the string. Requires two passes.
Every time an uppercase character is found, reallocate the string to its current size + 1. Requires only one pass, but frequent reallocations.
Same as 2, but whenever the array is too small, reallocate the array to twice its current size. Requires a single pass, and less frequent (but larger) reallocations. Finally reallocate the array to the length of the string it actually contains.
In this case, I consider option 1 to be the best. Doing two passes is an option if the string length is known, and the algorithm can be split into two distinct parts: find the new length, and modify the string. I can add it to the answer on request.

Is there a simple, portable way to determine the ordering of two characters in C?

According to the standard:
The values of the members of the execution character set are implementation-defined.
(ISO/IEC 9899:1999 5.2.1/1)
Further in the standard:
...the value of each character after 0 in the above list of decimal digits shall be one greater than the value of the previous.
(ISO/IEC 9899:1999 5.2.1/3)
It appears that the standard requires that the execution character set includes the 26 uppercase and 26 lowercase letters of the Latin alphabet, but I see no requirement that these characters be ordered in any way. I only see an order stipulation for the decimal digits.
This would seem to imply that, strictly speaking, there is no guarantee that 'a' < 'b'. Now, the letters of the alphabet are in order in each of ASCII, UTF-8, and EBCDIC. But for ASCII and UTF-8 we have 'A' < 'a', while for EBCDIC we have 'a' < 'A'.
It might be nice to have a function in ctype.h that compares alphabetic characters portably. Short of this or something similar, it seems to me that one must look in the locale to find the value of CODESET and proceed accordingly, but this doesn't seem simple.
My gut tells me that this is almost never an issue; for most cases alphabetical characters can be handled by converting to lowercase, because for the most commonly used character sets the letters are in order.
The question: given two chars
char c1;
char c2;
is there a simple, portable way to determine if c1 precedes c2 alphabetically? Or do we assume that the lowercase and uppercase characters always occur in sequence, even though this does not appear to be guaranteed by the standard?
To clarify any confusion, I am really just interested in the 52 letters of the Latin alphabet that are guaranteed by the standard to be in the execution character set. I realize that other sets of letters are important, but it seems that we can't even know about the ordering of this small subset of letters.
Edit
I think that I need to clarify a bit more. The issue, as I see it, is that we commonly think of the 26 lowercase letters of the Latin alphabet as being ordered. I would like to be able to assert that 'a' comes before 'b', and we have a convenient way of expressing this in code as 'a' < 'b', when we give 'a' and 'b' integral values. But the standard gives no assurances that the above code will evaluate as expected. Why not? The standard does guarantee this behavior for the digits 0-9, and this seems sensible. If I want to determine if one letter-char precedes another, say for sorting purposes, and if I want this code to be truly portable, it seems like the standard offers no help. Now I have to rely on the convention that ASCII, UTF-8, EBCDIC, etc. have adopted that 'a' < 'b' should be true. But this isn't really portable unless the only character sets used rely on this convention; this may be true.
This question originated for me in another question thread: Check if a letter is before or after another letter in C. Here, a few people suggested that you could determine the order of two letters stored in chars using inequalities. But one commenter pointed out that this behavior is not guaranteed by the standard.

strcoll is designed for this purpose. Simply set up two strings of one character each. (normally you want to compare strings, not characters).

There are historically used codes that don't simply order the alphabet. Baudot, for example, puts vowels before consonants, so 'A' < 'B', but 'U' < 'B' as well.
There are also codes like EBCDIC that are ordered, but with gaps. So in EBCDIC, 'I' < 'J', but 'I' + 1 != 'J'.

You could probably just make a table for the characters the standard garantees there will be to ASCII character numbers. E.g.,
#include <limits.h>
static char mytable[] = {
['a'] = 0x61,
['b'] = 0x62,
// ...
['A'] = 0x41,
['B'] = 0x42,
// ...
};
The compiler will map every characters in the current character set (which may be any crazy character set) to ASCII codes, and the characters which are not garanteed to exist will be mapped to zero. Then you can use this table for ordering whenever needed.
As you said,
char c1;
char c2;
Could portably be verified to be alphabetically ordered by checking
(c1 < sizeof(mytable) && c2 < sizeof(mytable) ? mytable[c1] < mytable[c2] : 0)
I've actually used this on a research project which runs on ASCII and EBCDIC for predictable ordering, but it's portable enough to work on any character set. Edit: I've actually let the size of the table empty, so that it would compute to the minimum needed, because of the DeathStation 9000, on which a byte might have 32bits and hence CHAR_MAX be up to 4294967295 or greater.

For A-Z,a-z in a case-insensitive manner (and using compound literals):
char ch = foo();
az_rank = strtol((char []){ch, 0}, NULL, 36);
For 2 char that are known to be A-Z,a-z but may be ASCII or EBCDIC.
int compare2alpha(char c1, char c2) {
int mask = 'A' ^ 'a'; // Only 1 bit is different between upper/lower
return (c1 | mask) - (c2 | mask);
}
Alternatively, if limited to 256 differ char, could use a look-up table that maps the char to its rank. Of course the table is platform dependent.

With C11, code could use _Static_assert() to insure, at compile time, that characters have a desired ordering.
An advantage to this approach is that since the overwhelming character codings all ready meet the desired A-Z requirement, should a novel or esoteric platform use something different, it may require a coding or customization that is not foreseeable. This best code can do in that case is to fail to compile.
Example use
// Sample case insensitive string sort routine that insures
// 1) 'A' < 'B' < 'C' < ... < 'Z'
// 2) 'a' < 'b' < 'c' < ... < 'z'
int compare_string_case_insensitive(const void *a, const void *b) {
_Static_assert('A' < 'B', "A-Z order unexpected");
_Static_assert('B' < 'C', "A-Z order unexpected");
_Static_assert('C' < 'D', "A-Z order unexpected");
// Other 21 _Static_assert() omitted for brevity
_Static_assert('Y' < 'Z', "A-Z order unexpected");
_Static_assert('a' < 'b', "a-z order unexpected");
_Static_assert('b' < 'c', "a-z order unexpected");
_Static_assert('c' < 'd', "a-z order unexpected");
// Other 21 _Static_assert() omitted for brevity
_Static_assert('y' < 'z', "a-z order unexpected");
const char *sa = (const char *)a;
const char *sb = (const char *)b;
int cha, chb;
do {
cha = toupper((unsigned char) *sa++);
chb = toupper((unsigned char) *sb++);
} while (cha && cha == chb);
return (cha > chb) - (cha < chb);
}

looking for numbers in a character array

I'm writing a program that reads lines from a file
I need to print out the numbers, the lines read are stored in a character array:
char line[255];
//code to read line from file here
for (c=0;c<256;c++)
{
if (line[c]<58 && line[c]>47)
printf("the character is: %c\n",line[c]);
}
the configuration file has the following lines:
buttons 3
the result I'd like to get is the character is 3, instead I get 3,9,4,4
Hope I've provided sufficient information.
thanks

Your if-statement is wrong.
You can express it much clearer, and more correctly as:
if ('0' <= line[c] && line[c] <= '9')
{
printf("the character is: %c\n",line[c]);
}
Your loop runs for 256 characters, even though the input of "buttons" only has 7 characters. You're running off into memory that is not yours, and likely finding 9, 4, 4, there by random chance.
You want:
for (int c=0; c < 256; ++c)
{
if (line[c] == '\0') // If the end of the input is found, stop the loop.
{
break;
}
if ('0' <= line[c] && line[c] <= '9')
{
printf("the character is: %c\n",line[c]);
}
}

An extension of abelenky's post:
abelenky presents 2 (out of many) solutions to the problem. An important aspect of writing code is readability. abelenky's first solution maximizes readability.
if (line[c] >= '0' && line[c] <= '9')
{
printf("the character is: %c\n",line[c]);
}
Everyone is aware that ASCII characters are mapped to integer values, but not everyone can readily recall the range of values associated with each types of character (numbers, letters, capital letters, etc.).
This is why C supports the single quotes: ' '
It is reasonable to assume that ASCII values for integers increment as do integers from 0-9, thus using '0' and '9' in your conditional statement improves readability. Adopting a more legible style of code will improve your and the life of anyone who views your code.
Happy coding!

How to return the index of ascii char in C

So you have Table mapping the 26 ascii characters from the English alphabet to their corresponding morse code strings
typedef struct a_look_tab {
char table[asciiNum][MORSE_MAX+1];
} ALookTab;
and asciiNum is the 0 for a, 1 for b, and so on. how to return an index (int) that is the index of the morse char.
So what we are doing after converting a char into a number is to param ascii The ascii character to convert and return The index for the given ascii character, how do we do that?

The simplest portable way to convert the char to an index is this type of construction:
/* Returns -1 if c is not an upper- or lower-case alphabetical character */
int char_to_index(char c)
{
static const char * const alphabet = "abcdefghijklmnopqrstuvwxyz";
char *p = strchr(alphabet, tolower((unsigned char)c));
return p ? p - alphabet : -1;
}

You need to convert a character, such as 'a', into its index in the table. According to your specification, the table begins with the Morse code for 'a', so 'a' should map to the index 0, 'b' should map to 1, and so on.
The simplest such mapping could be implemented like this:
int char_to_index(char c)
{
return tolower(c) - 'a';
}
This subtracts the ASCII code for 'a' from the given letter, which will turn 'a' into 0, and so on.
Unfortunately, this only works if the computer running the program encodes the letters of the alphabet using a system that assigns contiguous codes to the letters. Not all computers are like this. A more portable function could do the mapping explicitly, like so:
int char_to_index2(char c)
{
switch(tolower(c))
{
case 'a': return 0;
case 'b': return 1;
case 'c': return 2;
/* and so on */
}
}
This is more verbose code-wise, but more portable.
UPDATE: I added calls to tolower() to both functions to make them a bit more robust.

Note that the C standard doesn't require ASCII, and this code won't work under EBCDIC, but 99% of the time this won't matter.
I believe what you're looking for is much simpler than you think. Character literals like 'c' and '0' are actualy ints, not chars - they're casted down to char at assignment, and can be just as easily cast back up. So this is what (I think) you want:
#include <ctype.h> // for tolower()
char *func(ALookTab *a, char c)
{
if(isalpha(c))
return a->table[tolower(c) - 'a'];
if(isdigit(c))
return a->table[c - '0' + 26];
// handle special characters
}
Note that this code assumes that your morse code is stored as the 26 alphabet characters, the 10 digits, and then other special characters in whatever order you choose.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight