At school someday several years ago I had to do a swap function that swaps two integers, I wanted to do this using bitwise operations without using a third variable, so I came up with this:
void swap( int * a, int * b ) {
*a = *a ^ *b;
*b = *a ^ *b;
*a = *a ^ *b;
}
I thought it was good but when my function was tested by the school's correction program it found an error (of course when I asked they didn't want to tell me), and still today I don't know what didn't work, so I wonder in which case this method wouldn't work.
I wanted to do this using bitwise operations without using a third variable
Do you mind if I ask why? Was there a practical reason for this limitation, or was it just an intellectual puzzle?
when my function was tested by the school's correction program it found an error
I can't be sure what the correction program was complaining about, but one class of inputs this sort of solution is known to fail on is exemplified by
int x = 5;
swap(&x, &x);
printf("%d\n", x);
This prints 0, not 5.
You might say, "Why would anyone swap something with itself?"
They probably wouldn't, as I've shown it, but perhaps you can imagine that, in a mediocrely-written sort algorithm, it might end up doing the equivalent of
if(a[i] < a[j]) {
/* they are in order */
} else {
swap(&a[i], &a[j]);
}
Now, if it ever happens that i and j are the same, the swap function will wrongly zero out a[i].
See also What is the difference between two different swapping function?
Related
I am currently making a program that approximates the Schroedinger equation, and for my initial conditions, my professor said to begin with a gaussian. The formula I'm using for that is this (apologies, I don't know how to do equations in markdown):
p(x) = ( 1/sqrt(2 * PI) ) * e^( -1/2 * (x-u)^2 / o )
I am starting with u=0 and o=1 for simplicities sake, and so the way I use it in my program is like this:
double gaussian(double x) {
return (1/sqrt(2*M_PI)) * exp((-.5) * pow(x, 2));
}
void initial_conditions(int m, complex *values[], double dx) {
for (size_t i = 0; i < m; i++)
{
values[i]->real = gaussian(i * dx);
}
}
Compiled by: gcc project1.c -lm -o project1
But that produces a segfault every time I have run it. As far as I can tell, it should work, but I am somewhat of a novice to C. I have determined it is specifically that equation that is producing the error by using printf statements to narrow the place of error down, and it always gets to that specific whole formula and return statement and then dies.
Any advice or help would be appreciated.
complex *values[] is weird and unnatural. I can't see the invocation, but I have managed to convince myself that this really should be complex values[].
A complex is far too simple a thing to want to allocate each one individually on the heap; almost always an array of complex would be allocated in a single call to malloc() (or possibly even a stack allocated array by caller).
Proceeding on from the names, I can project with decent confidence the calling code didn't allocate each individual complex in values but just allocated values, and thus the crash is the -> dereference at values[i]->real and values[i] is uninitialized. It's like you want (carrying forward the single array) values[i].real = ... ; values[i].imag = 0;
I was writing the c language code for selection sort. It was working fine If the swapping was done with using Third Variable but when I changed the method of swapping without using third variable as shown in the code comment below. It is showing wrong Output( zeros at some positions).I cannot figure out why this is happening?
I have tried to swap two numbers without third variable in another program for the same type of conditions. But it is working fine there. But Why not in my selection sort program.
#include<stdio.h>
void selectsort(int * ,int);//selection sort function
int main(){
int a[5];
int i,n=5;
for(i=0;i<5;i++)
scanf("%d",&a[i]);
selectsort(a,n);
printf("Sorted Array is:\n");
for(i=0;i<5;i++)
printf("%d\n",a[i]);
}
/* Below is selection sort function definition*/
void selectsort(int*p ,int q){
int i,j,h,temp;
for(i=0;i<q-1;i++){
h=i;
for(j=i+1;j<q;j++){
if(p[h]>p[j]){
h=j;
}
}
/* below code is to swap the two numbers ( p[i] and p[h]) without
using third variable , but it is NOT WORKING here
(giving wrong output) BUT WORKING IF THIRD VARIABLE IS USED.Why?*/
p[i]=p[i]+p[h];
p[h]=p[i]-p[h];
p[i]=p[i]-p[h];
}
}
Your values of h and i are not quaranteed to be different.
Swapping in this case will not only not swap anything but also mess up your memory.
void selectsort(int*p ,int q){
int i,j,h,temp;
for(i=0;i<q-1;i++){
h=i; // <=== Here you start with identical values
for(j=i+1;j<q;j++){
if(p[h]>p[j]){
h=j; // This may or may not be executed.
}
}
// Here h can still be at same value as i.
// What happens in this case is shown in the comments below:
p[i]=p[i]+p[h]; // p[i]=p[i]+p[i]; ==> p[i] *=2;
p[h]=p[i]-p[h]; // p[i]=p[i]-p[i]; ==> p[i] = 0;
p[i]=p[i]-p[h]; // p[i]=p[i]-p[h]; ==> p[i] = 0;
}
}
You could add something like this before doing the swapping:
if (i==h)
continue;
Note:
Apart from academic cases I would not suggest using such an approach.
Swapping without a temporary variable has lots of downsides:
Only works for integer types
Needs handling for overflow etc.
Needs handling for identical storage locations.
Needs extra arithmetic operations causing more code and longer execution time
Is confusing readers and harder to maintain
It also only has one advantage
Saves stack storage for 1 variable.
If your goal is to confuse readers, then you should search for a version using XOR instead of arithmetics. ;)
I see lots of people use subtraction in a qsort comparator function. I think it is wrong because when dealing with these numbers: int nums[]={-2147483648,1,2,3}; INT_MIN = -2147483648;
int compare (const void * a, const void * b)
{
return ( *(int*)a - *(int*)b );
}
I wrote this function to test:
#include <stdio.h>
#include <limits.h>
int compare (const void * a, const void * b)
{
return ( *(int*)a - *(int*)b );
}
int main(void)
{
int a = 1;
int b = INT_MIN;
printf("%d %d\n", a,b);
printf("%d\n",compare((void *)&a,(void *)&b));
return 0;
}
The output is:
1 -2147483648
-2147483647
but a > b so the output should be positive。
I have seen many books write like this. I think it is wrong; it should be written like this when dealing with int types:
int compare (const void * a, const void * b)
{
if(*(int *)a < *(int *)b)
return -1;
else if(*(int *)a > *(int *)b)
return 1;
else
return 0;
}
I just cannot figure out why many books and web sites write in such a misleading way.
If you have any different view, please let me know.
I think it is wrong
Yes, a simple subtraction can lead to int overflow which is undefined behavior and should be avoided.
return *(int*)a - *(int*)b; // Potential undefined behavior.
A common idiom is to subtract two integer compares. Various compilers recognize this and create efficient well behaved code. Preserving const-ness also is good form.
const int *ca = a;
const int *cb = b;
return (*ca > *cb) - (*ca < *cb);
why many books and web sites write in such a misleading way.
return *a - *b; is conceptually easy to digest - even if it provides the wrong answer with extreme values - often learner code omits edge conditions to get the idea across - "knowing" that values will never be large.
Or consider the complexities of comparing long doubles with regard to NaN.
Your understanding is absolutely correct. This common idiom cannot be used for int values.
Your proposed solution works correctly, although it would be more readable with local variables to avoid so many casts:
int compare(const void *a, const void *b) {
const int *aa = a;
const int *bb = b;
if (*aa < *bb)
return -1;
else if (*aa > *bb)
return 1;
else
return 0;
}
Note that modern compilers will generate the same code with or without these local variables: always prefer the more readable form.
A more compact solution with the same exact result is commonly used although a bit more difficult to understand:
int compare(const void *a, const void *b) {
const int *aa = a;
const int *bb = b;
return (*aa > *bb) - (*aa < *bb);
}
Note that this approach works for all numeric types, but will return 0 for NaN floating point values.
As for your remark: I just cannot figure out why many books and web sites write in such a misleading way:
Many books and websites contain mistakes, and so do most programs. Many programming bugs get caught and squashed before they reach production if the program is tested wisely. Code fragments in books are not tested, and although they never reach production, the bugs they contain do propagate virally via unsuspecting readers who learn bogus methods and idioms. A very bad and lasting side effect.
Kudos to you for catching this! You have a rare skill among programmers: you are a good reader. There are far more programmers who write code than programmers who can read code correctly and see mistakes. Hone this skill by reading other people's code, on stack overflow or from open source projects... And do report the bugs.
The subtraction method is in common use, I have seen it in many places like you and it does work for most value pairs. This bug may go unnoticed for eons. A similar problem was latent in the zlib for decades: int m = (a + b) / 2; causes a fateful integer overflow for large int values of a and b.
The author probably saw it used and thought the subtraction was cool and fast, worth showing in print.
Note however that the erroneous function does work correctly for types smaller than int: signed or unsigned char and short, if these types are indeed smaller than int on the target platform, which the C Standard does not mandate.
Indeed similar code can be found in The C Programming Language by Brian Kernighan and Dennis Ritchie, the famous K&R C bible by its inventors. They use this approach in a simplistic implementation of strcmp() in chapter 5. The code in the book is dated, going all the way back to the late seventies. Although it has implementation defined behavior, it does not invoke undefined behavior in any but the rarest architectures among which the infamous DeathStation-9000, yet it should not be used to compare int values.
You are correct, *(int*)a - *(int*)b poses a risk of integer overflow and ought to be avoided as a method of comparing two int values.
It is possible it could be valid code in a controlled situation where one knows the values are such that the subtraction will not overflow. In general, though, it should be avoided.
The reason why so many books are wrong is likely the root of all evil: the K&R book. In chapter 5.5 they try to teach how to implement strcmp:
int strcmp(char *s, char *t)
{
int i;
for (i = 0; s[i] == t[i]; i++)
if (s[i] == '\0')
return 0;
return s[i] - t[i];
}
This code is questionable since char has implementation-defined signedness. Ignoring that, and ignoring that they fail to use const correctness as in the standard C version, the code otherwise works, partially because it relies on implicit type promotion to int (which is ugly), partially since they assume 7 bit ASCII, and the worst case 0 - 127 cannot underflow.
Further down in the book, 5.11, they try to teach how to use qsort:
qsort((void**) lineptr, 0, nlines-1,
(int (*)(void*,void*))(numeric ? numcmp : strcmp));
Ignoring the fact that this code invokes undefined behavior, since strcmp is not compatible with the function pointer int (*)(void*, void*), they teach to use the above method from strcmp.
However, looking at their numcmp function, it looks like this:
/* numcmp: compare s1 and s2 numerically */
int numcmp(char *s1, char *s2)
{
double v1, v2;
v1 = atof(s1);
v2 = atof(s2);
if (v1 < v2)
return -1;
else if (v1 > v2)
return 1;
else
return 0;
}
Ignoring the fact that this code will crash and burn if an invalid character is found by atof (such as the very likely locale issue with . versus ,), they actually manage to teach the correct method of writing such a comparison function. Since this function uses floating point, there's really no other way to write it.
Now someone might want to come up with an int version of this. If they do it based on the strcmp implementation rather than the floating point implementation, they'll get bugs.
Overall, just by flipping a few pages in this once canonical book, we already found some 3-4 cases of reliance on undefined behavior and 1 case of reliance on implementation-defined behavior. So it is really no wonder if people who learn C from this book writes code full of undefined behavior.
First, it's of course correct that an integer during the comparison could create serious problems for you.
On the other hand, doing a single subtraction is cheaper than going through an if/then/else, and the comparison gets performed O(n^2) times in a quicksort, so if this sort is performance-critical and we can get away with it we may want to use the difference.
It will work fine so long as all the values are in some range of size less than 2^31, because then their differences have to be smaller. So if whatever is generating the list you want to sort is going to keep values between a billion and minus one billion then you're fine using subtraction.
Note that checking that the values are in such a range prior to the sort is an O(n) operation.
On the other hand if there's a chance that the overflow could happen, you'd want to use something like the code you wrote in your question
Note that lots of stuff you see doesn't explicitly take overflow into account; it's just that maybe that's more expected in something that's more obviously an "arithmetic" context.
I am new to SO - I have a question which I was asked in interview and which for life of me I am not able to wrap my head around. I can solve it with while/for loop but interviewer specifically asked not to use them I even discussed with few friends of mine but unable to solve it. If someone can provide pointers.
Question is:
for given array
s[] = {5,1,0,4,2,3}
length of array is not given.
If length of array is 5 content
is guaranteed to be between 0 to 5.
There is no repetition of
numbers.
Sample example length(s, 3)
- a[3] = 4 , a[4] = 2, a[2] = 0, a[0] = 5, a[5] =3 returns length of 4 .
For given condition write subroutine int length (s, 3) - to find the number of steps it takes to find given value -
Additional conditions
You cannot use any loop statements like for, while and so on -
You cannot use any global or static variables.
You cannot call other routines inside this routine
You cannot modify given function parameters - it stays length (s, n) only
You cannot change original array too
Alternative solution that does not modify the array at all, but hides an extra parameter inside top 16 bits of x:
int length(int *s, int x){
int start = (x >> 16) - 1;
if (start < 0)
start = x;
if (s[x] == start)
return 0;
return 1 + length(s, ((start + 1) << 16) + s[x]);
}
This will fail if there are too many elements in the array, but I suspect any other recursive solution is likely to hit a stack overflow by that point in any case.
I think I found the answer eventually no I didnt crack it but found it online :) .. Here is the solution
int length(int * s, int x){
if(s[x] < 0){
return -1;
}
else{
s[x] = -s[x];
int len = length(s, -s[x]);
s[x] = -s[x];
return len + 1;
}
}
i don't think it contradicts with any of your conditions. i just didn't use the array as a parameter (that isn't a problem actually, you can modify it yourself)
int s[] = {5,1,0,4,2,3};
bool col[100]; ///to check cycle
int rec(int n)
{
if(col[n])return 0;
col[n]=true;
int sum=0;
sum = 1+rec(s[n]);
return sum;
}
The interviewer is probing your understanding of algorithms and programming paradigms, trying to understand your training, background, and depth. The interviewer has a challenging task; identifying capable developers with minimal evidence. Thus the interviewer presents a constructed problem that (they hope) elicits the desired knowledge (does candidate X know how to solve problem Y, or understand concept Z) perhaps because the interviewer believes the desired answer indicates the candidate knows the expected body of knowledge.
Modern languages provide several repetition structures (commands, statements), some which pre-test (check condition before entering statement-block), and some which post-test (check condition after performing statement block at least once). Here are examples,
Pre-test
while(condition) statement-block
for(initializer;condition;step) statement-block
Post-test
do statement-block while(condition)
repeat statement-block until(condition)
do statement-block until(condition)
These can all be written as conditional (choice) structures with branching (goto),
Pre-test
label:
if(condition)
statement-block;
goto label;
else
nil;
endif
Post-test
label:
statement-block;
if(condition)
goto label;
endif
You can also use recursion, where you call the same function as long as condition holds (or until condition met, depending upon positive or negative logic),
Pre-test
function recurse(args)
if(condition)
statement-block
recurse(revised args);
endif
return
end #function
Post-test
function recurse(args)
statement-block
if(condition)
recurse(revised args);
endif
return;
end
You would learn about recursion in an algorithms, or perhaps a computability course. You would learn about conditional branching in a compiler, high performance computing, or systems class. Your compiler course might examine techniques for detecting 'tail-recursion', and how to rewrite the function call into a loop.
Here is the problem, restated,
given array, s[] = {5,1,0,4,2,3}
array length unknown
content between [0 .. length], not repeated, no duplicates
write subroutine which provides the number of steps to find given value
That is,
int length( array s, int member ) --> position
Examine the conditions (constraints) on the problem,
Array length unknown - Solution must work for variable range of inputs
Cannot use loop statements (for, while, etc) - This suggests either the interviewer wants conditional branch or recursion.
Cannot use global or static variables - Does this suggest interviewer wants a recursive/functional-programming solution? Conditional-branch also provides this.
Cannot call other routines inside this routine - Does interviewer mean functions other than same function, or call any function (what does interviewer mean by 'other').
Cannot modify function parameters, stays length(s,n) - Declaring local (stack) variables is allowed. This could mean pass by value, make a local copy, etc. But not destructive modifications.
Cannot change original array - Definitely no destructive modifications. Possible 'hint' (ok to make local copy?), or further indication that you should use conditional-branch?
Here are two solutions, and a test driver (note, I have named them lengthi, iterative, and lengthr, recursive).
#include <stdio.h>
/* conditional branch */
int lengthi( int s[], int member )
{
int position=0;
AGAIN:
if( s[position] == member ) return(position);
++position;
goto AGAIN;
return(-1);
}
/* recursive */
int lengthr( int s[], int member )
{
if( s[0] == member ) return(0);
return( 1+length(s+1,member) );
}
int
main(int argc,char* argv[])
{
int s1[] = {0,1,2,3,4,5,6,7,8,9};
int s2[] = {1,2,3,4,9,8,7,6,0,5};
int s3[] = {2,4,6,8,0,1,3,5,7,9};
printf("%d at %d\n",3,lengthr(s1,3));
printf("%d at %d\n",3,lengthr(s2,3));
printf("%d at %d\n",3,lengthr(s3,3));
printf("%d at %d\n",3,lengthi(s1,3));
printf("%d at %d\n",3,lengthi(s2,3));
printf("%d at %d\n",3,lengthi(s3,3));
}
Since we are supposed to find the number of steps (iterations, function calls), that is asking for the ordinal position in the list, not the C index (zero based) position.
This is an interview question, and not a programming problem (per se), so probably better suited for the Programmers.stackexchange site. I might give the interviewer an entertaining answer, or their desired answer.
I have a program that I'm trying to decode. It is translated to C from another language (whose name is not spoken here), and as I want to understand how it works, I am slowly rewriting the code and simplifying it to use all the nice logical constructs C has to offer.
The following little bit keeps popping up in my code, with varying values of X and Y:
ptr[X]--;
while(ptr[X])
{
ptr[X]--;
ptr += Y;
}
ptr is of type char *, and I can't really make assumptions about the state of the array at any point because it's pretty deeply embedded in loops and dependent on input and output. I can successfully "simplify" that to:
for(ptr[X]--; ptr[X]; ptr[X]--, ptr += Y);
But that's just awful. Ever so slightly better is:
for(ptr[X]--; ptr[X]; ptr += Y) ptr[X]--;
I want to know if anyone can come up with a better simplification of the above code, I would greatly appreciate it. This occurs in no less than five places, and is impairing my ability to simplify and understand the flow control, so if anyone can provide a more consise/readable version, that would be awesome. If anyone can just offer any sort of fancy insight into that code, that would be awesome too, although I basically understand what it does.
Insight into the code for a specific X and/or Y can also help. Y tends to be between -2 and 2, and X is usually 1, for what its worth.
ptr[X] is equivalent to *(ptr + X), so we can rewrite it as follows:
for((*(ptr + X))--; *(ptr + X); (*(ptr + X))--, ptr += Y);
Now there's a lot of redundancy here, so we can simplify this to:
char *ptr_plus_x = ptr + X;
for((*ptr_plus_x)--; *ptr_plus_x; (*ptr_plus_x)--, ptr_plus_x += Y);
Then we can get rid of ptr_plus_x entirely:
ptr += X;
for((*ptr)--; *ptr; (*ptr)--, ptr += Y);
In English, we visit the memory locations at offsets X, X+Y, X+2Y, X+3Y, ..., decrementing each memory location, until we find a memory location that is 0. But, the test for 0 always occurs after the decrement, so we're really looking for the first memory location in that sequence with a value of 1. Once we find that, we decrement it to 0 and quit.
If Y is 1, then we decrement a string of consecutive memory locations going forwards, up to and including the first 1. If Y is -1, the same thing happens, but searching backwards from offset X. If Y is 0, an infinite loop occurs. If Y is any other value, the search pattern skips various entries.
It's not a very intuitive function, so I can see why you're confused.
I'll throw in:
ptr[X]--
while (ptr[X]--) ptr+=Y;
first evaluate, then decrement (for while condition, that is)
Edit: OK, i'll hate myself in the morning. Goto's are ok at this level, right?
dec: ptr[x]--
while (ptr[X]){
ptr+=Y;
goto dec;
}
(i honestly dont know whether to leave this or not.)
EDIT2: so, how about this one? (tcc didn't complain)
while (ptr[X]--?ptr[X]--,ptr+=Y:0){}
EDIT 2 1/2;
//longshot
while (ptr[X]--?ptr[X]--,ptr+=Y, ptr[X]:0){}
If all else fails..
EDIT3: Last one for tonight.
while (ptr[X]--?ptr[X]--,ptr+=Y:0){
if (!ptr[X]) break;
}//good luck with this, it has been very amusing.
The website for it-which-shall-not-be-named states:
The semantics of the it-which-shall-not-be-named states commands can also
be succinctly expressed in terms of C, as follows (assuming that p has
been previously defined as a char*):
> becomes ++p;
< becomes --p;
+ becomes ++*p;
- becomes --*p;
. becomes putchar(*p);
, becomes *p = getchar();
[ becomes while (*p) {
] becomes }
So it seems like it should be fairly easy to convert it over to C.
EDIT: Here is the Hello World BF converted to C++.
It's quite simple as is, already. Instead of trying to write less statements, I would rather try to grasp the intent and add some comment.
An example of 'a' meaning of the snippet: decrease all elements of a column (X) of a matrix of Y columns. You would need that to draw a vertical line of +'ses, for instance, in a language that has no direct assignment.
You could clarify this meaning by showing the indices directly:
// set elements of column to cGoal
for( int decrementsToGoal = cGoal; decrementsToGoal != 0; --decrementsToGoal ) {
// decrease all elements of column X
for( int row = cMaxRows; M[ row*matrixsizeY + columnX ]; --row ) {
--M[ row*matrixsizeY + columnX ];
}
}
Good luck :)