C treating variable declared to be a double as an int

Question

I am trying to solve a problem posed in this question which asks that a program be written that calculates logarithms of numbers without including <math.h>.

I wrote the following program:

#include <stdio.h>

#define ln2 0.693147180559945

double power (double argu, int expo)
{
    double power = 1;
    for (int i = 0; i < expo; i++) { power *= argu; }
    return power;
}

double log_2 (double argu)
{
    int i = 0;
    for (; (double)(1 << i) <= argu; i++) {}
    i -= 1;
    double who = (double)i, fra = 0;
    for (int j = 1, k = 1; k < 450; j *= -1, k += 1)
    {
        fra += ((double)j / (double)k) * (power (((argu - (double)(1 << i)) / (double)(1 << i)), (double)k));
    }
    return who + (fra / ln2);
}

double log_b (unsigned int base, double argu)
{
    return ((log_2(argu)) / (log_2(base)));
}

int main (void)
{
    for (;;)
    {
        unsigned int base;
        double argu;

        printf ("Give me a nonzero positive integer base, b = ");
        if (scanf ("%u", &base) != 1) { printf ("Error!!"); break; }
        printf ("Give me a nonzero positive rational argument, x = ");
        if (scanf ("%lf", &argu) != 1) { printf ("Error!!"); break; }
        printf ("log_b(x) = %.12lf\n\n", log_b (base, argu));
    }
    return 1;
}

I used the following method; approximated the log base 2 using left shift and Taylor expanded around that approximation, see function log_2(); then changed the base from 2 by applying the fact that ratios of logarithms are base independent, see function log_b().

My programme works for other numbers but give it an argument larger than a billion, it hangs. I suspect, as 2^32 roughly equals 10⁹, the computer is treating the variable double argu declared in the main() as an int instead of a double. Is my suspicion correct? If yes, then why is a double being treated as an int? On the other hand, if not, then why is the program not accepting arguments larger than a billion, a double should be able to accommodate numbers smaller than or equal to 2⁵³ right?

for (; (double)(1 << i) <= argu; i++) {} should be for (; (double)(1ULL << i) <= argu; i++) {} so you don't hit int limits — Iłya Bursov
– Iłya Bursov, Commented Sep 20 at 11:45
unsigned long long is 2^64 - 1 max, which means that i is at most 63. DBL_MAX_EXP, however, is 1024. The only way not to hit the limit is to eschew integral types altogether. — Nil Admirari
– Nil Admirari, Commented Sep 20 at 12:16
@chux I tested the program using 10 as base and powers of 10s as argument. I was incrementing k by 50 each time until all the 12 digits after decimal point of the result disappeared. It turned out all the arguments converge within 50 iterations of the for loop except multiples of 3, like 1000, 1000000 and 1000000000. Especially 1000 takes 450 iterations in order to disappear all the 12 digits after decimal point in result. This problem has been solved by chqrlie in his answer below. Now the for loop doesn't need to be this long any longer and converges within 35 iterations. — uran42
– uran42, Commented Sep 21 at 2:15
@uran42, Thanks for that info. If magic numbers are determined via testing, be sure to test with floating point values in the extreme range, including sub-normals and min/max values. The "all the 12 digits" itself is a magic number, why 12 when for common double 17 significant digits is a better goal given the common double encoding of 53 binary digits. Instead I'd recommend to analyze the code and the double encoding and instead of 450, 35, etc, and use constants from <float.h> or run time tests to drive the loop count. IAC, glad you found a better approach. — chux
– chux, Commented Sep 21 at 3:54

chqrlie · Accepted Answer · 2025-09-21 07:21:31Z

6

Your interpretation for the expression (double)(1 << i) <= argu is incorrect: 1 << i is evaluated using integer arithmetics, because << requires integer operands and it uses int because the constant 1 has int type. Hence if the result does not fit in an int or if the shift amount is greater or equal to the width of type int, 32 on most current architectures, the behavior is undefined.

You can use a larger type to accommodate up to 2⁶³: 1ULL can be shifted left at least 63 positions.

Yet your approach is still flawed for numbers less or equal to 1.0 where you try and compute 1 << -1 and for numbers larger or equal to 2⁶⁴ that will still cause an infinite loop.

Instead of mixing integer and floating point arithmetics, you can just use double arithmetics:

double log_2(double argu) {
    double who = 0;
    while (argu >= 2) {
        who += 1;
        argu /= 2;
    }
    double x = argu - 1;  // x > -1.0 && x < 1.0
    // Compute log(1+x) using the Taylor series first discovered by Nicolaus Mercator
    double frac = 0;
    for (int j = 1, k = 1; k < 450; j = -j, k++) {
        frac += (double)j / (double)k * power(x, k);
    }
    return who + frac / ln2;
}

Note that this function can be simplified, computing the power of -x incrementally, thereby removing the need for function power():

double log_2(double argu) {
    double who = 0;
    while (argu >= 2) {
        who += 1;
        argu /= 2;
    }
    double x = argu - 1;  // x > -1.0 && x < 1.0
    // Compute log(1+x) using the Taylor series first discovered by Nicolaus Mercator
    double frac = 0;
    double xk = x;
    for (int k = 1; k < 450; k++) {
        frac += xk / k;
        xk *= -x;
    }
    return who + frac / ln2;
}

Note however that the loop can be exited after much less than 450 iterations in most cases, but if the argument is too close to the base value, it does not converge fast enough for 450 to suffice. Try 2 and 1.999 and you get 1.000302003013 which is obviously wrong!

To mitigate this problem, we can adjust the value of argu so x is in the range [-1/3 : 1/3]: this ensures quicker convergence of the loop in less than 40 iterations (less than 33 in my tests). We can also add a test to break from the loop if frac does no changes.

Here is a faster and more reliable version:

double log_2(double argu) {
    double who = 0;
    while (argu < 0.667) {
        who -= 1;
        argu *= 2;
    }
    while (argu > 1.334) {
        who += 1;
        argu /= 2;
    }
    double x = argu - 1;  // x >= -0.33 && x <= 0.33
    // Compute log(1+x) using the Taylor series
    // first discovered by Nicolaus Mercator
    double frac = 0;
    double xk = x;
    for (int k = 1; k < 40; k++) {
        double frac0 = frac;
        frac += xk / k;
        xk *= x;
        if (frac == frac0)
            break;
    }
    return who + frac / ln2;
}

edited Sep 21 at 7:21

answered Sep 20 at 13:46

chqrlie

153k12 gold badges146 silver badges232 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

uran42 Sep 20 at 14:34

What should be done about the convergence issue? Multiples of 3s like 3,6 and 9 cause problems. Other numbers converge with as little of 50 iterations of the for loop, but the number 3 requires 450 iterations.

chqrlie Sep 20 at 17:40

I posted a simple fix to ensure quicker convergence.

Nil Admirari · Accepted Answer · 2025-09-20 11:37:41Z

2

The problem is in the loop:

for (; (double)(1 << i) <= argu; i++) {}

A simple program to show what the loop is actually computing:

#include <stdio.h>

int main() {
    double argu = 9'000'000'000;
    printf("%f\n", argu);

    int i = 0;
    while (true) {
        double pow = (double) (1 << i);
        printf("%d: %f\n", i, pow);
        if (pow > argu || i >= 40) {
            break;
        }
        ++i;
    }
}

Here's the output:

9000000000.000000
0: 1.000000
1: 2.000000
2: 4.000000
3: 8.000000
4: 16.000000
5: 32.000000
6: 64.000000
7: 128.000000
8: 256.000000
9: 512.000000
10: 1024.000000
11: 2048.000000
12: 4096.000000
13: 8192.000000
14: 16384.000000
15: 32768.000000
16: 65536.000000
17: 131072.000000
18: 262144.000000
19: 524288.000000
20: 1048576.000000
21: 2097152.000000
22: 4194304.000000
23: 8388608.000000
24: 16777216.000000
25: 33554432.000000
26: 67108864.000000
27: 134217728.000000
28: 268435456.000000
29: 536870912.000000
30: 1073741824.000000
31: -2147483648.000000
32: 1.000000
33: 2.000000
34: 4.000000
35: 8.000000
36: 16.000000
37: 32.000000
38: 64.000000
39: 128.000000
40: 256.000000

An int has 32 bits, shifting by more than that triggers undefined behaviour:

For signed lhs with nonnegative values, the value of LHS << RHS is LHS * 2RHS if it is representable in the promoted type of lhs, otherwise the behavior is undefined.

answered Sep 20 at 11:37

Nil Admirari

4193 silver badges10 bronze badges

5 Comments

Tom Karzes Sep 20 at 13:08

You should avoid using library function names (e.g. pow) as variable names. It's just asking for problems.

Nil Admirari Sep 20 at 13:35

pow is local to the while-loop and isn't polluting a global namespace or anything. The header in which pow() is declared isn't even included.

ad absurdum Sep 20 at 16:31

@NilAdmirari -- yes, but it's a bad habit, bad style, and can cause problems in code that lives longer than a toy example. You should at least include a disclaimer so that learners don't think that this is an ok thing to do in real code.

Nil Admirari Sep 21 at 13:03

The code in this example contains undefined behaviour. A quote from the standard is given, with a bold emphasis for learners to notice. This code shouldn't be used in production at all, since the problems with it are way more serious than just bad style.

ad absurdum Sep 21 at 13:38

@NilAdmirari -- good that you pointed out potential undefined behavior. Yet the code here still models bad style, hence the comments. Also note that you haven't provided a quote from the Standard, but from cppreference.com; not a bad reference, but not the Standard and inaccuracies are not unknown there. One more inaccuracy in your answer: "an int has 32 bits". An int must be at least 16 bits wide; 32 bits is common (it was not always so), but INT_WIDTH may be even more.

Collectives™ on Stack Overflow

C treating variable declared to be a double as an int

2 Answers 2

3 Comments

5 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

3 Comments

5 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related