November 25, 2013 will

Finding the first bit set with Python

Here's a Python gotcha that I spent some time tracking down. I'm writing it up in the spirit of saving developers a debugging headache in the future.

I had an integer with a single bit set, and I wanted to find the index of that bit. For example, 4 in binary is 00000100. The 1 is at the third position from the right, which should give an index of 2 – since the first position is 0.

You can do this in two ways; either check each bit in turn until you find a 1, or you can use math as a shortcut. I chose the math solution:

>>> import math
>>> myint = 4
>>> int(math.log(myint, 2))
2

Simple right? Finally staying awake in high school maths paid off. So simple that it was the last bit of code I suspected to be broken (spoiler: it was).

This is Python 2.7 which still has two types of integer; type int and arbitrary long integer type long. I was testing with ints because that's what you get when you type 4. However the numbers I was getting out of the Django db where longs. Look what happens with the above code when you use 4L rather than 4:

>>> myint = 4L
>>> int(math.log(myint, 2))
1

And that was the result of my headache. Longs and ints are generally interchangeable. But not in this case. math.log gives a different result with long and ints. Which we can see here.

>>> math.log(4, 2)
2.0
>>> math.log(4L, 2)
1.9999999999999998

That tiniest of rounding errors for the long version would be insignificant for most applications, but not if you are converting to an integer and discarding the fractional part. The fix is simple. Round the return value of math.log to the nearest whole.

>>> int(round(math.log(4L, 2)))
2

If you are working with Python 3, this problem goes away. Another reason to migrate if you have the option!

Use Markdown for formatting
*Italic* **Bold** `inline code` Links to [Google](http://www.google.com) > This is a quote > ```python import this ```
your comment will be previewed here
gravatar
Walter Dörwald
The following is even simpler:

 >>> 4 .bit_length() - 1
2

This works in Python 3 and Python 2.7 (with both ints and longs).
gravatar
orzel

Finding the first bit is an operation inherently belonging to “discrete math”, and it feels really weird/bad to use continuum math to do so.
Just copy/paste some python ffs() example on the web (1s google search example : http://stackoverflow.com/questions/5520655/return-index-of-least-significant-bit-in-python).
Those are really quick, don't worry about optimization.

@walter: bit_length() is not what the author asks. Your example only works because there's only one bit set in ‘4’.
gravatar
Craig McQueen
The proposed solution doesn't seem to work well if multiple bits are set. E.g.:

 int(round(math.log(6L, 2)))

That should return 2, but it returns 3.
gravatar
Peter Hansen
@orzel: your response to walter's suggestion seems wrong. Using bit_length()-1 with any of 4, 5, 6, or 7 produces the same result. And the description of that routine (“number of bits necessary to represent self in binary”) does seem exactly like what one would need (less 1) to answer the original question.
gravatar
Peter Hansen
Followup to my comment above: oh… unless you're not trying to find the index of the *most significant bit* that's set. For some reason that's how I read the problem. It would work for that case, or for the case where only one bit is set (i.e. the stated problem). It would not work for the cast where you want to find the index of the least significant bit that's set, so if that was the real question then bit_length() wouldn't be right.
gravatar
Will McGugan
Just to clarify, for my particular purposes there is only ever one bit set. So bit_length works well for this case.