CS 100 Binary representation of integers

For the binary representation of integers, there are 3 main methods for doing this. Only
one of these methods has been used in the last 20-25 years, however, for reasons which we
will explain below. For all three methods, the representation of positive integers is
exactly the same--they only differ in the representation of negative numbers and of 0.

Binary means "base 2". Valid digits are 0 and 1: in the familiar base 10, valid digits are 0-9.
375 in base ten means three hundred and seventy-five, of course--3 hundreds, 7 tens, and 5 1's,
or 3 x 10 to the power 2 (i.e. 100) plus 7 x 10 to the power 1 (i.e. 10) plus 5 x 10 to the power 0
(i.e. 1) or 3x100 + 7x10 + 5x1. 8375 would add 8 x 10 to the power 3 (i.e. 1000) to 375. Binary
numbers are done in a very similar manner--but remember that only the digits 0 and 1 are
used.   01101 in base 2 means 0x2 to the power 4 (i.e. 16) plus 1x2 to the power 3 (i.e. 8) plus
1x2 to the power 2 (i.e. 4) plus 0x1 to the power 1 (i.e. 2) plus 1x2 to the power 0 (i.e. 1):
0x16 + 1x8 + 1x4 +0x2 + 1x1 = 8 + 4 + 1 = 13 in the familiar decimal base 10 system.
Computers store values in 2 primary ways: first, as simple characters--e.g. the character 2
rather than the number 2. You may think "what's the difference?" But think of the value
2 in your computer or pocket calculator as compared to the character 2 in this sentence or
the character 2 in an address such as 1122 Volunteer Boulevard. The 1 and 2 characters in
that address are not intended to be manipulated--we do not intend to do multiplication or
addition, etc, on them, any more than we would want to do an addition on the character V
in that address. On the other hand, suppose we need the value 2--we want to multiply
something by the value 2. Then we need a representation of the value 2 that can have
arithmetic operations performed on it--such a representation is called "internal binary"
and is stored in the computer's memory as a string of 0's and 1's. Integers are normally
stored in 32 bits (i.e. 32 0's and 1's). 2 would be stored as the following (I've grouped the
bits in blocks of 8 bits): 00000000 00000000 00000000 00000010. From the earlier notes
above, this is 1x2 to the power 1 (i.e. 1x2 = 2) plus 0x2 to the power 31, 0x2 to the power
30, etc, and 0x2 to the power 0. Only the 1x2 to the power 1 brings anything positive to
the total, hence the above 32 bit number is how the value 2 gets stored.

Representation can be done as "signed" or "unsigned" values. "Unsigned" means that
the value is nonnegative--i.e. positive or zero. Negative numbers (e.g. -15) cannot be
represented as unsigned values. "Signed" means that we can have both positive and
negative numbers, plus the value zero. The computer must always know how a value
is stored--if the storage is misinterpreted, all kinds of problems can result.   All
negative values (i.e. we're talking about "signed" values always have the leftmost bit
in the representaion a 1--this is called the "sign bit". For positive numbers, the sign
bit is always 0. For signed values, you must specify how many bits are in the
representation--in computers, this is usually 32 bits, but may be 16, and occasionally
64 or 128 bits. Why is this specification important? Because you must always be
aware--for signed values--which bit is the sign bit.   To illustrate--suppose we
have 1101. We saw above that 01101 = 13. But if the leftmost bit of 1101 is the sign
bit, then we have a negative value--and that negative value is NOT -13! For the
examples below we will use 6-bit signed values to help simplify things. Using 32-bit
signed values would bog us down in worrying about too many bits and would get us
too far away from the point of the discussion.

Binary addition. The computer adds binary (not decimal) values, but the same
principles apply. Think how you would add 465 + 138:

         465
   +    138
--------
       ?????
You would start with the rightmost column--the unit's column--and add 5 and 8
to get 13. But 13 is not a single digit--it is 1x10 + 3x1. You write 3 in the unit' s
position in the answer and carry the 1 over to the 10's column. You now add 1 (the
carry-in, as it's called) + 6 + 3 = 10 = 1x10 + 0x1. You write down 0 in the 10's
position and carry the 1 to the hundred's column. 1 (the carry-in) + 4 + 1 = 6
and there is a "carry-out" of 0 to the thousands column, so you're done--the
result is 603. In binary addition, let's add 001101 (=13) and 000101 (=5):

      001101
+   000101
---------
????????
You start with the rightmost column. 1 + 1 = 2---but this is 2 in base 10. 2 in
base 10 is 10 in binary (why?!). DO NOT say to yourself that this is TEN in
binary--it's one-zero, or 1x2 to the power 1 plus 0x2 to the power 0 = 2 + 0
= 2 in base 10. Just as with adding decimal numbers, when the added total
is equal to or greater than the base, whether the base is 2, 10, or whatever,
you're going to have to carry into the next column. So we write down a 0
in the rightmost column and carry a 1 to the next column. In the next column--
the 2's column (in decimal, this would be the 10's column) we add 1 (the carry-in)
+ 0 + 0 = 1 with a carry-out of 0. In the 4's column we add 0 (the carry-in) + 1 + 1
= 2, so we write down 0 and the carry-out is 1. If the carry-in had been 1 we
would have added 1 + 1 + 1 = 3. 3 = 1x2 to the power 1 plus 1x2 to the power 0=
2 + 1, so we would have written a 1 and the carry out would have been a 1. In
the 8's column we add 1 (the carry-in from the 4's column) + 1 + 0 = 2, so we
write down a 0 and carry a 1 to the 16's column. In the 16's column, we add
1 + 0 + 0 = 1, with a carry-out of 0. Therefore:

     001101
+ 000101
--------
      010010         1x2 to the power 4 (i.e. 16) plus 1x2 to the power 1 (i.e. 2) = 18.

13 + 5 = 18, so this checks out.
----------------------------------------------------------------

Sign-and-magnitude representation. 001101 is a positive number (the sign bit is 0)
and represents 13 in 6 bits. To obtain the value -13, we start with +13 (001101) and
change the sign bit to 1: 101101 (= -13 in sign-and-magnitude rep). This is the
easiest of the 3 representations to understand. Getting a negative value is very easy.
But no computer nowadays (as far as I know) uses this representation. Why? Speed
in carrying out arithmetic operations is vitally important, and using this form of
representation would slow a computer down dramatically when it does arithmetic.
Adding two positive or two negative values is not too bad. But suppose we want to
add a positive value to a negative one--e.g. 13 + (-5). Think of what you would do
yourself with decimal numbers. You were asked to do the following addition when
you were in the 2nd (3rd?) grade.   We will also show the equivalent in binary.

      13                           001101 = +13 sign-and-magnitude
+ (- 5)                  +     100101 = -5     "      "        "
-------                  --------
      ?????                        ??????
Adding +13 and -5 isn't as straightforward as adding +13 and +5. What did you do
all those years ago? Panic!!!!??? When you added a positive and a negative value
together, you would have to do a subtraction rather than an addition. How did you
know that you couldn't do a nice straightforward addition? When you saw that the
signs of the two values were different.   You knew you had to do a subtraction. But
subtract what from what? You learned that you had to subtract the smaller
absolute value (i.e. 5) from the larger absolute value (i.e. 13) and that at the end
the sign of the result is the sign of the larger absolute value's. So you did 13 -5 = 8.
If the teacher asked you to add   (-13) + 5 you would have subtracted 5 from 13
to get 8, and then made the final result -8. The same kind of problems occur when
adding a positive and a negative sign-and-magnitude binary number. So, in full:
1) Do the two values have the same sign?
   a) YES. Add the two binary numbers--but NOT the column with the sign bit.
        Make the sign it of the result the same as the sign but of the two numbers.
   b) NO.   Compare the absolute values of the two numbers (i.e. everything but
        the sign bits.
        1) If the first vale is smaller in absolute value than the second, subtract it
             from the second value. Attach the sign of the second value to the result.
        2) If the first value is greater than or equal to the second, subtract the
             second value from the first and attach the sign of the first value to the
             result.
>>>>note: binary subtraction works just like decimal subtraction--you have
to be able to "borrow" from the column to the left<<<<<<<

In addition: 000000 = 0 (zero). What does 100000 represent? minus zero,
which is the same as 0. So we have two different ways of representing the
value 0--a complication. Adding a positive and a negative number together
was a complication for you back in the 2nd grade--and it's a worse problem
for the computer. It adds a LOT of overhead to addition. It also adds
complications to subtraction--you can subtract one decimal number from
another--but suppose you try to subtract a negative number from a positive
one? Right! You're back to just doing addition again. Sign-and-magnitude
representations are easy to understand--but they slow down arithmetic
operations in the computer.
------------------------------------------------------

One's complement representation. This representation is no longer used--
but it does make arithmetic easier for the compute than does sign-and-
magnitude. To get -13 in 1's complement, first start with +13 = 001101.
Then you simply flip the bits--0's become 1's, 1's become 0's, so that
-13 = 110010. Note that this has made the sign bit a 1--indicating a
negative number. REMEMBER!!! For positive numbers, all three methods
give exactly the same representation--but negative values differ. So let's
add +13 + (-5):

       001101 = +13
+   111010   = -5
-----------
       ??????
The good news is that unlike sign-and-magnitude, we do not have to worry
about different signs and we do not have to worry about which absolute
value is larger. In fact we can just go ahead and add bits--and unlike S+M
we include the sign bit in the addition.

       001101 = +13
+     111010 = -5
    -------
       000111   with a carry-out from the sign bit column of 1.

We note that the result--judging from the sign bit--is positive, which we
expect. That's a good sign (pun intended). But 000111 = +7, which is not
so good--the answer should be +8. Well, the complication with 1's
complement is that we now need to add the carry-out from the sign bit
back into the result:

      000111
+             1
---------
      001000   = +8 this is the correct result. If the carry-out was 0, adding
0 to the result doesn't change the result. Arithmetic is MUCH faster with
1's complement than with S+M. Note also:   000000 = 0, 111111 = -0 so we
again have two different representationsof the value 0.
-----------------------------------------------------------

2's complement. This is the representation your computer actually uses.
To get this representation for a negative value, first get the 1's complement:

    001101 = +13     110010 = -13 (1's complement).
Then add 1 to the 1's complement:
      110010
+            1
---------
      110011 = -13 in 2's complement. It's a little slower for the computer to
get the representation in the first place--but once the computer has the
representation, arithmetic is very fast--faster than 1's complement and much
faster than S+M. So we'll add +13 and -5:

         001101 = +13
    +   111011 = -5 (2's complement)
   ----------
         001000 with a carry-out from the sign bit of 1.

In 2's complement, unlike 1's complement, we simply ignore the carry-out from
the sign bit. 001000 = +8, which is the result we expected. There is no adding
the carry-out back in, so this makes 2's complement faster than 1's complement.
Also--what about zero?    000000 = 0.   111111 = -0 in 1's complement. To get
the 2's complemnt of a negative number, add 1 to the 1's complement:

       111111 = -0 1's complement
+              1
   --------
      000000 and we ignore the carry-out from the sign bit. The representation
of zero is unique with 2's complement.
(If you are suspicious here and ask "In S+M and 1's complement there are
2 representations of 0, and in 2's complement there is only 1 representation,
then where did that missing representation go?" The answer is that with
6 bits, with S+M and 1's complemnt, you can represent values from +31 to
-31--e.g. 011111 = +31. With 6 bits in 2's complement you can represent
values from +31 to -32, a slightly greater range of values.).
---------------------------------------------------------
+1 and -1 in the various representations (6 bits)
               S+M              1's C             2's C
        ----------------------------
+1       000001            000001          000001   note: all the same for positive
-1       100001            111110          111111   note: all different for negative
---------------------------------------------------------

OVERFLOW. For a given number of bits, not all numbers can be
represented correctly. For example, in 2's complement, you can represent
values from +31 (011111) to -32 (100000). A value such as +43 cannot be
correctly represented--there are not enough bits. Watch what happens when we
add +20 (010100) and + 18 (010010):

         010100 (+20)
+      010010 (+18)
----------------
         100110         We have added two positive numbers and gotten a negative result,
which cannot be correct! OVERFLOW means that the result is not valid:   when
on addition (etc) the carry into the sign bit is not the same as the carry out from the
sign bit--then overflow has occurred. In the above example, the carry into the sign
bit was a 1, and the carry out was a 0.   Try adding +2 (000010) and +5 (000101).
The carry into the sign bit is 0 and the carry out was 0: the result is valid.