CS 100 Binary representation of integers
For the binary representation of
integers, there are 3 main methods for doing this. Only
one of these methods has been used in the last 20-25 years, however,
for reasons which we
will explain below. For
all three methods, the representation of positive integers is
exactly the same--they only differ in
the representation of negative numbers and of 0.
Binary means "base 2". Valid digits are 0 and 1: in the
familiar base 10, valid digits are 0-9.
375 in base ten means three hundred and seventy-five, of course--3
hundreds, 7 tens, and 5 1's,
or 3 x 10 to the power 2 (i.e. 100) plus 7 x 10 to the power 1
(i.e. 10) plus 5 x 10 to the power 0
(i.e. 1) or 3x100 + 7x10 + 5x1. 8375 would add 8 x 10 to
the power 3 (i.e. 1000) to 375. Binary
numbers are done in a very similar manner--but remember that only the
digits 0 and 1 are
used. 01101 in base 2 means 0x2 to the power 4 (i.e. 16)
plus 1x2 to the power 3 (i.e. 8) plus
1x2 to the power 2 (i.e. 4) plus 0x1 to the power 1 (i.e. 2) plus 1x2
to the power 0 (i.e. 1):
0x16 + 1x8 + 1x4 +0x2 + 1x1 = 8 + 4 + 1 = 13 in the
familiar decimal base 10 system.
Computers store values in 2 primary ways: first, as simple
characters--e.g. the character 2
rather than the number 2. You may think "what's the
difference?" But think of the value
2 in your computer or pocket calculator as compared to the character 2
in this sentence or
the character 2 in an address such as 1122 Volunteer Boulevard.
The 1 and 2 characters in
that address are not intended to be manipulated--we do not intend to do
multiplication or
addition, etc, on them, any more than we would want to do an addition
on the character V
in that address. On the other hand, suppose we need the value
2--we want to multiply
something by the value 2. Then we need a representation of the
value 2 that can have
arithmetic operations performed on it--such a representation is called
"internal binary"
and is stored in the computer's memory as a string of 0's and
1's. Integers are normally
stored in 32 bits (i.e. 32 0's and 1's). 2 would be stored as the
following (I've grouped the
bits in blocks of 8 bits): 00000000 00000000 00000000
00000010. From the earlier notes
above, this is 1x2 to the power 1 (i.e. 1x2 = 2) plus 0x2 to the power
31, 0x2 to the power
30, etc, and 0x2 to the power 0. Only the 1x2 to the power 1
brings anything positive to
the total, hence the above 32 bit number is how the value 2 gets stored.
Representation can be done as "signed" or "unsigned" values.
"Unsigned" means that
the value is nonnegative--i.e. positive or zero. Negative numbers
(e.g. -15) cannot be
represented as unsigned values. "Signed" means that we can have
both positive and
negative numbers, plus the value zero. The computer must always
know how a value
is stored--if the storage is misinterpreted, all kinds of problems can
result. All
negative values (i.e. we're talking about "signed" values always have the leftmost bit
in the representaion a 1--this is called the "sign bit". For
positive numbers, the sign
bit is always 0. For signed values, you must specify how many
bits are in the
representation--in computers, this is usually 32 bits, but may be 16,
and occasionally
64 or 128 bits. Why is this specification important?
Because you must always be
aware--for signed values--which bit is the sign bit. To
illustrate--suppose we
have 1101. We saw above that 01101 = 13. But if the
leftmost bit of 1101 is the sign
bit, then we have a negative value--and that negative value is NOT
-13! For the
examples below we will use 6-bit signed values to help simplify
things. Using 32-bit
signed values would bog us down in worrying about too many bits and
would get us
too far away from the point of the discussion.
Binary addition. The computer adds binary (not decimal) values,
but the same
principles apply. Think how you would add 465 + 138:
465
+ 138
--------
?????
You would start with the rightmost column--the unit's column--and add 5
and 8
to get 13. But 13 is not a single digit--it is 1x10 + 3x1.
You write 3 in the unit' s
position in the answer and carry the 1 over to the 10's column.
You now add 1 (the
carry-in, as it's called) + 6 + 3 = 10 = 1x10 + 0x1. You
write down 0 in the 10's
position and carry the 1 to the hundred's column. 1 (the
carry-in) + 4 + 1 = 6
and there is a "carry-out" of 0 to the thousands column, so you're
done--the
result is 603. In binary addition, let's add 001101 (=13)
and 000101 (=5):
001101
+ 000101
---------
????????
You start with the rightmost column. 1 + 1 = 2---but this is 2 in
base 10. 2 in
base 10 is 10 in binary (why?!). DO NOT say to yourself that this
is TEN in
binary--it's one-zero, or 1x2 to the power 1 plus 0x2 to the power 0 =
2 + 0
= 2 in base 10. Just as with adding decimal numbers, when the
added total
is equal to or greater than the base, whether the base is 2, 10, or
whatever,
you're going to have to carry into the next column. So we write
down a 0
in the rightmost column and carry a 1 to the next column. In the
next column--
the 2's column (in decimal, this would be the 10's column) we add 1
(the carry-in)
+ 0 + 0 = 1 with a carry-out of 0. In the 4's column we add 0
(the carry-in) + 1 + 1
= 2, so we write down 0 and the carry-out is 1. If the carry-in
had been 1 we
would have added 1 + 1 + 1 = 3. 3 = 1x2 to the power 1 plus 1x2
to the power 0=
2 + 1, so we would have written a 1 and the carry out would have been a
1. In
the 8's column we add 1 (the carry-in from the 4's column) + 1 + 0 = 2,
so we
write down a 0 and carry a 1 to the 16's column. In the 16's
column, we add
1 + 0 + 0 = 1, with a carry-out of 0. Therefore:
001101
+ 000101
--------
010010 1x2 to the power
4 (i.e. 16) plus 1x2 to the power 1 (i.e. 2) = 18.
13 + 5 = 18, so this checks out.
----------------------------------------------------------------
Sign-and-magnitude representation. 001101 is a positive number
(the sign bit is 0)
and represents 13 in 6 bits. To obtain the value -13, we start
with +13 (001101) and
change the sign bit to 1: 101101 (= -13 in sign-and-magnitude
rep). This is the
easiest of the 3 representations to understand. Getting a
negative value is very easy.
But no computer nowadays (as far as I know) uses this
representation. Why? Speed
in carrying out arithmetic operations is vitally important, and using
this form of
representation would slow a computer down dramatically when it does
arithmetic.
Adding two positive or two negative values is not too bad. But
suppose we want to
add a positive value to a negative one--e.g. 13 + (-5). Think of
what you would do
yourself with decimal numbers. You were asked to do the following
addition when
you were in the 2nd (3rd?) grade. We will also show the
equivalent in binary.
13
001101 = +13 sign-and-magnitude
+ (-
5)
+ 100101 =
-5 "
" "
-------
--------
?????
??????
Adding +13 and -5 isn't as straightforward as adding +13 and +5.
What did you do
all those years ago? Panic!!!!??? When you added a positive
and a negative value
together, you would have to do a subtraction rather than an
addition. How did you
know that you couldn't do a nice straightforward addition? When
you saw that the
signs of the two values were different. You knew you had to
do a subtraction. But
subtract what from what? You learned that you had to subtract the
smaller
absolute value (i.e. 5) from the larger absolute value (i.e. 13) and
that at the end
the sign of the result is the sign of the larger absolute
value's. So you did 13 -5 = 8.
If the teacher asked you to add (-13) + 5 you would have
subtracted 5 from 13
to get 8, and then made the final result -8. The same kind of
problems occur when
adding a positive and a negative sign-and-magnitude binary
number. So, in full:
1) Do the two values have the same sign?
a) YES. Add the two binary numbers--but NOT the
column with the sign bit.
Make the sign it of the
result the same as the sign but of the two numbers.
b) NO. Compare the absolute values of the two
numbers (i.e. everything but
the sign bits.
1) If the first vale is
smaller in absolute value than the second, subtract it
from the second value. Attach the sign of the second value to the
result.
2) If the first value is
greater than or equal to the second, subtract the
second value from the first and attach the sign of the first value to
the
result.
>>>>note: binary subtraction works just like decimal
subtraction--you have
to be able to "borrow" from the column to the
left<<<<<<<
In addition: 000000 = 0 (zero). What does 100000
represent? minus zero,
which is the same as 0. So we have two different ways of
representing the
value 0--a complication. Adding a positive and a negative number
together
was a complication for you back in the 2nd grade--and it's a worse
problem
for the computer. It adds a LOT of overhead to addition. It
also adds
complications to subtraction--you can subtract one decimal number from
another--but suppose you try to subtract a negative number from a
positive
one? Right! You're back to just doing addition again.
Sign-and-magnitude
representations are easy to understand--but they slow down arithmetic
operations in the computer.
------------------------------------------------------
One's complement representation. This representation is no longer
used--
but it does make arithmetic easier for the compute than does sign-and-
magnitude. To get -13 in 1's complement, first start with +13 =
001101.
Then you simply flip the bits--0's become 1's, 1's become 0's, so that
-13 = 110010. Note that this has made the sign bit a
1--indicating a
negative number. REMEMBER!!! For positive numbers, all
three methods
give exactly the same representation--but negative values differ.
So let's
add +13 + (-5):
001101 = +13
+ 111010 = -5
-----------
??????
The good news is that unlike sign-and-magnitude, we do not have to worry
about different signs and we do not have to worry about which absolute
value is larger. In fact we can just go ahead and add bits--and
unlike S+M
we include the sign bit in the addition.
001101 = +13
+ 111010 = -5
-------
000111 with a
carry-out from the sign bit column of 1.
We note that the result--judging from the sign bit--is positive, which
we
expect. That's a good sign (pun intended). But 000111 = +7,
which is not
so good--the answer should be +8. Well, the complication with 1's
complement is that we now need to add the carry-out from the sign bit
back into the result:
000111
+
1
---------
001000 = +8 this is
the correct result. If the carry-out was 0, adding
0 to the result doesn't change the result. Arithmetic is MUCH
faster with
1's complement than with S+M. Note also: 000000 = 0,
111111 = -0 so we
again have two different representationsof the value 0.
-----------------------------------------------------------
2's complement. This is the representation your computer actually
uses.
To get this representation for a negative value, first get the 1's
complement:
001101 = +13
110010 = -13 (1's complement).
Then add 1 to the 1's complement:
110010
+ 1
---------
110011 = -13 in 2's
complement. It's a little slower for the computer to
get the representation in the first place--but once the computer has
the
representation, arithmetic is very fast--faster than 1's complement and
much
faster than S+M. So we'll add +13 and -5:
001101 = +13
+ 111011 = -5 (2's complement)
----------
001000 with a
carry-out from the sign bit of 1.
In 2's complement, unlike 1's complement, we simply ignore the
carry-out from
the sign bit. 001000 = +8, which is the result we expected.
There is no adding
the carry-out back in, so this makes 2's complement faster than 1's
complement.
Also--what about zero? 000000 = 0. 111111
= -0 in 1's complement. To get
the 2's complemnt of a negative number, add 1 to the 1's complement:
111111 = -0 1's complement
+
1
--------
000000 and we ignore the carry-out
from the sign bit. The representation
of zero is unique with 2's complement.
(If you are suspicious here and ask "In S+M and 1's complement there are
2 representations of 0, and in 2's complement there is only 1
representation,
then where did that missing representation go?" The answer is
that with
6 bits, with S+M and 1's complemnt, you can represent values from +31 to
-31--e.g. 011111 = +31. With 6 bits in 2's complement you
can represent
values from +31 to -32, a slightly greater range of values.).
---------------------------------------------------------
+1 and -1 in the various representations (6 bits)
S+M
1's
C
2's C
----------------------------
+1
000001
000001
000001 note: all the same for positive
-1
100001
111110
111111 note: all different for negative
---------------------------------------------------------
OVERFLOW. For a given number of bits, not all numbers can be
represented correctly. For example, in 2's complement, you can
represent
values from +31 (011111) to -32 (100000). A value such as +43
cannot be
correctly represented--there are not enough bits. Watch what
happens when we
add +20 (010100) and + 18 (010010):
010100 (+20)
+ 010010 (+18)
----------------
100110 We have added
two positive numbers and gotten a negative result,
which cannot be correct! OVERFLOW means that the result is not
valid: when
on addition (etc) the carry into the sign bit is not the same as the
carry out from the
sign bit--then overflow has occurred. In the above example, the
carry into the sign
bit was a 1, and the carry out was a 0. Try adding +2
(000010) and +5 (000101).
The carry into the sign bit is 0 and the carry out was 0: the
result is valid.