Bit

From Wikipedia, the free encyclopedia

This article is about the unit of information. For other uses, see Bit (disambiguation).

A bit (binary digit) refers to a digit in the binary numeral system, which consists of base 2 digits (ie. There are only 2 possible values: 0 or 1). For example, the number 10010111 is 8 bits long. Binary digits are almost always used as the basic unit of information storage and communication in digital computing and digital information theory. Information theory also often uses the natural digit, called either a nit or a nat. Quantum computing also uses qubits, a single piece of information with a probability of being true.

The bit is also a unit of measurement, the information capacity of one binary digit. It has the symbol bit, and less formally b (see discussion below). The unit is also known as the shannon, with symbol Sh.

Binary digit

Claude E. Shannon first used the word bit in a 1948 paper. He attributed its origin to John W. Tukey, who had written a Bell Labs memo in 9 January 1947 in which he contracted "binary digit" to simply "bit". Interestingly, Vannevar Bush had written in 1936 of "bits of information" that could be stored on the punch cards used in the mechanical computers of that time. ^[1]

A bit of storage is like a light switch; it can be either on (1) or off (0). A single bit is a one or a zero, a true or a false, a "flag" which is "on" or "off", or in general, the quantity of information required to distinguish two mutually exclusive states from each other. Gregory Bateson defined a bit as "a difference that makes a difference". [1]

The bit is the smallest unit of storage used in computing.

Representation

Bits can be represented in many forms. For example, on the circuitry in most computing devices, bits are represented as electrical levels. For some devices, a 1 (true value) is represented by a positive voltage, while a 0 (false value) is represented by a negative voltage. For other devices, zero volts is used to represent 0 (false value).

On CD-ROMs, this is represented as "pits" or "grounds". Pits, as the name implies, refers to a small groove on the CD, which reflects away the laser that reads it. Ground, on the other hand, refers basically to the flat reflective surface. The light of the reading laser is reflected back into the laser, which then picks up that light with a sensor. Pits represent 0 (false value), while ground represents 1 (true value).

CD-Rs work on the same theory, except that they use dyes instead of pits and ground.

Bits can also be represented magnetically, such as in magentic tapes and cassettes.

Unit

It is important to differentiate between the use of "bit" in referring to a discrete storage unit and the use of "bit" in referring to a statistical unit of information. The bit, as a discrete storage unit, can by definition store only 0 or 1. A statistical bit is the amount of information that, on average, can be stored in a discrete bit. It is thus the amount of information carried by a choice between two equally likely outcomes. One bit corresponds to about 0.693 nats (ln(2)), or 0.301 hartleys (log₁₀(2)).

Consider, for example, a computer file with 1,000 0s and 1s which can be losslessly compressed to a file of 500 0s and 1s (on average, over all files of that kind). The original file, although having 1,000 bits of storage, has at most 500 bits of information entropy, since information is not destroyed by lossless compression. A file can have no more information theoretical bits than it can storage bits. If these two ideas need to be distinguished, sometimes the name bit is used when discussing data storage while shannon is used for the statistical bit. However, most of the time, the meaning is clear from the context.

Abbreviation/symbol

No uniform agreement has been reached yet about what the official unit symbols for bit and byte should be. One commonly-quoted standard, the International Electrotechnical Commission's IEC 60027, specifies that "bit" should be the unit symbol for the unit bit (e.g. "kbit" for kilobit), but it does not yet define any symbol for the unit byte.

The other commonly-quoted relevant standard, IEEE 1541, specifies "b" to be the unit symbol for bit and "B" to be that for byte. This convention is also widely used in computing, but has so far not been considered acceptable internationally for several reasons:

both these symbols are already used for other units: "b" for barn and "B" for bel;
"bit" is already short for "binary digit", so there is little reason to abbreviate it any further;
it is customary to start a unit symbol with an uppercase letter only if the unit was named after a person (see also Claude Émile Jean-Baptiste Litre);
instead of byte, the term octet (unit symbol: "o") is used in some fields and in some French-speaking countries, which adds to the difficulty of agreeing on an international symbol;
"b" is occasionally also used for byte, along with "bit" for bit.

The unit bel is rarely used by itself (only as decibel, "dB"), so the chances of conflict with "B" for byte are quite small, even though both units are very commonly used in the same fields (e.g., telecommunication).

The combination of the symbols "bit" for bit and "B" for byte is also widely used in computing. 'b' vs 'B' confusion seems to be common enough to have inspired the creation of a dedicated website b is not B.

More than one bit

A byte is a collection of bits, originally variable in size but now almost always eight bits. Eight-bit bytes, also known as octets, can represent 256 values (2⁸ values, 0–255). A four-bit quantity is known as a nibble, and can represent 16 values (2⁴ values, 0–15). A rarely used term, crumb, can refer to a two-bit quantity, and can represent 4 values (2² values, 0–3).

"Word" is a term for a slightly larger group of bits, but it has no standard size. It represents the size of one register in a Computer-CPU. In the IA-32 architecture, 16 bits are called a "word" (with 32 bits being a double word or dword), but other architectures have word sizes of 8, 32, 64, 80 or others.

Terms for large quantities of bits can be formed using the standard range of prefixes, e.g., kilobit (kbit), megabit (Mbit) and gigabit (Gbit). Note that much confusion exists regarding these units and their abbreviations (see above).

When a bit within a group of bits such as a byte or word is to be referred to, it is usually specified by a number from 0 (not 1) upwards corresponding to its position within the byte or word. However, 0 can refer to either the most significant bit or to the least significant bit depending on the context, so the convention being used must be known.

Certain bitwise computer processor instructions (such as bit set) operate at the level of manipulating bits rather than manipulating data interpreted as an aggregate of bits.

Telecommunications or computer network transfer rates are usually described in terms of bits per second (bps), not to be confused with baud.

Notes

^ Darwin among the machines: the evolution of global intelligence, George Dyson, 1997. ISBN 0-201-40649-7

Retrieved from "http://en.wikipedia.org/wiki/Bit"

Categories: Data unit | Computing portmanteaus

Bit