Product Code Database
Example Keywords: stocking -scarf $44
   » » Wiki: Code Point
Tag Wiki 'Code Point'.

Code point

In character encoding terminology, a code point or code position is any of the numerical values that make up the code space. Glossary of Unicode Terms Many code points represent single characters but they can also have other meanings, such as for formatting.

For example, the character encoding scheme comprises 128 code points in the range 0 to 7Fhex, comprises 256 code points in the range 0hex to FFhex, and comprises code points in the range 0hex to 10FFFFhex. The Unicode code space is divided into seventeen planes (the basic multilingual plane, and 16 supplementary planes), each with (= 216) code points. Thus the total size of the Unicode code space is 17 ×  = .

The notion of a code point is used for abstraction, to distinguish both:
  • the number from an encoding as a sequence of , and
  • the abstract character from a particular graphical representation ().

This is because one may wish to make these distinctions to:

  • encode a particular code space in different ways, or
  • display a character via different glyphs.

For Unicode, the particular sequence of bits is called a – for the UCS-4 encoding, any code point is encoded as 4- (octet) , while in the UTF-8 encoding, different code points are encoded as sequences from one to four bytes long, forming a self-synchronizing code. See comparison of Unicode encodings for details. Code points are normally assigned to abstract characters. An abstract character is not a graphical glyph but a unit of textual data. However, code points may also be left reserved for future assignment (most of the Unicode code space is unassigned), or given other designated functions.

The distinction between a code point and the corresponding abstract character is not pronounced in Unicode, but is evident for many other encoding schemes, where numerous may exist for a single code space.

The concept of a code point is part of Unicode's solution to a difficult conundrum faced by character encoding developers in the 1980s. If they added more bits per character to accommodate larger character sets, that design decision would also constitute an unacceptable waste of then-scarce computing resources for users (who constituted the vast majority of computer users at the time), since those extra bits would always be zeroed out for such users. The code point avoids this problem by breaking the old idea of a direct one-to-one correspondence between characters and particular sequences of bits.

See also
  • Combining character
  • Text-based (computing)
  • Replacement character
  • Unicode collation algorithm


External links

Page 1 of 1
Page 1 of 1


Pages:  ..   .. 
Items:  .. 


General: Atom Feed Atom Feed  .. 
Help:  ..   .. 
Category:  ..   .. 
Media:  ..   .. 
Posts:  ..   ..   .. 


Page:  .. 
Summary:  .. 
1 Tags
10/10 Page Rank
5 Page Refs