An identifier is a name that identifies (that is, labels the identity of) either a unique object or a unique class of objects, where the "object" or class may be an idea, physical countable object (or class thereof), or physical noncountable substance (or class thereof). The abbreviation ID often refers to identity, identification (the process of identifying), or an identifier (that is, an instance of identification). An identifier may be a word, number, letter, symbol, or any combination of those.
The words, numbers, letters, or symbols may follow an code (wherein letters, digits, words, or symbols stand for (represent) ideas or longer names) or they may simply be arbitrary. When an identifier follows an encoding system, it is often referred to as a code or ID code. For instance the ISO/IEC 11179 metadata registry standard defines a code as system of valid symbols that substitute for longer values in contrast to identifiers without symbolic meaning. Identifiers that do not follow any encoding scheme are often said to be arbitrary IDs; they are arbitrarily assigned and have no greater meaning. (Sometimes identifiers are called "codes" even when they are actually arbitrary, whether because the speaker believes that they have deeper meaning or simply because they are speaking casually and imprecisely.)
The unique identifier ( UID) is an identifier that refers to only one instance—only one particular object in the universe. A part number is an identifier, but it is not a unique identifier—for that, a serial number is needed, to identify each instance of the part design. Thus the identifier "Model T" identifies the class (model) of automobiles that Ford's Model T comprises; whereas the unique identifier "Model T Serial Number 159,862" identifies one specific member of that class—that is, one particular Model T car, owned by one specific person.
The concepts of name and identifier are denotation equal, and the terms are thus denotatively ; but they are not always connotation synonymous, because and ID numbers are often connotatively distinguished from names in the sense of traditional natural language naming. For example, both "Jamie Zawinski" and "Netscape employee number 20" are identifiers for the same specific human being; but normal English-language connotation may consider "Jamie Zawinski" a "name" and not an "identifier", whereas it considers "Netscape employee number 20" an "identifier" but not a "name". This is an emic indistinction rather than an etic one.
ID codes may inherently carry metadata along with them. For example, when you know that the food package in front of you has the identifier "2011-09-25T15:42Z-MFR5-P02-243-45", you not only have that data, you also have the metadata that tells you that it was packaged on September 25, 2011, at 3:42pm UTC, manufactured by Licensed Vendor Number 5, at the Peoria, IL, USA plant, in Building 2, and was the 243rd package off the line in that shift, and was inspected by Inspector Number 45.
Arbitrary identifiers might lack metadata. For example, if a food package just says 100054678214, its ID may not tell anything except identity—no date, manufacturer name, production sequence rank, or inspector number. In some cases, arbitrary identifiers such as sequential serial numbers leak information (i.e. the German tank problem). Opaque identifiers—identifiers designed to avoid leaking even that small amount of information—include "really " and Version 4 UUIDs.
Which character sequences constitute identifiers depends on the lexical grammar of the language. A common rule is alphanumeric sequences, with underscore also allowed, and with the condition that it not begin with a digit (to simplify lexing by avoiding confusing with ) – so foo, foo1, foo_bar, _foo are allowed, but 1foo is not – this is the definition used in earlier versions of C and C++, Python 2, and many other languages. Later versions of these languages, along with many other modern languages support almost all Unicode characters in an identifier. However, a common restriction is not to permit whitespace characters and language operators; this simplifies tokenization by making it free-form and context-free. For example, forbidding + in identifiers (due to its use as a binary operation) means that a+b and a + b can be tokenized the same, while if it were allowed, a+b would be an identifier, not an addition. Whitespace in identifier is particularly problematic, as if spaces are allowed in identifiers, then a clause such as if rainy day then 1 is legal, with rainy day as an identifier, but tokenizing this requires the phrasal context of being in the condition of an if clause. Some languages do allow spaces in identifiers, however, such as ALGOL 68 and some ALGOL variants – for example, the following is a valid statement: '''real''' half pi; which could be entered as .real. half pi; (keywords are represented in boldface, concretely via stropping). In ALGOL this was possible because keywords are syntactically differentiated, so there is no risk of collision or ambiguity, spaces are eliminated during the line reconstruction phase, and the source was processed via scannerless parsing, so lexing could be context-sensitive.
In most languages, some character sequences have the lexical form of an identifier but are known as keywords – for example, if is frequently a keyword for an if clause, but lexically is of the same form as ig or foo namely a sequence of letters. This overlap can be handled in various ways: these may be forbidden from being identifiers – which simplifies tokenization and parsing – in which case they are reserved words; they may both be allowed but distinguished in other ways, such as via stropping; or keyword sequences may be allowed as identifiers and which sense is determined from context, which requires a context-sensitive lexer. Non-keywords may also be reserved words (forbidden as identifiers), particularly for forward compatibility, in case a word may become a keyword in future. In a few languages, e.g., PL/1, the distinction is not clear.
The scope, or accessibility within a program of an identifier can be either local or global. A global identifier is declared outside of functions and is available throughout the program. A local identifier is declared within a specific function and only available within that function.
For implementations of programming languages that are using a compiler, identifiers are often only compile time entities. That is, at runtime the compiled program contains references to memory addresses and offsets rather than the textual identifier tokens (these memory addresses, or offsets, having been assigned by the compiler to each identifier).
In languages that support reflection, such as interactive evaluation of source code (using an interpreter or an incremental compiler), identifiers are also runtime entities, sometimes even as first-class objects that can be freely manipulated and evaluated. In Lisp, these are called symbols.
Compilers and interpreters do not usually assign any semantic meaning to an identifier based on the actual character sequence used. However, there are exceptions.
The inverse is also possible, where multiple resources are represented with the same identifier (discussed below).
|international (via ISV)|
|U.S. and NATO|
|originated in U.S.; today international (via ISV)|
|originated in U.S.; today international|
|Handle System Namespace, international scope|
|originated in Germany; today international|
|originated in E.U.; may be seen internationally|
|many scopes, e.g., specific computer systems|
|ISBN is part of the EAN Namespace; international scope|
|U.S., with some international bibliographic usefulness|
|many scopes, e.g., banks, governments|
|Many different systems|
|U.S., with some international bibliographic usefulness|
|many scopes, e.g., company-specific, government-specific|