dtypes.tex

%Part{Dtypes, Root = "CLM.MSS"}
% Chapter of Common Lisp Manual.  Copyright 1984, 1988, 1989 Guy L. Steele Jr.

\clearpage\def\pagestatus{FINAL PROOF}

\ifx \rulang\Undef

\chapter{Data Types}
\label{DTYPES}

Common Lisp provides a variety of types of data objects.  It is important to
note that in Lisp it is data objects that are typed, not variables.
Any variable can have any Lisp object as its value.
(It is possible to make an explicit declaration that a variable will
in fact take on one of only a limited set of values.  However, such
a declaration may always be omitted, and the program will still run correctly.
Such a declaration merely constitutes advice from the user
that may be useful in gaining efficiency.  See \cdf{declare}.)

In Common Lisp, a data type is a (possibly infinite) set of
Lisp objects.  Many Lisp objects belong to more than one
such set, and so it doesn't always make sense to ask what is \emph{the} type
of an object; instead, one usually asks only whether an object belongs
to a given type.  The predicate \cdf{typep} may be used to ask
whether an object belongs to a given type,
and the function \cdf{type-of} returns \emph{a} type
to which a given object belongs.

The data types defined in Common Lisp are arranged into a hierarchy (actually
a partial order) defined by the subset relationship.
Certain sets of objects, such as the set of numbers or the
set of strings, are interesting enough to deserve labels.
Symbols are used for most
such labels (here, and throughout this book, the word ``symbol''
refers to atomic symbols, one kind of Lisp object,
elsewhere known as literal atoms).  See chapter~\ref{DTSPEC}
for a complete description of type specifiers.

The set of all objects is specified
by the symbol {\true}.  The empty data type, which contains no objects, is
denoted by {\nil}.

The following categories of Common Lisp objects are of particular interest:
numbers, characters, symbols, lists, arrays, structures, and functions.
There are others as well.
Some of these categories
have many subdivisions.  There are also standard types defined to
be the union
of two or more of these categories.  The categories listed above, while they
are data types, are neither more nor less ``real'' than other data types;
they simply constitute a particularly useful slice across
the type hierarchy for expository purposes.

Here are brief descriptions of various Common Lisp data types.
The remaining sections of this chapter go into more detail
and also describe notations for objects
of each type.  Descriptions of Lisp functions that operate
on data objects of each type appear in later chapters.

\begin{itemize}
\item
\emph{Numbers} are provided in various forms and representations.
Common Lisp provides a true integer data type: any integer,
positive or negative, has in principle a representation as a
Common Lisp data object, subject only to total memory limitations (rather than
machine word width).
A true rational data type is provided: the quotient of two integers,
if not an integer, is a ratio.
Floating-point numbers of various ranges and precisions are also
provided, as well as
Cartesian complex numbers.

\item
\emph{Characters} represent printed glyphs such as letters
or text formatting operations.  Strings are one-dimensional
arrays of characters.
Common Lisp provides for a rich character set, including ways to
represent characters of various type styles.

\item
\emph{Symbols} (sometimes called \emph{atomic symbols} for emphasis
or clarity) are named data objects.  Lisp provides machinery
for locating a symbol object, given its name (in the form
of a string).  Symbols have \emph{property lists}, which in effect
allow symbols to be treated as record structures with an extensible
set of named components, each of which may be any Lisp object.
Symbols also serve to name functions and variables within programs.

\item
\emph{Lists} are sequences represented in the form of linked cells
called \emph{conses}.  There is a special object (the symbol {\nil})
that is the empty list.  All other lists are built recursively by adding a new
element to the front of an existing list.  This is done by
creating a new \emph{cons}, which is an object having two components
called the \emph{car} and the \emph{cdr}.  The \emph{car} may hold anything,
and the \emph{cdr} is made to point to the previously existing list.
(Conses may actually be used completely generally as two-element
record structures, but their most important use is to represent
lists.)

\item
\emph{Arrays} are dimensioned collections of objects.
An array can have any non-negative number of dimensions and is indexed
by a sequence of integers.  A general array can have any Lisp object as
a component; other types of arrays are specialized for efficiency
and can hold only certain types of Lisp objects.
It is possible for two arrays, possibly with differing dimension information,
to share the same set of elements (such that modifying one array modifies
the other also) by causing one to be \emph{displaced} to the other.
One-dimensional arrays of any kind are called \emph{vectors}.
One-dimensional arrays of characters are called \emph{strings}.
One-dimensional arrays of bits (that is, of integers whose values are 0 or 1)
are called \emph{bit-vectors}.

\item
\emph{Hash tables} provide an efficient way of mapping any
Lisp object (a \emph{key}) to an associated object.

\item
\emph{Readtables} are used to control the built-in expression parser
\cdf{read}.

\item
\emph{Packages} are collections of symbols that serve as name spaces.
The parser recognizes symbols by looking up character sequences
in the current package.

\item
\emph{Pathnames} represent names of files in a fairly implementation-independent
manner.  They are used to interface to the external file system.

\item
\emph{Streams} represent sources or sinks of data, typically characters
or bytes.  They are used to perform I/O, as well as for internal
purposes such as parsing strings.

\item
\emph{Random-states} are data structures used to encapsulate the state
of the built-in random-number generator.

\item
\emph{Structures} are user-defined record structures, objects that
have named components.  The \cdf{defstruct} facility is used
to define new structure types.  Some Common Lisp implementations may
choose to implement certain system-supplied data types,
such as \emph{bignums}, \emph{readtables}, \emph{streams},
\emph{hash tables}, and \emph{pathnames}, as structures,
but this fact will be invisible to the user.

\item
\emph{Conditions} are objects used to affect control flow in certain
conventional ways by means of signals and handlers that intercept those signals.
In particular, errors are signaled by raising particular conditions,
and errors may be trapped by establishing handlers for those conditions.

\item
\emph{Classes} determine the structure and behavior of other
objects, their \emph{instances}.  Every Common Lisp data object
belongs to some class.  (In some ways the CLOS class system is
a generalization of the system of type specifiers of the first edition of this book,
but the class system augments the type system rather than supplanting it.)

\item
\emph{Methods} are chunks of code that operate on arguments
satisfying a particular pattern of classes.  Methods are
not functions; they are not invoked directly on arguments
but instead are bundled into generic functions.

\item
\emph{Generic functions} are functions that contain, among other
information, a set of methods.  When invoked, a generic function
executes a subset of its methods.  The subset chosen for execution
depends in a specific way on the classes or identities of the arguments
to which it is applied.
\end{itemize}

These categories are not always mutually exclusive.
The required relationships among the various data types are
explained in more detail in section~\ref{DATA-TYPE-RELATIONSHIPS}.

\section{Numbers}

Several kinds of numbers are defined in Common Lisp.
They are divided into \emph{integers}; \emph{ratios};
\emph{floating-point numbers}, with names provided for
up to four different floating-point representations; \emph{reals} and
\emph{complex numbers}.

The \cdf{number} data type encompasses all kinds of
numbers.  For convenience, there are names for some
subclasses of numbers as well.  Integers and ratios are of
type \cdf{rational}.  Rational numbers and floating-point
numbers are of type \cdf{real}.  Real numbers and complex
numbers are of type \cdf{number}.

Although the names of these types were chosen with the
terminology of mathematics in mind, the correspondences
are not always exact.  Integers and ratios model the
corresponding mathematical concepts directly.  Numbers
of type \cdf{float} may be used to approximate real
numbers, both rational and irrational.  The \cdf{real} type
includes all Common Lisp numbers that represent
mathematical real numbers, though there are
mathematical real numbers (irrational numbers)
that do not have an exact Common Lisp representation.
Only \cdf{real} numbers may be ordered using the \cdf{<}, \cdf{>}, \cdf{<=},
and \cdf{>=} functions.

\subsection{Integers}
\label{INTEGERS-SECTION}
\indexterm{integer}

The \cdf{integer} data type is intended to represent mathematical integers.
Unlike most programming languages, Common Lisp in principle imposes no limit on
the magnitude of an integer; storage
is automatically allocated as necessary to represent large integers.

In every Common Lisp implementation there is a range of integers that are
represented more efficiently than others; each such integer is called a
\emph{fixnum}, and an integer that is not a fixnum is called a
\emph{bignum}.
Common Lisp is designed to hide this distinction as much as possible;
the distinction between fixnums and bignums is visible to
the user in only a few places where the efficiency of representation is
important.  Exactly which integers are
fixnums is implementation-dependent; typically they will be those
integers in the range $-2^{n}$ to $2^{n}-1$,
inclusive, for some \emph{n} not less than 15.
See \cdf{most-positive-fixnum} and \cdf{most-negative-fixnum}.

\cdf{fixnum} must be a supertype
of the type \cd{(signed-byte 16)}, and additionally that the value
of \cdf{array-dimension-limit} must be a fixnum (implying that the implementor
should choose the range of fixnums to be large enough to accommodate the
largest size of array to be supported).

\beforenoterule
\begin{rationale}
This specification allows programmers to declare variables in portable code
to be of type \cdf{fixnum} for efficiency.  Fixnums are guaranteed to
encompass at least the set of 16-bit signed integers
(compare this to the data type \cd{short int} in the C programming language).
In addition, any valid array index must be a fixnum, and therefore variables
used to hold array indices (such as a \cdf{dotimes} variable)
may be declared \cdf{fixnum} in portable code.
\end{rationale}
\afternoterule

Integers are ordinarily written in decimal notation, as a sequence
of decimal digits, optionally preceded by a sign and optionally followed
by a decimal point.
For example:
\begin{lisp}
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\=\kill
\>0~~~~~\';\textrm{Zero} \\*
\>-0~~~~~\';\textrm{This \emph{always} means the same as \cd{0}} \\*
\>+6~~~~~\';\textrm{The first perfect number} \\
\>28~~~~~\';\textrm{The second perfect number} \\
\>1024.~~~~~\';\textrm{Two to the tenth power} \\*
\>-1~~~~~\';\textrm{$e^{\pi i}$} \\*
\>15511210043330985984000000.~~~~~\';\textrm{25 factorial (25!), probably a bignum}
\end{lisp}

Integers may be notated in radices other than ten.
The notation
\begin{lisp}
\#\emph{nn}r\emph{ddddd}     \textrm{or}     \#\emph{nn}R\emph{ddddd}
\end{lisp}
means the integer in radix-\emph{nn} notation denoted by the digits
\emph{ddddd}.  More precisely, one may write \cd{\#}, a non-empty sequence
of decimal digits representing an unsigned decimal integer \emph{n},
\cdf{r} (or \cdf{R}), an optional sign, and a sequence of radix-\emph{n}
digits, to indicate an integer written in radix \emph{n} (which must be
between 2 and 36, inclusive).  Only legal digits
for the specified radix may be used; for example, an octal number may
contain only the digits 0 through 7.  For digits above 9,
letters of the alphabet of either
case may be used in order.  Binary, octal, and
hexadecimal radices are useful enough to warrant the special
abbreviations \cd{\#b} for \cd{\#2r}, \cd{\#o} for \cd{\#8r}, and
\cd{\#x} for \cd{\#16r}.
For example:
\begin{lisp}
~~~~~~~~~~~~~~~~\=\kill
\>\#2r11010101~~~~~\';\textrm{Another way of writing \cd{213} decimal} \\
\>\#b11010101~~~~~\';\textrm{Ditto} \\
\>\#b+11010101~~~~~\';\textrm{Ditto} \\
\>\#o325~~~~~\';\textrm{Ditto, in octal radix} \\
\>\#xD5~~~~~\';\textrm{Ditto, in hexadecimal radix} \\
\>\#16r+D5~~~~~\';\textrm{Ditto} \\
\>\#o-300~~~~~\';\textrm{Decimal -192, written in base 8} \\
\>\#3r-21010~~~~~\';\textrm{Same thing in base 3} \\
\>\#25R-7H~~~~~\';\textrm{Same thing in base 25} \\
\>\#xACCEDED~~~~~\';\textrm{181202413, in hexadecimal radix}
\end{lisp}

\subsection{Ratios}
\indexterm{ratio}
\indexterm{rational}

A \emph{ratio} is a number representing the mathematical ratio
of two integers.  Integers and ratios collectively constitute
the type \cdf{rational}.
The canonical representation of a rational number is as an
integer if its value is integral, and otherwise as the ratio of two
integers, the \emph{numerator} and \emph{denominator}, whose greatest
common divisor is 1, and of which the denominator is positive (and in
fact greater than 1, or else the value would be integral).
A ratio is notated with
\cdf{/} as a separator, thus: \cd{3/5}.  It is possible to notate
ratios in non-canonical (unreduced) forms, such as \cd{4/6}, but the
Lisp function \cd{prin1} always prints the canonical form for a
ratio.

If any computation produces a result that is a ratio of
two integers such that the denominator evenly divides the
numerator, then the result is immediately converted to the equivalent
integer.  This is called the rule of \emph{rational canonicalization}.

Rational numbers may be written as the possibly signed quotient of
decimal numerals: an optional sign followed by two non-empty sequences of
digits separated by a \cd{/}.  This syntax may be described as
follows:

\begin{tabbing}
\emph{ratio} ::= \Mopt{\emph{sign}} \Mplus{\emph{digit}} \cd{/} \Mplus{\emph{digit}}
\end{tabbing}

The second sequence may not consist
entirely of zeros.
For example:
\begin{lisp}
2/3~~~~~~~~~~~~~~~~~~~~;\textrm{This is in canonical form} \\
4/6~~~~~~~~~~~~~~~~~~~~;\textrm{A non-canonical form for the same number} \\
-17/23~~~~~~~~~~~~~~~~~;\textrm{A not very interesting ratio} \\
-30517578125/32768~~~~~;\textrm{This is $(-5/2)^{15}$} \\
10/5~~~~~~~~~~~~~~~~~~~;\textrm{The canonical form for this is \cd{2}}
\end{lisp}

To notate rational numbers in radices other than ten,
one uses the same radix specifiers
(one of \cd{\#\emph{nn}R}, \cd{\#O}, \cd{\#B}, or \cd{\#X}) as for integers.
For example:

\begin{lisp}
\#o-101/75~~~~~~~~~~;\textrm{Octal notation for \cd{-65/61}} \\
\#3r120/21~~~~~~~~~~;\textrm{Ternary notation for \cd{15/7}} \\
\#Xbc/ad~~~~~~~~~~~~;\textrm{Hexadecimal notation for \cd{188/173}} \\
\#xFADED/FACADE~~~~~;\textrm{Hexadecimal notation for \cd{1027565/16435934}}
\end{lisp}

\subsection{Floating-Point Numbers}

Common Lisp allows an implementation to provide one or more kinds of
floating-point number, which collectively make up the type \cdf{float}.
Now a floating-point number is a (mathematical)
rational number of the form
$\emph{s} \cdot \emph{f} \cdot \emph{b}^{e-p}$,
where \emph{s} is $+1$ or $-1$, the \emph{sign};
\emph{b} is an integer greater than 1,
the \emph{base} or \emph{radix} of the representation;
\emph{p} is a positive integer,
the \emph{precision} (in base-\emph{b} digits) of the floating-point number;
\emph{f} is a positive integer between
$\emph{b}^{p-1}$ and $\emph{b}^{p}-1$ (inclusive),
the \emph{significand};
and \emph{e} is an integer, the \emph{exponent}.
The value of \emph{p} and the range of \emph{e}
depends on the implementation and on the type of floating-point number
within that implementation.
In addition, there is a floating-point zero;
depending on the implementation, there may also be a ``minus zero.''
If there is no minus zero, then \cd{0.0} and \cd{-0.0} are
both interpreted as simply a floating-point zero.

\beforenoterule
\begin{implementation}
The form of the above description should not be construed
to require the internal representation to be in sign-magnitude form.
Two's-complement and other representations are also acceptable.  Note
that the radix of the internal representation may be other than 2, as on
the IBM 360 and 370, which use radix 16; see
\cdf{float-radix}.
\end{implementation}
\afternoterule

Floating-point numbers may be provided in a variety of precisions and sizes,
depending on the implementation.  High-quality floating-point
software tends to depend critically on the precise nature of the
floating-point arithmetic and so may not always be completely portable.
As an aid in writing programs that are
moderately portable, however, certain definitions are made here:
\begin{itemize}
\item
A \emph{short} floating-point number (type \cdf{short-float})
is of the representation of smallest
fixed precision provided by an implementation.

\item
A \emph{long} floating-point number (type \cdf{long-float})
is of the representation of the largest fixed 
precision provided by an implementation.

\item
Intermediate between short and long formats are two others, arbitrarily
called \emph{single} and \emph{double} (types \cdf{single-float} and \cdf{double-float}).
\end{itemize}

The precise definition of these categories is implementation-dependent.
However, the rough intent is that short floating-point numbers be
precise to at least four decimal places (but also have
a space- efficient representation);
single floating-point numbers, to at least seven decimal places;
and double floating-point numbers, to at least fourteen decimal places.
It is suggested that
the precision (measured in bits, computed as $p \log_2 b$)
and the exponent size (also measured in bits, computed as the base-2
logarithm of 1 plus the maximum exponent value) be at least as great
as the values in table~\ref{Floating-Format-Requirements-Table}.

\begin{table}[t]
\caption{Recommended Minimum Floating-Point Precision and Exponent Size}
\label{Floating-Format-Requirements-Table}
\begin{tabular}{@{}lll@{}}
{Format\quad\quad}&{Minimum Precision\quad\quad}&{Minimum Exponent Size} \\ \hlinesp
Short&13 bits&5 bits \\
Single&24 bits&8 bits \\
Double&50 bits&8 bits \\
Long&50 bits&8 bits
\end{tabular}
\end{table}

Floating-point numbers are written in either decimal fraction
or computerized scientific notation: an optional sign,
then a non-empty sequence of digits with an embedded decimal point,
then an optional decimal exponent specification.
If there is no exponent specifier, then
the decimal point is required, and there must be digits
after it.
The exponent specifier consists of an exponent marker,
an optional sign, and a non-empty sequence of digits.
For preciseness, here is a modified-BNF description of floating-point
notation.
\begin{tabbing}
\emph{floating-point-number} ::= \=\Mopt{\emph{sign}} \Mstar{\emph{digit}} {\it
decimal-point} \Mplus{\emph{digit}} \Mopt{\emph{exponent}} \\*
\>\hbox to 0pt{\hss\Mor~}\Mopt{{\it
sign}} \Mplus{\emph{digit}} \Mopt{\emph{decimal-point} \Mstar{\emph{digit}}} {\it
exponent} \\
\emph{sign} ::= \cdf{+} {\Mor} \cdf{-} \\
\emph{decimal-point} ::= \cd{.} \\
\emph{digit} ::= \cd{0} {\Mor} \cd{1} {\Mor} \cd{2} {\Mor} \cd{3} {\Mor} \cd{4}
         {\Mor} \cd{5} {\Mor} \cd{6} {\Mor} \cd{7} {\Mor} \cd{8} {\Mor} \cd{9}\\
\emph{exponent} ::= \emph{exponent-marker} \Mopt{\emph{sign}} \Mplus{\emph{digit}}\\*
\emph{exponent-marker} ::= \cd{e} {\Mor} \cd{s} {\Mor} \cd{f}
{\Mor} \cd{d} {\Mor} \cd{l} {\Mor} \cd{E} {\Mor} \cd{S} {\Mor} \cd{F} {\Mor}
\cd{D} {\Mor} \cd{L}
\end{tabbing}
If no exponent specifier is present, or if the exponent marker \cdf{e}
(or \cdf{E}) is used, then the precise format to be used is not
specified.  When such a representation is read and
converted to an internal floating-point data object, the format specified
by the variable \cdf{*read-default-float-format*} is used; the initial
value of this variable is \cdf{single-float}.

The letters \cd{s}, \cd{f}, \cd{d}, and \cd{l} (or their
respective uppercase equivalents) explicitly specify the
use of \emph{short}, \emph{single}, \emph{double}, and \emph{long} format, respectively.

Examples of floating-point numbers:
\begin{lisp}
0.0~~~~~~~~~~~~~~~~~~~~~~~~~;\textrm{Floating-point zero in default format} \\
0E0~~~~~~~~~~~~~~~~~~~~~~~~~;\textrm{Also floating-point zero in default format} \\
-.0~~~~~~~~~~~~~~~~~~~~~~~~~;\textrm{This may be a zero or a minus zero,} \\
~~~~~~~~~~~~~~~~~~~~~~~~~~~~; \textrm{depending on the implementation} \\
0.~~~~~~~~~~~~~~~~~~~~~~~~~~;\textrm{The \emph{integer} zero, not a floating-point zero!} \\
0.0s0~~~~~~~~~~~~~~~~~~~~~~~;\textrm{A floating-point zero in \emph{short} format} \\
0s0~~~~~~~~~~~~~~~~~~~~~~~~~;\textrm{Also a floating-point zero in \emph{short} format} \\
3.1415926535897932384d0~~~~~;\textrm{A \emph{double}-format approximation to $\pi$} \\
6.02E+23~~~~~~~~~~~~~~~~~~~~;\textrm{Avogadro's number, in default format} \\
602E+21~~~~~~~~~~~~~~~~~~~~~;\textrm{Also Avogadro's number, in default format} \\
3.010299957f-1~~~~~~~~~~~~~~;\textrm{$\log_{10} 2$, in \emph{single} format} \\
-0.000000001s9~~~~~~~~~~~~~~;\textrm{$e^{\pi i}$ in \emph{short} format, the hard way}
\end{lisp}

The internal format used for an external representation depends only
on the exponent marker and not on the number of decimal digits
in the external representation.

While Common Lisp provides terminology and notation sufficient
to accommodate four distinct floating-point formats,
not all implementations will have the means to support
that many distinct formats.
An implementation is therefore permitted to provide
fewer than four distinct internal floating-point formats,
in which case at least one of them will be ``shared''
by more than one of the external format names \emph{short}, \emph{single},
\emph{double}, and \emph{long} according to the following rules:
\begin{itemize}
\item
If one internal format is provided, then it is considered to be
\emph{single}, but serves also as \emph{short}, \emph{double}, and \emph{long}.
The data types \cdf{short-float},
\cdf{single-float}, \cdf{double-float}, and \cdf{long-float} are
considered to be identical.  An expression such as \cd{(eql 1.0s0 1.0d0)}
will be true in such an implementation
because the two numbers \cd{1.0s0} and \cd{1.0d0} will
be converted into the same internal format and therefore be considered
to have the same data type, despite the differing external syntax.
Similarly, \cd{(typep 1.0L0 'short-float)} will be true in such
an implementation.
For output purposes all floating-point numbers are assumed to be
of \emph{single} format and thus will print using the
exponent letter \cdf{E} or \cdf{F}.

\item
If two internal formats are provided, then either of two correspondences
may be used, depending on which is the more appropriate:
\begin{itemize}
\item
One format is \emph{short}; the other is \emph{single} and serves also
as \emph{double} and \emph{long}.
The data types
\cdf{single-float}, \cdf{double-float}, and \cdf{long-float} are
considered to be identical, but \cdf{short-float} is distinct.
An expression such as \cd{(eql 1.0s0 1.0d0)}
will be false, but \cd{(eql 1.0f0 1.0d0)} will be true.
Similarly, \cd{(typep 1.0L0 'short-float)} will be false,
but \cd{(typep 1.0L0 'single-float)} will be true.
For output purposes all floating-point numbers are assumed to be
of \emph{short} or \emph{single} format.

\item
One format is \emph{single} and serves also as \emph{short};
the other is \emph{double} and serves also as \emph{long}.
The data types \cdf{short-float} and \cdf{single-float} are considered to be
identical, and the data types \cdf{double-float} and \cdf{long-float} are
considered to be identical.
An expression such as \cd{(eql 1.0s0 1.0d0)}
will be false, as will \cd{(eql 1.0f0 1.0d0)};
but \cd{(eql 1.0d0 1.0L0)} will be true.
Similarly, \cd{(typep 1.0L0 'short-float)} will be false,
but \cd{(typep 1.0L0 'double-float)} will be true.
For output purposes all floating-point numbers are assumed to be
of \emph{single} or \emph{double} format.
\end{itemize}

\item
If three internal formats are provided, then either of two correspondences
may be used, depending on which is the more appropriate:
\begin{itemize}
\item
One format is \emph{short}; another format is \emph{single}; and the third format is
\emph{double} and serves also as \emph{long}.  Similar constraints apply.

\item
One format is \emph{single} and serves also as \emph{short};
another is \emph{double}; and the third format is \emph{long}.
\end{itemize}
\end{itemize}

\beforenoterule
\begin{implementation}
It is recommended that an implementation
provide as many distinct floating-point formats as feasible,
using table~\ref{Floating-Format-Requirements-Table} as a guideline.
Ideally, short-format floating-point numbers should have an
``immediate'' representation that does not require heap allocation;
single-format
floating-point numbers should approximate IEEE proposed standard
single-format floating-point numbers; and double-format floating-point
numbers should approximate IEEE proposed standard double-format
floating-point numbers
\cite{IEEE-PROPOSED-FLOATING-POINT-STANDARD,IEEE-FLOATING-POINT-IMPL-GUIDE,IEEE-FLOATING-POINT-IMPL-GUIDE-ERRATA}.
\end{implementation}
\afternoterule


\subsection{Complex Numbers}

Complex numbers (type \cdf{complex})
are represented in Cartesian form, with a real part and an imaginary
part, each of which is a non-complex number (integer, ratio, or floating-point
number).  It should be emphasized that the parts of a complex
number are not necessarily floating-point numbers; in this, Common Lisp
is like PL/I and differs from Fortran.  However, both parts must
be of the same type: either both are rational, or both are of the
same floating-point format. 

Complex numbers may be notated by writing the characters \cd{\#C}
followed by a list of the real and imaginary parts.
If the two parts as notated are not of the same type, then
they are converted according to the rules of floating-point contagion
as described in chapter~\ref{NUMBER}.
(Indeed, \cd{\#C(\emph{a} \emph{b})} is equivalent to \cd{\#,(complex \emph{a} \emph{b})};
see the description of the function \cdf{complex}.)
For example:
\begin{lisp}
\#C(3.0s1 2.0s-1)~~~~~;\textrm{Real and imaginary parts are short format}\\
\#C(5 -3)~~~~~~~~~~~~~;\textrm{A Gaussian integer} \\
\#C(5/3 7.0)~~~~~~~~~~;\textrm{Will be converted internally to \cd{\#C(1.66666 7.0)}} \\
\#C(0 1)~~~~~~~~~~~~~~;\textrm{The imaginary unit, that is, \emph{i}}
\end{lisp}

The type of a specific complex number is indicated by a list
of the word \cdf{complex} and the type of the components; for example,
a specialized representation for complex numbers with short floating-point
parts would be of type \cd{(complex short-float)}.  The type \cdf{complex}
encompasses all complex representations.

A complex number of type \cd{(complex rational)}, that is, one whose
components are rational, can never have a zero imaginary part.
If the result of a computation would be a complex rational
with a zero imaginary part, the result is immediately
converted to a non-complex rational number by taking the
real part.  This is called the rule of \emph{complex canonicalization}.
This rule does not apply to floating-point complex numbers;
\cd{\#C(5.0 0.0)} and \cd{5.0} are different.

\section{Characters}

Characters are represented as data objects of type \cdf{character}.

A character object can be notated by writing \cd{\#{\Xbackslash}} followed
by the character itself.  For example, \cd{\#{\Xbackslash}g} means the character
object for a lowercase g.  This works well enough for printing
characters.  Non-printing characters have names, and can be notated
by writing \cd{\#{\Xbackslash}} and then the name; for example, \cd{\#{\Xbackslash}Space}
(or \cd{\#{\Xbackslash}SPACE} or \cd{\#{\Xbackslash}space} or \cd{\#{\Xbackslash}sPaCE})
means the space character.  The syntax for character names after \cd{\#{\Xbackslash}}
is the same as that for symbols.  However, only character names
that are known to the particular implementation may be used.

\subsection{Standard Characters}

Common Lisp defines a standard character set (subtype \cdf{standard-char})
for two purposes.
Common Lisp programs that are \emph{written} in the standard character set
can be read by any Common Lisp implementation; and Common Lisp programs
that \emph{use} only standard characters as data objects are most likely
to be portable.  The Common Lisp character set consists of a space character
\cd{\#{\Xbackslash}Space}, a newline character \cd{\#{\Xbackslash}Newline}, and the
following ninety-four
non-blank printing characters or their equivalents:
\begin{lisp}
! " \# \$ \% \& ' ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9 : ; < = > ? \\
{\Xatsign} A B C D E F G H I J K L M N O P Q R S T U V W X Y Z {\Xlbracket} {\Xbackslash} {\Xrbracket} {\Xcircumflex} {\Xunderscore} \\
{\Xbq} a b c d e f g h i j k l m n o p q r s t u v w x y z {\Xlbrace} | {\Xrbrace} {\Xtilde}
\end{lisp}
The Common Lisp standard character set is apparently equivalent to
the ninety-five standard ASCII printing characters plus a newline character.
Nevertheless, Common Lisp is designed to be relatively independent of
the ASCII character encoding.  For example, the collating sequence
is not specified except to say that digits must be properly ordered,
the uppercase letters must be properly ordered, and
the lowercase letters must be properly ordered
(see \cdf{char<} for a precise specification).
Other character encodings, particularly EBCDIC, should be easily accommodated
(with a suitable mapping of printing characters).

Of the ninety-four non-blank printing characters, the following are
used in only limited ways in the syntax of Common Lisp programs:
\begin{lisp}
{\Xlbracket}~~{\Xrbracket}~~{\Xlbrace}~~{\Xrbrace}~~?~~!~~{\Xcircumflex}~~{\Xunderscore}~~{\Xtilde}~~\$~~\% 
\end{lisp}

The following characters are called \emph{semi-standard}:
\begin{lisp}
\#{\Xbackslash}Backspace~~\#{\Xbackslash}Tab~~\#{\Xbackslash}Linefeed~~\#{\Xbackslash}Page~~\#{\Xbackslash}Return~~\#{\Xbackslash}Rubout
\end{lisp}
Not all implementations of Common Lisp need to support them; but those
implementations that
use the standard ASCII character set should support them, treating them as
corresponding respectively to the ASCII characters BS (octal code 010),
HT (011), LF (012), FF (014), CR (015), and DEL
(177). These characters are not
members of the subtype \cdf{standard-char} unless synonymous with
one of the standard characters specified above.
For example, in a given implementation it might
be sensible for the implementor to define
\cd{\#{\Xbackslash}Linefeed} or \cd{\#{\Xbackslash}Return} to be synonymous with \cd{\#{\Xbackslash}Newline},
or \cd{\#{\Xbackslash}Tab} to be synonymous with \cd{\#{\Xbackslash}Space}.

\subsection{Line Divisions}

The treatment of line divisions is one of the most difficult issues
in designing portable software, simply because there is so little agreement
among operating systems.  Some use a single character to delimit lines;
the recommended ASCII character for this purpose is the line feed character
LF (also called the new line character, NL),
but some systems use the carriage
return character CR.  Much more common is the two-character sequence
CR followed by LF.  Frequently line divisions have no representation
as a character but are implicit in the structuring of a file into records,
each record containing a line of text.  A deck of punched cards has this
structure, for example.

Common Lisp provides an abstract interface by requiring that there be a single
character, \cd{\#{\Xbackslash}Newline}, that within the language serves as a line
delimiter.  (The language C has a similar requirement.)
An implementation of Common Lisp must translate between this internal
single-character representation and whatever external representation(s)
may be used.

\beforenoterule
\begin{implementation}
How the character called \cd{\#{\Xbackslash}Newline} is represented
internally is not specified here, but it is strongly suggested that
the ASCII LF character be used in Common Lisp implementations that use the
ASCII character encoding.  The ASCII CR character is a workable,
but in most cases inferior, alternative.
\end{implementation}
\afternoterule

The requirement that a line division be represented as a single character
has certain consequences.  A character string
written in the middle of a program in such a way as to span more than
one line must contain exactly one character to represent each line division.
Consider this code fragment:
\begin{lisp}
(setq a-string "This string \\
contains \\
forty-two characters.")
\end{lisp}
Between \cdf{g} and \cdf{c} there must be exactly one character,
\cd{\#{\Xbackslash}Newline}; a two-character sequence, such as \cd{\#{\Xbackslash}Return} and then
\cd{\#{\Xbackslash}Newline}, is not acceptable, nor is the absence of a character.
The same is true between \cdf{s} and \cdf{f}.

When the character \cd{\#{\Xbackslash}Newline} is written to an output file,
the Common Lisp implementation must take the appropriate action
to produce a line division.  This might involve writing out a
record or translating \cd{\#{\Xbackslash}Newline} to a CR/LF sequence.

\beforenoterule
\begin{implementation}
If an implementation uses the ASCII character encoding,
uses the CR/LF sequence externally to delimit lines,
uses LF to represent \cd{\#{\Xbackslash}Newline} internally, and supports \cd{\#{\Xbackslash}Return}
as a data object corresponding to the ASCII character CR, the
question arises as to what action to take when the program
writes out \cd{\#{\Xbackslash}Return} followed by \cd{\#{\Xbackslash}Newline}.
It should first be noted that \cd{\#{\Xbackslash}Return} is not a standard Common Lisp
character, and the action to be taken when \cd{\#{\Xbackslash}Return} is written out
is therefore not defined by the Common Lisp language.  A plausible approach
is to buffer the \cd{\#{\Xbackslash}Return} character and suppress it if and only if the
next character is \cd{\#{\Xbackslash}Newline} (the net effect is to generate a CR/LF
sequence).
Another plausible
approach is simply to ignore
the difficulty and declare that writing \cd{\#{\Xbackslash}Return} and then
\cd{\#{\Xbackslash}Newline} results in the sequence CR/CR/LF in the output.
\end{implementation}
\afternoterule

\subsection{Non-standard Characters}

Any implementation may provide additional characters, whether printing
characters or named characters.  Some plausible examples:

\begin{lisp}
\#{\Xbackslash}$\pi$~~\#{\Xbackslash}$\alpha$~~\#{\Xbackslash}Break~~\#{\Xbackslash}Home-Up~~\#{\Xbackslash}Escape
\end{lisp}
The use of such characters may render Common Lisp programs non-portable.

\section{Symbols}

Symbols are Lisp data objects that serve several purposes
and have several interesting characteristics.  Every object of
type \cdf{symbol} has a name,
called its \emph{print name}.  Given a symbol, one can
obtain its name in the form of a string.  Conversely,
given the name of a symbol as a string, one can obtain the
symbol itself.  (More precisely, symbols are organized into
\emph{packages}, and all the symbols in a package are uniquely
identified by name.  See chapter~\ref{XPACK}.)

Symbols have a component called the \emph{property list}, or \emph{plist}.
By convention this is always a list whose even-numbered
components (calling the first component zero) are symbols,
here functioning as property names, and whose odd-numbered components
are associated property values.  Functions are provided for manipulating
this property list; in effect, these allow a symbol to be treated as an
extensible record structure.

Symbols are also used to represent certain kinds of variables in Lisp
programs, and there are functions for dealing with the values associated
with symbols in this role. 

A symbol can be notated simply by writing its name.
If its name is not empty, and if the name consists only of
uppercase alphabetic, numeric, or certain pseudo-alphabetic
special characters (but not
delimiter characters such as parentheses or space), and if
the name of the symbol cannot be mistaken for a number, then
the symbol can be notated by the sequence of characters in its name.
Any uppercase letters that appear in the (internal) name may
be written in either case in the external notation (more on this below).
For example:
\begin{lisp}
~~~~~~~~~~~~~~~~~~~~\=\kill
FROBBOZ\>;\textrm{The symbol whose name is \cdf{FROBBOZ}} \\
frobboz\>;\textrm{Another way to notate the same symbol} \\
fRObBoz\>;\textrm{Yet another way to notate it} \\
unwind-protect\>;\textrm{A symbol with a \cdf{-} in its name} \\
+\$\>;\textrm{The symbol named \cd{+\$}} \\
1+\>;\textrm{The symbol named \cdf{1+}} \\
+1\>;\textrm{This is the integer 1, not a symbol} \\
pascal{\Xunderscore}style\>;\textrm{This symbol has an underscore in its name} \\
b{\Xcircumflex}2-4*a*c\>;\textrm{This is a single symbol!} \\
\>;~\textrm{It has several special characters in its name} \\
file.rel.43\>;\textrm{This symbol has periods in its name} \\
/usr/games/zork\>;\textrm{This symbol has slashes in its name}
\end{lisp}

In addition to letters and numbers, the following characters are normally
considered to be alphabetic for the purposes of notating
symbols:
\begin{lisp}
+~~-~~*~~/~~{\Xatsign}~~\$~~\%~~{\Xcircumflex}~~\&~~{\Xunderscore}~~=~~<~~>~~{\Xtilde}~~.
\end{lisp}
Some of these characters have conventional purposes for naming things;
for example, symbols that name special variables
generally have names beginning and ending with
\cdf{*}.  The last character listed above, the period, is considered alphabetic
\emph{provided} that a token does not consist entirely of periods.
A single period standing by itself is used in the notation
of conses and dotted lists; a token consisting of two or more periods
is syntactically illegal.  (The period also serves as the decimal point
in the notation of numbers.)

The following characters are also alphabetic by default but are explicitly
reserved to the user for definition as reader macro characters
(see section~\ref{MACRO-CHARACTERS-SECTION}) or any other desired purpose
and therefore should not be used routinely in names of symbols:
\begin{lisp}
?~~!~~{\Xlbracket}~~{\Xrbracket}~~{\Xlbrace}~~{\Xrbrace}
\end{lisp}

A symbol may have uppercase letters, lowercase letters, or both
in its print name.
However, the Lisp reader normally converts lowercase letters to
the corresponding uppercase letters when reading symbols.
The net effect is that most of the time case makes no
difference when \emph{notating} symbols.  Case \emph{does} make
a difference internally and when printing a symbol.
Internally the symbols that name all standard Common Lisp functions,
variables, and keywords have uppercase names; their names appear
in lowercase in this book for readability.  Typing such names
with lowercase letters works because the function \cdf{read} will convert
lowercase letters to the equivalent uppercase letters.

\cdf{readtable-case}, which controls whether \cdf{read} will alter the case
of letters read as part of the name of a symbol.

If a symbol cannot be simply notated by the characters of its name
because the (internal) name contains special characters or lowercase letters,
then there are two ``escape'' conventions for notating them.
Writing a \cd{{\Xbackslash}} character before any character causes the character
to be treated itself as an ordinary character for use in a symbol name;
in particular, it suppresses internal conversion of lowercase letters
to their uppercase equivalents.
If any character in a notation is preceded by \cd{{\Xbackslash}}, then that
notation can never be interpreted as a number.
For example:
\begin{lisp}
~~~~~~~~~~~~~~~~~~~~~~~~\=\kill
{\Xbackslash}(\>;\textrm{The symbol whose name is \cd{(}} \\
{\Xbackslash}+1\>;\textrm{The symbol whose name is \cd{+1}} \\
{\Xbackslash}1\>;\textrm{Also the symbol whose name is \cd{+1}} \\
{\Xbackslash}frobboz\>;\textrm{The symbol whose name is \cd{fROBBOZ}} \\
3.14159265{\Xbackslash}s0\>;\textrm{The symbol whose name is \cd{3.14159265s0}} \\
3.14159265{\Xbackslash}S0\>;\textrm{A different symbol, whose name is \cd{3.14159265S0}} \\
3.14159265s0\>;\textrm{A short-format floating-point approximation to $\pi$} \\
APL{\Xbackslash}{\Xbackslash}360\>;\textrm{The symbol whose name is \cd{APL{\Xbackslash}360}} \\
apl{\Xbackslash}{\Xbackslash}360\>;\textrm{Also the symbol whose name is \cd{APL{\Xbackslash}360}} \\
{\Xbackslash}(b{\Xcircumflex}2{\Xbackslash}){\Xbackslash} -{\Xbackslash} 4*a*c\>;\textrm{The name is \cd{(B{\Xcircumflex}2) - 4*A*C};} \\
\>;~\textrm{it has parentheses and two spaces in it} \\
{\Xbackslash}({\Xbackslash}b{\Xcircumflex}2{\Xbackslash}){\Xbackslash} -{\Xbackslash} 4*{\Xbackslash}a*{\Xbackslash}c\>;\textrm{The name is \cd{(b{\Xcircumflex}2) - 4*a*c};} \\
\>;~\textrm{the letters are explicitly lowercase}
\end{lisp}

It may be tedious to insert a \cd{{\Xbackslash}} before \emph{every} delimiter
character in the name of a symbol if there are many of them.
An alternative convention is to surround the name of a symbol
with vertical bars; these cause every character between them to
be taken as part of the symbol's name, as if \cd{{\Xbackslash}} had been written
before each one, excepting only
\cd{|} itself and \cd{{\Xbackslash}}, which must nevertheless be preceded by \cd{{\Xbackslash}}.
For example:
\begin{lisp}
~~~~~~~~~~~~~~~~~~~~\=\kill
|"|\>;\textrm{The same as writing \cd{{\Xbackslash}"}} \\
|(b{\Xcircumflex}2) - 4*a*c|\>;\textrm{The name is \cd{(b{\Xcircumflex}2) - 4*a*c}} \\
|frobboz|\>;\textrm{The name is \cd{frobboz}, not \cd{FROBBOZ}} \\
|APL{\Xbackslash}360|\>;\textrm{The name is \cd{APL360}, because the \cd{{\Xbackslash}} quotes the \cd{3}} \\
|APL{\Xbackslash}{\Xbackslash}360|\>;\textrm{The name is \cd{APL{\Xbackslash}360}} \\
|apl{\Xbackslash}{\Xbackslash}360|\>;\textrm{The name is \cd{apl{\Xbackslash}360}} \\
|{\Xbackslash}|{\Xbackslash}||\>;\textrm{Same as \cd{{\Xbackslash}|{\Xbackslash}|}: the name is \cd{||}} \\
|(B{\Xcircumflex}2) - 4*A*C|\>;\textrm{The name is \cd{(B{\Xcircumflex}2) - 4*A*C};} \\
\>;~\textrm{it has parentheses and two spaces in it} \\
|(b{\Xcircumflex}2) - 4*a*c|\>;\textrm{The name is \cd{(b{\Xcircumflex}2) - 4*a*c}}
\end{lisp}

\section{Lists and Conses}
\indexterm{cons}

A \cdf{cons} is a record structure containing two components
called the \emph{car} and the \emph{cdr}.  Conses are used primarily
to represent lists.

A \emph{list} is recursively defined to be either the empty list
or a cons whose \emph{cdr} component is a list.
A list is therefore a chain of conses linked by their \emph{cdr} components
and terminated by {\nil}, the empty list.  The \emph{car} components of the conses
are called the \emph{elements} of the list.  For each element of the list
there is a cons.  The empty list has no elements at all.

A list is notated by writing the elements of the list in order,
separated by blank space (space, tab, or return characters)
and surrounded by parentheses.
\begin{lisp}
(a b c)~~~~~~~~~~~~~~~;\textrm{A list of three symbols} \\
(2.0s0 (a 1) \#{\Xbackslash}*)~~~~~;\textrm{A list of three things: a short floating-point} \\
~~~~~~~~~~~~~~~~~~~~~~;~\textrm{number, another list, and a character object}
\end{lisp}
The empty list {\nil} therefore can be written as {\emptylist}, because it is a list
with no elements.

A \emph{dotted list} is one whose last cons does not have {\nil} for
its \emph{cdr}, rather some other data object (which is also not a cons,
or the first-mentioned cons would not be the last cons of the list).
Such a list is called ``dotted'' because of the special notation
used for it: the elements of the list are written between
parentheses as before, but after the last element and before
the right parenthesis are written a dot (surrounded by blank space)
and then the \emph{cdr} of the last cons.  As a special case,
a single cons is notated by writing the \emph{car} and the \emph{cdr} between
parentheses and separated by a space-surrounded dot.
For example:
\begin{lisp}
(a . 4)~~~~~~~~~;\textrm{A cons whose \emph{car} is a symbol} \\
~~~~~~~~~~~~~~~~;~\textrm{and whose \emph{cdr} is an integer} \\
(a b c . d)~~~~~;\textrm{A dotted list with three elements whose last cons} \\
~~~~~~~~~~~~~~~~;~\textrm{has the symbol \cdf{d} in its \emph{cdr}}
\end{lisp}

It is legitimate to write something like \cd{(a b . (c d))};
this means the same as \cd{(a b c d)}.  The standard Lisp
output routines will never print a list in the first form, however;
they will avoid dot notation wherever possible.

Often the term \emph{list} is used to refer either to true lists or to
dotted lists.  When the distinction is important,
the term ``true list'' will be used to refer to a list
terminated by {\nil}.  Most functions
advertised to operate on lists expect to be given true lists. Throughout
this book, unless otherwise specified, it is an error to pass a dotted
list to a function that is specified to require a list as an argument.

\beforenoterule
\begin{implementation}
Implementors are encouraged to use the equivalent
of the predicate \cdf{endp} wherever it is necessary to test
for the end of a list.  Whenever feasible, this test should explicitly
signal an error if a list is found to be terminated by a non-{\nil} atom.
However, such an explicit error signal is not required, because
some such tests occur in important loops where efficiency is important.
In such cases, the predicate \cdf{atom} may be used to test
for the end of the list, quietly treating any non-{\nil} list-terminating
atom as if it were {\nil}.
\end{implementation}
\afternoterule

Sometimes the term \emph{tree} is used to refer to some cons
and all the other conses transitively accessible to it
through \emph{car} and \emph{cdr} links until non-conses are reached;
these non-conses are called the \emph{leaves} of the tree.

Lists, dotted lists, and trees are not mutually exclusive data types;
they are simply useful points of view about structures of conses.
There are yet other terms, such as \emph{association list}.
None of these are true Lisp data types.  Conses are a data type,
and {\nil} is the sole object of type \cdf{null}.
The Lisp data type \cdf{list} is taken to mean the union of the
\cdf{cons} and \cdf{null} data types, and therefore encompasses both
true lists and dotted lists.

\section{Arrays}
\label{ARRAY-TYPE-SECTION}
\indexterm{array}

An \cdf{array} is an object with components arranged according
to a Cartesian coordinate system.
In general, these components may be any Lisp data objects.

The number of dimensions of an array is called its \emph{rank}
(this terminology is borrowed from APL);
the rank is a non-negative integer.
Likewise, each dimension is itself a non-negative integer.
The total number of elements in the array is the product of all the
dimensions.

An implementation of Common Lisp may impose a limit on the rank of an array,
but this limit may not be smaller than 7.  Therefore, any Common Lisp
program may assume the use of arrays of rank 7 or less.
(A program may determine the actual limit on array ranks for
a given implementation by examining the constant \cdf{array-rank-limit}.)

It is permissible for a dimension to be zero.  In this case,
the array has no elements, and any attempt to access an element
is in error.  However, other properties of the array, such as the
dimensions themselves, may be used.
If the rank is zero, then there are no dimensions, and the
product of the dimensions is then by definition 1.
A zero-rank array therefore has a single element.

An array element is specified by a sequence of indices.
The length of the sequence must equal the rank of the array.
Each index must be a non-negative integer strictly less than
the corresponding array dimension.  Array indexing is
therefore zero-origin, not one-origin as in (the default case of)
Fortran.

As an example, suppose that the variable \cdf{foo} names a 3-by-5 array.
Then the first index may be 0, 1, or 2, and the second index
may be 0, 1, 2, 3, or 4.  One may refer to array elements using
the function \cdf{aref}; for example, \cd{(aref foo 2 1)}
refers to element (2, 1) of the array.  Note that \cdf{aref} takes
a variable number of arguments: an array, and as many indices
as the array has dimensions.
A zero-rank array has no dimensions, and therefore
\cdf{aref} would take such an array and no indices, and return the sole
element of the array.

In general, arrays can be multidimensional,
can share their contents with other array objects, and can have their
size altered dynamically (either enlarging or shrinking) after creation.
A one-dimensional array may also have a \emph{fill pointer}.

Multidimensional arrays store their components in row-major order;
that is, internally a multidimensional array is stored as a one-dimensional
array, with the multidimensional index sets ordered lexicographically,
last index varying fastest.  This is important in two situations:
(1) when arrays with different dimensions share their contents, and
(2) when accessing very large arrays in a virtual-memory implementation.
(The first situation is a matter of semantics; the second, a matter
of efficiency.)

An array that is not displaced to another array, has no fill pointer, and
is not to have its size adjusted dynamically after creation is called a
\emph{simple} array.  The user may provide declarations that certain arrays
will be simple.  Some implementations can handle simple arrays in an
especially efficient manner; for example, simple arrays may have a more
compact representation than non-simple arrays.

If one or more of the \cd{:adjustable}, \cd{:fill-pointer},
and \cd{:displaced-to} arguments is true when \cdf{make-array}
is called, then whether the resulting
array is simple is unspecified; but if all three arguments are false,
then the resulting array is guaranteed to be simple.

\subsection{Vectors}

One-dimensional arrays are called \emph{vectors} in Common Lisp
and constitute the type \cdf{vector} (which is therefore a subtype of \cdf{array}).
Vectors and lists are collectively considered to be
\emph{sequences}.  They differ in that any component of a one-dimensional array
can be accessed in constant time,
whereas the average component access time for a
list is linear in the length of the list; on the other hand, adding a new
element to the front of a list takes constant time, whereas the same
operation on an array takes time linear in the length of the array.

A general vector (a one-dimensional array
that can have any data object as an element but that has
no additional paraphernalia) can be notated by notating the
components in order, separated by whitespace and surrounded by \cd{\#(}
and \cd{)}.
For example:
\begin{lisp}
\#(a b c)~~~~~~~~~~~~~~~~~~~~;\textrm{A vector of length 3} \\*
\#()~~~~~~~~~~~~~~~~~~~~~~~~~;\textrm{An empty vector} \\
\#(2 3 5 7 11 13 17 19 23 29 31 37 41 43 47) \\*
~~~~~~~~~~~~~~~~~~~~~~~~~~~~;\textrm{A vector containing the primes below 50}
\end{lisp}
Note that when the function \cdf{read} parses this syntax, it always constructs
a \emph{simple} general vector.

\beforenoterule
\begin{rationale}
Many people have suggested that brackets be used
to notate vectors, as \cd{{\Xlbracket}a b c{\Xrbracket}}
instead of \cd{\#(a b c)}.  This notation
would be shorter, perhaps more readable, and certainly in accord with
cultural conventions in other parts of computer science and mathematics.
However, to preserve the usefulness of the user-definable macro-character
feature of the function \cdf{read}, it is necessary to leave some
characters to the user for this purpose.  Experience in MacLisp has
shown that users, especially implementors of languages for use
in artificial intelligence research, often want
to define special kinds of brackets.  Therefore Common Lisp avoids using
brackets and braces for any syntactic purpose.
\end{rationale}
\afternoterule

Implementations may provide certain specialized representations of
arrays for efficiency in the case where all the components are of
the same specialized (typically numeric) type.  All implementations
provide specialized arrays for the cases when the components
are characters (or rather, a special subset of the characters);
the one-dimensional instances of
this specialization are called \emph{strings}.
All implementations are also required to provide specialized arrays
of bits, that is, arrays of type \cd{(array bit)};
the one-dimensional instances of
this specialization are called \emph{bit-vectors}.

\subsection{Strings}
\label{STRING-TYPE-SECTION}

\begin{lisp}
base-string \EQ\ (vector base-char) \\*
simple-base-string \EQ\ (simple-array base-char (*))
\end{lisp}
An implementation may support
other string subtypes as well.  All Common Lisp functions that operate
on strings treat all strings uniformly; note, however,
that it is an error to attempt to insert
an extended character into a base string.

The type \cdf{string} is therefore a subtype of the type \cdf{vector}.

A string can be written as the sequence of characters contained in the
string, preceded and followed by a \cd{{\Xdquote}} (double quote) character.
Any \cd{{\Xdquote}} or \cd{{\Xbackslash}} character in the sequence must additionally
have a \cd{{\Xbackslash}} character before it.

For example:
\begin{lisp}
{\Xdquote}Foo{\Xdquote}~~~~~~~~~~~~~~~~~~~~~~~~~;\textrm{A string with three characters in it} \\*
{\Xdquote}{\Xdquote}~~~~~~~~~~~~~~~~~~~~~~~~~~~~;\textrm{An empty string} \\
{\Xdquote}{\Xbackslash}{\Xdquote}APL{\Xbackslash}{\Xbackslash}360?{\Xbackslash}{\Xdquote} he cried.{\Xdquote}~~~~~;\textrm{A string with twenty characters} \\*
{\Xdquote}|x| = |-x|{\Xdquote}~~~~~~~~~~~~~~~~~~;\textrm{A ten-character string}
\end{lisp}
Notice that any vertical bar \cd{|} in a string need not be
preceded by a \cd{{\Xbackslash}}.  Similarly, any double quote in the name
of a symbol written using vertical-bar notation need not be
preceded by a \cd{{\Xbackslash}}.  The double-quote and vertical-bar notations
are similar but distinct: double quotes indicate a character string
containing the sequence of characters,
whereas vertical bars indicate a symbol whose name is the contained
sequence of characters.

The characters contained by the double quotes, taken from left to right,
occupy locations within the string with increasing indices.
The leftmost character is string element number 0, the next one
is element number 1, the next one is element number 2, and so on.

Note that the function \cd{prin1} will print any character vector
(not just a simple one)
using this syntax, but the function \cdf{read} will always construct
a simple string when it reads this syntax.

\subsection{Bit-Vectors}

A bit-vector can be written as the sequence of bits contained in the
string, preceded by \cd{\#*}; any delimiter character, such as whitespace,
will terminate the bit-vector syntax.
For example:
\begin{lisp}
\#*10110~~~~~;\textrm{A five-bit bit-vector; bit 0 is a 1} \\
\#*~~~~~~~~~~;\textrm{An empty bit-vector}
\end{lisp}

The bits notated following the \cd{\#*}, taken from left to right,
occupy locations within the bit-vector with increasing indices.
The leftmost notated bit is bit-vector element number 0, the next one
is element number 1, and so on.

The function \cd{prin1} will print any bit-vector (not just a simple one)
using this syntax, but the function \cdf{read} will always construct
a simple bit-vector when it reads this syntax.

\section{Hash Tables}
Hash tables provide an efficient way of mapping any
Lisp object (a \emph{key}) to an associated object.
They are provided as primitives of Common Lisp because
some implementations may need to use internal storage
management strategies that would make it very difficult
for the user to implement hash tables in a portable fashion.
Hash tables are described in chapter~\ref{HASH}.

\section{Readtables}

A readtable is a data structure that maps characters into syntax
types for the Lisp expression parser.
In particular, a readtable indicates for
each character with syntax \emph{macro character} what its macro
definition is.  This is a mechanism by which the user may reprogram
the parser to a limited but useful extent.
See section~\ref{READTABLE-SECTION}.

\section{Packages}

Packages are collections of symbols that serve as name spaces.
The parser recognizes symbols by looking up character sequences
in the current package.  Packages can be used to hide
names internal to a module from other code.  Mechanisms are provided
for exporting symbols from a given package to the primary ``user'' package.
See chapter~\ref{XPACK}.

\section{Pathnames}
Pathnames are the means by which a Common Lisp program can
interface to an external file system in a reasonably implementation-independent
manner.  See section~\ref{PATHNAME}.

\section{Streams}

A stream is a source or sink of data, typically characters or bytes.
Nearly all functions that perform I/O do so with respect to a specified
stream.  The function \cdf{open} takes a pathname and returns a stream
connected to the file specified by the pathname.
There are a number of standard streams that are used by default for
various purposes.  See chapter~\ref{STREAM}.

There are subtypes of type \cdf{stream}:
\cdf{broadcast-stream}, \cdf{concatenated-stream},
\cdf{echo-stream}, \cdf{synonym-stream}, \cdf{string-stream}, \cdf{file-stream},
and \cdf{two-way-stream} are disjoint subtypes of \cdf{stream}.
Note particularly that a synonym stream is always and only of type
\cdf{synonym-stream}, regardless of the type of the stream for which it is a synonym.

\section{Random-States}

An object of type \cdf{random-state} is used to encapsulate
state information used by the pseudo-random number generator.
For more information about \cdf{random-state} objects,
see section~\ref{RANDOM}.

\section{Structures}

Structures are instances of user-defined data types that have
a fixed number of named components.  They are analogous to
records in Pascal.
Structures are declared using the \cdf{defstruct} construct;
\cdf{defstruct} automatically defines access and constructor functions for
the new data type.

Different structures may print out in different ways;
the definition of a structure type may specify a print procedure
to use for objects of that type (see the
\cd{:print-function} option to \cdf{defstruct}).
The default notation for structures is
\begin{lisp}
\#S(\emph{structure-name} \\
~~~~~~~~\emph{slot-name-1} \emph{slot-value-1} \\
~~~~~~~~\emph{slot-name-2} \emph{slot-value-2} \\
~~~~~~~~~~~~~~~~~~~~~~...)
\end{lisp}
where \cd{\#S} indicates structure syntax, \emph{structure-name} is
the name (a symbol) of the structure type, each \emph{slot-name} is the name
(also a symbol) of a component, and each corresponding \emph{slot-value}
is the representation of the Lisp object in that slot.

\section{Functions}
\label{FUNCTION-TYPE-SECTION}

The type \cdf{function} is to be disjoint
from \cdf{cons} and \cdf{symbol}, and so a list whose \emph{car} is \cdf{lambda}
is not, properly speaking, of type \cdf{function}, nor is any symbol.
However,
standard Common Lisp functions that accept functional arguments
will accept a symbol or a list whose \emph{car} is \cdf{lambda}
and automatically coerce it to be a function; such standard
functions include \cdf{funcall}, \cdf{apply}, and \cdf{mapcar}.
Such functions do not, however, accept a lambda-expression as a functional
argument; therefore one may not write

\vskip 3pt
\begin{lisp}
(mapcar '(lambda (x y) (sqrt (* x y))) p q)
\end{lisp}
but instead one must write something like
\begin{lisp}
(mapcar \#'(lambda (x y) (sqrt (* x y))) p q)
\end{lisp}

This change makes it impermissible to represent a lexical closure
as a list whose \emph{car} is some special marker.

The value of a \cdf{function} special operator
will always be of type \cdf{function}.

\section{Unreadable Data Objects}

Some objects may print in implementation-dependent ways.
Such objects cannot necessarily be reliably reconstructed from
a printed representation, and so they are usually printed in
a format informative to the user but not acceptable to the \cdf{read} function:
\cd{\#<\emph{useful information}>}.
The Lisp reader will signal an error on encountering \cd{\#<}.

As a hypothetical example, an implementation might print
\begin{lisp}
\#<stack-pointer si:rename-within-new-definition-maybe \#o311037552>
\end{lisp}
for an implementation-specific ``internal stack pointer'' data type
whose printed representation includes the name of the type,
some information about the stack slot pointed to, and the machine address
(in octal) of the stack slot.

See \cdf{print-unreadable-object}, a macro that prints an object using \cd{\#<}
syntax.

\section{Overlap, Inclusion, and Disjointness of Types}
\label{DATA-TYPE-RELATIONSHIPS}

The Common Lisp data type hierarchy is tangled and purposely left somewhat
open-ended so that implementors may experiment with new data types
as extensions to the language.  This section explicitly states all
the defined relationships between types, including subtype/supertype
relationships,
disjointness, and exhaustive partitioning.  The user of Common Lisp
should not depend on any relationships not explicitly stated here.
For example, it is not valid to assume that because a number
is not complex and not rational that it must be a \cdf{float}, because
implementations are permitted to provide yet other kinds of numbers.

First we need some terminology.
If \emph{x} is a supertype of \emph{y}, then any object of type \emph{y} is also
of type \emph{x}, and \emph{y} is said to be a subtype of \emph{x}.  If types
\emph{x} and \emph{y} are disjoint, then no object (in any implementation) may
be both of type \emph{x} and of type \emph{y}.  Types $\emph{a}_1$ through
$\emph{a}_{n}$ are an \emph{exhaustive union}
of type \emph{x} if each $\emph{a}_j$
is a subtype of \emph{x}, and any object of type \emph{x} is
necessarily of at least one of the types $\emph{a}_{j}$;
$\emph{a}_1$ through $\emph{a}_{n}$ are furthermore an \emph{exhaustive partition}
if they are also pairwise disjoint.

\begin{itemize}
\item
The type \cdf{t} is a supertype of every type whatsoever.
Every object is of type \cdf{t}.

\item
The type {\nil} is a subtype of every type whatsoever.
No object is of type {\nil}.
\end{itemize}

\begin{itemize}
\item
The types \cdf{cons}, \cdf{symbol}, \cdf{array}, \cdf{number}, \cdf{character},
\cdf{hash-table}, \cdf{readtable}, \cdf{package}, \cdf{pathname},
\cdf{stream}, \cdf{random-state}, and any single other type created by
\cdf{defstruct} or \cdf{defclass}
are pairwise disjoint.

Type \cdf{function}
is disjoint from the types \cdf{cons}, \cdf{symbol}, \cdf{array}, \cdf{number},
and \cdf{character}.

The type \cdf{compiled-function} is a subtype of \cdf{function};
implementations are free to define other subtypes of \cdf{function}.
\end{itemize}

\begin{itemize}
\item
The types \cdf{real} and \cdf{complex} are pairwise disjoint
subtypes of \cdf{number}.
\end{itemize}

\beforenoterule
\begin{rationale}
It might be thought that \cdf{real} and \cdf{complex} should
form an exhaustive partition of the type \cdf{number}.  This is purposely
avoided here in order to permit compatible experimentation with extensions
to the Common Lisp number system.
\end{rationale}
\afternoterule

\begin{itemize}
\item
The types \cdf{rational} and \cdf{float} are pairwise disjoint
subtypes of \cdf{real}.
\end{itemize}

\beforenoterule
\begin{rationale}
It might be thought that \cdf{rational} and \cdf{float} should
form an exhaustive partition of the type \cdf{real}.  This is purposely
avoided here in order to permit compatible experimentation with extensions
to the Common Lisp number system.
\end{rationale}
\afternoterule

\begin{itemize}
\item
The types \cdf{integer} and \cdf{ratio} are disjoint subtypes of \cdf{rational}.
\end{itemize}

\beforenoterule
\begin{rationale}
It might be thought that \cdf{integer} and \cdf{ratio} should
form an exhaustive partition of the type \cdf{rational}.  This is purposely
avoided here in order to permit compatible experimentation with extensions
to the Common Lisp rational number system.
\end{rationale}
\afternoterule

Types \cdf{fixnum} and \cdf{bignum}
do in fact form an exhaustive partition of the type \cdf{integer}; more precisely,
they voted to specify that the type \cdf{bignum} is by definition equivalent
to \cd{(and~integer (not~fixnum))}.  This is consistent with the
first edition text in section~\ref{INTEGERS-SECTION}.

I interpret this to mean that implementators could still experiment with
such extensions as adding explicit representations of infinity, but such infinities
would necessarily be of type \cdf{bignum}.

\begin{itemize}
\item
The types \cdf{short-float}, \cdf{single-float}, \cdf{double-float}, and
\cdf{long-float} are subtypes of \cdf{float}.  Any two of them must be
either disjoint or identical; if identical, then any other types between
them in the above ordering must also be identical to them
(for example, if \cdf{single-float} and \cdf{long-float} are identical types,
then \cdf{double-float} must be identical to them also).

\item
The type \cdf{null} is a subtype of \cdf{symbol}; the only object of type
\cdf{null} is {\nil}.

\item
The types \cdf{cons} and \cdf{null} form an exhaustive partition of the type
\cdf{list}.
\end{itemize}

\begin{itemize}
\item
The type \cdf{standard-char} is a subtype of \cdf{base-char}.
The types \cdf{base-char} and \cdf{extended-char}
form an exhaustive partition of \cdf{character}.
\end{itemize}

\begin{itemize}
\item
The type \cdf{string} is a subtype of \cdf{vector}; it is the union of
all types \cd{(vector~\emph{c})} such that \emph{c} is a subtype of \cdf{character}.
\end{itemize}

\begin{itemize}
\item
The type \cdf{bit-vector} is a subtype of \cdf{vector}, for \cdf{bit-vector}
means \cd{(vector bit)}.

\item
The types \cd{(vector t)}, \cdf{string}, and \cdf{bit-vector} are disjoint.

\item
The type \cdf{vector} is a subtype of \cdf{array}; for all types \emph{x},
the type \cd{(vector \emph{x})} is the same as the type \cd{(array \emph{x} (*))}.

\item
The type \cdf{simple-array} is a subtype of \cdf{array}.
\end{itemize}

\begin{itemize}
\item
The types \cdf{simple-vector}, \cdf{simple-string}, and
\cdf{simple-bit-vector} are disjoint subtypes of \cdf{simple-array}, for they
mean \cd{(simple-array t (*))}, the union of all types
\cd{(simple-array \emph{c} (*))} such that \emph{c} is a subtype of \cdf{character},
and \cd{(simple-array bit (*))}, respectively.
\end{itemize}

\begin{itemize}
\item
The type \cdf{simple-vector} is a subtype of \cdf{vector} and indeed
is a subtype of \cd{(vector t)}.

\item
The type \cdf{simple-string} is a subtype of \cdf{string}.
(Note that although \cdf{string} is a subtype of \cdf{vector},
\cdf{simple-string} is not a subtype of \cdf{simple-vector}.)
\end{itemize}

\beforenoterule
\begin{rationale}
The hypothetical name \cdf{simple-general-vector} would have been more accurate than
\cdf{simple-vector}, but in this instance euphony and
user convenience were deemed more important to the design
of Common Lisp than a rigid symmetry.
\end{rationale}
\afternoterule

\begin{itemize}
\item
The type \cdf{simple-bit-vector} is a subtype of \cdf{bit-vector}.
(Note that although \cdf{bit-vector} is a subtype of \cdf{vector},
\cdf{simple-bit-vector} is not a subtype of \cdf{simple-vector}.)

\item
The types \cdf{vector} and \cdf{list} are disjoint subtypes of \cdf{sequence}.

\item
The types \cdf{random-state}, \cdf{readtable}, \cdf{package}, \cdf{pathname},
\cdf{stream}, and \cdf{hash-table} are pairwise disjoint.
\end{itemize}

\cdf{random-state}, \cdf{readtable}, \cdf{package}, \cdf{pathname},
\cdf{stream}, and \cdf{hash-table} are
pairwise disjoint from a number of other types as well;
see note above.

\begin{itemize}
\item
The types \cdf{two-way-stream}, \cdf{echo-stream},
\cdf{broadcast-stream}, \cdf{file-stream}, \cdf{synonym-stream}, \cdf{string-stream}, and
\cdf{concatenated-stream} are disjoint subtypes of \cdf{stream}.
\end{itemize}

\begin{itemize}
\item
Any two types created by \cdf{defstruct} are disjoint unless
one is a supertype of the other by virtue of
the \cd{:include} option.
\end{itemize}

%RUSSIAN
\else

\chapter{Типы данных}
\label{DTYPES}

Common Lisp предоставляет множество типов для объектов
данных. Необходимо подчеркнуть, что в Lisp'е типизированы данные,
а не переменные. Любая переменная может содержать данные любого
типа. (Можно указать явно, что некоторая переменная фактически
может содержать только один или конечное множество типов
объектов. Однако, такая декларация может быть опущена, и программа
будет выполняться корректно. Такая декларация содержит
рекомендации от пользователя, и это может быть полезным при
оптимизации. Смотрите \cdf{declare}.)

В Common Lisp'е тип данных является (возможно бесконечным)
множеством Lisp объектов. Многие объекты Lisp'а принадлежат к
более чем одному множеству типов, так что иногда не имеет смысла
спрашивать тип объекта; вместо этого задаётся вопрос о
принадлежности объекта к нужному типу. Предикат \cdf{typep} может
использоваться для определения принадлежности объекта к заданному
типу, а функция \cdf{type-of} возвращает тип, к которому
принадлежит заданный объект.

Типы данных в Common Lisp сложены в иерархию (фактически в порядке
убывания объёма) определённую отношениями подмножеств. Несомненно
множества объектов, такие как множество чисел и множество строк
заслуживают идентификаторов. Для многих этих идентификаторов
используются символы (здесь и далее, слово <<символ>> ссылается на
тип Lisp'овых объектов символ, известный также как литеральный
атом). См. главу~\ref{DTSPEC} подробно описывающую определения
типов. 

Множество все объектов определяется символом {\true}. Пустой тип
данных, который не содержит объектов обозначается с помощью
{\nil}. 

Следующие категории объектов Common Lisp'а в особенности
интересны: числа (numbers), знаки (characters), символы (symbols),
списки (lists), массивы (arrays), структуры (structures) и функции
(functions). Другие типы тоже, конечно, интересны. Некоторые из
этих категорий имеют много подразделов. Так же есть стандартные
типы, которые определены как объединение двух и более данных
категорий. Вышеупомянутые категории, являясь типами объектов, менее
<<реальны>> чем другие типы данных. Они просто
составляют объединения типов для наглядности. 

Вот краткое изложение различных Common Lisp'овых типов
данных. Оставшиеся разделы данной главы рассматривают типы более
детально, а также описывают нотации для объектов для каждого
типа. Описание Lisp'овых функций, что оперируют объектами данных
каждого типа будет даваться в следующих главах. 

\begin{itemize}
\item
\emph{Числа} имеют различные формы и представления. Common Lisp
предоставляет целочисленный (integer) тип данных: любое целое
число, положительное или отрицательное ограничено размерами памяти
(преимущественно равными ширине машинного слова). Также
предоставляется рациональный или дробный (rational) тип данных:
это отношение двух целых чисел, не являющееся целым числом. Также
предоставляются числа с плавающей точкой различных интервалов и
точностей. И наконец, в языке также есть комплексные числа. 

\item
\emph{Строковые} символы представляют печатные символы, такие как
буквы или управляющие форматированием символы. Строки являются
одномерными массивами символов. Common Lisp предоставляет богатое
множество символов, включая способы представления различных
стилей печати. 

\item
\emph{Символы} (иногда для ясности называемые \emph{атомные символы (atomic
  symbols)})
являются именованными объектами данных. Lisp предоставляет
механизм определяющий местоположение объекта символа по заданному
имени (в форме строки). У символов есть \emph{списки свойств},
которые фактически позволяют использовать символы в качестве
структур, с расширяемым множеством имён полей, каждое из которых
может быть любым Lisp объектом. Символы также служат для
именования функций и переменных в программе. 

\item
\emph{Списки}  это последовательность, представленная в форме связанных
ячеек, называемых \emph{cons-ячейками}. Для обозначения пустого
списка служит специальный объект (обозначаемый символом
{\nil}). Все остальные списки создаются рекурсивно, с помощью
добавления новых элементов в начало существующего списка. Это
происходит так: создаётся новая cons-ячейка, которая является
объектом, имеющим два компонента, называемых \emph{car} и \emph{cdr}. \emph{Car}
может хранить, что угодно, а \emph{cdr} создан для хранения 
указателя на существующий ранее список. (Cons-ячейки могут
использоваться для хранения записи структуры из двух
элементов, но это не главное их предназначение.) 

\item
\emph{Массивы} - это n-мерные коллекции объектов. Массив может
иметь любое неотрицательное количество измерений и индексироваться
с помощью последовательности целых чисел. Общий тип массива может
содержать любой Lisp объект. Другие типы массивов специализируются
для эффективности и могут содержать только определённые типы Lisp
объектов. Также существует возможность того, что два массива,
возможно с разным количеством измерений, указывают на одно и то же
подмножество объектов (если изменить первый массив, изменится и
второй). Это достигается с помощью указания для одного массива
\emph{быть связанным} с другим массивом. Одномерные массивы
любого типа называются \emph{векторами (vectors)}. Одномерные
массивы букв называются \emph{строки}. Одномерные
массивы битов (это целое число, которое может содержать 0 или 1)
называются \emph{битовыми векторами (bit-vectors)}. 

\item
\emph{Хеш-таблицы} предоставляют эффективный способ связывания
любого Lisp объекта (\emph{ключа}) с другим объектом (\emph{значением}). 

\item
\emph{Таблицы символов Lisp парсера (readtables)} используются для управления
парсером выражений \cdf{read}. Этот функционал предназначен для создания
макроридеров для ограниченного изменения синтаксиса языка.

\item
\emph{Пакеты} являются коллекциями символов и служат для разделения их на 
пространства имён. Парсер распознает символы с помощью поиска
их имён в текущем пакете.

\item
\emph{Имена файлов (pathnames)} хранят в себе путь к файлу в кроссплатформенном
виде. Они используются для взаимодействия с внешней файловой
системой.

\item
\emph{Потоки} представляют источники данных, обычно строковых
символов или байтов. Они используются для ввода/вывода, а также для
внутренних нужд, например для парсинга строк. 

\item
\emph{Состояния генератора случайных чисел (random-states)} --- это структуры
данных, используемые для хранения состояния встроенного генератора случайных
чисел (ГСЧ).

\item
\emph{Структуры} --- это определённые пользователем объекты, имеющие
именованные поля. \cdf{defstruct} используется для определения
новых типов структур. Некоторые реализации Common Lisp'а могут
внутренне предоставлять некоторые системные типы такие, как \emph{bignums},
\emph{таблицы символов Lisp парсера (readtables)}, \emph{потоки (streams)},
\emph{хеш-таблицы (hash tables)} и \emph{имена файлов (pathnames)} как
структуры, но для пользователя это не имеет значения.

\item
\emph{Условия (conditions)} --- это объекты, используемые для
управления ходом выполнения программы с помощью сигналов и
обработчиков этих самых сигналов. В частности, ошибки
сигнализируются с помощью генерации условия, и эти ошибки могут
быть обработаны с помощью установленных для этих условий обработчиков.

\item
\emph{Классы} определяют структуру и поведение других объектов,
являющихся \emph{экземплярами} данных классов. Каждый объект данных
принадлежит некоторому классу. 

\item
\emph{Методы} --- это код, который оперирует аргументами, которые
соответствуют некоторому шаблону. Методы не являются функциями; они
не вызываются напрямую, а объединяются в обобщённые функции (generic
functions). 

\item
\emph{Обобщённые функции} --- это функции, которые содержат, кроме всего
прочего, множество методов. При вызове generic функция вызывает
подмножество её методов. Подмножество для выполнения выделяется с
помощью определения классов аргументов и выбора им соответствующих
методов. 
\end{itemize}

Эти категории не всегда взаимоисключаемы. Указанные отношения
между различными типами данных более детально описано в
разделе~\ref{DATA-TYPE-RELATIONSHIPS}.

\section{Числа}

В Common Lisp'е определены некоторые виды чисел. Они
подразделяются на \emph{целочисленные (\cdf{integer})}; \emph{дробные (\cdf{ratio})};
\emph{с плавающей точкой (floating-point)} с 
четырьмя видами представления, \emph{действительные (\cdf{real})} и
\emph{комплексные (\cdf{complex})}.

Числовой (\cdf{number}) тип данных захватывает все числовые типы. Для удобства
также предоставлены имена для некоторых числовых
подтипов. Целые числа и дроби принадлежат к
\emph{рациональному (\cdf{rational})}. Рациональные числа и с плавающей точкой --- к
\emph{действительному (\cdf{real})}. Действительные (real) и комплексные (complex) --- к
\emph{числовому (\cdf{number})} типу.

Несмотря на то, что эти типы выбирались из математической
терминологии, соответствие не всегда полное. Модель целочисленных
(integers) и дробных (ratios) типов полностью совпадает с
математической. Числа с \emph{плавающей точкой (float)} могут
использоваться для аппроксимации действительных (real) чисел:
рациональных (rational) и иррациональных
(irrational). \emph{Действительный (real)} тип включает все Common Lisp
числа, которые отображают действительные (real) математические числа,
однако для математических иррациональных (irrational) аналогии в Cоmmon Lisp'е
нет. Только \emph{действительные (real)} числа могут быть отсортированы с
помощью функций \cdf{<}, \cdf{>}, \cdf{<=} и \cdf{>=}. (Ох жеш FIXME). 

\subsection{Целые числа}
\label{INTEGERS-SECTION}
\indexterm{integer}

\emph{Целочисленный} тип данных предназначен для отображения
математических целых чисел. В отличие от большинства языков
программирования, Common Lisp принципиально не навязывает
ограничений на величину целого числа.
Место для хранения больших чисел выделяется автоматически по мере
необходимости.

В каждой реализации Common Lisp'а есть интервал целых чисел,
которые хранятся более оптимально, чем другие. Каждое такое число
называется \emph{fixnum}, и число не являющееся fixnum'ом называется
\emph{bignum}. Common Lisp спроектирован так, чтобы скрыть различие настолько,
насколько это возможно. Различие между fixnums и bignums видимо
пользователю, только в тех местах, где важна эффективность работы
алгоритма. Какие числа являются fixnums зависит от реализации;
обычно это числа в интервале от $-2^{n}$ to
$2^{n}-1$, включительно, для некоторого \emph{n} не меньше
15. См. \cdf{most-positive-fixnum} и \cdf{most-negative-fixnum}. 

\cdf{fixnum} должен быть супертипом для типа \cd{(signed-byte 16)}, 
и в дополнение к этому, значения \cdf{array-dimension-limit} должны
принадлежать fixnum (разработчики должны выбрать интервал fixnum,
чтобы в него можно было включить наибольшее число поддерживаемых
измерений для массивов). 

\beforenoterule
\begin{rationale}
Эта спецификация позволяет программистам объявлять
переменные в переносимом коде типа \cdf{fixnum} для
эффективности. Fixnums гарантированно заключают в себе множество
знаковых 16-битных чисел чисел (это сравнимо с типом
данных \cd{short int} в языке программирования C). В дополнение к
всему, любой корректный индекс массива должен быть fixnum, и в
таком случае переменные, которые хранят индексы массива (например
переменная в \cdf{dotimes}) могут быть объявлены как \cdf{fixnum} в
переносимом коде. 
\end{rationale}
\afternoterule

Целые числа обычно записываются в десятичном виде, как последовательность
десятичных цифр опционально с предшествующим знаком и опционально с последующей
точкой. Например:
\begin{lisp}
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\=\kill
\>0~~~~~\';\textrm{Нуль} \\*
\>-0~~~~~\';\textrm{Это \emph{всегда} значит то же, что и \cd{0}} \\*
\>+6~~~~~\';\textrm{Первое совершенное число} \\
\>28~~~~~\';\textrm{Второе совершенное число} \\
\>1024.~~~~~\';\textrm{Два в десятой степени} \\*
\>-1~~~~~\';\textrm{$e^{\pi i}$} \\*
\>15511210043330985984000000.~~~~~\';\textrm{факториал от 25 (25!),
скорее все bignum}
\end{lisp}

Целые числа могут быть представлены с основаниями отличными от
десяти. Синтаксис: 
\begin{lisp}
\#\emph{nn}r\emph{ddddd}     \textrm{or}     \#\emph{nn}R\emph{ddddd}
\end{lisp}
означает, что целое число с основанием \emph{nn} определённое с
помощью цифр и букв \emph{ddddd}. Более точное описание:
символ \cd{\#}, не пустая последовательность десятичных цифр
представляющих десятичное число \emph{n}, \cdf{r} (или \cdf{R}), опционально
знак + или -, и последовательность цифр для заданной системы счисления (система
счисления должна быть между 2 и 36, включительно). Для
заданной системы счисления могут использоваться для задания числа только
корректные символы. Например для
восьмеричного числа могут использоваться только цифры от 0 до 7
включительно. Для систем счисления больших десятичной, могут использоваться
буквы алфавита в любом регистре в алфавитном порядке. Двоичные, восьмеричные и
шестнадцатеричные основания можно использовать с помощью следующих аббревиатур:
\cd{\#b} для \cd{\#2r}, \cd{\#o} для \cd{\#8r}, \cd{\#x} для
\cd{\#16r}. Например:
\begin{lisp}
~~~~~~~~~~~~~~~~\=\kill
\>\#2r11010101~~~~~\';\textrm{Другой способ определения
числа \cd{213}} \\
\>\#b11010101~~~~~\';\textrm{то же самое} \\
\>\#b+11010101~~~~~\';\textrm{то же самое} \\
\>\#o235~~~~~\';\textrm{То же самое, в восьмеричной системе} \\
\>\#xD5~~~~~\';\textrm{То же самое, в шестнадцатеричной системе} \\
\>\#16r+D5~~~~~\';\textrm{То же самое} \\
\>\#o-300~~~~~\';\textrm{Десятичное число -192, записанное восьмеричным числом} \\
\>\#3r-21010~~~~~\';\textrm{То же самое, в троичное системе счисления} \\
\>\#25R-7H~~~~~\';\textrm{То же самое с основанием 25} \\
\>\#xACCEDED~~~~~\';\textrm{181202413, в шестнадцатеричной системе}
\end{lisp}

\subsection{Дробные числа}
\indexterm{ratio}
\indexterm{rational}

\emph{Дробное число} --- это число отображающее математическое отношение
между двумя целыми числами. Целые и дробные числа вместе
составляют тип рациональных (\cdf{rational}) чисел. Образцовое
отображение дробных чисел - это целое число, если значение целое,
в противном случае это отношение двух целых чисел, \emph{числителя} и
\emph{знаменателя}, наибольший общий делитель которых единица, в котором
знаменатель положителен (и фактически больший чем единица, иначе
дробь является целым числом). Дробь записывается с помощью
разделителя \cdf{/}, так: \cd{3/5}. Есть возможность
использовать нестандартную запись такую, как \cd{4/6}, но Lisp
функция \cd{prin1} всегда выводит дробь в стандартной форме. 

Если какое-либо вычисление привело к результату, являющемуся
дробью двух целых чисел, где знаменатель делит числитель нацело,
тогда результат немедленно преобразуется в эквивалентное целое
число. Это называется правилом \emph{канонизации дробей}.

Дробные числа могут быть записаны так: опционально знак + или -,
за ним следуют две не пустые последовательности цифр разделённых с
помощью \cd{/} . Такой синтаксис может быть описан так:

\begin{tabbing}
\emph{ratio} ::= \Mopt{\emph{sign}} \Mplus{\emph{digit}} \cd{/} \Mplus{\emph{digit}}
\end{tabbing}

Вторая последовательность не может состоять только из
нулей. Например:  
\begin{lisp}
2/3~~~~~~~~~~~~~~~~~;\textrm{Это каноническая запись} \\
4/6~~~~~~~~~~~~~~~~~;\textrm{Это неканоническая запись предыдущего числа} \\
-17/23~~~~~~~~~~~~~~;\textrm{Не очень интересная дробь} \\
-30517578125/32768~~;\textrm{Это $(-5/12)^{15}$} \\
10/5~~~~~~~~~~~~~~~~;\textrm{Это каноническая запись для \cd{2}}
\end{lisp}

Для задания дробей в системе счисления отличной от десятичной, необходимо
использовать спецификатор основания (один из \cd{\#\emph{nn}R}, \cd{\#O}, \cd{\#B} или \cd{\#X}) как и для
целых чисел. Например:

\begin{lisp}
\#o-101/75~~~~~~~~~;\textrm{Восьмеричная запись для \cd{-65/61}} \\
\#3r120/21~~~~~~~~~;\textrm{Третичная запись для \cd{15/7}} \\
\#Xbc/ad~~~~~~~~~~~;\textrm{Шестнадцатеричная запись для \cd{188/173}} \\
\#xFADED/FACADE~~~~;\textrm{Шестнадцатеричная запись для \cd{1027565/16435934}} 
\end{lisp}

\subsection{Числа с плавающей точкой}

Common Lisp позволяет реализации содержать один и более типов чисел с
плавающей точкой, которые все вместе составляют тип \cdf{float}.
Число с плавающей точкой является (математически) рациональным числом формы
$\emph{s} \cdot \emph{f} \cdot \emph{b}^{e-p}$,
где \emph{s} $+1$ или $-1$, является \emph{знаком};
\emph{b} целое число большее 1,
является \emph{основанием} для представления;
\emph{p} положительное целое, является \emph{точностью} (количество цифр по
основанию \emph{b}) числа с плавающей точкой;
\emph{f} положительное целое между $\emph{b}^{p-1}$ и
$\emph{b}^{p}-1$ (включительно), является мантиссой;
и \emph{e} целое число, является экспонентой.
Значение \emph{p} и интервал \emph{e} зависит от реализации, также может быть
<<минус ноль>>. Если <<минус ноль>> отсутствует, тогда \cd{0.0} и \cd{-0.0} оба
интерпретируются, как ноль с плавающей точкой.

\beforenoterule
\begin{implementation}
The form of the above description should not be construed
to require the internal representation to be in sign-magnitude form.
Two's-complement and other representations are also acceptable.  Note
that the radix of the internal representation may be other than 2, as on
the IBM 360 and 370, which use radix 16; see
\cdf{float-radix}.
\end{implementation}
\afternoterule

В зависимости от реализации числа с плавающей точкой могут предоставляться с
различными точностями и размерами. Высококачественные программы с
вычислениями с плавающей точкой зависят от того, какая предоставлена точность,
и не всегда могут быть полностью переносимы. Для содействия по умеренной
переносимости программ, сделаны следующие определения:
\begin{itemize}
\item
\emph{Короткий} тип числа с плавающей точкой (тип \cdf{short-float}) является
представлением числа с наименьшей фиксированной точностью, предоставляемого реализацией.

\item
\emph{Длинный} тип числа с плавающей точкой (тип \cdf{long-float}) является
представлением числа с наибольшей фиксированной точностью, предоставляемого реализацией.

\item
Промежуточными форматами между коротким и длинным форматами является два других
формата, называемых \emph{одинарный} и \emph{двойной} (типы \cdf{single-float} и
\cdf{double-float}).
\end{itemize}

Определение точности для этих категорий зависит от реализации. Однако, примерная
цель такая, что короткий тип с плавающий точкой должен содержать точность как минимум
4 позиции после запятой (и также должен иметь эффективное представление в
памяти);
одинарный тип с плавающей точкой --- как минимум 7 знаков после запятой;
двойной тип с плавающей точкой --- как минимум 14 знаков после запятой.
Предполагается, что размер точности (измеряется в битах и рассчитывается как
$p\log_2 b$) и экспоненты (измеряется в битах и рассчитывается как логарифм 
с основанием 2 от (1 плюс максимальное значение экспоненты) должен быть как 
минимум таким же большим как значения из таблицы~\ref{Floating-Format-Requirements-Table}.

\begin{table}[t]
\caption{Рекомендуемый размер для точности и экспоненты для типа с плавающей точкой}
\label{Floating-Format-Requirements-Table}
\begin{tabular}{@{}lll@{}}
{Формат\quad\quad}&{Минимальная точность\quad\quad}&{Минимальный размер экспоненты} \\ \hlinesp
Короткое&13 бит&5 бит \\
Одинарное&24 бит&8 бит \\
Двойное&50 бит&8 бит \\
Длинное&50 бит&8 бит
\end{tabular}
\end{table}

Числа с плавающей точкой записываются в двух формах десятичной дробью и
компьютеризированной научной записью: необязательный знак, затем не пустая
последовательность цифр с встроенной точкой, затем необязательная
часть определения экспоненты.
Если определения экспоненты нет, тогда требуется точка, и после неё должны быть
цифры.
Определение экспоненты составляется из маркера экспоненты, необязательного знака
и негустой последовательности цифр.
Для ясности приведена БНФ для записи чисел с плавающей точкой.

\begin{tabbing}
\emph{число-с-плавающей-точкой} ::= \=\Mopt{\emph{знак}} \Mstar{\emph{цифра}} {\it
точка} \Mplus{\emph{цифра}} \Mopt{\emph{экспонента}} \\*
\>\hbox to 0pt{\hss\Mor~}\Mopt{{\it
знак}} \Mplus{\emph{цифра}} \Mopt{\emph{точка} \Mstar{\emph{цифра}}} {\it
экспонента} \\
\emph{знак} ::= \cd{+} {\Mor} \cd{-} \\
\emph{точка} ::= \cd{.} \\
\emph{цифра} ::= \cd{0} {\Mor} \cd{1} {\Mor} \cd{2} {\Mor} \cd{3} {\Mor} \cd{4}
         {\Mor} \cd{5} {\Mor} \cd{6} {\Mor} \cd{7} {\Mor} \cd{8} {\Mor} \cd{9}\\
\emph{экспонента} ::= \emph{маркер-экспоненты} \Mopt{\emph{знак}} \Mplus{\emph{цифра}}\\*
\emph{маркер-экспоненты} ::= \cd{e} {\Mor} \cd{s} {\Mor} \cd{f}
{\Mor} \cd{d} {\Mor} \cd{l} {\Mor} \cd{E} {\Mor} \cd{S} {\Mor} \cd{F} {\Mor}
\cd{D} {\Mor} \cd{L}
\end{tabbing}

Если определение экспоненты отсутствует или если используется маркер
экспоненты \cdf{e} (или \cdf{E}), тогда используемые формат точности не
задан. Когда такое представление считывается и конвертируется во внутренний
формат объекта числа с плавающей точкой, формат задаётся с помощью
переменной \cdf{*read-default-float-format*}; первоначальное значение данной
переменной \cdf{single-float}.

Буквы  \cd{s}, \cd{f}, \cd{d} и \cd{l} (или их эквиваленты в верхнем регистре)
явно задают использование типа: \emph{короткий}, \emph{одинарный}, \emph{двойной} и
\emph{длинный}, соответственно.

Примеры чисел с плавающей точкой:
\begin{lisp}
0.0~~~~~~~~~~~~~~~~~~~~~~~~~;\textrm{Ноль с плавающей точкой в формате по умолчанию} \\
0E0~~~~~~~~~~~~~~~~~~~~~~~~~;\textrm{Также ноль с плавающей точкой в формате по умолчанию} \\
-.0~~~~~~~~~~~~~~~~~~~~~~~~~;\textrm{Это может быть нулём или минус нулём} \\
~~~~~~~~~~~~~~~~~~~~~~~~~~~~; \textrm{в зависимости от реализации} \\
0.~~~~~~~~~~~~~~~~~~~~~~~~~~;\textrm{\emph{Целый} ноль, не с плавающей точкой!} \\
0.0s0~~~~~~~~~~~~~~~~~~~~~~~;\textrm{Ноль с плавающей точкой в \emph{коротком} формате} \\
0s0~~~~~~~~~~~~~~~~~~~~~~~~~;\textrm{Также ноль с плавающей точкой в \emph{коротком} формате} \\
3.1415926535897932384d0~~~~~;\textrm{Аппроксимация числе пи в \emph{двойном} формате} \\
6.02E+23~~~~~~~~~~~~~~~~~~~~;\textrm{Число Авогадро в формате по умолчанию} \\
602E+21~~~~~~~~~~~~~~~~~~~~~;\textrm{Также число Авогадро в формате по умолчанию} \\
3.010299957f-1~~~~~~~~~~~~~~;\textrm{$\log_{10} 2$, в \emph{одинарном} формате} \\
-0.000000001s9~~~~~~~~~~~~~~;\textrm{$e^{\pi i}$ в коротком формате}
\end{lisp}

Внутренний формат, используемый для внешнего представления, зависит только от
маркера экспоненты и не зависит от  количества знаков после запятой во внешнем
представлении. 

Тогда как Common Lisp содержит терминологию и систему обозначений для
включения 4 различных типов чисел с плавающей точкой, не все реализации будут
иметь желание поддержки такого большого количества типов.
Реализации разрешается предоставлять меньшее, чем 4, количество внутренних
форматов чисел с плавающей точкой, в таком случае как минимум один из этих типов
будет <<общим>> для более одного внешнего имени \emph{короткого}, \emph{одинарного},
\emph{двойного} и \emph{длинного} в соответствии со следующими правилами:
\begin{itemize}
\item
Если предоставляется один внутренний формат, то он рассматривается как
\emph{одинарный}, но также служит \emph{коротким}, \emph{двойным} и
\emph{длинным}.
Типы данный \cdf{short-float}, \cdf{single-float}, \cdf{double-float} и
\cdf{long-float} считаются идентичными. В этой реализации такое выражение, как
\cd{(eql 1.0s0 1.0d0)} будет истинным, потому что два числа \cd{1.0s0} и
\cd{1.0d0} будут конвертированы в один и тот же внутренний формат и таким
образом будут считаться принадлежащим одному типу данных, несмотря на различный
внешний синтаксис.
В такой реализации и выражение \cd{(typep 1.0L0 'short-float)} будет истинным.
В механизме вывода все числа с плавающей точкой будут выводится в
\emph{одинарном} формате, таким образом будет вывод экспоненты  с буквой \cd{E}
или \cd{F}.
\item
Если предоставляется два внутренних формата, то может быть выбрано одно из двух
соответствий, в зависимости от того, какое является более подходящим:
\begin{itemize}
\item
Один формат является \emph{коротким}, другой \emph{длинным}, и они представлены
и как \emph{двойной} и \emph{длинный}.
Типы данных \cdf{single-float}, \cdf{double-float} и \cdf{long-float} считаются
идентичными, но \cdf{short-float} от них отличается.
Выражение, такое как \cd{(eql 1.0s0 1.0d0)} будет ложным, но \cd{(eql 1.0f0
  1.0d0)} будет истинным. Также и \cd{(typep 1.0L0 'short-float)} будет ложным,
но \cd{(typep 1.0L0 'single-float)} будет истинным.
В механизме вывода все числа с плавающей точкой считаются \emph{короткими} или
\emph{одинарными}.

\item 
Одни формат \emph{одинарный} и также представляет \emph{короткий}.
Другой формат \emph{двойной} и также представляет \emph{длинный}.
Типы данных \cdf{short-float} и \cdf{single-float} считаются
одинаковыми, и \cdf{double-float} и \cdf{long-float} считаются одинаковыми.
Такое выражение, как \cd{(eql 1.0s0 1.0d0)} будет ложным, так же как и \cd{(eql
  1.0f0 1.0d0)}, но \cd{(eql 1.0d0 1.0L0)} будет положительным.
Также \cd{(typep 1.0L0 'short-float)} будет ложным, но \cd{(typep 1.0L0
  'double-float)} будет истинным.
В механизме вывода все числа с плавающей точкой считаются \emph{одинарными} или
\emph{двойными}.
\end{itemize}

\item 
Если предоставляется три внутренних формата, тогда может быть выбрано одно
из двух соответствий, в зависимости от того, какое является более подходящим:
\begin{itemize}
\item 
Один формат \emph{короткий}, другой \emph{одинарный} и третий \emph{двойной} и
также рассматривается как \emph{длинный}. Тогда применяются уже названные
ограничения.

\item 
Один формат \emph{одинарный} и также рассматривается, как \emph{короткий},
другой формат \emph{двойной} и третий \emph{длинный}.
\end{itemize}

\end{itemize}

\beforenoterule
\begin{implementation}
Рекомендуется предоставлять столько различных типов чисел с плавающей точкой,
сколько возможно, используя при этом
таблицу~\ref{Floating-Format-Requirements-Table} в качестве указаний.
В идеальной ситуации, \emph{короткий} формат числа с плавающей точкой должен
быть <<быстрым>>, в частности, не требующим выделения места в куче.
\emph{одинарный} должен приближаться к стандарту IEEE для одинарного формата.
\emph{двойной} должен приближаться к стандарту IEEE для двойного формата.
\cite{IEEE-PROPOSED-FLOATING-POINT-STANDARD,IEEE-FLOATING-POINT-IMPL-GUIDE,IEEE-FLOATING-POINT-IMPL-GUIDE-ERRATA}.
\end{implementation}
\afternoterule

\subsection{Комплексные числа}

Комплексные числа (тип \cdf{complex})
представляются в алгебраической форме, с действительной и мнимой частями, каждая
из которых является не комплексным числом (целым, дробным, или с плавающей
точкой). Следует отметить, что части комплексного числа не
обязательно числа с плавающей точкой; в это Common Lisp похож на PL/I и
отличается от Fortran'а. Однако обе части должны быть одного типа: обе
рациональные, или обе какого-либо формата с плавающей точкой.

Комплексные числа могут быть обозначены с помощью записи символа \cd{\#C} с
последующим списком действительной и мнимой частей.
Если две части, как было отмечено, не принадлежат одному типу, тогда они будут
преобразованы в соответствие с правилами преобразования чисел с плавающей точкой
описанными в главе~\ref{NUMBER}.

\begin{lisp}
\#C(3.0s1 2.0s-1)~~~~~;\textrm{Действительная и мнимая части в коротком формате}\\
\#C(5 -3)~~~~~~~~~~~~~;\textrm{Целое Гаусса} \\
\#C(5/3 7.0)~~~~~~~~~~;\textrm{Будет преобразовано в \cd{\#C(1.66666 7.0)}} \\
\#C(0 1)~~~~~~~~~~~~~~;\textrm{Мнимая единица, \emph{i}}
\end{lisp}

Тип заданного комплексного числа определяется с помощью списка: слова
\cdf{complex} и типа компонентов; например, специализированное представление для
комплексных чисел с частями принадлежащими типу <<короткое с плавающей точкой>>,
будет выглядеть так \cd{(complex short-float)}. Тип \cdf{complex} включает все
представления комплексных типов.

Комплексное число типа \cd{(complex rational)}, в котором части принадлежат
рациональному типу, никогда не может содержать нулевую мнимую часть. Если в
результате вычислений получится комплексное число с нулевой мнимой частью, то
данное число будет автоматически конвертировано в не комплексное дробное число,
равное действительной часть исходного числа. Это называется правилом
\emph{канонизации комплексного числа}. Данное правило не применяется для
комплексных чисел с плавающими точками, то есть \cd{\#C(5.0 0.0)} и \cd{5.0} различные
числа.

\section{Буквы}

Буквы символы представляют собой объекты данных, принадлежащих типу
\cd{буква (character)}. (Чтобы не возникало путаницы между символами и
строковыми символами (буквами), в данном переводе используется слово
буквы.)

Объект буквы может быть записан, как знак \cd{\#{\Xbackslash}} и
последующая буква. Например:  \cd{\#{\Xbackslash}g}
обозначает букву g в нижнем регистре. Это работает достаточно хорошо
для вывода печатаемых букв. Невыводимые буквы имеют имена, и могут быть
записаны с помощью \cd{\#{\Xbackslash}} и последующего имени; например,
\cd{\#{\Xbackslash}Space} (или \cd{\#{\Xbackslash}SPACE} или
\cd{\#{\Xbackslash}space} или \cd{\#{\Xbackslash}sPaCE}) обозначает символ пробела.
Синтаксис для записи имени строкового символа после \cd{\#{\Xbackslash}}, такой
же как и для Lisp символов. Однако в работе могут использоваться только те
имена, которые известны данной реализации.

\subsection{Стандартные буквы}

Common Lisp определяет множество стандартных букв (подтип
\cdf{standard-char}) для двух целей.  Common Lisp программы, которые
\emph{записаны} используя множество букв, могут быть прочитаны любой
реализацией Common Lisp; и Common Lisp программы, которые
\emph{используют} только стандартные буквы в качестве объектов данных,
скорее всего будут портируемыми. 
Множество букв состоит из символа
пробела, \cd{\#{\Xbackslash}Space}, символа новой строки
\cd{\#{\Xbackslash}Newline}, и следующих сорока четырёх печатаемых
символов и их эквивалентов:
\begin{lisp}
! " \# \$ \% \& ' ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9 : ; < = > ? \\
{\Xatsign} A B C D E F G H I J K L M N O P Q R S T U V W X Y Z {\Xlbracket} {\Xbackslash} {\Xrbracket} {\Xcircumflex} {\Xunderscore} \\
{\Xbq} a b c d e f g h i j k l m n o p q r s t u v w x y z {\Xlbrace} | {\Xrbrace} {\Xtilde}
\end{lisp}

Множество стандартных букв Common Lisp'а явно соответствует
множеству из сорока пяти стандартных ASCII печатаемых букв и символа новой
строки.

Из сорока четырёх печатаемых символов, следующие используются с ограничениями
связанными с синтаксисом Common Lisp программ:
\begin{lisp}
{\Xlbracket}~~{\Xrbracket}~~{\Xlbrace}~~{\Xrbrace}~~?~~!~~{\Xcircumflex}~~{\Xunderscore}~~{\Xtilde}~~\$~~\% 
\end{lisp}

Следующие строковые символы называются \emph{слегка стандартизированными}:
\begin{lisp}
\#{\Xbackslash}Backspace~~\#{\Xbackslash}Tab~~\#{\Xbackslash}Linefeed~~\#{\Xbackslash}Page~~\#{\Xbackslash}Return~~\#{\Xbackslash}Rubout
\end{lisp}

Не все реализации Common Lisp'а нуждаются в поддержке этих букв; но те
реализации, что используют ASCII кодировку должны их поддерживать,
соответственно BS (восьмеричный код 010), HT (011), LF (012), FF (014), CR
(015) и DEL (177). Эти строковые символы не являются членами подтипа
\cdf{standard-char}, если не будут созданы синонимы для них.
Например, разработчик реализации может 
определить \cd{\#{\Xbackslash}Linefeed} или \cd{\#{\Xbackslash}Return} как
синоним для \cd{\#{\Xbackslash}Newline},
или \cd{\#{\Xbackslash}Tab} как синоним для \cd{\#{\Xbackslash}Space}.

\subsection{Разделители строк}

Обработка разделителей строк является одним из самых сложных моментов в
проектировании переносимой программы, преимущественно потому, что между
операционными системами очень мало соглашений по этому поводу. Некоторые
используют только один символ, и рекомендуемый для этого ASCII символ является
символом перевода строки LF (также называемый символом новой строки, NL),
но некоторые системы используют символ перевода каретки CR. Более
широко используется последовательность из двух символов CR и последующем
LF. Часто разделители строк не имеют выводимого представления, но неявно влияют
на структурирование файла в записи, каждая запись содержит строку
текста. Например, дека перфокарт имеет такую структуру.

Common Lisp предоставляет абстрактный интерфейс, требуя наличия одного символа
\cd{\#{\Xbackslash}Newline}, который являет разделителем строк. (Язык C имеет
подобное требование.)
Реализация Common Lisp'а должна транслировать это односимвольное представление
разделители в то, что требуется во внешних системах в данной операционной системе.

Требование того, что разделитель строк должен быть представлен одним символом,
имеет следующие последствия. Строковый объект, записанный в середине программы и
содержащий несколько строк, должен содержать только один символ для каждого
разделителя. Рассмотрим фрагмент следующего кода:
\begin{lisp}
(setq a-string "This string \\
contains \\
forty-two characters.")
\end{lisp}

Между \cdf{g} and \cdf{c} должен быть только один символ,
\cd{\#{\Xbackslash}Newline}; такая последовательность из двух
символов, как \cd{\#{\Xbackslash}Return} и \cd{\#{\Xbackslash}Newline}
некорректна. Такая же ситуация и между \cdf{s} и \cdf{f}.

Когда буква \cd{\#{\Xbackslash}Newline} записывается в выходной файл,
реализация Common Lisp'а должна предпринять соответствующие действия для
разделения строк. Это может быть реализовано, как трансляция
\cd{\#{\Xbackslash}Newline} в последовательность CR/LF.

\subsection{Нестандартные символы}

Любая реализация может предоставлять дополнительные буквы, и
печатаемые и именованные. Некоторые вероятные примеры:

\begin{lisp}
\#{\Xbackslash}$\pi$~~\#{\Xbackslash}$\alpha$~~\#{\Xbackslash}Break~~\#{\Xbackslash}Home-Up~~\#{\Xbackslash}Escape
\end{lisp}

Использование таких букв, может создавать проблемы для портируемости Common
Lisp программы.

\section{Символы}

Символы (не строковые символы (или буквы)) являются Lisp'овыми
объектами данных, созданы для нескольких целей и имеют несколько
интересных свойств. Каждый объект типа \cdf{symbol} имеет имя,
называемое его \emph{выводимым именем (print name)}. Существует
возможность получить имя символа в виде строки. Также возможно
обратное действие, получение имени символа из строки. (Более подробно:
символы могут быть организованы в \emph{пакеты}, и все символы в
пакете имеют уникальные имена. Смотрите главу~\ref{XPACK}.)

У символов есть компонент, называемый \emph{список свойств}, или \emph{plist}.
Список свойств всегда является списком, у которого чётные элементы (начиная с
нулевого) являются символами, они выступают в качестве имён свойств, и нечётные
элементы являются связанными со свойствами значениями. Для манипуляций с этим
списком свойств предоставляются функции, это позволяет символу выступать в роли
расширяемой структуры.

Символы также используются для представления определённых видов переменных в
Lisp программах, и для манипуляции значениями связанными с символами в такой
роли также предоставляются функции.

Символ может быть обозначен просто записью его имени.
Если его имя непустое, и если его имя содержит только алфавитные буквы в верхнем
регистре, цифры или некоторые псевдо-алфавитные строковые символы (но не
разделители, как круглые скобки и пробелы), и если имя символа не может быть
интерпретировано как число, тогда имя символа задаётся последовательностью букв
его имени.
Во внутреннем представлении все буквы записанные в имени символа переводятся в
верхний регистр.
Например:
\begin{lisp}
~~~~~~~~~~~~~~~~~~~~\=\kill
FROBBOZ\>;\textrm{Символ, имя которого \cdf{FROBBOZ}} \\
frobboz\>;\textrm{Другой путь записи того же символа} \\
fRObBoz\>;\textrm{Ещё один путь записи полюбившегося символа} \\
unwind-protect\>;\textrm{Символ с дефисом в имени} \\
+\$\>;\textrm{Символ с именем \cd{+\$}} \\
1+\>;\textrm{Символ с именем \cdf{1+}} \\
+1\>;\textrm{Это число 1, а не символ} \\
pascal{\Xunderscore}style\>;\textrm{Этот символ содержит знак подчёркивания в своём
  имени} \\
b{\Xcircumflex}2-4*a*c\>;\textrm{Это один символ} \\
\>;~\textrm{Этот символ содержит некоторые специальные знаки в своём имени} \\
file.rel.43\>;\textrm{Символ содержит точки в своём имени} \\
/usr/games/zork\>;\textrm{Символ содержит наклонные черты в своём имени}
\end{lisp}

В дополнение к буквам и числам, следующие строковые символы допускаются в
использовании в написании имени символа:
\begin{lisp}
+~~-~~*~~/~~{\Xatsign}~~\$~~\%~~{\Xcircumflex}~~\&~~{\Xunderscore}~~=~~<~~>~~{\Xtilde}~~.
\end{lisp}

Некоторые из этих строковых символов имеют специальные общепринятые значения
для имён.
Например, символы, которые задают специальные переменные, обычно имеют имена
начинающиеся и заканчивающиеся звёздочкой \cd{*}.
Одиночная точка используется для задания cons-ячеек или списков с точкой. Точка
также является разделителем дробной части.

Следующие строковые символы предназначены для использования в качестве
макросимволов для изменения и расширения синтаксиса языка:
\begin{lisp}
?~~!~~{\Xlbracket}~~{\Xrbracket}~~{\Xlbrace}~~{\Xrbrace}
\end{lisp}

Выводимое имя символа может содержать буквы в верхнем и нижнем регистрах.
Однако, при чтении Lisp reader обычно конвертирует буквы нижнего регистра в
верхний.
В реализации все символы, которые именуют все стандартные Common Lisp переменные
и функции хранятся в верхнем регистре. Однако в книге все эти символы для
удобства приводятся в нижнем регистре. Использование имён символов в нижнем
регистре при написании программы возможно, потому что \cdf{read} конвертирует все
считываемые символы в верхний регистр.

Существует функция \cdf{readtable-case}, которая контролирует поведение функции
\cdf{read} касаемо преобразования регистров букв в именах символов.

Если символ не может быть задан, потому что в его имени используются
недопустимые буквы и знаки, их можно <<экранировать>> двумя способами. Один из
них заключается в использовании обратной наклонной черты перед каждым
экранируемым знаком. В таком случае имя символа никогда не будет ошибочно
интерпретировано, как число.
Например:
\begin{lisp}
~~~~~~~~~~~~~~~~~~~~~~~~\=\kill
{\Xbackslash}(\>;\textrm{Символ с именем \cd{(}} \\
{\Xbackslash}+1\>;\textrm{Символ с именем \cd{+1}} \\
+{\Xbackslash}1\>;\textrm{Также символ с именем \cd{+1}} \\
{\Xbackslash}frobboz\>;\textrm{Символ с именем \cdf{fROBBOZ}} \\
3.14159265{\Xbackslash}s0\>;\textrm{Символ с именем \cd{3.14159265s0}} \\
3.14159265{\Xbackslash}S0\>;\textrm{Другой символ с именем \cd{3.14159265S0}} \\
3.14159265s0\>;\textrm{short-format с плавающей точкой для аппроксимации числа $\pi$} \\
APL{\Xbackslash}{\Xbackslash}360\>;\textrm{Символ с именем \cd{APL{\Xbackslash}360}} \\
apl{\Xbackslash}{\Xbackslash}360\>;\textrm{Также символ с именем \cd{APL{\Xbackslash}360}} \\
{\Xbackslash}(b{\Xcircumflex}2{\Xbackslash}){\Xbackslash} -{\Xbackslash} 4*a*c\>;\textrm{Имя \cd{(B{\Xcircumflex}2) - 4*A*C};} \\
\>;~\textrm{содержит скобки и два пробела} \\
{\Xbackslash}({\Xbackslash}b{\Xcircumflex}2{\Xbackslash}){\Xbackslash} -{\Xbackslash} 4*{\Xbackslash}a*{\Xbackslash}c\>;\textrm{Имя \cd{(b{\Xcircumflex}2) - 4*a*c};} \\
\>;~\textrm{буквы явно указаны в нижнем регистре}
\end{lisp}

Если таких <<запрещённых>> букв в имени много, использование \cd{{\Xbackslash}}
перед \emph{каждой} буквой утомительно. Альтернативным методом экранирования 
знаков в имени символа является заключение всего имени или только его части в
скобки из вертикальных черт. Это эквивалентно тому, что каждая буква была бы
экранирована обратной косой чертой.
\begin{lisp}
~~~~~~~~~~~~~~~~~~~~\=\kill
|"|\>;\textrm{То же что и \cd{{\Xbackslash}"}} \\
|(b{\Xcircumflex}2) - 4*a*c|\>;\textrm{Имя \cd{(b{\Xcircumflex}2) - 4*a*c}} \\
|frobboz|\>;\textrm{Имя \cdf{frobboz}, а не \cdf{FROBBOZ}} \\
|APL{\Xbackslash}360|\>;\textrm{Имя \cd{APL360}, потому что \cd{{\Xbackslash}} экранирует the \cd{3}} \\
|APL{\Xbackslash}{\Xbackslash}360|\>;\textrm{Имя \cd{APL{\Xbackslash}360}} \\
|apl{\Xbackslash}{\Xbackslash}360|\>;\textrm{Имя \cd{apl{\Xbackslash}360}} \\
|{\Xbackslash}|{\Xbackslash}||\>;\textrm{То же, что и \cd{{\Xbackslash}|{\Xbackslash}|}: имя \cd{||}} \\
|(B{\Xcircumflex}2) - 4*A*C|\>;\textrm{Имя \cd{(B{\Xcircumflex}2) - 4*A*C};} \\
\>;~\textrm{содержит скобки и два пробела} \\
|(b{\Xcircumflex}2) - 4*a*c|\>;\textrm{Имя \cd{(b{\Xcircumflex}2) - 4*a*c}}
\end{lisp}

\section{Списки и Cons-ячейки}
\indexterm{cons}

\cd{cons-ячейка} является записью структуры, содержащей два элемента, называемых
\emph{car} и \emph{cdr}. Cons-ячейки используются преимущественно для отображения
списков.

\emph{Список} рекурсивно определяется пустым списком или cons-ячейкой, у
которой \emph{cdr} элемент является списком.
Таким образом, список является цепочкой cons-ячеек связанных с помощью их
\emph{cdr} элементов, заканчивающейся пустым списком с помощью
{\nil}. \emph{car} элементы cons-ячеек называются \emph{элементами} списка. Для
каждого элемента списка существует cons-ячейка. Пустой список не имеет элементов
вообще.

Список записывается с помощью элементов в необходимом порядке, разделяемых
пробелом (пробел, таб, возврат каретки) и окружённых круглыми скобками.
\begin{lisp}
(a b c)~~~~~~~~~~~~~~~;\textrm{Список трёх элементов} \\
(2.0s0 (a 1) \#{\Xbackslash}*)~~~~~;\textrm{Список трёх элементов: короткого с
  плавающей точкой} \\
~~~~~~~~~~~~~~~~~~~~~~;~\textrm{числа, другого списка, и строкового символа}
\end{lisp}
Таким образом, пустой список {\nil} может быть записан, как {\emptylist}, потому что
является списком без элементов.

\emph{Список с точкой} является списком, последняя cons-ячейка которого в
\emph{cdr} элементе содержит 
объект данных, а не {\nil} (который не является
cons-ячейкой, иначе исходная cons-ячейка не была бы последней).
Такой список называется <<списком с точкой>> по причине используемой для него
специальной записи: элементы списка записанные в двух последних позициях списка
перед закрывающей круглой скобкой разделяются точкой (обрамленной с двух сторон
пробелами). Тогда последнее значение будет содержаться в \emph{cdr} элементе
последней cons-ячейки. В особых случаях, одиночная cons-ячейка может быть
записана с помощью \emph{car} и \emph{cdr} элементов, обрамленных в круглые скобки
и разделённых с помощью точки, окружённой пробелами. 
Например:
\begin{lisp}
(a . 4)~~~~~~~~~;\textrm{cons-ячейка, \emph{car} которой является символом} \\
~~~~~~~~~~~~~~~~;~\textrm{и \emph{cdr} которой равен целому числу} \\
(a b c . d)~~~~~;\textrm{Список с точкой с тремя элементами, у последней} \\
~~~~~~~~~~~~~~~~;~\textrm{cons-ячейки \emph{cdr} равен символу \cdf{d}}
\end{lisp}

Правильной записью также является что-то наподобие \cd{(a b . (c d))};
она означает то же, что и \cd{(a b c d)}. Стандартный Lisp вывод никогда не
распечатает список в первом виде, таким образом когда это возможно, он старается
избавиться от записи с точкой.

Часто термин \emph{список} употребляется и для обычных списков и для списков с
точкой. Когда разница важна, для списка, заканчивающегося с помощью {\nil},
будет употребляться термин <<Ъ список>>. Большинство функций указывают, что
оперируют списками, ожидая, что они Ъ. Везде в этой книге, если не указано иное,
передача списка с точкой в такие функции является ошибкой.

\beforenoterule
\begin{implementation}
Implementors are encouraged to use the equivalent
of the predicate \cdf{endp} wherever it is necessary to test
for the end of a list.  Whenever feasible, this test should explicitly
signal an error if a list is found to be terminated by a non-{\nil} atom.
However, such an explicit error signal is not required, because
some such tests occur in important loops where efficiency is important.
In such cases, the predicate \cdf{atom} may be used to test
for the end of the list, quietly treating any non-{\nil} list-terminating
atom as if it were {\nil}.
\end{implementation}
\afternoterule

Иногда используется термин \emph{дерево} для ссылки на некоторую cons-ячейку,
которая содержит другие cons-ячейки в своих \emph{car} и \emph{cdr} элементах,
которые также содержат cons-ячейки в своих элементах и так далее, пока не будут
достигнуты элементы, не являющиеся cons-ячейками.
Такие элементы, не являющиеся cons-ячейками называются \emph{листьями} дерева.

Списки, списки с точкой и деревья, все вместе не завершают список типов данных,
они просто являются удобной точкой для рассмотрения таких структур, как
cons-ячейки.
Существуют также другие термины, такие как, например, \emph{ассоциативный
  список}. Ни один из этих типов данных не является Lisp'овым типом
данных. Типом данных являются cons-ячейки, а также {\nil} является объектом типа
\cdf{null}. Lisp'овый тип данных \cd{список} подразумевает объединение типов
\cd{cons-ячеек} и \cdf{null}, и по этой причине содержит в себе оба типа: Ъ
список и список с точкой.

\section{Массивы}
\label{ARRAY-TYPE-SECTION}
\indexterm{array}

\cd{Массив (\cdf{array})} является объектом с элементами расположенными в соответствие с
декартовой системой координат.

Количество измерений массива называется \emph{ранг} (это терминология взята из
APL). Ранг является неотрицательных целым.
Также каждое измерение само по себе является неотрицательным целым.
Общее количество элементов в массиве является произведением размеров всех
измерений.

Реализация Common Lisp'а может налагать ограничение на ранг массива, но данное
ограничение не может быть менее 7. Таким образом, любая Common Lisp программа
может использовать массивы с семью и менее измерениями.
(Программа может получить текущее ограничение для ранга для используемой системы
с помощью константы \cdf{array-rank-limit}.)

Допускается существование нулевого ранга. В этом случае, массив не содержит
элементов, и любой доступ к элементам является ошибкой. При этом другие
свойства массива использоваться могут. Если ранг равен нулю, тогда массив не
имеет измерений, и их произведение приравнивается к 1 (FIXME).
Таким образом массив с нулевым рангом содержит один элемент.

Элемент массива задаётся последовательностью индексов.
Длина данной последовательности должна равняется рангу массива.
Каждый индекс должен быть неотрицательным целым строго меньшим размеру
соответствующего измерения. Также индексация массива начинается с нуля, а не с
единицы, как в по умолчанию Fortran'е.

В качестве примера, предположим, что переменная \cdf{foo} обозначает двумерный
массив с размерами измерений 3 и 5. Первый индекс может быть 0, 1 или 2, и второй
индекс может быть 0, 1, 2, 3 или 4. Обращение к элементам массива может быть
осуществлено с помощью функции \cdf{aref}, например, \cd{(aref foo 2 1)}
ссылается на элемент массива (2, 1). Следует отметить, что \cdf{aref} принимает
переменное число аргументов: массив, и столько индексов, сколько измерений у
массива.
Массив с нулевым рангом не имеет измерений, и в таком случае \cdf{aref} принимает
только один параметр -- массив, и не принимает индексы, и возвращает одиночный
элемент массива.

В целом, массивы могут быть многомерными, могут иметь общее содержимое с
другими массивами. и могут динамически менять свой размер после создания (и
увеличивать, и уменьшать).
Одномерный массив может также иметь \emph{указатель заполнения}.

Многомерные массивы хранят элементы построчно.
Это значит, что внутренне многомерный массив хранится как одномерный массив с
порядком элементов, соответствующим лексикографическому порядку их индексов. Это
важно в двух ситуациях:
(1) когда массивы с разными измерениями имеют общее содержимое, и 
(2) когда осуществляется доступ к очень большому массиву в виртуальной памяти.
(Первая ситуация касается семантики; вторая --- эффективности)

Массив, что не указывает на другой массив, не имеет указателя заполнения и не
имеет динамически расширяемого размера после создания называется \emph{простым}
массивом. Пользователи могут декларировать то, что конкретный массив будет
простым. Некоторые реализации могут обрабатывать простые массивы более
эффективным способом, например, простые массивы могут храниться более компактно,
чем непростые. 

Когда вызывается \cdf{make-array}, если один или более из \cd{:adjustable},
\cd{:fill-pointer} и \cd{:displaced-to} аргументов равен истине, 
тогда является ли результат простым массивом не определено. Однако если все три
аргумента равны лжи, тогда результат гарантированно будет простым массивом.

\subsection{Векторы}

В Common Lisp'е одномерные массивы называется \emph{векторами}, и составляют тип
\cd{vector} (который в свою очередь является подтипом \cd{array}).
Вектора и списки вместе являются \emph{последовательностями}. Они отличаются тем,
что любой элемент одномерного массива может быть получен за константное время,
тогда как среднее время доступа к компоненту для списка линейно зависит от длины
списка, с другой стороны, добавление нового элемента в начала списка занимает
константное время, тогда как эта же операция для массива занимает время линейно
зависящее от длины массива.

Обычный вектор (одномерный массив, который может содержать любой тип объектов,
но не имеющий дополнительных атрибутов) может быть записан с помощью
перечисления элементов разделённых пробелом и окружённых \cd{\#(} и
\cd{)}.
Например:
\begin{lisp}
\#(a b c)~~~~~~~~~~~~~~~~~~~~;\textrm{Вектор из трёх элементов} \\*
\#()~~~~~~~~~~~~~~~~~~~~~~~~~;\textrm{Пустой вектор} \\
\#(2 3 5 7 11 13 17 19 23 29 31 37 41 43 47) \\*
~~~~~~~~~~~~~~~~~~~~~~~~~~~~;\textrm{Вектор содержит простые числа меньшие пятидесяти}
\end{lisp}

Следует отметить, что когда функция \cdf{read} парсит данный синтаксис, она
всегда создаёт \emph{простой} массив.

\beforenoterule
\begin{rationale}
Многие люди рекомендовали использовать квадратные скобки для задания векторов
так: \cd{{\Xlbracket}a b c{\Xrbracket}} вместо \cd{\#(a b c)}. Данная запись
короче, возможно более читаема, и безусловно совпадает с культурными традициями
в других областях компьютерных наук и математики. Однако, для достижения
предельной полезности от пользовательских макросимволов, что расширяют
возможности функции \cdf{read}, необходимо было оставить некоторые строковые
символы для этих пользовательских целей. Опыт использования MacLisp'а
показывает, что пользователи, особенно разработчики языков для использования в
исследованиях искусственного интеллекта, часто хотят определять специальные
значения для квадратных скобок. Таким образом Common Lisp не использует
квадратных и фигурных скобок в своём синтаксисе.
\end{rationale}
\afternoterule

Реализации могут содержать специализированные представления массивов для
достижения эффективности в случаях, когда все элементы принадлежат одному
определённому типу (например, числовому). Все реализации содержат
специальные массивы в случаях, когда все элементы являются строковыми символами
(или специализированное подмножество строковых символов).
Такие одномерные массивы называются \emph{строки}.
Все реализации также должны содержать специализированные битовые массивы,
которые принадлежат типу \cd{(array bit)}.
Такие одномерные массивы называются \emph{битовые векторы}.

\subsection{Строки}
\label{STRING-TYPE-SECTION}

\begin{lisp}
base-string \EQ\ (vector base-char) \\*
simple-base-string \EQ\ (simple-array base-char (*))
\end{lisp}

Реализация может поддерживать другие типы строк. Все функции Common Lisp'а
взаимодействуют со строками одинаково. Однако следует отметить, вставка extended
character в base string является ошибкой.

\cd{Строковый (\cdf{string})} тип является подтипом \cd{векторного (\cdf{vector})} типа.

Строка может быть записана как последовательность символов, с предшествующим и
последующим символом двойной кавычки \cd{{\Xdquote}}.
Любой символ \cd{{\Xdquote}} или \cd{{\Xbackslash}} в данной последовательности должен
иметь предшествующий символ \cd{{\Xbackslash}}.

Например:
\begin{lisp}
{\Xdquote}Foo{\Xdquote}~~~~~~~~~~~~~~~~~~~~~~~~~;\textrm{Строка из трёх символов} \\*
{\Xdquote}{\Xdquote}~~~~~~~~~~~~~~~~~~~~~~~~~~~~;\textrm{Пустая строка} \\
{\Xdquote}{\Xbackslash}{\Xdquote}APL{\Xbackslash}{\Xbackslash}360?{\Xbackslash}{\Xdquote} he
cried.{\Xdquote}~~~~~;\textrm{Строка из двенадцати символов} \\*
{\Xdquote}|x| = |-x|{\Xdquote}~~~~~~~~~~~~~~~~~~;\textrm{Строка из десяти символов}
\end{lisp}

Необходимо отметить, что символ вертикальной черты \cd{|} в строке не должен быть
экранирован с помощью \cd{{\Xbackslash}}. Также как и любая двойная кавычка в имени
символа, записанного с использованием вертикальных черт, не нуждается в
экранировании. Записи с помощью двойной кавычки и вертикальной черты похожи, но
используются для разных целей: двойная кавычка указывает на строку, содержащую
строковые символы, тогда как вертикальная черта указывает на символ, имя
которого содержит последовательность строковых символов.

Строковые символы обрамленные двойными кавычками, считываются слева
направо. Индекс символа больше индекса предыдущего символа на 1. Самый левый
символ строки имеет индекс 0, следующий 1, следующий 2, и т.д.

Следует отметить, что функция \cd{prin1} будет
выводить на печать в данном синтаксисе любой вектор строковых символов (не
только простой), но функция \cdf{read} при разборе данного синтаксиса будет
всегда создавать простую строку.

\subsection{Битовые векторы}

Битовый вектор может быть записан в виде последовательности битов заключённых в
строку с предшествующей \cd{\#*}; любой разделитель, например, как пробел
завершает синтаксис битового вектора.
Например:
\begin{lisp}
\#*10110~~~~~;\textrm{Пятибитный битовый вектор; нулевой бит 1} \\
\#*~~~~~~~~~~;\textrm{Пустой битовый вектор}
\end{lisp}

Биты записанные после \cd{\#*}, читаются слева направо. Индекс каждого бита
больше индекса предыдущего бита на 1. Индекс самого левого бита 0, следующего 1
и т.д.

Функция \cd{prin1} выводит любой битовый вектор (не только простой)
в этом синтаксисе, однако функция \cdf{read} при разборе этого синтаксиса будет
всегда создавать простой битовый вектор.

\section{Хеш-таблицы}

Хеш-таблицы предоставляют эффективный способ для связи любого Lisp объекта
(\emph{ключа}) с другим объектом. Они предоставляются как примитивы Common
Lisp'а, потому что некоторые реализации могут нуждаться в использовании
стратегий управления внутренними хранилищами, что создало бы сложности для
пользователя в реализации портируемых хеш-таблиц.
Хеш-таблицы описаны в главе~\ref{HASH}.

\section{Таблицы символов Lisp парсера (Readtables)}

Таблицы символов Lisp парсера является структурой данных, которая отображает
символы в синтаксические типы для парсера Lisp выражений.
В частности, эта таблица указывает для каждого строкового символа с синтаксисом
\emph{макросимвола}, какой макрос ему соответствует. Это механизм, с помощью
которого пользователь может запрограммировать парсер для выполнения
ограниченных, но полезных расширений.
Смотрите раздел~\ref{READTABLE-SECTION}.

\section{Пакеты}

Пакеты являются коллекциями символов, которые предоставлены в качестве
пространства имён. Парсер распознает символы с помощью поиска строки в текущем
пакете. Пакеты могут использоваться для скрытия имён внутрь модуля от другого
кода. Также предоставляются механизмы для экспортирования символов из заданного
пакета в главный <<пользовательский>> пакет.
Смотрите главу~\ref{XPACK}.

\section{Имена файлов}

Имена файлов являются сущностями, с помощью которых Common Lisp программа может
взаимодействовать с внешней файловой системой в приемлемой платформонезависимой
форме. Смотрите раздел~\ref{PATHNAME}.

\section{Потоки}

Поток является источником или набором данных, обычно строковых символов или
байтов. Почти все функции, что выполняют ввод/вывод, делают это в отношении
заданного потока. Функция \cdf{open} принимает путь к файлу и возвращает поток
подключённый к файлу, указанному в параметре.
Существует несколько стандартных потоков, которые используются по умолчанию для
различных целей. Смотрите главу~\ref{STREAM}.

Существуют следующие подтипы типы \cdf{stream}:
\cdf{broadcast-stream}, \cdf{concatenated-stream},
\cdf{echo-stream}, \cdf{synonym-stream}, \cdf{string-stream}, \cdf{file-stream},
и \cdf{two-way-stream} непересекающиеся подтипы типа \cdf{stream}.
Следует отметить, что поток-синоним всегда приндлежит типу \cdf{synonym-stream}
вне зависимости от того, какой тип у потока, на который он указывает.

\section{Состояние для генератора псевдослучайных чисел (Random-States)}

Объект типа \cdf{random-state} используется для хранении информации о
состоянии, используемом генератором псевдослучайных чисел. Для боле подробной
информации об объектах \cdf{random-state} смотрите главу~\ref{RANDOM}.

\section{Структуры}

Структуры являются экземплярами определённых пользователем типов данных, которые
имеют ограниченное количество именованных полей (свойств). Они являются
аналогами записей в Pascal'е.
Структуры декларируются с помощью конструкции \cdf{defstruct}. \cdf{defstruct}
автоматически определяет конструктор и функции доступа к полям для нового типа
данных.

Различные структуры могут выводится на печать различными способами.
Определение типа структуры может содержать процедуру вывода на печать для
объектов данного типа (смотрите опцию \cd{:print-function} для \cdf{defstruct}).
Записью по-умолчанию для структур является:
\begin{lisp}
\#S(\emph{имя-структуры} \\
~~~~~~~~\emph{имя-слота-1} \emph{значение-слота-1} \\
~~~~~~~~\emph{имя-слота-2} \emph{значение-слота-2} \\
~~~~~~~~~~~~~~~~~~~~~~...)
\end{lisp}
где \cd{\#S} указывает на синтаксис структуры, \emph{имя-структуры} является
именем (символом) типа данной структуры, каждый \emph{имя-слота} является именем
слота (также символ), и каждое соответствующее \emph{значение-слота} ---
отображением Lisp объекта в данном слоте.

\section{Функции}
Тип \cdf{function} не должен пересекаться с
\cdf{cons} и \cdf{symbol}, и таким образом список, у которого \emph{car} элемент
это\cdf{lambda} не является, честно говоря, типом \cdf{function}, ровно
как и любой символ.

Однако стандартные Common Lisp'овые функции, которые принимают функциональные
аргументы, будут принимать символ или список, у которого \emph{car} элемент
является \cdf{lambda} и автоматически преобразовывать их в функции. Эти функции
включают в себя \cdf{funcall}, \cdf{apply} и \cdf{mapcar}.
Такие функции, однако, не принимают лямбда-выражение в качестве функционального
аргумента. Таким образом нельзя записать
\vskip 3pt
\begin{lisp}
(mapcar '(lambda (x y) (sqrt (* x y))) p q)
\end{lisp}
но можно что-то вроде
\begin{lisp}
(mapcar \#'(lambda (x y) (sqrt (* x y))) p q)
\end{lisp}

Это изменение сделало недопустимым представление лексических замыканий, как
списка, у которого \emph{car} элемент является некоторым специальным маркером.

Значение оператора \cdf{function} всегда будет принадлежать типу \cdf{function}.

\section{Нечитаемые объекты данных}

Некоторые объекты могут быть выведены на печать в виде, который зависит от
реализации.
Такие объекты не могут быть полностью реконструированы из распечатанной формы,
так как они обычно распечатываются в форме информативной для пользователя, но не
подходящей для функции \cdf{read}:
\cd{\#<\emph{полезная информация}}.

В качестве гипотетического пример, реализация может выводить 
\begin{lisp}
\#<stack-pointer si:rename-within-new-definition-maybe \#o311037552>
\end{lisp}
для некоторого специфичного типа данных <<внутреннего указатель на стек>>, у
которого выводимое отображение включает имя типа, некоторую информацию о слоте
стека, и машинный адрес (в восьмеричной система) данного слота.

Смотрите \cdf{print-unreadable-object}, макрос, которые выводит объект используя
\cd{\#<} синтаксис.

\section{Пересечение, включение и дизъюнктивность типов}
\label{DATA-TYPE-RELATIONSHIPS}

\begin{figure}
\caption{Common Lisp Type Hierarchy by Greg Pfeil}
\label{TYPES-HEIRARCHY-GRAPHIC}
\igraphics{CL-type-hierarchy}
\small\noindent
\end{figure}


Common Lisp'овая иерархия типов данных запутана и намеренно оставлена несколько
открытой, так что разработчики могут экспериментировать с новыми типами данных в
качестве расширения языка. В этом разделе чётко оговариваются все
определённые связи между типами, в том числе отношения подтипа/супертипа,
непересекаемость и исчерпывающее разбиение. Пользователь Common Lisp'а
не должен зависеть от любых отношений, явно здесь не указанных.
Например, недопустимо предположить, что, поскольку число
это не комплексное и не рациональное, то оно должно быть \cdf{float}. Реализация
может содержать другие виды чисел.

В первую очередь нам необходимо определить термины.
Если \emph{x} супертип \emph{y}, тогда любой объект типа \emph{y} принадлежит
также типу \emph{x}, и считается, что \emph{y} подтип \emph{x}. Если типы
\emph{x} и \emph{y} не пересекаются, то ни один объект (ни в одно реализации) не
может принадлежать одновременно двум этим типам \emph{x} и \emph{y}. Типы
с $\emph{a}_1$ по $\emph{a}_{n}$ являются \emph{исчерпывающим множеством} типа
\emph{x}, если каждый $\emph{a}_j$ является подтипом \emph{x}, и любой объект
обязательно принадлежит одному из типов $\emph{a}_{j}$.
$\emph{a}_1$ по $\emph{a}_{n}$ являются исчерпывающим разбиением, если они также
попарно не пересекаются.

\begin{itemize}
\item
Тип \cdf{t} является супертипом всех остальных типов. Каждый объект принадлежит
типу \cdf{t}.

\item 
Тип {\nil} является подтипом любого типа.
Объектов типа {\nil} не существует.

\item
Типы \cdf{cons}, \cdf{symbol}, \cdf{array}, \cdf{number}, \cdf{character},
\cdf{hash-table}, \cdf{readtable}, \cdf{package}, \cdf{pathname},
\cdf{stream}, \cdf{random-state} и любой другой тип, созданный с помощью
\cdf{defstruct} or \cdf{defclass} являются попарно непересекающимися.

Тип \cdf{function} не пересекается с типами \cdf{cons}, \cdf{symbol},
\cdf{array}, \cdf{number}, and \cdf{character}.

Тип \cdf{compiled-function} является подтипом \cdf{function}. Реализация может
также содержать другие потипы \cdf{function}.
\end{itemize}

\begin{itemize}
\item
Типы \cdf{real} and \cdf{complex} попарно непересекающиеся подтипы \cdf{number}.
\end{itemize}

\beforenoterule
\begin{rationale}
Может показаться, что \cdf{real} и \cdf{complex} должны формировать
исчерпывающее множество типа \cdf{number}. Но это специально сделано не так, для
того чтобы расширения Common Lisp'а могли экспериментировать с числовой системой.
\end{rationale}
\afternoterule

\begin{itemize}
\item
Типы \cdf{rational} и \cdf{float} попарно непересекающиеся подтипы \cdf{real}.
\end{itemize}

\beforenoterule
\begin{rationale}
Может показаться, что \cdf{rational} и \cdf{float} должны формировать
исчерпывающее множество типа \cdf{real}. Но это специально сделано не так, для
того чтобы расширения Common Lisp'а могли экспериментировать с числовой системой.
\end{rationale}
\afternoterule

\begin{itemize}
\item
Типы \cdf{integer} и \cdf{ratio} непересекающиеся подтипы \cdf{rational}.
\end{itemize}

\beforenoterule
\begin{rationale}
Может показаться, что \cdf{integer} и \cdf{ratio} должны формировать
исчерпывающее множество типа \cdf{rational}. Но это специально сделано не так, для
того чтобы расширения Common Lisp'а могли экспериментировать с числовой системой.
\end{rationale}
\afternoterule

Types \cdf{fixnum} and \cdf{bignum}
do in fact form an exhaustive partition of the type \cdf{integer}; more precisely,
they voted to specify that the type \cdf{bignum} is by definition equivalent
to \cd{(and~integer (not~fixnum))}.  This is consistent with the
first edition text in section~\ref{INTEGERS-SECTION}.

I interpret this to mean that implementators could still experiment with
such extensions as adding explicit representations of infinity, but such infinities
would necessarily be of type \cdf{bignum}.

\begin{itemize}
\item
Типы \cdf{short-float}, \cdf{single-float}, \cdf{double-float} и 
\cdf{long-float} являются подтипами \cdf{float}.  Любые два из них могут быть не
пересекающимися или идентичными. Если идентичные, тогда любые другие типы между
ними в перечисленном порядке должны быть также им идентичны (например, если \cdf{single-float} и \cdf{long-float} идентичны,
то \cdf{double-float} должен быть также им идентичен).

\item
Тип \cdf{null} является подтипом \cdf{symbol}; только один объект {\nil}
принадлежит типу \cdf{null}.

\item
Типы \cdf{cons} и \cdf{null} являются исчерпывающими частями типа \cdf{list}.
\end{itemize}

\begin{itemize}
\item
Тип \cdf{standard-char} является подтипом \cdf{base-char}.
Типы \cdf{base-char} и \cdf{extended-char}
являются исчерпывающими частями \cdf{character}.
\end{itemize}

\begin{itemize}
\item
Тип \cdf{string} является подтипом \cdf{vector}.
Множество всех типов \cd{(vector~\emph{c})}, включает себя такие типы, как
например, когда \emph{c} является подтипом \cdf{character}.
\end{itemize}

\begin{itemize}
\item
Тип \cdf{bit-vector} является подтипом \cdf{vector}, для \cdf{bit-vector}
означает \cd{(vector bit)}.

\item
Типы \cd{(vector t)}, \cdf{string}, и \cdf{bit-vector} являются непересекающимися.

\item
Тип \cdf{vector} является подтипом \cdf{array}; для всех типов \emph{x},
тип \cd{(vector \emph{x})} является тем же, что и тип \cd{(array \emph{x} (*))}.

\item
Тип \cdf{simple-array} является подтипом \cdf{array}.
\end{itemize}

\begin{itemize}
\item
Типы \cdf{simple-vector}, \cdf{simple-string} и
\cdf{simple-bit-vector} являются непересекающимися подтипами \cdf{simple-array},
для них значит
 \cd{(simple-array t (*))}, множество все типов
\cd{(simple-array \emph{c} (*))}, в котором, например, \emph{c} является подтипом \cdf{character},
и \cd{(simple-array bit (*))}, соответственно.
\end{itemize}


\begin{itemize}
\item
Тип \cdf{simple-vector} является подтипом \cdf{vector} и, конечно, подтипом
\cd{(vector t)}. 

\item
Тип \cdf{simple-string} является подтипом \cdf{string}.
(Следует отметить, что \cdf{string} является подтипом \cdf{vector},
\cdf{simple-string} не является подтипом \cdf{simple-vector}.)
\end{itemize}

\beforenoterule
\begin{rationale}
The hypothetical name \cdf{simple-general-vector} would have been more accurate than
\cdf{simple-vector}, but in this instance euphony and
user convenience were deemed more important to the design
of Common Lisp than a rigid symmetry.
\end{rationale}
\afternoterule

\begin{itemize}
\item
Тип \cdf{simple-bit-vector} является подтипом \cdf{bit-vector}.
(Следует отметить, что \cdf{bit-vector} является подтипом \cdf{vector},
\cdf{simple-bit-vector} не является подтипом \cdf{simple-vector}.)

\item
Типы \cdf{vector} и \cdf{list} является непересекающимися подтипами \cdf{sequence}.

\item
Типы \cdf{random-state}, \cdf{readtable}, \cdf{package}, \cdf{pathname},
\cdf{stream} и \cdf{hash-table} являются попарно непересекающимися.
\end{itemize}

\cdf{random-state}, \cdf{readtable}, \cdf{package}, \cdf{pathname},
\cdf{stream}, and \cdf{hash-table} 
попарно не пересекаются с другим типами. Смотрите заметку выше.

\begin{itemize}
\item
Типы \cdf{two-way-stream}, \cdf{echo-stream},
\cdf{broadcast-stream}, \cdf{file-stream}, \cdf{synonym-stream},
\cdf{string-stream} и
\cdf{concatenated-stream} являются попарно непересекающимися подтипами \cdf{stream}.
\end{itemize}

\begin{itemize}
\item
Любые два типа созданные с помощью  \cdf{defstruct} являются непересекающимися,
если только один из них не является супертипом для другого, в котором была
указанная опция \cd{:include} с именем этого супертипа.
\end{itemize}

\fi