-
Notifications
You must be signed in to change notification settings - Fork 9
/
dtypes.tex
2966 lines (2570 loc) · 171 KB
/
dtypes.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
%Part{Dtypes, Root = "CLM.MSS"}
% Chapter of Common Lisp Manual. Copyright 1984, 1988, 1989 Guy L. Steele Jr.
\clearpage\def\pagestatus{FINAL PROOF}
\ifx \rulang\Undef
\chapter{Data Types}
\label{DTYPES}
Common Lisp provides a variety of types of data objects. It is important to
note that in Lisp it is data objects that are typed, not variables.
Any variable can have any Lisp object as its value.
(It is possible to make an explicit declaration that a variable will
in fact take on one of only a limited set of values. However, such
a declaration may always be omitted, and the program will still run correctly.
Such a declaration merely constitutes advice from the user
that may be useful in gaining efficiency. See \cdf{declare}.)
In Common Lisp, a data type is a (possibly infinite) set of
Lisp objects. Many Lisp objects belong to more than one
such set, and so it doesn't always make sense to ask what is \emph{the} type
of an object; instead, one usually asks only whether an object belongs
to a given type. The predicate \cdf{typep} may be used to ask
whether an object belongs to a given type,
and the function \cdf{type-of} returns \emph{a} type
to which a given object belongs.
The data types defined in Common Lisp are arranged into a hierarchy (actually
a partial order) defined by the subset relationship.
Certain sets of objects, such as the set of numbers or the
set of strings, are interesting enough to deserve labels.
Symbols are used for most
such labels (here, and throughout this book, the word ``symbol''
refers to atomic symbols, one kind of Lisp object,
elsewhere known as literal atoms). See chapter~\ref{DTSPEC}
for a complete description of type specifiers.
The set of all objects is specified
by the symbol {\true}. The empty data type, which contains no objects, is
denoted by {\nil}.
The following categories of Common Lisp objects are of particular interest:
numbers, characters, symbols, lists, arrays, structures, and functions.
There are others as well.
Some of these categories
have many subdivisions. There are also standard types defined to
be the union
of two or more of these categories. The categories listed above, while they
are data types, are neither more nor less ``real'' than other data types;
they simply constitute a particularly useful slice across
the type hierarchy for expository purposes.
Here are brief descriptions of various Common Lisp data types.
The remaining sections of this chapter go into more detail
and also describe notations for objects
of each type. Descriptions of Lisp functions that operate
on data objects of each type appear in later chapters.
\begin{itemize}
\item
\emph{Numbers} are provided in various forms and representations.
Common Lisp provides a true integer data type: any integer,
positive or negative, has in principle a representation as a
Common Lisp data object, subject only to total memory limitations (rather than
machine word width).
A true rational data type is provided: the quotient of two integers,
if not an integer, is a ratio.
Floating-point numbers of various ranges and precisions are also
provided, as well as
Cartesian complex numbers.
\item
\emph{Characters} represent printed glyphs such as letters
or text formatting operations. Strings are one-dimensional
arrays of characters.
Common Lisp provides for a rich character set, including ways to
represent characters of various type styles.
\item
\emph{Symbols} (sometimes called \emph{atomic symbols} for emphasis
or clarity) are named data objects. Lisp provides machinery
for locating a symbol object, given its name (in the form
of a string). Symbols have \emph{property lists}, which in effect
allow symbols to be treated as record structures with an extensible
set of named components, each of which may be any Lisp object.
Symbols also serve to name functions and variables within programs.
\item
\emph{Lists} are sequences represented in the form of linked cells
called \emph{conses}. There is a special object (the symbol {\nil})
that is the empty list. All other lists are built recursively by adding a new
element to the front of an existing list. This is done by
creating a new \emph{cons}, which is an object having two components
called the \emph{car} and the \emph{cdr}. The \emph{car} may hold anything,
and the \emph{cdr} is made to point to the previously existing list.
(Conses may actually be used completely generally as two-element
record structures, but their most important use is to represent
lists.)
\item
\emph{Arrays} are dimensioned collections of objects.
An array can have any non-negative number of dimensions and is indexed
by a sequence of integers. A general array can have any Lisp object as
a component; other types of arrays are specialized for efficiency
and can hold only certain types of Lisp objects.
It is possible for two arrays, possibly with differing dimension information,
to share the same set of elements (such that modifying one array modifies
the other also) by causing one to be \emph{displaced} to the other.
One-dimensional arrays of any kind are called \emph{vectors}.
One-dimensional arrays of characters are called \emph{strings}.
One-dimensional arrays of bits (that is, of integers whose values are 0 or 1)
are called \emph{bit-vectors}.
\item
\emph{Hash tables} provide an efficient way of mapping any
Lisp object (a \emph{key}) to an associated object.
\item
\emph{Readtables} are used to control the built-in expression parser
\cdf{read}.
\item
\emph{Packages} are collections of symbols that serve as name spaces.
The parser recognizes symbols by looking up character sequences
in the current package.
\item
\emph{Pathnames} represent names of files in a fairly implementation-independent
manner. They are used to interface to the external file system.
\item
\emph{Streams} represent sources or sinks of data, typically characters
or bytes. They are used to perform I/O, as well as for internal
purposes such as parsing strings.
\item
\emph{Random-states} are data structures used to encapsulate the state
of the built-in random-number generator.
\item
\emph{Structures} are user-defined record structures, objects that
have named components. The \cdf{defstruct} facility is used
to define new structure types. Some Common Lisp implementations may
choose to implement certain system-supplied data types,
such as \emph{bignums}, \emph{readtables}, \emph{streams},
\emph{hash tables}, and \emph{pathnames}, as structures,
but this fact will be invisible to the user.
\item
\emph{Conditions} are objects used to affect control flow in certain
conventional ways by means of signals and handlers that intercept those signals.
In particular, errors are signaled by raising particular conditions,
and errors may be trapped by establishing handlers for those conditions.
\item
\emph{Classes} determine the structure and behavior of other
objects, their \emph{instances}. Every Common Lisp data object
belongs to some class. (In some ways the CLOS class system is
a generalization of the system of type specifiers of the first edition of this book,
but the class system augments the type system rather than supplanting it.)
\item
\emph{Methods} are chunks of code that operate on arguments
satisfying a particular pattern of classes. Methods are
not functions; they are not invoked directly on arguments
but instead are bundled into generic functions.
\item
\emph{Generic functions} are functions that contain, among other
information, a set of methods. When invoked, a generic function
executes a subset of its methods. The subset chosen for execution
depends in a specific way on the classes or identities of the arguments
to which it is applied.
\end{itemize}
These categories are not always mutually exclusive.
The required relationships among the various data types are
explained in more detail in section~\ref{DATA-TYPE-RELATIONSHIPS}.
\section{Numbers}
Several kinds of numbers are defined in Common Lisp.
They are divided into \emph{integers}; \emph{ratios};
\emph{floating-point numbers}, with names provided for
up to four different floating-point representations; \emph{reals} and
\emph{complex numbers}.
The \cdf{number} data type encompasses all kinds of
numbers. For convenience, there are names for some
subclasses of numbers as well. Integers and ratios are of
type \cdf{rational}. Rational numbers and floating-point
numbers are of type \cdf{real}. Real numbers and complex
numbers are of type \cdf{number}.
Although the names of these types were chosen with the
terminology of mathematics in mind, the correspondences
are not always exact. Integers and ratios model the
corresponding mathematical concepts directly. Numbers
of type \cdf{float} may be used to approximate real
numbers, both rational and irrational. The \cdf{real} type
includes all Common Lisp numbers that represent
mathematical real numbers, though there are
mathematical real numbers (irrational numbers)
that do not have an exact Common Lisp representation.
Only \cdf{real} numbers may be ordered using the \cdf{<}, \cdf{>}, \cdf{<=},
and \cdf{>=} functions.
\subsection{Integers}
\label{INTEGERS-SECTION}
\indexterm{integer}
The \cdf{integer} data type is intended to represent mathematical integers.
Unlike most programming languages, Common Lisp in principle imposes no limit on
the magnitude of an integer; storage
is automatically allocated as necessary to represent large integers.
In every Common Lisp implementation there is a range of integers that are
represented more efficiently than others; each such integer is called a
\emph{fixnum}, and an integer that is not a fixnum is called a
\emph{bignum}.
Common Lisp is designed to hide this distinction as much as possible;
the distinction between fixnums and bignums is visible to
the user in only a few places where the efficiency of representation is
important. Exactly which integers are
fixnums is implementation-dependent; typically they will be those
integers in the range $-2^{n}$ to $2^{n}-1$,
inclusive, for some \emph{n} not less than 15.
See \cdf{most-positive-fixnum} and \cdf{most-negative-fixnum}.
\cdf{fixnum} must be a supertype
of the type \cd{(signed-byte 16)}, and additionally that the value
of \cdf{array-dimension-limit} must be a fixnum (implying that the implementor
should choose the range of fixnums to be large enough to accommodate the
largest size of array to be supported).
\beforenoterule
\begin{rationale}
This specification allows programmers to declare variables in portable code
to be of type \cdf{fixnum} for efficiency. Fixnums are guaranteed to
encompass at least the set of 16-bit signed integers
(compare this to the data type \cd{short int} in the C programming language).
In addition, any valid array index must be a fixnum, and therefore variables
used to hold array indices (such as a \cdf{dotimes} variable)
may be declared \cdf{fixnum} in portable code.
\end{rationale}
\afternoterule
Integers are ordinarily written in decimal notation, as a sequence
of decimal digits, optionally preceded by a sign and optionally followed
by a decimal point.
For example:
\begin{lisp}
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\=\kill
\>0~~~~~\';\textrm{Zero} \\*
\>-0~~~~~\';\textrm{This \emph{always} means the same as \cd{0}} \\*
\>+6~~~~~\';\textrm{The first perfect number} \\
\>28~~~~~\';\textrm{The second perfect number} \\
\>1024.~~~~~\';\textrm{Two to the tenth power} \\*
\>-1~~~~~\';\textrm{$e^{\pi i}$} \\*
\>15511210043330985984000000.~~~~~\';\textrm{25 factorial (25!), probably a bignum}
\end{lisp}
Integers may be notated in radices other than ten.
The notation
\begin{lisp}
\#\emph{nn}r\emph{ddddd} \textrm{or} \#\emph{nn}R\emph{ddddd}
\end{lisp}
means the integer in radix-\emph{nn} notation denoted by the digits
\emph{ddddd}. More precisely, one may write \cd{\#}, a non-empty sequence
of decimal digits representing an unsigned decimal integer \emph{n},
\cdf{r} (or \cdf{R}), an optional sign, and a sequence of radix-\emph{n}
digits, to indicate an integer written in radix \emph{n} (which must be
between 2 and 36, inclusive). Only legal digits
for the specified radix may be used; for example, an octal number may
contain only the digits 0 through 7. For digits above 9,
letters of the alphabet of either
case may be used in order. Binary, octal, and
hexadecimal radices are useful enough to warrant the special
abbreviations \cd{\#b} for \cd{\#2r}, \cd{\#o} for \cd{\#8r}, and
\cd{\#x} for \cd{\#16r}.
For example:
\begin{lisp}
~~~~~~~~~~~~~~~~\=\kill
\>\#2r11010101~~~~~\';\textrm{Another way of writing \cd{213} decimal} \\
\>\#b11010101~~~~~\';\textrm{Ditto} \\
\>\#b+11010101~~~~~\';\textrm{Ditto} \\
\>\#o325~~~~~\';\textrm{Ditto, in octal radix} \\
\>\#xD5~~~~~\';\textrm{Ditto, in hexadecimal radix} \\
\>\#16r+D5~~~~~\';\textrm{Ditto} \\
\>\#o-300~~~~~\';\textrm{Decimal -192, written in base 8} \\
\>\#3r-21010~~~~~\';\textrm{Same thing in base 3} \\
\>\#25R-7H~~~~~\';\textrm{Same thing in base 25} \\
\>\#xACCEDED~~~~~\';\textrm{181202413, in hexadecimal radix}
\end{lisp}
\subsection{Ratios}
\indexterm{ratio}
\indexterm{rational}
A \emph{ratio} is a number representing the mathematical ratio
of two integers. Integers and ratios collectively constitute
the type \cdf{rational}.
The canonical representation of a rational number is as an
integer if its value is integral, and otherwise as the ratio of two
integers, the \emph{numerator} and \emph{denominator}, whose greatest
common divisor is 1, and of which the denominator is positive (and in
fact greater than 1, or else the value would be integral).
A ratio is notated with
\cdf{/} as a separator, thus: \cd{3/5}. It is possible to notate
ratios in non-canonical (unreduced) forms, such as \cd{4/6}, but the
Lisp function \cd{prin1} always prints the canonical form for a
ratio.
If any computation produces a result that is a ratio of
two integers such that the denominator evenly divides the
numerator, then the result is immediately converted to the equivalent
integer. This is called the rule of \emph{rational canonicalization}.
Rational numbers may be written as the possibly signed quotient of
decimal numerals: an optional sign followed by two non-empty sequences of
digits separated by a \cd{/}. This syntax may be described as
follows:
\begin{tabbing}
\emph{ratio} ::= \Mopt{\emph{sign}} \Mplus{\emph{digit}} \cd{/} \Mplus{\emph{digit}}
\end{tabbing}
The second sequence may not consist
entirely of zeros.
For example:
\begin{lisp}
2/3~~~~~~~~~~~~~~~~~~~~;\textrm{This is in canonical form} \\
4/6~~~~~~~~~~~~~~~~~~~~;\textrm{A non-canonical form for the same number} \\
-17/23~~~~~~~~~~~~~~~~~;\textrm{A not very interesting ratio} \\
-30517578125/32768~~~~~;\textrm{This is $(-5/2)^{15}$} \\
10/5~~~~~~~~~~~~~~~~~~~;\textrm{The canonical form for this is \cd{2}}
\end{lisp}
To notate rational numbers in radices other than ten,
one uses the same radix specifiers
(one of \cd{\#\emph{nn}R}, \cd{\#O}, \cd{\#B}, or \cd{\#X}) as for integers.
For example:
\begin{lisp}
\#o-101/75~~~~~~~~~~;\textrm{Octal notation for \cd{-65/61}} \\
\#3r120/21~~~~~~~~~~;\textrm{Ternary notation for \cd{15/7}} \\
\#Xbc/ad~~~~~~~~~~~~;\textrm{Hexadecimal notation for \cd{188/173}} \\
\#xFADED/FACADE~~~~~;\textrm{Hexadecimal notation for \cd{1027565/16435934}}
\end{lisp}
\subsection{Floating-Point Numbers}
Common Lisp allows an implementation to provide one or more kinds of
floating-point number, which collectively make up the type \cdf{float}.
Now a floating-point number is a (mathematical)
rational number of the form
$\emph{s} \cdot \emph{f} \cdot \emph{b}^{e-p}$,
where \emph{s} is $+1$ or $-1$, the \emph{sign};
\emph{b} is an integer greater than 1,
the \emph{base} or \emph{radix} of the representation;
\emph{p} is a positive integer,
the \emph{precision} (in base-\emph{b} digits) of the floating-point number;
\emph{f} is a positive integer between
$\emph{b}^{p-1}$ and $\emph{b}^{p}-1$ (inclusive),
the \emph{significand};
and \emph{e} is an integer, the \emph{exponent}.
The value of \emph{p} and the range of \emph{e}
depends on the implementation and on the type of floating-point number
within that implementation.
In addition, there is a floating-point zero;
depending on the implementation, there may also be a ``minus zero.''
If there is no minus zero, then \cd{0.0} and \cd{-0.0} are
both interpreted as simply a floating-point zero.
\beforenoterule
\begin{implementation}
The form of the above description should not be construed
to require the internal representation to be in sign-magnitude form.
Two's-complement and other representations are also acceptable. Note
that the radix of the internal representation may be other than 2, as on
the IBM 360 and 370, which use radix 16; see
\cdf{float-radix}.
\end{implementation}
\afternoterule
Floating-point numbers may be provided in a variety of precisions and sizes,
depending on the implementation. High-quality floating-point
software tends to depend critically on the precise nature of the
floating-point arithmetic and so may not always be completely portable.
As an aid in writing programs that are
moderately portable, however, certain definitions are made here:
\begin{itemize}
\item
A \emph{short} floating-point number (type \cdf{short-float})
is of the representation of smallest
fixed precision provided by an implementation.
\item
A \emph{long} floating-point number (type \cdf{long-float})
is of the representation of the largest fixed
precision provided by an implementation.
\item
Intermediate between short and long formats are two others, arbitrarily
called \emph{single} and \emph{double} (types \cdf{single-float} and \cdf{double-float}).
\end{itemize}
The precise definition of these categories is implementation-dependent.
However, the rough intent is that short floating-point numbers be
precise to at least four decimal places (but also have
a space- efficient representation);
single floating-point numbers, to at least seven decimal places;
and double floating-point numbers, to at least fourteen decimal places.
It is suggested that
the precision (measured in bits, computed as $p \log_2 b$)
and the exponent size (also measured in bits, computed as the base-2
logarithm of 1 plus the maximum exponent value) be at least as great
as the values in table~\ref{Floating-Format-Requirements-Table}.
\begin{table}[t]
\caption{Recommended Minimum Floating-Point Precision and Exponent Size}
\label{Floating-Format-Requirements-Table}
\begin{tabular}{@{}lll@{}}
{Format\quad\quad}&{Minimum Precision\quad\quad}&{Minimum Exponent Size} \\ \hlinesp
Short&13 bits&5 bits \\
Single&24 bits&8 bits \\
Double&50 bits&8 bits \\
Long&50 bits&8 bits
\end{tabular}
\end{table}
Floating-point numbers are written in either decimal fraction
or computerized scientific notation: an optional sign,
then a non-empty sequence of digits with an embedded decimal point,
then an optional decimal exponent specification.
If there is no exponent specifier, then
the decimal point is required, and there must be digits
after it.
The exponent specifier consists of an exponent marker,
an optional sign, and a non-empty sequence of digits.
For preciseness, here is a modified-BNF description of floating-point
notation.
\begin{tabbing}
\emph{floating-point-number} ::= \=\Mopt{\emph{sign}} \Mstar{\emph{digit}} {\it
decimal-point} \Mplus{\emph{digit}} \Mopt{\emph{exponent}} \\*
\>\hbox to 0pt{\hss\Mor~}\Mopt{{\it
sign}} \Mplus{\emph{digit}} \Mopt{\emph{decimal-point} \Mstar{\emph{digit}}} {\it
exponent} \\
\emph{sign} ::= \cdf{+} {\Mor} \cdf{-} \\
\emph{decimal-point} ::= \cd{.} \\
\emph{digit} ::= \cd{0} {\Mor} \cd{1} {\Mor} \cd{2} {\Mor} \cd{3} {\Mor} \cd{4}
{\Mor} \cd{5} {\Mor} \cd{6} {\Mor} \cd{7} {\Mor} \cd{8} {\Mor} \cd{9}\\
\emph{exponent} ::= \emph{exponent-marker} \Mopt{\emph{sign}} \Mplus{\emph{digit}}\\*
\emph{exponent-marker} ::= \cd{e} {\Mor} \cd{s} {\Mor} \cd{f}
{\Mor} \cd{d} {\Mor} \cd{l} {\Mor} \cd{E} {\Mor} \cd{S} {\Mor} \cd{F} {\Mor}
\cd{D} {\Mor} \cd{L}
\end{tabbing}
If no exponent specifier is present, or if the exponent marker \cdf{e}
(or \cdf{E}) is used, then the precise format to be used is not
specified. When such a representation is read and
converted to an internal floating-point data object, the format specified
by the variable \cdf{*read-default-float-format*} is used; the initial
value of this variable is \cdf{single-float}.
The letters \cd{s}, \cd{f}, \cd{d}, and \cd{l} (or their
respective uppercase equivalents) explicitly specify the
use of \emph{short}, \emph{single}, \emph{double}, and \emph{long} format, respectively.
Examples of floating-point numbers:
\begin{lisp}
0.0~~~~~~~~~~~~~~~~~~~~~~~~~;\textrm{Floating-point zero in default format} \\
0E0~~~~~~~~~~~~~~~~~~~~~~~~~;\textrm{Also floating-point zero in default format} \\
-.0~~~~~~~~~~~~~~~~~~~~~~~~~;\textrm{This may be a zero or a minus zero,} \\
~~~~~~~~~~~~~~~~~~~~~~~~~~~~; \textrm{depending on the implementation} \\
0.~~~~~~~~~~~~~~~~~~~~~~~~~~;\textrm{The \emph{integer} zero, not a floating-point zero!} \\
0.0s0~~~~~~~~~~~~~~~~~~~~~~~;\textrm{A floating-point zero in \emph{short} format} \\
0s0~~~~~~~~~~~~~~~~~~~~~~~~~;\textrm{Also a floating-point zero in \emph{short} format} \\
3.1415926535897932384d0~~~~~;\textrm{A \emph{double}-format approximation to $\pi$} \\
6.02E+23~~~~~~~~~~~~~~~~~~~~;\textrm{Avogadro's number, in default format} \\
602E+21~~~~~~~~~~~~~~~~~~~~~;\textrm{Also Avogadro's number, in default format} \\
3.010299957f-1~~~~~~~~~~~~~~;\textrm{$\log_{10} 2$, in \emph{single} format} \\
-0.000000001s9~~~~~~~~~~~~~~;\textrm{$e^{\pi i}$ in \emph{short} format, the hard way}
\end{lisp}
The internal format used for an external representation depends only
on the exponent marker and not on the number of decimal digits
in the external representation.
While Common Lisp provides terminology and notation sufficient
to accommodate four distinct floating-point formats,
not all implementations will have the means to support
that many distinct formats.
An implementation is therefore permitted to provide
fewer than four distinct internal floating-point formats,
in which case at least one of them will be ``shared''
by more than one of the external format names \emph{short}, \emph{single},
\emph{double}, and \emph{long} according to the following rules:
\begin{itemize}
\item
If one internal format is provided, then it is considered to be
\emph{single}, but serves also as \emph{short}, \emph{double}, and \emph{long}.
The data types \cdf{short-float},
\cdf{single-float}, \cdf{double-float}, and \cdf{long-float} are
considered to be identical. An expression such as \cd{(eql 1.0s0 1.0d0)}
will be true in such an implementation
because the two numbers \cd{1.0s0} and \cd{1.0d0} will
be converted into the same internal format and therefore be considered
to have the same data type, despite the differing external syntax.
Similarly, \cd{(typep 1.0L0 'short-float)} will be true in such
an implementation.
For output purposes all floating-point numbers are assumed to be
of \emph{single} format and thus will print using the
exponent letter \cdf{E} or \cdf{F}.
\item
If two internal formats are provided, then either of two correspondences
may be used, depending on which is the more appropriate:
\begin{itemize}
\item
One format is \emph{short}; the other is \emph{single} and serves also
as \emph{double} and \emph{long}.
The data types
\cdf{single-float}, \cdf{double-float}, and \cdf{long-float} are
considered to be identical, but \cdf{short-float} is distinct.
An expression such as \cd{(eql 1.0s0 1.0d0)}
will be false, but \cd{(eql 1.0f0 1.0d0)} will be true.
Similarly, \cd{(typep 1.0L0 'short-float)} will be false,
but \cd{(typep 1.0L0 'single-float)} will be true.
For output purposes all floating-point numbers are assumed to be
of \emph{short} or \emph{single} format.
\item
One format is \emph{single} and serves also as \emph{short};
the other is \emph{double} and serves also as \emph{long}.
The data types \cdf{short-float} and \cdf{single-float} are considered to be
identical, and the data types \cdf{double-float} and \cdf{long-float} are
considered to be identical.
An expression such as \cd{(eql 1.0s0 1.0d0)}
will be false, as will \cd{(eql 1.0f0 1.0d0)};
but \cd{(eql 1.0d0 1.0L0)} will be true.
Similarly, \cd{(typep 1.0L0 'short-float)} will be false,
but \cd{(typep 1.0L0 'double-float)} will be true.
For output purposes all floating-point numbers are assumed to be
of \emph{single} or \emph{double} format.
\end{itemize}
\item
If three internal formats are provided, then either of two correspondences
may be used, depending on which is the more appropriate:
\begin{itemize}
\item
One format is \emph{short}; another format is \emph{single}; and the third format is
\emph{double} and serves also as \emph{long}. Similar constraints apply.
\item
One format is \emph{single} and serves also as \emph{short};
another is \emph{double}; and the third format is \emph{long}.
\end{itemize}
\end{itemize}
\beforenoterule
\begin{implementation}
It is recommended that an implementation
provide as many distinct floating-point formats as feasible,
using table~\ref{Floating-Format-Requirements-Table} as a guideline.
Ideally, short-format floating-point numbers should have an
``immediate'' representation that does not require heap allocation;
single-format
floating-point numbers should approximate IEEE proposed standard
single-format floating-point numbers; and double-format floating-point
numbers should approximate IEEE proposed standard double-format
floating-point numbers
\cite{IEEE-PROPOSED-FLOATING-POINT-STANDARD,IEEE-FLOATING-POINT-IMPL-GUIDE,IEEE-FLOATING-POINT-IMPL-GUIDE-ERRATA}.
\end{implementation}
\afternoterule
\subsection{Complex Numbers}
Complex numbers (type \cdf{complex})
are represented in Cartesian form, with a real part and an imaginary
part, each of which is a non-complex number (integer, ratio, or floating-point
number). It should be emphasized that the parts of a complex
number are not necessarily floating-point numbers; in this, Common Lisp
is like PL/I and differs from Fortran. However, both parts must
be of the same type: either both are rational, or both are of the
same floating-point format.
Complex numbers may be notated by writing the characters \cd{\#C}
followed by a list of the real and imaginary parts.
If the two parts as notated are not of the same type, then
they are converted according to the rules of floating-point contagion
as described in chapter~\ref{NUMBER}.
(Indeed, \cd{\#C(\emph{a} \emph{b})} is equivalent to \cd{\#,(complex \emph{a} \emph{b})};
see the description of the function \cdf{complex}.)
For example:
\begin{lisp}
\#C(3.0s1 2.0s-1)~~~~~;\textrm{Real and imaginary parts are short format}\\
\#C(5 -3)~~~~~~~~~~~~~;\textrm{A Gaussian integer} \\
\#C(5/3 7.0)~~~~~~~~~~;\textrm{Will be converted internally to \cd{\#C(1.66666 7.0)}} \\
\#C(0 1)~~~~~~~~~~~~~~;\textrm{The imaginary unit, that is, \emph{i}}
\end{lisp}
The type of a specific complex number is indicated by a list
of the word \cdf{complex} and the type of the components; for example,
a specialized representation for complex numbers with short floating-point
parts would be of type \cd{(complex short-float)}. The type \cdf{complex}
encompasses all complex representations.
A complex number of type \cd{(complex rational)}, that is, one whose
components are rational, can never have a zero imaginary part.
If the result of a computation would be a complex rational
with a zero imaginary part, the result is immediately
converted to a non-complex rational number by taking the
real part. This is called the rule of \emph{complex canonicalization}.
This rule does not apply to floating-point complex numbers;
\cd{\#C(5.0 0.0)} and \cd{5.0} are different.
\section{Characters}
Characters are represented as data objects of type \cdf{character}.
A character object can be notated by writing \cd{\#{\Xbackslash}} followed
by the character itself. For example, \cd{\#{\Xbackslash}g} means the character
object for a lowercase g. This works well enough for printing
characters. Non-printing characters have names, and can be notated
by writing \cd{\#{\Xbackslash}} and then the name; for example, \cd{\#{\Xbackslash}Space}
(or \cd{\#{\Xbackslash}SPACE} or \cd{\#{\Xbackslash}space} or \cd{\#{\Xbackslash}sPaCE})
means the space character. The syntax for character names after \cd{\#{\Xbackslash}}
is the same as that for symbols. However, only character names
that are known to the particular implementation may be used.
\subsection{Standard Characters}
Common Lisp defines a standard character set (subtype \cdf{standard-char})
for two purposes.
Common Lisp programs that are \emph{written} in the standard character set
can be read by any Common Lisp implementation; and Common Lisp programs
that \emph{use} only standard characters as data objects are most likely
to be portable. The Common Lisp character set consists of a space character
\cd{\#{\Xbackslash}Space}, a newline character \cd{\#{\Xbackslash}Newline}, and the
following ninety-four
non-blank printing characters or their equivalents:
\begin{lisp}
! " \# \$ \% \& ' ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9 : ; < = > ? \\
{\Xatsign} A B C D E F G H I J K L M N O P Q R S T U V W X Y Z {\Xlbracket} {\Xbackslash} {\Xrbracket} {\Xcircumflex} {\Xunderscore} \\
{\Xbq} a b c d e f g h i j k l m n o p q r s t u v w x y z {\Xlbrace} | {\Xrbrace} {\Xtilde}
\end{lisp}
The Common Lisp standard character set is apparently equivalent to
the ninety-five standard ASCII printing characters plus a newline character.
Nevertheless, Common Lisp is designed to be relatively independent of
the ASCII character encoding. For example, the collating sequence
is not specified except to say that digits must be properly ordered,
the uppercase letters must be properly ordered, and
the lowercase letters must be properly ordered
(see \cdf{char<} for a precise specification).
Other character encodings, particularly EBCDIC, should be easily accommodated
(with a suitable mapping of printing characters).
Of the ninety-four non-blank printing characters, the following are
used in only limited ways in the syntax of Common Lisp programs:
\begin{lisp}
{\Xlbracket}~~{\Xrbracket}~~{\Xlbrace}~~{\Xrbrace}~~?~~!~~{\Xcircumflex}~~{\Xunderscore}~~{\Xtilde}~~\$~~\%
\end{lisp}
The following characters are called \emph{semi-standard}:
\begin{lisp}
\#{\Xbackslash}Backspace~~\#{\Xbackslash}Tab~~\#{\Xbackslash}Linefeed~~\#{\Xbackslash}Page~~\#{\Xbackslash}Return~~\#{\Xbackslash}Rubout
\end{lisp}
Not all implementations of Common Lisp need to support them; but those
implementations that
use the standard ASCII character set should support them, treating them as
corresponding respectively to the ASCII characters BS (octal code 010),
HT (011), LF (012), FF (014), CR (015), and DEL
(177). These characters are not
members of the subtype \cdf{standard-char} unless synonymous with
one of the standard characters specified above.
For example, in a given implementation it might
be sensible for the implementor to define
\cd{\#{\Xbackslash}Linefeed} or \cd{\#{\Xbackslash}Return} to be synonymous with \cd{\#{\Xbackslash}Newline},
or \cd{\#{\Xbackslash}Tab} to be synonymous with \cd{\#{\Xbackslash}Space}.
\subsection{Line Divisions}
The treatment of line divisions is one of the most difficult issues
in designing portable software, simply because there is so little agreement
among operating systems. Some use a single character to delimit lines;
the recommended ASCII character for this purpose is the line feed character
LF (also called the new line character, NL),
but some systems use the carriage
return character CR. Much more common is the two-character sequence
CR followed by LF. Frequently line divisions have no representation
as a character but are implicit in the structuring of a file into records,
each record containing a line of text. A deck of punched cards has this
structure, for example.
Common Lisp provides an abstract interface by requiring that there be a single
character, \cd{\#{\Xbackslash}Newline}, that within the language serves as a line
delimiter. (The language C has a similar requirement.)
An implementation of Common Lisp must translate between this internal
single-character representation and whatever external representation(s)
may be used.
\beforenoterule
\begin{implementation}
How the character called \cd{\#{\Xbackslash}Newline} is represented
internally is not specified here, but it is strongly suggested that
the ASCII LF character be used in Common Lisp implementations that use the
ASCII character encoding. The ASCII CR character is a workable,
but in most cases inferior, alternative.
\end{implementation}
\afternoterule
The requirement that a line division be represented as a single character
has certain consequences. A character string
written in the middle of a program in such a way as to span more than
one line must contain exactly one character to represent each line division.
Consider this code fragment:
\begin{lisp}
(setq a-string "This string \\
contains \\
forty-two characters.")
\end{lisp}
Between \cdf{g} and \cdf{c} there must be exactly one character,
\cd{\#{\Xbackslash}Newline}; a two-character sequence, such as \cd{\#{\Xbackslash}Return} and then
\cd{\#{\Xbackslash}Newline}, is not acceptable, nor is the absence of a character.
The same is true between \cdf{s} and \cdf{f}.
When the character \cd{\#{\Xbackslash}Newline} is written to an output file,
the Common Lisp implementation must take the appropriate action
to produce a line division. This might involve writing out a
record or translating \cd{\#{\Xbackslash}Newline} to a CR/LF sequence.
\beforenoterule
\begin{implementation}
If an implementation uses the ASCII character encoding,
uses the CR/LF sequence externally to delimit lines,
uses LF to represent \cd{\#{\Xbackslash}Newline} internally, and supports \cd{\#{\Xbackslash}Return}
as a data object corresponding to the ASCII character CR, the
question arises as to what action to take when the program
writes out \cd{\#{\Xbackslash}Return} followed by \cd{\#{\Xbackslash}Newline}.
It should first be noted that \cd{\#{\Xbackslash}Return} is not a standard Common Lisp
character, and the action to be taken when \cd{\#{\Xbackslash}Return} is written out
is therefore not defined by the Common Lisp language. A plausible approach
is to buffer the \cd{\#{\Xbackslash}Return} character and suppress it if and only if the
next character is \cd{\#{\Xbackslash}Newline} (the net effect is to generate a CR/LF
sequence).
Another plausible
approach is simply to ignore
the difficulty and declare that writing \cd{\#{\Xbackslash}Return} and then
\cd{\#{\Xbackslash}Newline} results in the sequence CR/CR/LF in the output.
\end{implementation}
\afternoterule
\subsection{Non-standard Characters}
Any implementation may provide additional characters, whether printing
characters or named characters. Some plausible examples:
\begin{lisp}
\#{\Xbackslash}$\pi$~~\#{\Xbackslash}$\alpha$~~\#{\Xbackslash}Break~~\#{\Xbackslash}Home-Up~~\#{\Xbackslash}Escape
\end{lisp}
The use of such characters may render Common Lisp programs non-portable.
\section{Symbols}
Symbols are Lisp data objects that serve several purposes
and have several interesting characteristics. Every object of
type \cdf{symbol} has a name,
called its \emph{print name}. Given a symbol, one can
obtain its name in the form of a string. Conversely,
given the name of a symbol as a string, one can obtain the
symbol itself. (More precisely, symbols are organized into
\emph{packages}, and all the symbols in a package are uniquely
identified by name. See chapter~\ref{XPACK}.)
Symbols have a component called the \emph{property list}, or \emph{plist}.
By convention this is always a list whose even-numbered
components (calling the first component zero) are symbols,
here functioning as property names, and whose odd-numbered components
are associated property values. Functions are provided for manipulating
this property list; in effect, these allow a symbol to be treated as an
extensible record structure.
Symbols are also used to represent certain kinds of variables in Lisp
programs, and there are functions for dealing with the values associated
with symbols in this role.
A symbol can be notated simply by writing its name.
If its name is not empty, and if the name consists only of
uppercase alphabetic, numeric, or certain pseudo-alphabetic
special characters (but not
delimiter characters such as parentheses or space), and if
the name of the symbol cannot be mistaken for a number, then
the symbol can be notated by the sequence of characters in its name.
Any uppercase letters that appear in the (internal) name may
be written in either case in the external notation (more on this below).
For example:
\begin{lisp}
~~~~~~~~~~~~~~~~~~~~\=\kill
FROBBOZ\>;\textrm{The symbol whose name is \cdf{FROBBOZ}} \\
frobboz\>;\textrm{Another way to notate the same symbol} \\
fRObBoz\>;\textrm{Yet another way to notate it} \\
unwind-protect\>;\textrm{A symbol with a \cdf{-} in its name} \\
+\$\>;\textrm{The symbol named \cd{+\$}} \\
1+\>;\textrm{The symbol named \cdf{1+}} \\
+1\>;\textrm{This is the integer 1, not a symbol} \\
pascal{\Xunderscore}style\>;\textrm{This symbol has an underscore in its name} \\
b{\Xcircumflex}2-4*a*c\>;\textrm{This is a single symbol!} \\
\>;~\textrm{It has several special characters in its name} \\
file.rel.43\>;\textrm{This symbol has periods in its name} \\
/usr/games/zork\>;\textrm{This symbol has slashes in its name}
\end{lisp}
In addition to letters and numbers, the following characters are normally
considered to be alphabetic for the purposes of notating
symbols:
\begin{lisp}
+~~-~~*~~/~~{\Xatsign}~~\$~~\%~~{\Xcircumflex}~~\&~~{\Xunderscore}~~=~~<~~>~~{\Xtilde}~~.
\end{lisp}
Some of these characters have conventional purposes for naming things;
for example, symbols that name special variables
generally have names beginning and ending with
\cdf{*}. The last character listed above, the period, is considered alphabetic
\emph{provided} that a token does not consist entirely of periods.
A single period standing by itself is used in the notation
of conses and dotted lists; a token consisting of two or more periods
is syntactically illegal. (The period also serves as the decimal point
in the notation of numbers.)
The following characters are also alphabetic by default but are explicitly
reserved to the user for definition as reader macro characters
(see section~\ref{MACRO-CHARACTERS-SECTION}) or any other desired purpose
and therefore should not be used routinely in names of symbols:
\begin{lisp}
?~~!~~{\Xlbracket}~~{\Xrbracket}~~{\Xlbrace}~~{\Xrbrace}
\end{lisp}
A symbol may have uppercase letters, lowercase letters, or both
in its print name.
However, the Lisp reader normally converts lowercase letters to
the corresponding uppercase letters when reading symbols.
The net effect is that most of the time case makes no
difference when \emph{notating} symbols. Case \emph{does} make
a difference internally and when printing a symbol.
Internally the symbols that name all standard Common Lisp functions,
variables, and keywords have uppercase names; their names appear
in lowercase in this book for readability. Typing such names
with lowercase letters works because the function \cdf{read} will convert
lowercase letters to the equivalent uppercase letters.
\cdf{readtable-case}, which controls whether \cdf{read} will alter the case
of letters read as part of the name of a symbol.
If a symbol cannot be simply notated by the characters of its name
because the (internal) name contains special characters or lowercase letters,
then there are two ``escape'' conventions for notating them.
Writing a \cd{{\Xbackslash}} character before any character causes the character
to be treated itself as an ordinary character for use in a symbol name;
in particular, it suppresses internal conversion of lowercase letters
to their uppercase equivalents.
If any character in a notation is preceded by \cd{{\Xbackslash}}, then that
notation can never be interpreted as a number.
For example:
\begin{lisp}
~~~~~~~~~~~~~~~~~~~~~~~~\=\kill
{\Xbackslash}(\>;\textrm{The symbol whose name is \cd{(}} \\
{\Xbackslash}+1\>;\textrm{The symbol whose name is \cd{+1}} \\
{\Xbackslash}1\>;\textrm{Also the symbol whose name is \cd{+1}} \\
{\Xbackslash}frobboz\>;\textrm{The symbol whose name is \cd{fROBBOZ}} \\
3.14159265{\Xbackslash}s0\>;\textrm{The symbol whose name is \cd{3.14159265s0}} \\
3.14159265{\Xbackslash}S0\>;\textrm{A different symbol, whose name is \cd{3.14159265S0}} \\
3.14159265s0\>;\textrm{A short-format floating-point approximation to $\pi$} \\
APL{\Xbackslash}{\Xbackslash}360\>;\textrm{The symbol whose name is \cd{APL{\Xbackslash}360}} \\
apl{\Xbackslash}{\Xbackslash}360\>;\textrm{Also the symbol whose name is \cd{APL{\Xbackslash}360}} \\
{\Xbackslash}(b{\Xcircumflex}2{\Xbackslash}){\Xbackslash} -{\Xbackslash} 4*a*c\>;\textrm{The name is \cd{(B{\Xcircumflex}2) - 4*A*C};} \\
\>;~\textrm{it has parentheses and two spaces in it} \\
{\Xbackslash}({\Xbackslash}b{\Xcircumflex}2{\Xbackslash}){\Xbackslash} -{\Xbackslash} 4*{\Xbackslash}a*{\Xbackslash}c\>;\textrm{The name is \cd{(b{\Xcircumflex}2) - 4*a*c};} \\
\>;~\textrm{the letters are explicitly lowercase}
\end{lisp}
It may be tedious to insert a \cd{{\Xbackslash}} before \emph{every} delimiter
character in the name of a symbol if there are many of them.
An alternative convention is to surround the name of a symbol
with vertical bars; these cause every character between them to
be taken as part of the symbol's name, as if \cd{{\Xbackslash}} had been written
before each one, excepting only
\cd{|} itself and \cd{{\Xbackslash}}, which must nevertheless be preceded by \cd{{\Xbackslash}}.
For example:
\begin{lisp}
~~~~~~~~~~~~~~~~~~~~\=\kill
|"|\>;\textrm{The same as writing \cd{{\Xbackslash}"}} \\
|(b{\Xcircumflex}2) - 4*a*c|\>;\textrm{The name is \cd{(b{\Xcircumflex}2) - 4*a*c}} \\
|frobboz|\>;\textrm{The name is \cd{frobboz}, not \cd{FROBBOZ}} \\
|APL{\Xbackslash}360|\>;\textrm{The name is \cd{APL360}, because the \cd{{\Xbackslash}} quotes the \cd{3}} \\
|APL{\Xbackslash}{\Xbackslash}360|\>;\textrm{The name is \cd{APL{\Xbackslash}360}} \\
|apl{\Xbackslash}{\Xbackslash}360|\>;\textrm{The name is \cd{apl{\Xbackslash}360}} \\
|{\Xbackslash}|{\Xbackslash}||\>;\textrm{Same as \cd{{\Xbackslash}|{\Xbackslash}|}: the name is \cd{||}} \\
|(B{\Xcircumflex}2) - 4*A*C|\>;\textrm{The name is \cd{(B{\Xcircumflex}2) - 4*A*C};} \\
\>;~\textrm{it has parentheses and two spaces in it} \\
|(b{\Xcircumflex}2) - 4*a*c|\>;\textrm{The name is \cd{(b{\Xcircumflex}2) - 4*a*c}}
\end{lisp}
\section{Lists and Conses}
\indexterm{cons}
A \cdf{cons} is a record structure containing two components
called the \emph{car} and the \emph{cdr}. Conses are used primarily
to represent lists.
A \emph{list} is recursively defined to be either the empty list
or a cons whose \emph{cdr} component is a list.
A list is therefore a chain of conses linked by their \emph{cdr} components
and terminated by {\nil}, the empty list. The \emph{car} components of the conses
are called the \emph{elements} of the list. For each element of the list
there is a cons. The empty list has no elements at all.
A list is notated by writing the elements of the list in order,
separated by blank space (space, tab, or return characters)
and surrounded by parentheses.
\begin{lisp}
(a b c)~~~~~~~~~~~~~~~;\textrm{A list of three symbols} \\
(2.0s0 (a 1) \#{\Xbackslash}*)~~~~~;\textrm{A list of three things: a short floating-point} \\
~~~~~~~~~~~~~~~~~~~~~~;~\textrm{number, another list, and a character object}
\end{lisp}
The empty list {\nil} therefore can be written as {\emptylist}, because it is a list
with no elements.
A \emph{dotted list} is one whose last cons does not have {\nil} for
its \emph{cdr}, rather some other data object (which is also not a cons,
or the first-mentioned cons would not be the last cons of the list).
Such a list is called ``dotted'' because of the special notation
used for it: the elements of the list are written between
parentheses as before, but after the last element and before
the right parenthesis are written a dot (surrounded by blank space)
and then the \emph{cdr} of the last cons. As a special case,
a single cons is notated by writing the \emph{car} and the \emph{cdr} between
parentheses and separated by a space-surrounded dot.
For example:
\begin{lisp}
(a . 4)~~~~~~~~~;\textrm{A cons whose \emph{car} is a symbol} \\
~~~~~~~~~~~~~~~~;~\textrm{and whose \emph{cdr} is an integer} \\
(a b c . d)~~~~~;\textrm{A dotted list with three elements whose last cons} \\
~~~~~~~~~~~~~~~~;~\textrm{has the symbol \cdf{d} in its \emph{cdr}}
\end{lisp}
It is legitimate to write something like \cd{(a b . (c d))};
this means the same as \cd{(a b c d)}. The standard Lisp
output routines will never print a list in the first form, however;
they will avoid dot notation wherever possible.
Often the term \emph{list} is used to refer either to true lists or to
dotted lists. When the distinction is important,
the term ``true list'' will be used to refer to a list
terminated by {\nil}. Most functions
advertised to operate on lists expect to be given true lists. Throughout
this book, unless otherwise specified, it is an error to pass a dotted
list to a function that is specified to require a list as an argument.
\beforenoterule
\begin{implementation}
Implementors are encouraged to use the equivalent
of the predicate \cdf{endp} wherever it is necessary to test
for the end of a list. Whenever feasible, this test should explicitly
signal an error if a list is found to be terminated by a non-{\nil} atom.
However, such an explicit error signal is not required, because
some such tests occur in important loops where efficiency is important.
In such cases, the predicate \cdf{atom} may be used to test
for the end of the list, quietly treating any non-{\nil} list-terminating
atom as if it were {\nil}.
\end{implementation}
\afternoterule
Sometimes the term \emph{tree} is used to refer to some cons
and all the other conses transitively accessible to it
through \emph{car} and \emph{cdr} links until non-conses are reached;
these non-conses are called the \emph{leaves} of the tree.
Lists, dotted lists, and trees are not mutually exclusive data types;
they are simply useful points of view about structures of conses.
There are yet other terms, such as \emph{association list}.
None of these are true Lisp data types. Conses are a data type,
and {\nil} is the sole object of type \cdf{null}.
The Lisp data type \cdf{list} is taken to mean the union of the
\cdf{cons} and \cdf{null} data types, and therefore encompasses both
true lists and dotted lists.
\section{Arrays}
\label{ARRAY-TYPE-SECTION}
\indexterm{array}
An \cdf{array} is an object with components arranged according
to a Cartesian coordinate system.
In general, these components may be any Lisp data objects.
The number of dimensions of an array is called its \emph{rank}
(this terminology is borrowed from APL);
the rank is a non-negative integer.
Likewise, each dimension is itself a non-negative integer.