A data type is a collection of related values. These collections need not be disjoint, and they are often hierarchical. Scheme has a rich set of data types: some are simple (indivisible) data types and others are compound data types made by combining other data types.
The simple data types of Scheme include booleans, numbers, characters, and symbols.
Scheme’s booleans are #t
for true and #f
for false.
Scheme has a predicate procedure called boolean?
that
checks if its argument is boolean.
(boolean? #t) => #t (boolean? "Hello, World!") => #f
The procedure not
negates its argument, considered as a
boolean.
(not #f) => #t (not #t) => #f (not "Hello, World!") => #f
The last expression illustrates a Scheme convenience:
In a context that requires a boolean, Scheme will treat
any value that is not #f
as a true value.
Scheme numbers can be integers (e.g., 42
), rationals
(22/7
), reals (3.1416
), or complex (2+3i
). An
integer is a rational is a real is a complex number is a
number. Predicates exist for testing the various kinds of
numberness:
(number? 42) => #t (number? #t) => #f (complex? 2+3i) => #t (real? 2+3i) => #f (real? 3.1416) => #t (real? 22/7) => #t (real? 42) => #t (rational? 2+3i) => #f (rational? 3.1416) => #t (rational? 22/7) => #t (integer? 22/7) => #f (integer? 42) => #t
Scheme integers need not be specified in decimal (base 10)
format. They can be specified in binary by prefixing the
numeral with #b
. Thus #b1100
is the number twelve.
The octal prefix is #o
and the hex prefix is
#x
. (The optional decimal prefix is #d
.)
Numbers can tested for equality using the general-purpose
equality predicate eqv?
.
(eqv? 42 42) => #t (eqv? 42 #f) => #f (eqv? 42 42.0) => #f
However, if you know that the arguments to be compared are
numbers, the special number-equality predicate =
is more
apt.
(= 42 42) => #t (= 42 #f) -->ERROR!!! (= 42 42.0) => #t
Other number comparisons allowed are
<
, <=
, >
, >=
.
(< 3 2) => #f (>= 4.5 3) => #t
Arithmetic procedures +
, ‑
, *
, /
, expt
have the
expected behavior:
(+ 1 2 3) => 6 (- 5.3 2) => 3.3 (- 5 2 1) => 2 (* 1 2 3) => 6 (/ 6 3) => 2 (/ 22 7) => 22/7 (expt 2 3) => 8 (expt 4 1/2) => 2.0
For a single argument, ‑
and /
return the negation
and the reciprocal respectively:
(- 4) => -4 (/ 4) => 1/4
The procedures max
and min
return the maximum and
minimum respectively of the number arguments supplied to
them. Any number of arguments can be so supplied.
(max 1 3 4 2 3) => 4 (min 1 3 4 2 3) => 1
The procedure abs
returns the absolute value of
its argument.
(abs 3) => 3 (abs -4) => 4
The procedure round
returns the closest integer to the
argument. The procedures ceiling
and floor
return the nearest
integer above and below respectively.
(round 2.718) => 3 (ceiling 2.718) => 3 (floor 2.718) => 2
This is just the tip of the iceberg. Scheme
provides a large and comprehensive suite of arithmetic
and trigonometric procedures. For instance, atan
,
exp
, and sqrt
respectively return the
arctangent, natural antilogarithm, and
square root of their argument. Consult
R5RS [23] for more details.
Scheme character data are represented by prefixing the
character with #\
. Thus, #\c
is the character
c
. Some non-graphic characters have more descriptive
names, e.g., #\newline
, #\tab
. The character for
space can be written #\
, or more readably, #\space
.
The character predicate is char?
:
(char? #\c) => #t (char? 1) => #f (char? #\;) => #t
Note that a semicolon character datum does not trigger a comment.
The character data type has its set of comparison
predicates: char=?
, char<?
, char<=?
, char>?
,
char>=?
.
(char=? #\a #\a) => #t (char<? #\a #\b) => #t (char>=? #\a #\b) => #f
To make the comparisons case-insensitive, use char‑ci
instead of char
in the procedure name:
(char-ci=? #\a #\A) => #t (char-ci<? #\a #\B) => #t
The case conversion procedures are char‑downcase
and
char‑upcase
:
(char-downcase #\A) => #\a (char-upcase #\a) => #\A
The simple data types we saw above are self-evaluating. I.e., if you typed any object from these data types to the listener, the evaluated result returned by the listener will be the same as what you typed in.
#t => #t 42 => 42 #\c => #\c
Symbols don’t behave the same way. This is because symbols are used by Scheme programs as identifiers for variables, and thus will evaluate to the value that the variable holds. Nevertheless, symbols are a simple data type, and symbols are legitimate values that Scheme can traffic in, along with characters, numbers, and the rest.
To specify a symbol without making Scheme think it is a variable, you should quote the symbol:
(quote xyz) => xyz
Since this type of quoting is very common in Scheme, a convenient abbreviation is provided. The expression
'E
will be treated by Scheme as equivalent to
(quote E)
Scheme symbols are named by a sequence of characters. About
the only limitation on a symbol’s name is that it shouldn’t
be mistakable for some other data, e.g., characters or booleans
or numbers or compound data. Thus, this‑is‑a‑symbol
,
i18n
,
<=>
, and $!#*
are all symbols; 16
, ‑i
(a
complex number!), #t
, "this‑is‑a‑string"
, and
(barf)
(a list) are not. The predicate for
checking symbolness is called symbol?
:
(symbol? 'xyz) => #t (symbol? 42) => #f
Scheme symbols are normally case-insensitive. Thus the
symbols
Calorie
and calorie
are identical:
(eqv? 'Calorie 'calorie) => #t
We can use the symbol xyz
as a global variable by using
the form define
:
(define xyz 9)
This says the variable xyz
holds the value 9
. If we
feed xyz
to the listener, the result will be the value
held by xyz
:
xyz => 9
We can use the form set!
to change the value held by a
variable:
(set! xyz #\c)
Now
xyz => #\c
Compound data types are built by combining values from other data types in structured ways.
Strings are sequences of characters (not to be confused with symbols, which are simple data that have a sequence of characters as their name). You can specify strings by enclosing the constituent characters in double-quotes. Strings evaluate to themselves.
"Hello, World!" => "Hello, World!"
The procedure string
takes a bunch of characters and
returns the string made from them:
(string #\h #\e #\l #\l #\o) => "hello"
Let us now define a global variable greeting
.
(define greeting "Hello; Hello!")
Note that a semicolon inside a string datum does not trigger a comment.
The characters in a given string can be individually
accessed and modified. The procedure string‑ref
takes a
string and a (0-based) index, and returns the character at
that index:
(string-ref greeting 0) => #\H
New strings can be created by appending other strings:
(string-append "E " "Pluribus " "Unum") => "E Pluribus Unum"
You can make a string of a specified length, and fill it with the desired characters later.
(define a-3-char-long-string (make-string 3))
The predicate for checking stringness is string?
.
Strings obtained as a result of calls to string
,
make‑string
, and string‑append
are mutable.
The procedure string‑set!
replaces the
character at a given index:
(define hello (string #\H #\e #\l #\l #\o)) hello => "Hello" (string-set! hello 1 #\a) hello => "Hallo"
Vectors are sequences like strings, but their elements can be anything, not just characters. Indeed, the elements can be vectors themselves, which is a good way to generate multidimensional vectors.
Here’s a way to create a vector of the first five integers:
(vector 0 1 2 3 4) => #(0 1 2 3 4)
Note Scheme’s representation of a vector value: a #
character followed by the vector’s contents enclosed in
parentheses.
In analogy with make‑string
, the procedure
make‑vector
makes a vector of a specific length:
(define v (make-vector 5))
The procedures vector‑ref
and vector‑set!
access and
modify vector elements.
The predicate for checking if something is a vector is vector?
.
A dotted pair is a compound value made by combining
any two arbitrary values into an ordered couple. The
first element is called the car, the second
element is called the cdr, and the combining
procedure is cons
.
(cons 1 #t) => (1 . #t)
Dotted pairs are not self-evaluating, and so to specify
them directly as data (i.e., without producing them via
a cons
-call), one must explicitly quote them:
'(1 . #t) => (1 . #t) (1 . #t) -->ERROR!!!
The accessor procedures are car
and cdr
:
(define x (cons 1 #t)) (car x) => 1 (cdr x) => #t
The elements of a dotted pair can be replaced by the
mutator procedures set‑car!
and set‑cdr!
:
(set-car! x 2) (set-cdr! x #f) x => (2 . #f)
Dotted pairs can contain other dotted pairs.
(define y (cons (cons 1 2) 3)) y => ((1 . 2) . 3)
The car
of the car
of this list is 1
.
The cdr
of the car
of this list is 2
.
I.e.,
(car (car y)) => 1 (cdr (car y)) => 2
Scheme provides procedure abbreviations for cascaded
compositions of the car
and cdr
procedures.
Thus, caar
stands for “car
of car
of”,
and cdar
stands for “cdr
of car
of”, etc.
(caar y) => 1 (cdar y) => 2
c...r
-style abbreviations for upto four cascades are
guaranteed to exist. Thus, cadr
, cdadr
, and
cdaddr
are all valid. cdadadr
might be pushing it.
When nested dotting occurs along the second element, Scheme uses a special notation to represent the resulting expression:
(cons 1 (cons 2 (cons 3 (cons 4 5)))) => (1 2 3 4 . 5)
I.e., (1 2 3 4 . 5)
is an abbreviation for (1
. (2 . (3 . (4 . 5))))
. The last cdr of this
expression is 5
.
Scheme provides a further abbreviation if the last cdr
is a special object called the empty list, which
is represented by the expression ()
. The empty
list is not considered self-evaluating, and so one
should quote it when supplying it as a value in a
program:
'() => ()
The abbreviation for a dotted pair of the form (1
. (2 . (3 . (4 . ()))))
is
(1 2 3 4)
This special kind of nested dotted pair is called a list. This particular list is four elements long. It could have been created by saying
(cons 1 (cons 2 (cons 3 (cons 4 '()))))
but Scheme provides a procedure called list
that
makes list creation more convenient. list
takes
any number of arguments and returns the list containing
them:
(list 1 2 3 4) => (1 2 3 4)
Indeed, if we know all the elements of a list, we can use
quote
to specify the list:
'(1 2 3 4) => (1 2 3 4)
List elements can be accessed by index.
(define y (list 1 2 3 4)) (list-ref y 0) => 1 (list-ref y 3) => 4 (list-tail y 1) => (2 3 4) (list-tail y 3) => (4)
list‑tail
returns the tail of the list
starting from the given index.
The predicates pair?
, list?
, and null?
check if their argument is a dotted pair, list, or the
empty list, respectively:
(pair? '(1 . 2)) => #t (pair? '(1 2)) => #t (pair? '()) => #f (list? '()) => #t (null? '()) => #t (list? '(1 2)) => #t (list? '(1 . 2)) => #f (null? '(1 2)) => #f (null? '(1 . 2)) => #f
Scheme offers many procedures for converting among
the data types. We already know how to convert between
the character cases using char‑downcase
and
char‑upcase
. Characters can be converted into
integers using char‑>integer
, and integers can be
converted into characters using integer‑>char
.
(The integer corresponding to a character is usually
its ascii code.)
(char->integer #\d) => 100 (integer->char 50) => #\2
Strings can be converted into the corresponding list of characters.
(string->list "hello") => (#\h #\e #\l #\l #\o)
Other conversion procedures in the same vein are
list‑>string
, vector‑>list
, and
list‑>vector
.
Numbers can be converted to strings:
(number->string 16) => "16"
Strings can be converted to numbers. If the string
corresponds to no number, #f
is returned.
(string->number "16") => 16 (string->number "Am I a hot number?") => #f
string‑>number
takes an optional second argument,
the radix.
(string->number "16" 8) => 14
because 16
in base 8 is the number fourteen.
Symbols can be converted to strings, and vice versa:
(symbol->string 'symbol) => "symbol" (string->symbol "string") => string
Scheme contains some other data types. One is the
procedure. We have already seen many procedures, e.g.,
display
, +
, cons
. In reality, these are
variables holding the procedure values, which are
themselves not visible as are numbers or characters:
cons => <procedure>
The procedures we have seen thus far are primitive procedures, with standard global variables holding them. Users can create additional procedure values.
Yet another data type is the port. A port is the conduit through which input and output is performed. Ports are usually associated with files and consoles.
In our “Hello, World!” program, we used the
procedure display
to write a string to the console.
display
can take two arguments, one the value to be
displayed, and the other the output port it should be
displayed on.
In our program, display
’s second argument was
implicit. The default output port used is the standard
output port. We can get the current standard output
port via the procedure-call (current‑output‑port)
.
We could have been more explicit and written
(display "Hello, World!" (current-output-port))
All the data types discussed here can be lumped
together into a single all-encompassing data type
called the s-expression (s for
symbolic). Thus 42
, #\c
, (1 . 2)
, #(a
b c)
, "Hello"
, (quote xyz)
,
(string‑>number "16")
, and (begin
(display "Hello, World!") (newline))
are all s-expressions.