Skip to content

Tutorial 1: An Introduction

Mark Cox edited this page Sep 15, 2017 · 4 revisions

This tutorial shows how to use the template function system to improve the performance of a function which performs element wise addition of two arrays. The original function general-a-plus-b, defined for arrays containing numbers, is shown below.

(defun general-a-plus-b (c a b)
  (declare (optimize safety))
  (assert (= (array-total-size c)
             (array-total-size a)
             (array-total-size b)))
  (dotimes (i (array-total-size c))
    (setf (row-major-aref c i) (+ (row-major-aref a i)
                                  (row-major-aref b i))))
  (values))

The template function requires a lambda form function which generates a lambda form according to an argument specification. The syntax of an argument specification is the same as the arg-typespec syntax used by compound function type specifiers except that &optional arguments are not allowed.

The template function system provides the macro destructuring-argument-specification to aid implementation.

(eval-when (:compile-toplevel :load-toplevel :execute)
  (defun make-a-plus-b-lambda-form (argument-specification)
    (destructuring-argument-specification (<c> <a> <b>) argument-specification
      `(lambda (c a b)
         (declare (optimize safety)
                  (type ,<c> c)
                  (type ,<a> a)
                  (type ,<b> b))
         (print '(,<c> ,<a> ,<b>)) ;; For debugging
         (assert (= (array-total-size c)
                    (array-total-size a)
                    (array-total-size b)))
         (dotimes (i (array-total-size c))
           (setf (row-major-aref c i) (+ (row-major-aref a i)
                                         (row-major-aref b i))))
         (values)))))

A template function also requires a function-type-function which generates a function type specifier according to an argument specification.

(eval-when (:compile-toplevel :load-toplevel :execute)
  (defun make-a-plus-b-function-type (argument-specification)
    `(function ,argument-specification (values))))

Inserting a template function in the global environment with the name a-plus-b is achieved using the define-template macro.

(define-template a-plus-b (c a b)
  (:lambda-form-function #'make-a-plus-b-lambda-form)
  (:function-type-function #'make-a-plus-b-function-type))

There are no specializations associated with the template function yet. Specializations are instantiated by the user using macro require-instantation (or require-instantiations).

The following requests the most general a-plus-b specialization.

(require-instantiation a-plus-b (array array array))

The following introduces more specializations

(require-instantiations (a-plus-b ((simple-array double-float)
                                     (simple-array double-float)
                                     (simple-array double-float))
                                  ((simple-array single-float)
                                     (simple-array single-float)
                                     (simple-array single-float))))

The example below shows the most specific specialization being selected at runtime.

(defun runtime-example (size element-type initial-a initial-b)
  (let* ((c (make-array size :element-type element-type))
         (a (make-array size :element-type element-type
                             :initial-element initial-a))
         (b (make-array size :element-type element-type
                             :initial-element initial-b)))
    (a-plus-b c a b)))

(runtime-example 2 't 1 2)
(runtime-example 3 'double-float 1d0 2d0)
(runtime-example 4 'single-float 3s0 4s0)

;; If the "For debugging" line is enabled in make-a-plus-b-lambda-form
;; then evaluating the runtime-example forms should print the following
;;
;;   (array array array)
;;   ((simple-array double-float) (simple-array double-float) (simple-array double-float))
;;   ((simple-array single-float) (simple-array single-float) (simple-array single-float))

Specializations can also be selected at compile time using THE forms or declarations (on supported implementations).

(let* ((fn (compiler-macro-function 'a-plus-b))
       (type '(simple-array double-float)))
  (funcall fn `(a-plus-b (the ,type c) (the ,type a) (the ,type b))
              nil))
;; (the (values)
;;      (a-plus-b/v\d_v\d_v\d
;;        (the (simple-array double-float) c)
;;        (the (simple-array double-float) a)
;;        (the (simple-array double-float) b)))

Timing the different implementations is done with the following code.

(defun compute-timing ()
  (let* ((size 8000)
         (element-type 'double-float)
         (initial-a 1d0)
         (initial-b 2d0)
         (count 10000)
         (a (make-array size :element-type element-type
                             :initial-element initial-a))
         (b (make-array size :element-type element-type
                             :initial-element initial-b))
         (c (make-array size :element-type element-type)))
    (flet ((run-general ()
             (format t "~&General timing.~%")
             (time
               (dotimes (i count)
                 (general-a-plus-b c a b))))
           (run-runtime ()
             (format t "~&Runtime timing.~%")
             (time
               (dotimes (i count)
                 (a-plus-b c a b))))
           (run-compile-time ()
             (let* ((array-type `(simple-array ,element-type))
                    (fn (compile nil `(lambda (a b c)
                                        (dotimes (i ,count)
                                          (a-plus-b (the ,array-type a)
                                                    (the ,array-type b)
                                                    (the ,array-type c)))))))
               (format t "~&Compile time timing.~%")
               (time
                 (funcall fn a b c)))))
      (run-general)
      (run-runtime)
      (run-compile-time))))

;; The function (compute-timing) prints the following on SBCL 1.3.20.
;;
;; General timing.
;; Evaluation took:
;;   4.556 seconds of real time
;;   4.552687 seconds of total run time (4.500860 user, 0.051827 system)
;;   [ Run times consist of 0.055 seconds GC time, and 4.498 seconds non-GC time. ]
;;   99.93% CPU
;;   9,114,130,822 processor cycles
;;   3,840,046,624 bytes consed
;;
;; Runtime timing.
;; Evaluation took:
;;   1.951 seconds of real time
;;   1.948507 seconds of total run time (1.943698 user, 0.004809 system)
;;   99.90% CPU
;;   3,901,077,786 processor cycles
;;   0 bytes consed
;;
;; Compile time timing.
;; Evaluation took:
;;   1.934 seconds of real time
;;   1.932634 seconds of total run time (1.927686 user, 0.004948 system)
;;   99.95% CPU
;;   3,869,181,645 processor cycles
;;   0 bytes consed