LFE Programming Rules and Conventions

3 SW Engineering Principles

3.1 Export as few functions as possible from a module

Modules are the basic code structuring entity in LFE. A module can contain a large number of functions but only functions which are included in the export list of the module can be called from outside the module.

Seen from the outside, the complexity of a module depends upon the number of functions which are exported from the module. A module which exports one or two functions is usually easier to understand than a module which exports dozens of functions.

Modules where the ratio of exported/non-exported functions is low are desirable in that a user of the module only needs to understand the functionality of the functions which are exported from the module.

In addition, the writer or maintainer of the code in the module can change the internal structure of the module in any appropriate manner provided the external interface remains unchanged.

3.2 Try to reduce intermodule dependencies

A module which calls functions in many different modules will be more difficult to maintain than a module which only calls functions in a few different modules.

This is because each time we make a change to a module interface, we have to check all places in the code where this module is called. Reducing the interdependencies between modules simplifies the problem of maintaining these modules.

We can simplify the system structure by reducing the number of different modules which are called from a given module.

Note also that it is desirable that the inter-module calling dependencies form a tree and not a cyclic graph. Example:

But not

3.3 Put commonly used code into libraries

Commonly used code should be placed into libraries. The libraries should be collections of related functions. Great effort should be made in ensuring that libraries contain functions of the same type. Thus a library such as lists containing only functions for manipulating lists is a good choice, whereas a library, lists-and-maths containing a combination of functions for manipulating lists and for mathematics is a very bad choice.

The best library functions have no side effects. Libraries with functions with side effects limit the re-usability.

3.4 Isolate "tricky" or "dirty" code into separate modules

Often a problem can be solved by using a mixture of clean and dirty code. Separate the clean and dirty code into separate modules.

Dirty code is code that does dirty things. Example:

Uses the process dictionary.
Uses (erlang:process_info arg) for strange purposes.
Does anything that you are not supposed to do (but have to do).

Concentrate on trying to maximize the amount of clean code and minimize the amount of dirty code. Isolate the dirty code and clearly comment or otherwise document all side effects and problems associated with this part of the code.

3.5 Don't make assumptions about what the caller will do with the results of a function

Don't make assumptions about why a function has been called or about what the caller of a function wishes to do with the results.

For example, suppose we call a routine with certain arguments which may be invalid. The implementer of the routine should not make any assumptions about what the caller of the function wishes to happen when the arguments are invalid.

Thus we should not write code like

(defun do-something (args)
  (case (check-args args)
    ('ok
      (tuple 'ok (do-it args)))
    ((tuple error what)
      (let ((string (format-the-error what)))
        ;; don't do the following!
        (io:format '"* error: ~s~n" (list string))
        error))))

Instead write something like:

(defun do-something (args)
  (case (check-args args)
    ('ok
      (tuple 'ok (do-it args)))
    ((tuple error what)
      (tuple 'error what))))

(defun error-report
  (((tuple error what))
    (format-the-error what)))

In the former case, the error string is always printed on standard output; in the latter case, an error descriptor is returned to the application. The application can now decide what to do with this error descriptor.

By calling error_report/1 the application can convert the error descriptor to a printable string and print it if so required. But this may not be the desired behavior - in any case the decision as to what to do with the result is left to the caller.

3.6 Abstract out common patterns of code or behavior

Whenever you have the same pattern of code in two or more places in the code try to isolate this in a common function and call this function instead of having the code in two different places. Copied code requires much effort to maintain.

If you see similar patterns of code (i.e. almost identical) in two or more places in the code it is worth taking some time to see if one cannot change the problem slightly to make the different cases the same and then write a small amount of additional code to describe the differences between the two.

Avoid "copy" and "paste" programming, use functions!

3.7 Top-down

Write your program using the top-down fashion, not bottom-up (starting with details). Top-down is a nice way of successively approaching details of the implementation, ending up with defining primitive functions. The code will be independent of representation since the representation is not known when the higher levels of code are designed.

3.8 Don't prematurely optimize code

Don't optimize your code at the first stage. First make it right, then (if necessary) make it fast (while keeping it right).

3.9 Use the principle of "least astonishment"

The system should always respond in a manner which causes the "least astonishment" to the user - i.e., a user should be able to predict what will happen when they do something and not be astonished by the result.

This has to do with consistency. A consistent system where different modules do things in a similar manner will be much easier to understand than a system where each module does things in a different manner.

If you get astonished by what a function does, either your function solves the wrong problem or it has a wrong name.

3.10 Try to eliminate side effects

LFE has several primitives which have side effects. One may also call functions from Erlang/OTP in LFE which have side effects. Functions which use these cannot be easily re-used since they cause permanent changes to their environment and you have to know the exact state of the process before calling such routines.

Write as much as possible of the code with side-effect free code.

Maximize the number of pure functions.

Collect together the functions which have side effect and clearly document all the side effects.

With a little care most code can be written in a side-effect free manner - this will make the system a lot easier to maintain, test and understand.

3.11 Don't allow private data structure to "leak" out of a module

This is best illustrated by a simple example. We define a simple module called queue, to implement queues:

(defmodule queues1
  (export
    (add 2)
    (fetch 1)))

(defun add (item queue)
  (lists:append queue (list item)))

(defun fetch
  (((cons head tail))
    (tuple 'ok head tail))
  (('())
    'empty))

This implements a queue as a list, unfortunately to use this the user must know that the queue is represented as a list. A typical interactive usage in the LFE REPL might look like the following:

> (c '"src/queues1.lfe")
#(module queues1)
> (set q1 '())
()
> (set q2 (queues1:add 'joe q1))
(joe)
> (set q3 (queues1:add 'mike q2))
(joe mike)

This is bad since the user:

needs to know that the queue is represented as a list, and
the implementer cannot change the internal representation of the queue (this they might want to do later to provide a better version of the ̇ module).

Better is:

(defmodule queues2
  (export
    (add 2)
    (fetch 1)
    (new 0)))

(defun new ()
  '())

(defun add (item queue)
  (lists:append queue (list item)))

(defun fetch
  (((cons head tail))
    (tuple 'ok head tail))
  (('())
    'empty))

Now we can write:

> (c '"src/queues2.lfe")
#(module queues2)
> (set q1 (queues2:new))
()
> (set q2 (queues2:add 'joe q1))
(joe)
> (set q3 (queues2:add 'mike q2))
(joe mike)

Which is much better and corrects this problem.

Now suppose the user needs to know the length of the queue, they might be tempted to write:

> (set len (length queue-2)) ; don't do this!

since they know that the queue is represented as a list. Again this is bad programming practice and leads to code which is very difficult to maintain and understand. If they need to know the length of the queue then a len/1 function must be added to the module, thus:

(defmodule queues3
  (export
    (add 2)
    (fetch 1)
    (len/1)
    (new 0)))

(defun new ()
  '())

(defun add (item queue)
  (lists:append queue (list item)))

(defun fetch
  (((cons head tail))
    (tuple 'ok head tail))
  (('())
    'empty))

(defun len (queue)
  (length queue))

Now the user can call (queues3:len queue) instead.

Here we say that we have "abstracted out" all the details of the queue (the queue is in fact what is called an "abstract data type").

Why do we go to all this trouble? - the practice of abstracting out internal details of the implementation allows us to change the implementation without changing the code of the modules which call the functions in the module we have changed. So, for example, a better implementation of the queue is as follows:

(defmodule queues4
  (export
    (add 2)
    (fetch 1)
    (len 1)
    (new 0)))

(defun new ()
  #(() ()))

(defun add
  "Faster addition of elements that using lists:append."
  ((item (tuple x y))
    (tuple (cons item x) y)))

(defun fetch
  (((tuple x (cons head tail)))
    (tuple 'ok head (tuple x tail)))
  (((tuple '() '()))
    'empty)
  (((tuple x '()))
    ;; perform this heavy computation only sometimes
    (fetch (tuple '() (lists:reverse x)))))

(defun len
  (((tuple x y))
    (+ (length x) (length y))))

Which would then be used like so:

> (c '"src/queues4.lfe")
#(module queues4)
> (set q1 (queues4:new))
#(() ())
> (set q2 (queues4:add 'joe q1))
#((joe) ())
> (set q3 (queues4:add 'mike q2))
#((mike joe) ())
> (set q4 (queues4:add 'robert q3))
#((robert mike joe) ())
> (queues4:len q1)
0
> (queues4:len q2)
1
> (queues4:len q3)
2
> (queues4:len q4)
3
> (queues4:fetch q1)
empty
> (queues4:fetch q2)
#(ok joe #(() ()))
> (queues4:fetch q3)
#(ok joe #(() (mike)))
> (queues4:fetch q4)
#(ok joe #(() (mike robert)))
>

3.12 Make code as deterministic as possible

A deterministic program is one which will always run in the same manner no matter how many times the program is run. A non-deterministic program may deliver different results each time it is run. For debugging purposes it is a good idea to make things as deterministic as possible. This helps make errors reproducible.

For example, suppose one process has to start five parallel processes and then check that they have started correctly, suppose further that the order in which these five are started does not matter.

We could then choose to either start all five in parallel and then check that they have all started correctly but it would be better to start them one at a time and check that each one has started correctly before starting the next one.

3.13 Do not program "defensively"

A defensive program is one where the programmer does not "trust" the input data to the part of the system they are programming. In general one should not test input data to functions for correctness. Most of the code in the system should be written with the assumption that the input data to the function in question is correct. Only a small part of the code should actually perform any checking of the data. This is usually done when data "enters" the system for the first time, once data has been checked as it enters the system it should thereafter be assumed correct.

Example:

;; args: option is either 'all or 'normal
(defun get-server-usage-info (option ascii-pid)
  (let ((pid (list_to_pid ascii-pid)))
    (case option
      'all (get-all-info pid)
      'normal (get-normal-info pid))))

The function will crash if option is neither 'normal nor 'all, and it should do that. The caller is responsible for supplying correct input.

3.14 Isolate hardware interfaces with a device driver

Hardware should be isolated from the system through the use of device drivers. The device drivers should implement hardware interfaces which make the hardware appear as if they were Erlang processes. Hardware should be made to look and behave like normal Erlang processes. Hardware should appear to receive and send normal Erlang messages and should respond in the conventional manner when errors occur.

3.15 Do and undo things in the same function

Suppose we have a program which opens a file, does something with it and closes it later. This should be coded as:

;; the followingis good code
(defun do-something-with (file)
  (case (file:open file 'read)
    ((tuple 'ok stream)
      (do-it stream)
      (file:close stream))
    (error error)))

Note the symmetry in opening the file with (file:open ...) and closing it with (file:close ...) in the same routine.

The solution below is much harder to follow and it is not obvious which file that is closed. Don't program it like this:

(defun do-something-with (file)
  (case (file:open file 'read)
    ((tuple 'ok stream)
      (do-it stream))
    (error error)))

(defun do-it (stream)
  ...
  (func-234 ... stream ...)
  ...)

(defun func-234 (... stream ...)
  ...
  (file:close stream)) ; don't do this!