2 Diving In

2.3 Binary and Bitstrings

2.3.1 Lists and binary

A full discussion of the binary type is a huge topic that probably deserves one or more dedicated tutorials, especially given the close connection with pattern matching and the efficient parsing of binary data. However, for now, we're just going to look at one particular area: working with strings as binary data.

In the previous section, we had mentioned using (binary ...) to more efficiently represent large strings. Here's an example (pretending, for now, that our example is using a very large string ;-)):

> (binary "There's a frood who really knows where his towel is.")
#B(84 104 101 114 101 39 115 32 97 32 102 114 111 111 100 32 119 104 111 ...)

Or you could use the Erlang function, if you wanted:

> (list_to_binary "There's a frood who really knows...")
#B(84 104 101 114 101 39 115 32 97 32 102 114 111 111 100 32 119 104 111 ...)
101 97 108 108 121 32 107 110 111 ...)

Let's set a variable with this value in the shell, so we can work with it more easily:

(set data (binary "There's a frood who really knows where his towel is."))
#B(84 104 101 114 101 39 115 32 97 32 102 114 111 111 100 32 119 104 111 ...)

2.3.2 Binary Functions in OTP

Let's convert it back to a list using a function from the Erlang stdlib binary module:

> (unicode:characters_to_list data)
"There's a frood who really knows where his towel is."

Note that the LFE binary function is quite different than the call to the binary module in the Erlang stdlib! The binary module has all sorts of nifty functions we can use (check out the docs). Here's an example of splitting our data:

> (binary:split data (binary " who really knows "))
(#B(84 104 101 114 101 39 115 32 97 32 102 114 111 111 100)
 #B(119 104 101 114 101 32 104 105 115 32 116 111 119 101 108 32 105 115 46))

The split gives us two pieces; here's how we can get the new string from that split:

> (unicode"characters_to_list
    (binary:split data (binary "who really knows ")))
"There's a frood where his towel is."
>

binary split creates a list of binaries, but since this is an IoList and unicode characters_to_list can handle those without us having to flatten them, our work is done! We get our result: the new string that we created by splitting on "who really knows ".

2.3.3 Bit-Packing (and Unpacking)

For this section, let's use the 16-bit color example that is given in Joe Armstrong's Erlang book where 5 bits are allocated for the red channel, 6 for the green and 5 for the blue. In LFE, we can create a 16-bit memory area like so:

> (set red 2)
2
> (set green 61)
61
> (set blue 20)
20
> (binary
    (red (size 5))
    (green (size 6))
    (blue (size 5)))
#B(23 180)
>

All packed and ready!

We can use patterns to unpack binary data in a let expression into the variables r, g, and b, printing out the results within the let:

> (let (((binary (r (size 5)) (g (size 6)) (b (size 5)))
         #b(23 180)))
       (io:format "~p ~p ~p~n" (list r g b)))
2 61 20
ok
>

We're getting a little ahead of ourselves here, by throwing a pattern in the mix, but it's a good enough example to risk it :-)

2.3.4 So What's a Bitstring?

We've been looking at binaries in LFE, but what's a bitstring? The Erlang docs say it well: A bitstring is a sequence of zero or more bits, where the number of bits doesn't need to be divisible by 8. If the number of bits is divisible by 8, the bitstring is also a binary.

2.3.5 LFE's Exact Definition of Binary

Here's the full definition for the binary from in LFE:

(binary seg ... )

Where seg is:

byte
string
(val integer|float|binary|bitstring|bytes|bits
     (size n) (unit n)
     big-endian|little-endian|native-endian|little|native|big
     signed|unsigned)

This should help you puzzle through some of the more complex binary constructions you come accross ;-)