Hi, I’m Erika Rowland (a.k.a. erikareads). Hi, I’m Erika. I’m an Ops-shaped Software Engineer, Toolmaker, and Resilience Engineering fan. I like Elixir and Gleam, Reading, and Design. She/Her. If you're looking for experienced talent, I would love to chat. Published on

Using Regex in Erlang

Recently, I was helping someone port some code to Erlang that involved a Regex. I hadn’t worked with Regex in Erlang before, so here are some notes on what I learned.

re - Perl-like regular expressions for Erlang

The Erlang standard library has the re module, which supports regular expression matching for Erlang strings and binaries.Both Gleam and Elixir use String to mean an Erlang binary. The default string type in Erlang is sugar over a linked list of code points.

The re module provides three main functions: replace, run, and split, which all operate on string-like inputErlang uses the iodata type to represent a generalization on binaries that can be more efficiently worked with, all three of these functions take an iodata or charlist. and a regex pattern.

Regex matching can be done with run like this:

> re:run("my string", "[myexpression]").
{match,[{0,1}]}

By default run returns all captured parts of the input, as a list of {Offset, Length} pairs. It also only returns the first match, not all matches.

To return the captures as strings, I can use the {capture, all, list} option:

> re:run("my string", 
    "[myexpression]",
    [{capture, all, list}]
  ).
{match,["m"]}

To get all of the matches, I can use the global option:

> re:run("my string", 
    "[myexpression]",
    [global,{capture, all, list}]
  ).
{match,[["m"],["y"],["s"],["r"],["i"],["n"]]}

Named Captures

re supports Perl-style named captures, which look like this:

Pattern1 = "(?<myname>capture)".

It can used by changing the ValueSpec component of the capture option:There isn’t a built in way to associate names with captures. Elixir provides a named_captures function to easily do this.From reading the source code of the Elixir implementation, it should be possible to combine re:inspect with lists:zip to get a list of {Name, Capture} pairs.

> re:run(
    "my capture of my capture", 
    Pattern1, 
    [
      global,
      {capture, ["myname"], list}
    ]
  ).
{match,[["capture"],["capture"]]}

Compiling An Expression

Erlang provides a compile function to compile a regular expression for re-use throughout the lifetime of a program:

Compiling the regular expression before matching is useful if the same expression is to be used in matching against multiple subjects during the lifetime of the program. Compiling once and executing many times is far more efficient than compiling each time one wants to match.

A regex can be compiled into a pattern for use with compile:

{ok, Pattern2} = re:compile("[myexpression]").

The used like any other regex pattern:

> re:run(
    "my string",
    Pattern2,
    [{capture, all, list}]
  ).
{match,["m"]}

Takeaways

The Erlang regular expression module is a bit difficult to use, but it will be nice if I have need of a zero-dependency regex module when using Erlang.