Named Capture Groups

Last updated on Monday, March 11th, 2019 @ 6:12PM.

Did you know that Google Chrome and other environments such as Node.js now support named capture groups in regular expressions? That's right, now we can use regular expressions such as the following:

/(?<year>\d{4})(?<delim>[\-\/\.])(?<month>\d\d)\k<delim>(?<day>\d\d)/

What Are Named Capture Groups?

That might be your first question. First of all, it is important to remember that capture groups (or capturing groups) are basically a way of keeping track of a sequence of characters that were matched by your regular expression. Let's say that we have a variable which contains the string "2019-12-31". We can actually use a regular expression to pull out the year, the month and the day of the month:

var result = /^(\d{4})-(\d\d)-(\d\d)$/.exec("2019-12-31");

Running the above code will assign an augmented array object to result where the first item (result[1]) will be the entire match. The second item (result[1]) will be the year which in this case is "2019". The third item (result[2]) will be the month which in this case is "12". The third item (result[3]) will be the month which in this case is "31". Each capture group is represented in our regular expression by simply wrapping the desired pattern in parentheses.

Even though this works just fine, we want to be able to reference our capture groups by name. That is why named capture groups (or named capturing groups) were added to ECMAScript (AKA JavaScript). They are capture groups which can be referenced by name (as you most-likely guessed).

Can I See An Example?

Of course! Let's say that we want to pull the year, the month and the day of the month from the string "2019-12-31". We can do this with the following regular expression which also contains named capture groups:

var result = /^(?<year>\d{4})-(?<month>\d\d)-(?<day>\d\d)$/.exec("2019-12-31");

Running the above code will assign an augmented array object to result. The would be more-or-less the equivalent to the following array:

[
  "2019-12-31",
  "2019",
  "12",
  "31",
  index: 0,
  length: 4,
  input: "2019-12-31",
  groups: {
    day: "31",
    month: "12",
    year: "2019"
  }
]

Of course, if you try to run the above code in the console it will fail, but more-or-less that would be the structure of result. As you can see result[0] to result[3] are the normal values you would get with regular capture groups. We also still get access to result.index and result.input. What is new is result.groups. This object contains a key for each named capture group that we defined in our regular expression.

Syntax

As you may have noticed in the example above, the syntax is pretty simple. A normal capture group is simply surrounded by parentheses, whereas a named capture group is surrounded by parentheses and preceded by ?<name_of_group> (of course replacing name_of_group with the desired name of the capture group).

Backreferences

What is nice is that just as you can use backreferences to reference a previously capture group in the regular expression, you can do something similar with named capture groups. Here is an example of a backreference to a normal capture group:
var result = /(.).*?\1/.exec("Where in the world is Carmen Sandiego?");

The value of result will be something like the following:

[
  "here in th",
  "h",
  index: 1,
  length: 2,
  input: "Where in the world is Carmen Sandiego?"
]

This representation is once again simply to describe the structure and is not properly formed JavaScript. The first item is the substring that was matched by the regular expression. The second item is the value of the capture group. The purpose of the regular expression is to find the first character in the given string that is repeated again later on. We indicate that we want a character that repeats by using the \1 backreference which essentially references the first capture group found in the regular expression. The regular expression /(.).*?\1/ also allows for other characters that are not the same as those found in the capture group to be in the general match. In this case the first character that is repeated later on is the "h". For that reason the actual match is "here in th".

How do we backreference a named capture group? Here is an example similar to what we used before but using a named capture group:

var result = /(?<repeat>.).*?\k<repeat>/.exec("Where in the world is Carmen Sandiego?");

The value of result will be the following:

[
  "here in th",
  "h",
  index: 1,
  input: "Where in the world is Carmen Sandiego?",
  length: 2,
  groups: {
    repeat: "h"
  }
]

Again the above is a representation of the result array. The main difference that you will notice is that the groups property is defined as a blank object with repeat as its only key-value pair. As far as the structure of the regular expression is concerned, the main difference is that we are now using a named capture group and we are using the named capture group backreference syntax to reference that first named capture group. The one thing that we want to take away from this example is that in order to use a named backreference we need to use the syntax \k<name_of_group> (of course replacing name_of_group with the actual name of the desired capture group.

One thing that I do want to mention is that you can still reference named capture groups by number. For example, /(?<repeat>.).*?\1/ will work the same as /(?<repeat>.).*?\1/.

Can I Try An Example?

When I published this article Google Chrome was the only major web browser that supported named capture groups. The good news though is that thanks to RunKit, we can play around with this new feature. Try out the example below and modify it as you like:

var today = new Date; var delim = '/-.'.charAt(Math.floor(Math.random() * 3)); var strToday = [today.getFullYear(), today.getMonth() + 1, today.getDate()].join(delim); strToday = strToday.replace(/\b\d\b/g, '0$&'); var rgx = /(?<year>\d{4})(?<delim>[\-\/\.])(?<month>\d\d)\k<delim>(?<day>\d\d)/; rgx.exec(strToday);

Is There Anything Else I Should Know?

There is always more that we can learn, especially when it comes to JavaScript these days. I do have to admit that as I started diving into this new feature the following pages helped a lot:

A great resource for all JavaScripters is MDN. I'm sure there will be more information out there as well as time goes on. One thing that I hope to write about in the future is using String.prototype.replace(…) in conjunction with a regular expression with named capture groups.

In conclusion I have to say the best way to get to know more is to keep writing code and try to keep up with the advances in ECMAScript (AKA JavaScript). Happy coding!!! 😎