How regular expressions work in Bash: a simple explanation for beginners

Using quantifiers to repeat patterns in a script

Quantifiers control how many times a character or group should be repeated. This adds a lot of power to your regular expression patterns.

+(plus): Matches the previous character one or more times. ab+cMatches “abc”, “abbc”, “abbbc”, etc., but not “ac”.
?(question mark): Matches the previous character zero or one times (i.e. makes the previous character optional). ab?cMatches “ac” and “abc”, but not “abbc”.
*(asterisk): Matches the previous character zero or more times. We’ve seen this before.
{n}Matches the previous character exactly n times. a{3}Matches “aaa”.
{n,}Matches the previous character n or more times. a{2,}Matches “aa”, “aaa”, “aaaa”, etc.
{n,m}Matches the previous character n to m times (inclusive). a{1,3}Matches “a”, “aa”, or “aaa”.

Let’s modify our regex_test.sh script to use quantifiers. Open regex_test.sh with a text editor and replace its contents with the following:

#!/bin/bash

string="abbbc"
if [[ "$string" =~ ab+c ]]; then
  echo "Match found!"
else
  echo "No match."
fi

Save the file and run it:

./regex_test.sh

The output should say “Match found!” This is because ab+c matches a string that starts with ‘a’, followed by one or more ‘b’s, and ends with ‘c’.

Extracting data using capture groups in a script

Parentheses () are used to group parts of a regular expression. This is useful for applying quantifiers to multiple characters and for capturing text matches.

When you use parentheses, Bash stores the text that matches that part of the regular expression in a special array called BASH_REMATCH. BASH_REMATCH[0] contains the entire string, BASH_REMATCH[1] matches the first group, BASH_REMATCH[2] the second group, and so on.

Let’s modify our regex_test.sh script to extract data using capture groups. Open regex_test.sh with a text editor and replace its contents with the following code:

#!/bin/bash

string="apple123"
if [[ "$string" =~ ^([a-z]+)([0-9]+)$ ]]; then
  fruit="${BASH_REMATCH[1]}"
  number="${BASH_REMATCH[2]}"
  echo "Fruit: $fruit"
else
  echo "No match."
fi

Save the file and run it:

./regex_test.sh

The output should be “Fruit: apple”. This script extracts the name of the fruit from a string using capture groups.

Replacing text with sed in a script

Let’s create a new script called sed_test.sh to practice using sed.

cd ~/project
touch sed_test.sh
chmod +x sed_test.sh

Open sed_test.sh with a text editor and add the following:

#!/bin/bash

string="apple123"
echo "$string" | sed 's/[0-9]/X/g'

Save the file and run it:

./sed_test.sh

The output should be: appleXXX. This sed script replaces all digits in a string with the letter “X”.

Conclusion

Regular expressions in Bash are a convenient and powerful way to work with text right at the command line. They allow you to not only check strings, but also extract the necessary parts, change them, and automate repetitive actions in scripts. Thanks to simple constructs – from characters and quantifiers to capture groups – you can quickly create patterns that work with any data. And the combination with sed opens up even more possibilities for text processing without complex tools.

How regular expressions work in Bash: a simple explanation for beginners

Understanding basic regular expressions and matching

Working with character sets in a script

Using quantifiers to repeat patterns in a script

Extracting data using capture groups in a script

Replacing text with sed in a script

Conclusion