How to use regular expressions as data generation patterns?

The regular expression is industrial standard for search patterns. It is a well-known way to describe value format and limitations. Also, the regular expressions used as a pattern in XSD schemas and many others solutions as data description mechanism. As a result of this, it is very comfortable to use a regular expression as data generation source directly.

Of course, it is not so suitable for cases like "random integer" or "date in range". Let's review more interesting examples.

1. Customer code contains department ID (letter between A and M), dot, store code (2 to 7), '-' sign and tree random digits but the first is great than 2. The regular expression is:

  How to use regular expression in DTM Data Generator
2. Product version has three group of digits separated by a dot. The first group between 1 and 12 and other two groups has two digits with leading zero. The suitable expression is:

  How to use regular expression in DTM Flat File Generator
3. Reference code has three groups. The first and third is the same digit between 1 and 5 when the second group contains five hexadecimal digits. The regular expression uses block (or sub-expression) definition with reference as \1:

  How to use regular expression in DTM Test XML Generator

Feature Availability

All modern versions of DTM Data Generator (including editions for JSON and for Excel), DTM Flat File Generator, DTM File Factory and DTM Test XML Generator support regular expressions as a pattern for data generation. For the first tool "by pattern fill method should be used". For the second and third tools, $Regexp function with regular expression as the only parameter is available. The "Custom Generator" option should be selected to use this function.

DTM Data Generation SDK and DTM Data Generation Script Compiler also support regular expressions via $Regexp function call.

The popular question is "can I combine regular expression with another data population methods?". Yes, it is possible for all mentioned tools using data generation pattern engine. The user can combine $Regexp function call with other function or pattern items. Let's analyze two examples:

1. Person name with identification signature as 4 letters between A and F separated by ':'. This value requires Value Library ($Lib function) call for the realistic full name and regular expression ($Regexp function) for signature. The pattern is:


2. The product name has two substrings separated by '-'. The first should be loaded from [Product Groups] database table as column [Name] and the second has two groups: 2 digits and 2 letters between 'f' and 'q'. We'll use $Table function call for the first substring and regular expression for the second:

$Table(Product Groups,Name)-$Regexp([0-9 ]{2}[f-q]{2})