different modifiers on the individual expressions. The order of
* sub-matches is preserved as well. Numbered back-references are adapted to
* the new overall sub-match count. This means that it's safe to use numbered
* back-refences in the individual expressions!
* If {@link $names} is given, the individual expressions are captured in
* named sub-matches using the contents of that array as names.
* Matching pair-delimiters (e.g. "{…}") are currently
* not supported.
*
* The function assumes that all regular expressions are well-formed.
* Behaviour is undefined if they aren't.
*
* This function was created after a
* {@link http://stackoverflow.com/questions/244959/ StackOverflow discussion}.
* Much of it was written or thought of by “porneL” and “eyelidlessness”. Many
* thanks to both of them.
*
* @param string $glue A string to insert between the individual expressions.
* This should usually be either the empty string, indicating
* concatenation, or the pipe ("|"), indicating alternation.
* Notice that this string might have to be escaped since it is treated
* as a normal character in a regular expression (i.e. "/" will
* end the expression and result in an invalid output).
* @param array $expressions The expressions to merge. The expressions may
* have arbitrary different delimiters and modifiers.
* @param array $names Optional. This is either an empty array or an array of
* strings of the same length as {@link $expressions}. In that case,
* the strings of this array are used to create named sub-matches for the
* expressions.
* @return string An string representing a regular expression equivalent to the
* merged expressions. Returns FALSE if an error occurred.
*/
function preg_merge($glue, array $expressions, array $names = array()) {
// … then, a miracle occurs.
// Sanity check …
$use_names = ($names !== null and count($names) !== 0);
if (
$use_names and count($names) !== count($expressions) or
!is_string($glue)
)
return false;
$result = array();
// For keeping track of the names for sub-matches.
$names_count = 0;
// For keeping track of *all* captures to re-adjust backreferences.
$capture_count = 0;
foreach ($expressions as $expression) {
if ($use_names)
$name = str_replace(' ', '_', $names[$names_count++]);
// Get delimiters and modifiers:
$stripped = preg_strip($expression);
if ($stripped === false)
return false;
list($sub_expr, $modifiers) = $stripped;
// Re-adjust backreferences:
// TODO What about \R backreferences (\0 isn't allowed, though)?
// We assume that the expression is correct and therefore don't check
// for matching parentheses.
$number_of_captures = preg_match_all('/\([^?]|\(\?[^:]/', $sub_expr, $_);
if ($number_of_captures === false)
return false;
if ($number_of_captures > 0) {
$backref_expr = '/
(?" : '?:';
$new_expr = "($sub_name$sub_modifiers$sub_expr)";
$result[] = $new_expr;
}
return '/' . implode($glue, $result) . '/';
}
/**
* Strips a regular expression string off its delimiters and modifiers.
* Additionally, normalizes the delimiters (i.e. reformats the pattern so that
* it could have used "/" as delimiter).
*
* @param string $expression The regular expression string to strip.
* @return array An array whose first entry is the expression itself, the
* second an array of delimiters. If the argument is not a valid regular
* expression, returns FALSE.
*
*/
function preg_strip($expression) {
if (preg_match('/^(.)(.*)\\1([imsxeADSUXJu]*)$/s', $expression, $matches) !== 1)
return false;
$delim = $matches[1];
$sub_expr = $matches[2];
if ($delim !== '/') {
// Replace occurrences by the escaped delimiter by its unescaped
// version and escape new delimiter.
$sub_expr = str_replace("\\$delim", $delim, $sub_expr);
$sub_expr = str_replace('/', '\\/', $sub_expr);
}
$modifiers = $matches[3] === '' ? array() : str_split(trim($matches[3]));
return array($sub_expr, $modifiers);
}
?>