Welcome to Admin Junkies, Guest — join our community!

Register or log in to explore all our content and services for free on Admin Junkies.

Feedback on some Regex needed.

Tyrsson

Retired Staff
Joined
Apr 29, 2023
Messages
402
Website
github.com
Credits
1,098
So, the back story.

I am currently building a password validation class for general use in Laminas applications via laminas-validator implementations. I believe that any time your working with Regex its good to get as many peoples thoughts as possible because there is so many approaches, ways to look at it, and, how to accomplish the task at hand. I am trying to figure out if there is a simpler way to accomplish this:

$options['special'] is the configuration options that are passed to the validator, in other words, if you set the options to require 2 special characters then that is how many we have to verify is present in $value. The current supported special characters are:
Code:
[].+=\[\\\\@_!\#$%^&*()<>?|}{~:-]

PHP:
        if (isset($options['special']) && (int) $options['special'] > 0) {
            preg_match_all(self::POSSIBLE_SPECIAL_CHARS, $value, $specialMatches);
            if (! count($specialMatches[0]) >= $options['special']) {
                $this->error(self::INVALID_SPECIAL_COUNT);
                $isValid = false;
            }
        }
 
Advertisement Placeholder
Um... ! has a higher precedence so what you end up testing here is !(count(...)) -> true/false >= $options['special'] which isn't what you want. Compare the results with an extra () in https://3v4l.org/qM9OJ for info.

The way I'd sort this works out something like this:

Code:
$isValid = true;
$counts = [
  'loweralpha' => 0,
  'higheralpha' => 0,
  'numeric' => 0,
  'special' => 0,
  'accented' => 0,
  'other' => 0,
];
$characters = mb_str_split($password);
foreach ($characters as $character) {
  $counts[classify_character($character)]++;
}

if ($counts['loweralpha'] < ($options['loweralpha'] ?? 0)) {
  $this->error(self::INVALID_LOWERALPHA_COUNT);
  $isValid = false;
}

// rinse repeat for other classes of character

function classify_character(string $character): string {
  if (preg_match('/[a-z]/', $character)) {
    return 'loweralpha';
  }
  if (preg_match('/[A-Z]/', $character)) {
   return 'higheralpha';
  }
  if (preg_match('/[0-9]/', $character)) {
    return 'numeric';
  }
  if (preg_match('/[...special characters here...]/', $character)) {
    return 'special';
  }
  if (preg_match('/\p{L}/u', $character)) {
    return 'accented';
  }

  return 'other';
}

Is it more efficient? No, not at all, but this is not a process you'll be running that often in practice because you only need to validate a change, not general entry, and reasoning through what characters are when is much easier - especially in the special characters, because you no longer need to write a regex that matches multiple sequences of any-of-these-things, but simply list out which characters you consider special characters and you're done. You get to skip worrying about anchors or + or greedy or any of that stuff because as long as you escape the metacharacters in that list, it will just work to either match or not match the single character under consideration.

Also for the record, any system that doesn't accept " in a password is one that I usually have question marks about; if you have proper sanitisation going in, there should be no reason to restrict which characters are special as long as they're printable characters, so for example I might question the validity of something like the zero width joiner in a password because it's non-printable (though, hey, it's 2023, why not use colour-variant emojis or flags as password characters?) but any system that doesn't allow " or ' makes me immediately suspicious of its ability to tolerate XSS attempts.
 
Well, this is just the first iteration. So let me just post the entire class here. Its very much a work in progress.

PHP:
<?php
declare(strict_types=1);
namespace Webinertia\ThemeManager\Validator;
use Laminas\Validator\AbstractValidator;
use Traversable;
use function array_shift;
use function count;
use function func_get_args;
use function is_array;
use function iterator_to_array;
use function preg_match_all;
use function strlen;
final class Password extends AbstractValidator
{
    /** Supported special characters */
    public const POSSIBLE_SPECIAL_CHARS = '/[].\'"+=\[\\\\@_!\#$%^&*()<>?|}{~:-]/';
    public const INVALID_LENGTH_COUNT   = 'invalidLengthCount';
    public const INVALID_UPPER_COUNT    = 'invalidUpperCount';
    public const INVALID_LOWER_COUNT    = 'invalidLowerCount';
    public const INVALID_DIGIT_COUNT    = 'invalidDigitCount';
    public const INVALID_SPECIAL_COUNT  = 'invalidSpecialCount';
    protected $messageTemplates = [
        self::INVALID_LENGTH_COUNT  => "Password must be at least %length% characters in length.",
        self::INVALID_UPPER_COUNT   => "Password must contain at least %upper% uppercase letter(s).",
        self::INVALID_LOWER_COUNT   => "Password must contain at least %lower% lowercase letter(s).",
        self::INVALID_DIGIT_COUNT   => "Password must contain at least %digit% numeric character(s).",
        self::INVALID_SPECIAL_COUNT => "Password must contain at least %special% special character(s)."
    ];
    protected $messageVariables = [
        'length'  => ['options' => 'length'],
        'upper'   => ['options' => 'upper'],
        'lower'   => ['options' => 'lower'],
        'digit'   => ['options' => 'digit'],
        'special' => ['options' => 'special'],
    ];
    protected $options = [
        'length'  => 0, // overall length of password
        'upper'   => 0, // uppercase count
        'lower'   => 0, // lowercase count
        'digit'   => 0, // digit count
        'special' => 0, // special char count
    ];
    /**
     * @param array<string, mixed>|Traversable<string, mixed> $options
     * @return void
     */
    public function __construct($options = [])
    {
        if ($options instanceof Traversable) {
            $options = iterator_to_array($options);
        } elseif (! is_array($options)) {
            $options        = func_get_args();
            $temp['length'] = array_shift($options);
            $options        = $temp;
        }
        parent::__construct($options);
    }
    public function isValid($value)
    {
        $this->setValue($value);
        $isValid = true;
        $options = $this->getOptions();
        if (isset($options['length']) && strlen($value) < (int) $options['length']) {
            $this->error(self::INVALID_LENGTH_COUNT);
            $isValid = false;
        }
        if (isset($options['upper']) && (int) $options['upper'] > 0) {
            preg_match_all('/[A-Z]/', $value, $upperMatches);
            if (! (count($upperMatches[0]) >= $options['upper'])) {
                $this->error(self::INVALID_UPPER_COUNT);
                $isValid = false;
            }
        }
        if (isset($options['lower']) && (int) $options['lower'] > 0) {
            preg_match_all('/[a-z]/', $value, $lowerMatches);
            if (! (count($lowerMatches[0]) >= $options['lower'])) {
                $this->error(self::INVALID_LOWER_COUNT);
                $isValid = false;
            }
        }
        if (isset($options['digit']) && (int) $options['digit'] > 0) {
            preg_match_all('/[0-9]/', $value, $digitMatches);
            if (! (count($digitMatches[0]) >= $options['digit'])) {
                $this->error(self::INVALID_DIGIT_COUNT);
                $isValid = false;
            }
        }
        if (isset($options['special']) && (int) $options['special'] > 0) {
            preg_match_all(self::POSSIBLE_SPECIAL_CHARS, $value, $specialMatches);
            if (! (count($specialMatches[0]) >= $options['special'])) {
                $this->error(self::INVALID_SPECIAL_COUNT);
                $isValid = false;
            }
        }
        return $isValid;
    }
    public function setValue($value)
    {
        $this->value = (string) $value;
    }
}
 
Last edited:
Well, as for supporting " and '. Not sure about the XSS vector, since ideally a users password would never be displayed, but I could see it being a problem in an injection vector. But I see your point.
 
! has a higher precedence
Yea, lm aware. Not the first time its been overlooked, sure it won't be the last either. Good catch though. The little testing that I had done was only up to 2 characters of any type. It would've shown early. And for sure the unit test would've caught it.
 
Last edited:

Log in or register to unlock full forum benefits!

Log in or register to unlock full forum benefits!

Register

Register on Admin Junkies completely free.

Register now
Log in

If you have an account, please log in

Log in

Would You Rather #9

  • Start a forum in a popular but highly competitive niche

    Votes: 5 20.0%
  • Initiate a forum within a limited-known niche with zero competition

    Votes: 20 80.0%
Win this space by entering the Website of The Month Contest

Theme editor

Theme customizations

Graphic Backgrounds

Granite Backgrounds