You Are Here: Home »Tutorials»Php_mysql »   Regex in forms Thursday November 20th 2008

Regular Expressions And Forms

Introduction

When it comes to filling out forms many people are lazy, others simply don't pay attention to any instructions that you give them or some will try to get it right, but just end up making a typo, so for most web based forms you need quite alot of error checking, the most common problems being with people leaving fields blank, well we've already covered that in the error checking in forms tutorial, so what about when you need to check the format of the information they enter, other than an email address? well this is one point that regular expressions become very useful.

Note: This is not a tutorial that covers alot about regular expressions, there's far more to them than the basic examples below but hopefully this will be useful for some basic form checking and give you an understanding of the basics of regular expressions if you've never come across them before.

The Basics

Regular expressions are the easiest way to check the pattern and length of any kind of string, there's several functions that support regular expressions, probally the most popular which we'll use here is ereg()

First we need to start out with the beginning and end anchors, which are ^ (start of the line) and $ (end of the line) and the conditions we want to check for will go between these.

Regular expressions allow you to use ranges for checking a value, for example, if your form includes an input for a phone number you don't want people to be entering letters, only numbers in the range of 0-9, so we can use this range within a set of square brackets and inside our anchors to check against the string, for example:

$phonenumber '9552977';
if(
ereg("^[0-9]$"$phonenumber)) #the string contains only numbers

In this case the string would be correct and you can continue with your code, you can then add an else condition at that point to return an error if the string contains characters other than 0-9 (ereg will always return true if the string matches).

Probally the most common conditions to check in a string is for a combination of numbers and letters, for example a username that you don't want to contain any special characters, so for this we can combine different ranges.

We already know that [0-9] will check and allow numbers in the string, for letters the format is similar [a-z] will check the string only contains characters between a-z except this is case sensitive, and would only allow lowercase letters.

As you would probally want to allow uppercase letters in a username also, we can add another range for those [A-Z] then combine all three to check against our string:

$username 'Dave2004';
if(
ereg("^[A-Za-z0-9]$"$username)) {
#the string contains no character other than letters + numbers
}

As an alternative to [A-Za-z0-9] you could also use [:alnum:] which will look for the same match (characters 0-9 and upper or lowercase a-z) eg: ^[:alnum:]$

The above will check a single or multiple character string, but you may want to specify the minimum and maximum length the string can be, for example only allow usernames between 3 and 12 characters in length, for this we'd add another condition {3,12}

You can follow any item in a regular expression with curly brackets containing one or two numbers which will basically define how many times the previous item has to match, so now our code looks like this:

$username "Dave2004";
if(
ereg("^[A-Za-z0-9]{3,12}$"$username)) {
#the string contains no character other than letters + numbers and is between 3 and 12 characters long
}

As an alternative to using [a-z][A-Z] you could also use [:aplha:] which looks for any alphabetic character either upper or lower case, eg: ^[[:alpha:]0-9]{3,15}$

Also note that ereg() is case-sensitive, another option in some cases would be to use eregi() which is not, therefore you could match lower and uppercase letters in a string by just looking for [a-z].

preg_match() is another function for regular expressions, and for scripts where you might be searching for certain content within a string, it would be better, but for checking the format of a basic string ereg() or eregi is fine().

Aswell as full ranges such as [a-z] and [0-9] you can also check individual characters or a look for one in a selection in the same way, for example using [2|4] would check the string contains only the numbers 2 or 4 (| works as an 'or' character).

You can also combine different matches and ranges to check that a string follows a particular format, for example lets say your form requires the user to enter a reference number or code that is always in the format of 3 numbers then 4 lowercase letters eg: '467gyhe' (would be a correct string) '42geyh8' (would be incorrect) we could check the string like so:

$string '467gyhe';
if(
ereg("^[0-9]{3}[a-z]{4}$"$string)) {
#the string is formed of 3 numbers then 4 lowercase letters
}

A couple of additional characters and conditions that may be useful for checking inputs in a form:

The period (.) says "any single character in this position", for example if you changed the last example we used to:

if(ereg("^[0-9]{3}[.][a-z]{4}$"$string))

It would look for any one single character between the 3 numbers and 4 letters.

[:space:] will look for a space in the string, so again back to the same example, if you wanted to only allow the string with a space between the numbers and letters you could use:

$string "467 gyhe";
if(
ereg("^[0-9]{3}[[:space:]][a-z]{4}$"$string)) {
#the string is formed of 3 numbers then 4 lowercase letters with a space between the numbers and letters
}

Again, these are only some very basic uses for regular expressions, but they may help with your forms, i'll try and write a more indepth regular expressions tutorial at somepoint.