Regular Expressions methods and usage
Declaration
Here are the ways to declare a regular expression in JavaScript. While other languages such as PHP or VBScript use other delimiters, in JavaScript you use forward slash (/) when you declare RegExp literals.
Flags
There are three flags that you may use on a RegExp. The multiline flag is supported only in JavaScript1.5+, but the other two are supported in pretty much every browser that can handle RegExp (JavaScript.1.2+). These flags can be used in any order or combination, and are an integral part of the RegExp.
Reference : http://www.javascriptkit.com/javatutors/redev.shtml
Credits: This tutorial is written by David Andersson (Liorean)
Here are the ways to declare a regular expression in JavaScript. While other languages such as PHP or VBScript use other delimiters, in JavaScript you use forward slash (/) when you declare RegExp literals.
Syntax | Example |
---|---|
RegExp Literal | |
/pattern/flags; |
var re = /mac/i; |
RegExp Object Constructor | |
new RegExp("pattern","flags"); |
var re = new RegExp(window.prompt("Please
input a regex.","yes|yeah"),"g"); |
Flags
There are three flags that you may use on a RegExp. The multiline flag is supported only in JavaScript1.5+, but the other two are supported in pretty much every browser that can handle RegExp (JavaScript.1.2+). These flags can be used in any order or combination, and are an integral part of the RegExp.
Flag | Description |
---|---|
Global Search | |
g |
The global search flag makes the RegExp search for a pattern throughout the string, creating an array of all occurrences it can find matching the given pattern. |
Ignore Case | |
i |
The ignore case flag makes a regular expression case insensitive. For international coders, note that this might not work on extended characters such as å, ü, ñ, æ. |
Multiline Input | |
m |
This flag makes the beginning of input (^ ) and end
of input ($ ) codes also catch beginning and end of line
respectively. JavaScript1.5+ only. |
Pattern | Description |
---|---|
Escaping | |
\ |
Escapes special characters to literal and literal
characters to special. E.g: /\(s\)/ matches '(s)' while /(\s)/
matches any whitespace and captures the match. |
Quantifiers | |
{n} ,
{n,} ,
{n,m} , * ,
+ , ? |
Quantifiers match the preceding subpattern a certain number of
times. The subpattern can be a single character, an escape sequence,
a pattern enclosed by parentheses or a character set.{n} matches exactly n
times.{n,} matches n or more
times.{n,m} matches n
to m times.* is short for {0,} . Matches zero or
more times.+ is short for {1,} . Matches one or more
times.? is short for {0,1} . Matches zero or
one time.E.g: /o{1,3}/ matches 'oo' in "tooth" and 'o' in
"nose". |
Pattern delimiters | |
(pattern) ,
(?:pattern) |
Matches entire contained pattern.(pattern) captures match.(?:pattern) doesn't capture matchE.g: /(d).\1/ matches and captures 'dad' in
"abcdadef" while /(?:.d){2}/ matches but doesn't
capture 'cdad'.Note: (?:pattern)
is a JavaScript 1.5 feature. |
Lookaheads | |
(?=pattern) ,
(?!pattern) |
A lookahead matches only if the preceding subexpression is
followed by the pattern, but the pattern is not part of the match.
The subexpression is the part of the regular expression which will
be matched.(?=pattern) matches only if there is
a following pattern in input.(?!pattern) matches only if there is
not a following pattern in input.E.g: /Win(?=98)/ matches 'Win' only if 'Win' is
followed by '98'.Note: Lookahead is a JavaScript1.5 feature. |
Alternation | |
| |
Alternation matches content on either side of the alternation
character. E.g: /(a|b)a/ matches 'aa' in "dseaas" and 'ba' in
"acbab". |
Character sets | |
[characters] ,
[^characters] |
Matches any of the contained characters. A range of characters
may be defined by using a hyphen.[characters] matches any of the
contained characters.[^characters] negates the character
set and matches all but the contained charactersE.g: /[abcd]/ matches any of the characters 'a', 'b',
'c', 'd' and may be abbreviated to /[a-d]/ . Ranges must
be in ascending order, otherwise they will throw an error. (E.g:
/[d-a]/ will throw an error.)/[^0-9]/ matches all characters but digits.Note: Most special characters are automatically escaped to their literal meaning in character sets. |
Special characters | |
^ , $ ,
. ,
? and all the highlighted characters above in the
table. |
Special characters are characters that match something else than
what they appear as.^ matches beginning of input (or new line with m
flag).$ matches end of input (or end of line with m
flag).. matches any character except a newline.? directly following a quantifier makes the
quantifier non-greedy (makes it match minimum instead of maximum of
the interval defined).E.g: /(.)*?/ matches nothing or '' in all strings.Note: Non-greedy matches are not supported in older browsers such as Netscape Navigator 4 or Microsoft Internet Explorer 5.0. |
Literal characters | |
All characters except those with special meaning. | Mapped directly to the corresponding character. E.g: /a/ matches 'a' in "Any ancestor". |
Backreferences | |
\n |
Backreferences are references to the same thing as a previously
captured match. n is a positive nonzero integer telling the
browser which captured match to reference to./(\S)\1(\1)+/g matches all occurrences of three equal
non-whitespace characters following each other./<(\S+).*>(.*)<\/\1>/ matches any tag.E.g: /<(\S+).*>(.*)<\/\1>/ matches '<div
id="me">text</div>' in "text<div id=\"me\">text</div>text". |
Character Escapes | |
\f , \r ,
\n ,
\t , \v , \0 , [\b] ,
\s , \S , \w , \W ,
\d , \D , \b , \B ,
\cX , \xhh ,
\uhhhh |
\f matches form-feed.\r matches carriage return.\n matches linefeed.\t matches horizontal tab.\v matches vertical tab.\0 matches NUL character.[\b] matches backspace.\s matches whitespace (short for
[\f\n\r\t\v\u00A0\u2028\u2029] ).\S matches anything but a whitespace (short for
[^\f\n\r\t\v\u00A0\u2028\u2029] ).\w matches any alphanumerical character (word
characters) including underscore (short for [a-zA-Z0-9_] ).\W matches any non-word characters (short for
[^a-zA-Z0-9_] ).\d matches any digit (short for [0-9] ).\D matches any non-digit (short for [^0-9] ).\b matches a word boundary (the position between a
word and a space).\B matches a non-word boundary (short for [^\b] ).\cX matches a control character.
E.g: \cm matches control-M.\xhh matches the character with two
characters of hexadecimal code hh.\uhhhh matches the Unicode character
with four characters of hexadecimal code hhhh. |
Description | Example |
---|---|
RegExp.exec(string) |
|
Applies the RegExp to the given string, and returns the match information. | var match = /s(amp)le/i.exec("Sample text") match then contains ["Sample","amp"] |
RegExp.test(string) |
|
Tests if the given string matches the Regexp, and returns true if matching, false if not. | var match = /sample/.test("Sample text") match then contains false |
String.match(pattern) |
|
Matches given string with the RegExp. With g
flag returns an array containing the matches, without g
flag returns just the first match or if no match is found returns
null. |
var str = "Watch out for the rock!".match(/r?or?/g) str then contains ["o","or","ro"] |
String.search(pattern) |
|
Matches RegExp with string and returns the index of the beginning of the match if found, -1 if not. | var ndx = "Watch out for the rock!".search(/for/) ndx then contains 10 |
String.replace(pattern,string) |
|
Replaces matches with the given string, and returns the edited string. | var str = "Liorean said: My name is
Liorean!".replace(/Liorean/g,'Big Fat Dork') str then contains "Big Fat Dork said: My name
is Big Fat Dork!" |
String.split(pattern) |
|
Cuts a string into an array, making cuts at matches. | var str = "I am confused".split(/\s/g) str then contains ["I","am","confused"] |
Reference : http://www.javascriptkit.com/javatutors/redev.shtml
Credits: This tutorial is written by David Andersson (Liorean)
Labels: JavaScript