RFC-compliant email address validator


/ Published in: PHP
Save to your folder(s)

A PHP function that correctly validates all parts of a given email address, according to RFCs 5322, 5321, 1123, 2396, 3696, 4291, 4343, 2821 & 2822. I’ve released it under a license that allows you to use it royalty-free in commercial or non-commercial work.\r\n\r\nThe test cases and the latest version of the code will always be here: http://code.google.com/p/isemail/source/browse/#svn/trunk


Copy this code and paste it in your HTML
  1. <?php
  2. /**
  3.  * To validate an email address according to RFCs 5321, 5322 and others
  4.  *
  5.  * Copyright (c) 2008-2010, Dominic Sayers <br>
  6.  * Test schema documentation Copyright (c) 2010, Daniel Marschall <br>
  7.  * All rights reserved.
  8.  *
  9.  * Redistribution and use in source and binary forms, with or without modification,
  10.  * are permitted provided that the following conditions are met:
  11.  *
  12.  * - Redistributions of source code must retain the above copyright notice,
  13.  * this list of conditions and the following disclaimer.
  14.  * - Redistributions in binary form must reproduce the above copyright notice,
  15.  * this list of conditions and the following disclaimer in the documentation
  16.  * and/or other materials provided with the distribution.
  17.  * - Neither the name of Dominic Sayers nor the names of its contributors may be
  18.  * used to endorse or promote products derived from this software without
  19.  * specific prior written permission.
  20.  *
  21.  * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
  22.  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
  23.  * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
  24.  * DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR
  25.  * ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
  26.  * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
  27.  * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON
  28.  * ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
  29.  * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
  30.  * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
  31.  *
  32.  * @package is_email
  33.  * @author Dominic Sayers <dominic@sayers.cc>
  34.  * @copyright 2008-2010 Dominic Sayers
  35.  * @license http://www.opensource.org/licenses/bsd-license.php BSD License
  36.  * @link http://www.dominicsayers.com/isemail
  37.  * @version 2.8.3 - Clarified text for ISEMAIL_IPV6BADCHAR and new test #276 added (too many IPv6 groups with an elision)
  38.  */
  39.  
  40. // The quality of this code has been improved greatly by using PHPLint
  41. // Copyright (c) 2010 Umberto Salsi
  42. // This is free software; see the license for copying conditions.
  43. // More info: http://www.icosaedro.it/phplint/
  44. /*.
  45. require_module 'standard';
  46. require_module 'pcre';
  47. .*/
  48. /**
  49.  * Check that an email address conforms to RFCs 5321, 5322 and others
  50.  *
  51.  * @param string $email The email address to check
  52.  * @param boolean $checkDNS If true then a DNS check for A and MX records will be made
  53.  * @param mixed $errorlevel If true then return an integer error or warning number rather than true or false
  54.  */
  55. /*.mixed.*/ function is_email ($email, $checkDNS = false, $errorlevel = false) {
  56. // Check that $email is a valid address. Read the following RFCs to understand the constraints:
  57. // (http://tools.ietf.org/html/rfc5321)
  58. // (http://tools.ietf.org/html/rfc5322)
  59. // (http://tools.ietf.org/html/rfc4291#section-2.2)
  60. // (http://tools.ietf.org/html/rfc1123#section-2.1)
  61. // (http://tools.ietf.org/html/rfc3696) (guidance only)
  62.  
  63. // $errorlevel Behaviour
  64. // --------------- ---------------------------------------------------------------------------
  65. // E_ERROR Return validation failures only. For technically valid addresses return
  66. // ISEMAIL_VALID
  67. // E_WARNING Return warnings for unlikely but technically valid addresses. This includes
  68. // addresses at TLDs (e.g. johndoe@com), addresses with FWS and comments,
  69. // addresses that are quoted and addresses that contain no alphabetic or
  70. // numeric characters.
  71. // true Same as E_ERROR
  72. // false Return true for valid addresses, false for invalid ones. No warnings.
  73. //
  74. // Errors can be distinguished from warnings if ($return_value > ISEMAIL_ERROR)
  75. // version 2.0: Enhance $diagnose parameter to $errorlevel
  76. // revision 2.5: some syntax changes to make it more PHPLint-friendly. Should be functionally identical.
  77.  
  78. if (!defined('ISEMAIL_VALID')) {
  79. // No errors
  80. define('ISEMAIL_VALID' , 0);
  81. // Warnings (valid address but unlikely in the real world)
  82. define('ISEMAIL_WARNING' , 64);
  83. define('ISEMAIL_TLD' , 65);
  84. define('ISEMAIL_TLDNUMERIC' , 66);
  85. define('ISEMAIL_QUOTEDSTRING' , 67);
  86. define('ISEMAIL_COMMENTS' , 68);
  87. define('ISEMAIL_FWS' , 69);
  88. define('ISEMAIL_ADDRESSLITERAL' , 70);
  89. define('ISEMAIL_UNLIKELYINITIAL' , 71);
  90. define('ISEMAIL_SINGLEGROUPELISION' , 72);
  91. define('ISEMAIL_DOMAINNOTFOUND' , 73);
  92. define('ISEMAIL_MXNOTFOUND' , 74);
  93. // Errors (invalid address)
  94. define('ISEMAIL_ERROR' , 128);
  95. define('ISEMAIL_TOOLONG' , 129);
  96. define('ISEMAIL_NOAT' , 130);
  97. define('ISEMAIL_NOLOCALPART' , 131);
  98. define('ISEMAIL_NODOMAIN' , 132);
  99. define('ISEMAIL_ZEROLENGTHELEMENT' , 133);
  100. define('ISEMAIL_BADCOMMENT_START' , 134);
  101. define('ISEMAIL_BADCOMMENT_END' , 135);
  102. define('ISEMAIL_UNESCAPEDDELIM' , 136);
  103. define('ISEMAIL_EMPTYELEMENT' , 137);
  104. define('ISEMAIL_UNESCAPEDSPECIAL' , 138);
  105. define('ISEMAIL_LOCALTOOLONG' , 139);
  106. // define('ISEMAIL_IPV4BADPREFIX' , 140);
  107. define('ISEMAIL_IPV6BADPREFIXMIXED' , 141);
  108. define('ISEMAIL_IPV6BADPREFIX' , 142);
  109. define('ISEMAIL_IPV6GROUPCOUNT' , 143);
  110. define('ISEMAIL_IPV6DOUBLEDOUBLECOLON' , 144);
  111. define('ISEMAIL_IPV6BADCHAR' , 145);
  112. define('ISEMAIL_IPV6TOOMANYGROUPS' , 146);
  113. define('ISEMAIL_DOMAINEMPTYELEMENT' , 147);
  114. define('ISEMAIL_DOMAINELEMENTTOOLONG' , 148);
  115. define('ISEMAIL_DOMAINBADCHAR' , 149);
  116. define('ISEMAIL_DOMAINTOOLONG' , 150);
  117. define('ISEMAIL_IPV6SINGLECOLONSTART' , 151);
  118. define('ISEMAIL_IPV6SINGLECOLONEND' , 152);
  119. // Unexpected errors
  120. // define('ISEMAIL_BADPARAMETER' , 190);
  121. // define('ISEMAIL_NOTDEFINED' , 191);
  122. // revision 2.1: Redefined unexpected error constants so they don't clash with the ISEMAIL_WARNING bit
  123. // revision 2.5: Undefined unused constants
  124. }
  125.  
  126. if (is_bool($errorlevel)) {
  127. if ((bool) $errorlevel) {
  128. $diagnose = true;
  129. $warn = false;
  130. } else {
  131. $diagnose = false;
  132. $warn = false;
  133. }
  134. } else {
  135. switch ((int) $errorlevel) {
  136. case E_WARNING:
  137. $diagnose = true;
  138. $warn = true;
  139. break;
  140. case E_ERROR:
  141. $diagnose = true;
  142. $warn = false;
  143. break;
  144. default:
  145. $diagnose = false;
  146. $warn = false;
  147. }
  148. }
  149.  
  150. if ($diagnose) /*.mixed.*/ $return_status = ISEMAIL_VALID; else $return_status = true;
  151. // version 2.0: Enhance $diagnose parameter to $errorlevel
  152.  
  153. // the upper limit on address lengths should normally be considered to be 254
  154. // (http://www.rfc-editor.org/errata_search.php?rfc=3696)
  155. // NB My erratum has now been verified by the IETF so the correct answer is 254
  156. //
  157. // The maximum total length of a reverse-path or forward-path is 256
  158. // characters (including the punctuation and element separators)
  159. // (http://tools.ietf.org/html/rfc5321#section-4.5.3.1.3)
  160. // NB There is a mandatory 2-character wrapper round the actual address
  161. $emailLength = strlen($email);
  162. // revision 1.17: Max length reduced to 254 (see above)
  163. if ($emailLength > 254) if ($diagnose) return ISEMAIL_TOOLONG; else return false; // Too long
  164.  
  165. // Contemporary email addresses consist of a "local part" separated from
  166. // a "domain part" (a fully-qualified domain name) by an at-sign ("@").
  167. // (http://tools.ietf.org/html/rfc3696#section-3)
  168. $atIndex = strrpos($email,'@');
  169.  
  170. if ($atIndex === false) if ($diagnose) return ISEMAIL_NOAT; else return false; // No at-sign
  171. if ($atIndex === 0) if ($diagnose) return ISEMAIL_NOLOCALPART; else return false; // No local part
  172. if ($atIndex === $emailLength - 1) if ($diagnose) return ISEMAIL_NODOMAIN; else return false; // No domain part
  173. // revision 1.14: Length test bug suggested by Andrew Campbell of Gloucester, MA
  174.  
  175. // Sanitize comments
  176. // - remove nested comments, quotes and dots in comments
  177. // - remove parentheses and dots from quoted strings
  178. $braceDepth = 0;
  179. $inQuote = false;
  180. $escapeThisChar = false;
  181.  
  182. for ($i = 0; $i < $emailLength; ++$i) {
  183. $char = $email[$i];
  184. $replaceChar = false;
  185.  
  186. if ($char === '\\') $escapeThisChar = !$escapeThisChar; // Escape the next character?
  187. else {
  188. switch ($char) {
  189. case '(':
  190. if ($escapeThisChar) $replaceChar = true;
  191. else if ($inQuote) $replaceChar = true;
  192. else if ($braceDepth++ > 0) $replaceChar = true; // Increment brace depth
  193.  
  194. break;
  195. case ')':
  196. if ($escapeThisChar) $replaceChar = true;
  197. else if ($inQuote) $replaceChar = true;
  198. else {
  199. if (--$braceDepth > 0) $replaceChar = true; // Decrement brace depth
  200. if ($braceDepth < 0) $braceDepth = 0;
  201. }
  202.  
  203. break;
  204. case '"':
  205. if ($escapeThisChar) $replaceChar = true;
  206. else if ($braceDepth === 0) $inQuote = !$inQuote; // Are we inside a quoted string?
  207. else $replaceChar = true;
  208.  
  209. break;
  210. case '.':
  211. if ($escapeThisChar) $replaceChar = true; // Dots don't help us either
  212. else if ($braceDepth > 0) $replaceChar = true;
  213.  
  214. break;
  215. default:
  216. }
  217.  
  218. $escapeThisChar = false;
  219. // if ($replaceChar) $email[$i] = 'x'; // Replace the offending character with something harmless
  220. // revision 1.12: Line above replaced because PHPLint doesn't like that syntax
  221. if ($replaceChar) $email = (string) substr_replace($email, 'x', $i, 1); // Replace the offending character with something harmless
  222. }
  223. }
  224.  
  225. $localPart = substr($email, 0, $atIndex);
  226. $domain = substr($email, $atIndex + 1);
  227. $FWS = "(?:(?:(?:[ \\t]*(?:\\r\\n))?[ \\t]+)|(?:[ \\t]+(?:(?:\\r\\n)[ \\t]+)*))"; // Folding white space
  228. $dotArray = /*. (array[]) .*/ array();
  229.  
  230. // Let's check the local part for RFC compliance...
  231. //
  232. // local-part = dot-atom / quoted-string / obs-local-part
  233. // obs-local-part = word *("." word)
  234. // (http://tools.ietf.org/html/rfc5322#section-3.4.1)
  235. //
  236. // Problem: need to distinguish between "first.last" and "first"."last"
  237. // (i.e. one element or two). And I suck at regexes.
  238. $dotArray = preg_split('/\\.(?=(?:[^\\"]*\\"[^\\"]*\\")*(?![^\\"]*\\"))/m', $localPart);
  239. $partLength = 0;
  240.  
  241. foreach ($dotArray as $arrayMember) {
  242. $element = (string) $arrayMember;
  243. // Remove any leading or trailing FWS
  244. $new_element = preg_replace("/^$FWS|$FWS\$/", '', $element);
  245. if ($warn && ($element !== $new_element)) $return_status = ISEMAIL_FWS; // FWS is unlikely in the real world
  246. $element = $new_element;
  247. // version 2.3: Warning condition added
  248. $elementLength = strlen($element);
  249.  
  250. if ($elementLength === 0) if ($diagnose) return ISEMAIL_ZEROLENGTHELEMENT; else return false; // Can't have empty element (consecutive dots or dots at the start or end)
  251. // revision 1.15: Speed up the test and get rid of "unitialized string offset" notices from PHP
  252.  
  253. // We need to remove any valid comments (i.e. those at the start or end of the element)
  254. if ($element[0] === '(') {
  255. if ($warn) $return_status = ISEMAIL_COMMENTS; // Comments are unlikely in the real world
  256. // version 2.0: Warning condition added
  257. $indexBrace = strpos($element, ')');
  258. if ($indexBrace !== false) {
  259. if (preg_match('/(?<!\\\\)[\\(\\)]/', substr($element, 1, $indexBrace - 1)) > 0)
  260. if ($diagnose) return ISEMAIL_BADCOMMENT_START; else return false; // Illegal characters in comment
  261. $element = substr($element, $indexBrace + 1, $elementLength - $indexBrace - 1);
  262. $elementLength = strlen($element);
  263. }
  264. }
  265.  
  266. if ($element[$elementLength - 1] === ')') {
  267. if ($warn) $return_status = ISEMAIL_COMMENTS; // Comments are unlikely in the real world
  268. // version 2.0: Warning condition added
  269. $indexBrace = strrpos($element, '(');
  270. if ($indexBrace !== false) {
  271. if (preg_match('/(?<!\\\\)(?:[\\(\\)])/', substr($element, $indexBrace + 1, $elementLength - $indexBrace - 2)) > 0)
  272. if ($diagnose) return ISEMAIL_BADCOMMENT_END; else return false; // Illegal characters in comment
  273. $element = substr($element, 0, $indexBrace);
  274. $elementLength = strlen($element);
  275. }
  276. }
  277.  
  278. // Remove any remaining leading or trailing FWS around the element (having removed any comments)
  279. $new_element = preg_replace("/^$FWS|$FWS\$/", '', $element);
  280. if ($warn && ($element !== $new_element)) $return_status = ISEMAIL_FWS; // FWS is unlikely in the real world
  281. $element = $new_element;
  282. // version 2.0: Warning condition added
  283.  
  284. // What's left counts towards the maximum length for this part
  285. if ($partLength > 0) $partLength++; // for the dot
  286. $partLength += strlen($element);
  287.  
  288. // Each dot-delimited component can be an atom or a quoted string
  289. // (because of the obs-local-part provision)
  290. if (preg_match('/^"(?:.)*"$/s', $element) > 0) {
  291. // Quoted-string tests:
  292. if ($warn) $return_status = ISEMAIL_QUOTEDSTRING; // Quoted string is unlikely in the real world
  293. // version 2.0: Warning condition added
  294. // Remove any FWS
  295. $element = preg_replace("/(?<!\\\\)$FWS/", '', $element); // A warning condition, but we've already raised ISEMAIL_QUOTEDSTRING
  296. // My regex skillz aren't up to distinguishing between \" \\" \\\" \\\\" etc.
  297. // So remove all \\ from the string first...
  298. $element = preg_replace('/\\\\\\\\/', ' ', $element);
  299. if (preg_match('/(?<!\\\\|^)["\\r\\n\\x00](?!$)|\\\\"$|""/', $element) > 0) if ($diagnose) return ISEMAIL_UNESCAPEDDELIM; else return false; // ", CR, LF and NUL must be escaped
  300. // version 2.0: allow ""@example.com because it's technically valid
  301. } else {
  302. // Unquoted string tests:
  303. //
  304. // Period (".") may...appear, but may not be used to start or end the
  305. // local part, nor may two or more consecutive periods appear.
  306. // (http://tools.ietf.org/html/rfc3696#section-3)
  307. //
  308. // A zero-length element implies a period at the beginning or end of the
  309. // local part, or two periods together. Either way it's not allowed.
  310. if ($element === '') if ($diagnose) return ISEMAIL_EMPTYELEMENT; else return false; // Dots in wrong place
  311.  
  312. // Any ASCII graphic (printing) character other than the
  313. // at-sign ("@"), backslash, double quote, comma, or square brackets may
  314. // appear without quoting. If any of that list of excluded characters
  315. // are to appear, they must be quoted
  316. // (http://tools.ietf.org/html/rfc3696#section-3)
  317. //
  318. // Any excluded characters? i.e. 0x00-0x20, (, ), <, >, [, ], :, ;, @, \, comma, period, "
  319. if (preg_match('/[\\x00-\\x20\\(\\)<>\\[\\]:;@\\\\,\\."]/', $element) > 0) if ($diagnose) return ISEMAIL_UNESCAPEDSPECIAL; else return false; // These characters must be in a quoted string
  320. if ($warn && (preg_match('/^\\w+/', $element) === 0)) $return_status = ISEMAIL_UNLIKELYINITIAL; // First character is an odd one
  321. }
  322. }
  323.  
  324. if ($partLength > 64) if ($diagnose) return ISEMAIL_LOCALTOOLONG; else return false; // Local part must be 64 characters or less
  325.  
  326. // Now let's check the domain part...
  327.  
  328. // The domain name can also be replaced by an IP address in square brackets
  329. // (http://tools.ietf.org/html/rfc3696#section-3)
  330. // (http://tools.ietf.org/html/rfc5321#section-4.1.3)
  331. // (http://tools.ietf.org/html/rfc4291#section-2.2)
  332. if (preg_match('/^\\[(.)+]$/', $domain) === 1) {
  333. // It's an address-literal
  334. if ($warn) $return_status = ISEMAIL_ADDRESSLITERAL; // Quoted string is unlikely in the real world
  335. // version 2.0: Warning condition added
  336. $addressLiteral = substr($domain, 1, strlen($domain) - 2);
  337. $groupMax = 8;
  338. // revision 2.1: new IPv6 testing strategy
  339. $matchesIP = array();
  340. $colon = ':'; // Revision 2.7: Daniel Marschall's new IPv6 testing strategy
  341. $double_colon = '::';
  342.  
  343. // Extract IPv4 part from the end of the address-literal (if there is one)
  344. if (preg_match('/\\b(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)$/', $addressLiteral, $matchesIP) > 0) {
  345. $index = strrpos($addressLiteral, $matchesIP[0]);
  346.  
  347. if ($index === 0) {
  348. // Nothing there except a valid IPv4 address, so...
  349. if ($diagnose) return $return_status; else return true;
  350. // version 2.0: return warning if one is set
  351. } else {
  352. //- // Assume it's an attempt at a mixed address (IPv6 + IPv4)
  353. //- if ($addressLiteral[$index - 1] !== $colon) if ($diagnose) return ISEMAIL_IPV4BADPREFIX; else return false; // Character preceding IPv4 address must be ':'
  354. // revision 2.1: new IPv6 testing strategy
  355. if (substr($addressLiteral, 0, 5) !== 'IPv6:') if ($diagnose) return ISEMAIL_IPV6BADPREFIXMIXED; else return false; // RFC5321 section 4.1.3
  356. //-
  357. //- $IPv6 = substr($addressLiteral, 5, ($index === 7) ? 2 : $index - 6);
  358. //- $groupMax = 6;
  359. // revision 2.1: new IPv6 testing strategy
  360. $IPv6 = substr($addressLiteral, 5, $index - 5) . '0000:0000'; // Convert IPv4 part to IPv6 format
  361. }
  362. } else {
  363. // It must be an attempt at pure IPv6
  364. if (substr($addressLiteral, 0, 5) !== 'IPv6:') if ($diagnose) return ISEMAIL_IPV6BADPREFIX; else return false; // RFC5321 section 4.1.3
  365. $IPv6 = substr($addressLiteral, 5);
  366. //- $groupMax = 8;
  367. // revision 2.1: new IPv6 testing strategy
  368. }
  369.  
  370. $matchesIP = explode($colon, $IPv6); // Revision 2.7: Daniel Marschall's new IPv6 testing strategy
  371. $groupCount = count($matchesIP);
  372. $index = strpos($IPv6,$double_colon);
  373.  
  374. if ($index === false) {
  375. // We need exactly the right number of groups
  376. if ($groupCount !== $groupMax) if ($diagnose) return ISEMAIL_IPV6GROUPCOUNT; else return false; // RFC5321 section 4.1.3
  377. } else {
  378. if ($index !== strrpos($IPv6,$double_colon)) if ($diagnose) return ISEMAIL_IPV6DOUBLEDOUBLECOLON; else return false; // More than one '::'
  379. if ($index === 0 || $index === (strlen($IPv6) - 2)) $groupMax++; // RFC 4291 allows :: at the start or end of an address with 7 other groups in addition
  380. if ($groupCount > $groupMax) if ($diagnose) return ISEMAIL_IPV6TOOMANYGROUPS; else return false; // Too many IPv6 groups in address
  381. if ($groupCount === $groupMax) $return_status = ISEMAIL_SINGLEGROUPELISION; // Eliding a single group with :: is deprecated by RFCs 5321 & 5952
  382. }
  383.  
  384. // Check for single : at start and end of address
  385. // Revision 2.7: Daniel Marschall's new IPv6 testing strategy
  386. if ((substr($IPv6, 0, 1) === $colon) && (substr($IPv6, 1, 1) !== $colon)) if ($diagnose) return ISEMAIL_IPV6SINGLECOLONSTART; else return false; // Address starts with a single colon
  387. if ((substr($IPv6, -1) === $colon) && (substr($IPv6, -2, 1) !== $colon)) if ($diagnose) return ISEMAIL_IPV6SINGLECOLONEND; else return false; // Address ends with a single colon
  388.  
  389. // Check for unmatched characters
  390. if (count(preg_grep('/^[0-9A-Fa-f]{0,4}$/', $matchesIP, PREG_GREP_INVERT)) !== 0) if ($diagnose) return ISEMAIL_IPV6BADCHAR ; else return false; // Illegal characters in address
  391. // It's a valid IPv6 address, so...
  392. if ($diagnose) return $return_status; else return true;
  393. // revision 2.1: bug fix: now correctly return warning status
  394. } else {
  395. // It's a domain name...
  396.  
  397. // The syntax of a legal Internet host name was specified in RFC-952
  398. // One aspect of host name syntax is hereby changed: the
  399. // restriction on the first character is relaxed to allow either a
  400. // letter or a digit.
  401. // (http://tools.ietf.org/html/rfc1123#section-2.1)
  402. //
  403. // NB RFC 1123 updates RFC 1035, but this is not currently apparent from reading RFC 1035.
  404. //
  405. // Most common applications, including email and the Web, will generally not
  406. // permit...escaped strings
  407. // (http://tools.ietf.org/html/rfc3696#section-2)
  408. //
  409. // the better strategy has now become to make the "at least one period" test,
  410. // to verify LDH conformance (including verification that the apparent TLD name
  411. // is not all-numeric)
  412. // (http://tools.ietf.org/html/rfc3696#section-2)
  413. //
  414. // Characters outside the set of alphabetic characters, digits, and hyphen MUST NOT appear in domain name
  415. // labels for SMTP clients or servers
  416. // (http://tools.ietf.org/html/rfc5321#section-4.1.2)
  417. //
  418. // RFC5321 precludes the use of a trailing dot in a domain name for SMTP purposes
  419. // (http://tools.ietf.org/html/rfc5321#section-4.1.2)
  420. $dotArray = preg_split('/\\.(?=(?:[^\\"]*\\"[^\\"]*\\")*(?![^\\"]*\\"))/m', $domain);
  421. $partLength = 0;
  422. $element = ''; // Since we use $element after the foreach loop let's make sure it has a value
  423. // revision 1.13: Line above added because PHPLint now checks for Definitely Assigned Variables
  424.  
  425. if ($warn && (count($dotArray) === 1)) $return_status = ISEMAIL_TLD; // The mail host probably isn't a TLD
  426. // version 2.0: downgraded to a warning
  427.  
  428. foreach ($dotArray as $arrayMember) {
  429. $element = (string) $arrayMember;
  430. // Remove any leading or trailing FWS
  431. $new_element = preg_replace("/^$FWS|$FWS\$/", '', $element);
  432. if ($warn && ($element !== $new_element)) $return_status = ISEMAIL_FWS; // FWS is unlikely in the real world
  433. $element = $new_element;
  434. // version 2.0: Warning condition added
  435. $elementLength = strlen($element);
  436.  
  437. // Each dot-delimited component must be of type atext
  438. // A zero-length element implies a period at the beginning or end of the
  439. // local part, or two periods together. Either way it's not allowed.
  440. if ($elementLength === 0) if ($diagnose) return ISEMAIL_DOMAINEMPTYELEMENT; else return false; // Dots in wrong place
  441. // revision 1.15: Speed up the test and get rid of "unitialized string offset" notices from PHP
  442.  
  443. // Then we need to remove all valid comments (i.e. those at the start or end of the element
  444. if ($element[0] === '(') {
  445. if ($warn) $return_status = ISEMAIL_COMMENTS; // Comments are unlikely in the real world
  446. // version 2.0: Warning condition added
  447. $indexBrace = strpos($element, ')');
  448. if ($indexBrace !== false) {
  449. if (preg_match('/(?<!\\\\)[\\(\\)]/', substr($element, 1, $indexBrace - 1)) > 0)
  450. if ($diagnose) return ISEMAIL_BADCOMMENT_START; else return false; // Illegal characters in comment
  451. // revision 1.17: Fixed name of constant (also spotted by turboflash - thanks!)
  452. $element = substr($element, $indexBrace + 1, $elementLength - $indexBrace - 1);
  453. $elementLength = strlen($element);
  454. }
  455. }
  456.  
  457. if ($element[$elementLength - 1] === ')') {
  458. if ($warn) $return_status = ISEMAIL_COMMENTS; // Comments are unlikely in the real world
  459. // version 2.0: Warning condition added
  460. $indexBrace = strrpos($element, '(');
  461. if ($indexBrace !== false) {
  462. if (preg_match('/(?<!\\\\)(?:[\\(\\)])/', substr($element, $indexBrace + 1, $elementLength - $indexBrace - 2)) > 0)
  463. if ($diagnose) return ISEMAIL_BADCOMMENT_END; else return false; // Illegal characters in comment
  464. // revision 1.17: Fixed name of constant (also spotted by turboflash - thanks!)
  465. $element = substr($element, 0, $indexBrace);
  466. $elementLength = strlen($element);
  467. }
  468. }
  469.  
  470. // Remove any leading or trailing FWS around the element (inside any comments)
  471. $new_element = preg_replace("/^$FWS|$FWS\$/", '', $element);
  472. if ($warn && ($element !== $new_element)) $return_status = ISEMAIL_FWS; // FWS is unlikely in the real world
  473. $element = $new_element;
  474. // version 2.0: Warning condition added
  475.  
  476. // What's left counts towards the maximum length for this part
  477. if ($partLength > 0) $partLength++; // for the dot
  478. $partLength += strlen($element);
  479.  
  480. // The DNS defines domain name syntax very generally -- a
  481. // string of labels each containing up to 63 8-bit octets,
  482. // separated by dots, and with a maximum total of 255
  483. // octets.
  484. // (http://tools.ietf.org/html/rfc1123#section-6.1.3.5)
  485. if ($elementLength > 63) if ($diagnose) return ISEMAIL_DOMAINELEMENTTOOLONG; else return false; // Label must be 63 characters or less
  486.  
  487. // Any ASCII graphic (printing) character other than the
  488. // at-sign ("@"), backslash, double quote, comma, or square brackets may
  489. // appear without quoting. If any of that list of excluded characters
  490. // are to appear, they must be quoted
  491. // (http://tools.ietf.org/html/rfc3696#section-3)
  492. //
  493. // If the hyphen is used, it is not permitted to appear at
  494. // either the beginning or end of a label.
  495. // (http://tools.ietf.org/html/rfc3696#section-2)
  496. //
  497. // Any excluded characters? i.e. 0x00-0x20, (, ), <, >, [, ], :, ;, @, \, comma, period, "
  498. if (preg_match('/[\\x00-\\x20\\(\\)<>\\[\\]:;@\\\\,\\."]|^-|-$/', $element) > 0) if ($diagnose) return ISEMAIL_DOMAINBADCHAR; else return false; // Illegal character in domain name
  499. }
  500.  
  501. if ($partLength > 255) if ($diagnose) return ISEMAIL_DOMAINTOOLONG; else return false; // Domain part must be 255 characters or less (http://tools.ietf.org/html/rfc1123#section-6.1.3.5)
  502.  
  503. if ($warn && (preg_match('/^[0-9]+$/', $element) > 0)) $return_status = ISEMAIL_TLDNUMERIC; // TLD probably isn't all-numeric (http://www.apps.ietf.org/rfc/rfc3696.html#sec-2)
  504. // version 2.0: Downgraded to a warning
  505.  
  506. // Check DNS?
  507. if ($diagnose && ($return_status === ISEMAIL_VALID) && $checkDNS && function_exists('checkdnsrr')) {
  508. if (!(checkdnsrr($domain, 'A'))) $return_status = ISEMAIL_DOMAINNOTFOUND; // 'A' record for domain can't be found
  509. if (!(checkdnsrr($domain, 'MX'))) $return_status = ISEMAIL_MXNOTFOUND; // 'MX' record for domain can't be found
  510. }
  511. }
  512.  
  513. // Eliminate all other factors, and the one which remains must be the truth.
  514. // (Sherlock Holmes, The Sign of Four)
  515. if ($diagnose) return $return_status; else return true;
  516. // version 2.0: return warning if one is set
  517. }
  518.  
  519. $email = 'dominic@sayers.cc';
  520.  
  521. echo "Testing $email<br/>";
  522. echo "$email is " . ((is_email($email)) ? '' : 'not ') . 'a valid email address';
  523. ?>

URL: http://www.dominicsayers.com/isemail/

Report this snippet


Comments

RSS Icon Subscribe to comments

You need to login to post a comment.