I have no interest in parsing a list of URLs from a given string of text (even though some of the regexes on this page are capable of doing that). /(((http|ftp|https):\/) (([0-9a-z_-] \.) (aero|asia|biz|cat|com|coop|edu|gov|info|int|jobs|mil|mobi|museum|name|net|org|pro|tel|travel|ac|ad|ae|af|ag|ai|al|am|an|ao|aq|ar|as|at|au|aw|ax|az|ba|bb|bd|be|bf|bg|bh|bi|bj|bm|bn|bo|br|bs|bt|bv|bw|by|bz|ca|cc|cd|cf|cg|ch|ci|ck|cl|cm|cn|co|cr|cu|cv|cx|cy|cz|cz|de|dj|dk|dm|do|dz|ec|ee|eg|er|es|et|eu|fi|fj|fk|fm|fo|fr|ga|gb|gd|ge|gf|gg|gh|gi|gl|gm|gn|gp|gq|gr|gs|gt|gu|gw|gy|hk|hm|hn|hr|ht|hu|id|ie|il|im|in|io|iq|ir|is|it|je|jm|jo|jp|ke|kg|kh|ki|km|kn|kp|kr|kw|ky|kz|la|lb|lc|li|lk|lr|ls|lt|lu|lv|ly|ma|mc|md|me|mg|mh|mk|ml|mn|mn|mo|mp|mr|ms|mt|mu|mv|mw|mx|my|mz|na|nc|ne|nf|ng|ni|nl|no|np|nr|nu|nz|nom|pa|pe|pf|pg|ph|pk|pl|pm|pn|pr|ps|pt|pw|py|qa|re|ra|rs|ru|rw|sa|sb|sc|sd|se|sg|sh|si|sj|sj|sk|sl|sm|sn|so|sr|st|su|sv|sy|sz|tc|td|tf|tg|th|tj|tk|tl|tm|tn|to|tp|tr|tt|tv|tw|tz|ua|ug|uk|us|uy|uz|va|vc|ve|vg|vi|vn|vu|wf|ws|ye|yt|yu|za|zm|zw|arpa)(:[0-9] )? I also don’t want to allow every possible technically valid URL — quite the opposite.
Also, single weird leading and/or trailing characters aren’t tested for. Michael's regex considers [email protected] a valid address * which conflicts with section 2.3.5 of RFC 5321 which states that: * * Only resolvable, fully-qualified domain names (FQDNs) are permitted * when domain names are used in SMTP.In other words, names that can * be resolved to MX RRs or address (i.e., A or AAAA) RRs (as discussed * in Section 5) are permitted, as are CNAME RRs whose targets can be * resolved, in turn, to MX or address [email protected](16) "[email protected]" This feature is only available for PHP Versions (PHP 5 Rejection of so-called partial domains because of "missing" dot is not following section 2.3.5 of RFC 5321.It says FQDNs are permitted, and com, org, or va are (well, may be) valids FQDNs. Some TDLs (although few of them) have MX RRs, the for example "[email protected]" is correct.