Return to Snippet

Revision: 40772
at February 20, 2011 05:33 by jm1248


Updated Code
<html>
<head>
 <title>Htmlify</title>

<script type="text/javascript">

function htmlview($text)
{
///////////// Show and tell //////////////
 htmlversion = htmlify($text); // Convert the Test Text to HTML
 raw.value = ""; // Clear the Raw HTML box
 raw.value = htmlversion; // Show the raw HTML from this pass
 Browser_View.innerHTML = raw.value; // Show as a browser would see it.
}


function htmlify($text)
{
  var tlnk = new Array; //Create an array to hold the potential links
  var hlnk = new Array; //Create an array to hold the HTML translation

 // First, translate special characters to HTML
  $text = spchrs2html($text);

 // Loop through the clear text 
  var i = 0;
  for (i=0;i<4;i++) // Set ;i<20; to a reasonable limit here
  {
  // Get a potential link and mark where it came from
   $text = $text.replace(/(\S+\.\S+)/,"<"+i+">"); // look for dots that are surrounded by non-whitespace characters
   tlnk[i] = RegExp.$1;
  } // EOLoop
  ac = i;
//?** too many loops - need a break **
 // Loop through the array of potential links and make replacements
  for (i=0;i<ac;i++)
  {
   // If this is a number, (e.g. 6.4sec; $5.00 etc.) OR too short; restore original and skip it
   if (tlnk[i].search(/\d\.\d/)>-1 || tlnk[i].length <5) // Search for digit.digit OR len < 5 in this potential link
   {
    $text = $text.replace("<"+i+">", tlnk[i]);
   }
   else
   {
   // Make this URL into a real link - move brackets and punctuation outside of the anchor tag
   htm = makelink(tlnk[i]);
   $text = $text.replace("<"+i+">", htm);
   }
  }

 // Now put the breaks on
  $text = $text.replace(/\n/g,"<br/>");
 // And deal with multiple spaces
  $text = $text.replace(/\ \ /g," &nbsp;");
 // And any other specials
  $text = $text.replace(/"/g,"&quot;");
  $text = $text.replace(/\$/g,"&#36;");

  return $text;
}

function makelink(txt) // Make a real link from this potential link
{
   txt = html2spchrs(txt); // Undo any html special characters in this link
   var i = 0;

 // Clean the front end
   pN = txt.length-1;
   for (i=0;i<pN;i++)
   {
    ch = txt.substr(i,1); // Look at each character
    if (ch.search(/\w/)>-1) break; // Stop looping when a word char is found
   }
   prea = txt.substring(0,i); // Copy the pre anchor stuff
   prea = spchrs2html(prea) // Redo any html special characters in this link
   txt = txt.substr(i); // Trim the preamble from the link

 // Clean the trailing end
   for (i=pN;i>0;i--)
   {
   ch = txt.substr(i,1); // Look back at each character
   if (ch.search(/\w|_|-|\//)>-1) break; // Loop until a legal trailing char is found
   }
   posta = txt.substring(i+1); // Copy the post anchor stuff
   posta = spchrs2html(posta) // Redo any html angle bracket codes in this link

  turl = txt.substring(0,i+1); // and detach it from the rest - this is the legit URL

 // If the URL is an email address, link as a mailto:
  if (turl.search(/@/)>0)
  {
   tlnk = "<a href='mailto:"+turl+"'>"+turl+"</a>";
   return prea+tlnk+posta;
  } 
 // Not a mailto, treat as a document URL
 hurl = ""
  if (turl.search(/\w+:\/\//)<0) hurl = "http://"; // Add http:// if no xxxx:// already there
  tlnk = "<a href='"+hurl+turl+"'>"+turl+"</a>";
 return prea+tlnk+posta;
}

function spchrs2html(str)
{
  str = str.replace (/&/g, "&amp;");
  str = str.replace (/</g, "&lt;"); // Convert angle brackets to HTML codes in string
  str = str.replace (/>/g, "&gt;");
  return str;
}

function html2spchrs(str)
{
  str = str.replace (/&lt;/g, "<"); // Undo any angle bracket codes in this link
  str = str.replace (/&gt;/g, ">");
  str = str.replace (/&amp;/g, "&");
  return str;
}

</script>
</head>

<body onload='htmlview(test.value)'>

///////////// Show and tell //////////////<br/>
Test Text
<br/>
<textarea id='test' rows='5' cols='60'>
Try this: (http://yoursite.com.) if not, then 
ask Jimmy B <[email protected]?subject=MyProblem> 
the FTP link is ftp://mysite.com/dofuss/.  Did you
see this technews.org/news/wires.html?ref=wett&skin=rat%20fur?

</textarea>
<br/><br/>

Raw HTML
<br/>
<textarea id='raw' rows='5' cols='90'>
</textarea>
<br/><br/>

<b>Browser View</b>
<br/>
<div id='Browser_View'>
</div>

<br/>
<h2 style= 'margin-left: 2%; width: 50%; font-family: Verdana; font-size: 1.2em'>Notes:</h2>
<p style= 'margin-left: 4%; width: 50%; font-family: Verdana; font-size: 0.9em';>
Use your browser's View Source function, copy all and paste into a text editor.
Save it as <b>htmlify_reference.html</b> and then as htmlify-1.html for your working copy.
JavaScript can be maddening - a slight mistake anywhere can result in nothing working.
Make sure you have a working version to use as a return point.
<br/><br/>
By all means improve on this project but please post your improvements in this thread.
<br/><br/>
There is no restriction on any legitimate use of this material.  Feel free to spread it
around.
<br/><br/>
John<br/>
</p>

</body>
</html>

Revision: 40771
at February 8, 2011 01:09 by jm1248


Initial Code
<html>
<head>

<script>

function clickify_links($text)
{
 return $text.replace(/((https?:\/\/)?([-\w]+\.[-\w\.]+)+\w(:\d+)?(\/([-\w\_\.]*(\?\S+)?)?)*)/gim, "<a href='http://"+"$1"+"'>"+"$1"+"</a>");
}

function test()
{
 $text = "your-site,net http://yoursite.com. bb dd mysite.org! ...<br/>"
 $text = $text+"yoursite.com .yoursite.com! https://yoursite.com: yoursite.com? <br/>"
 $text = $text+"yoursite.subsite.com/dir/blurb.html?name=billy-jean. <= captures last punctuation mark <br/>"
 res = clickify_links($text)
 document.write(res);
}

</script>
</head>

<body onload='test()'>

</body>
</html>

Initial URL


Initial Description
Hopefully this covers just about everything.   This code has far more lines than htmlify.js but seems to work ok.

(I  just remembered that " // Loop through the clear text " has no loop breaker - will post a fix, but it's not a show-stopper for most applications.)

As far as I can tell;

brackets are dealt with properly
punctuation is moved away from the link.

This script is actually an HTML document that contains test
patterns with raw and browser-view outputs.

Comments are welcome.

Initial Title
Text to HTML in JavaScript

Initial Tags
url, links

Initial Language
JavaScript