Re: [NTLK] [OT]: HTML to text conversion guide

From: Puckdropper (puckdropper_at_yahoo.com)
Date: Wed Aug 24 2005 - 18:52:53 PDT


PHP will do that with the call to just a single
function. I'd have to look it up in the manual (I
haven't used it) but I know it can.

What I think might be useful is a HTML processor that
converts the HTML formatted data to it's USENET
representations. Such as < i>text< /i> to /text/ or <
u>text< /u> to _text_. Someone's probably already
written this program, so if you want it search for it.
 

Puckdropper

> These %xx characters are simply the ascii
> representation of the
> character in hexa-decimal. They get encoded
> *usually* because
> they're reserved for markup (especially '<' and
> '>'.) Therefore
> they're unique and you couldn't have different
> characters represented
> by the same number (ie. you seem to list %20 a
> couple of times
> whereas it should represent <space>, <tilde> should
> be %7E. '%?20'
> seems a bit weird though...)
>
> The accented characters aren't represented by ascii
> though and don't/
> can't have a corresponding number in this fashion.
>
> Toby.
>
> --
> This is the NewtonTalk list -
> http://www.newtontalk.net/ for all inquiries
> Official Newton FAQ:
> http://www.chuma.org/newton/faq/
> WikiWikiNewt for all kinds of articles:
> http://tools.unna.org/wikiwikinewt/
>
>

                
____________________________________________________
Start your day with Yahoo! - make it your home page
http://www.yahoo.com/r/hs
 

-- 
This is the NewtonTalk list - http://www.newtontalk.net/ for all inquiries
Official Newton FAQ: http://www.chuma.org/newton/faq/
WikiWikiNewt for all kinds of articles: http://tools.unna.org/wikiwikinewt/


This archive was generated by hypermail 2.1.5 : Thu Aug 25 2005 - 11:00:02 PDT