Re: [NTLK] [OT]: HTML to text conversion guide

From: Toby Hutton (tobyhutton_at_mac.com)
Date: Wed Aug 24 2005 - 15:44:28 PDT


On 25/08/2005, at 5:21 AM, Ian Iverson wrote:

> A while back the list was discussing how ecartis
> converts HTML mail to text and what some of the
> symbols meant. As I have plenty of free time on my
> hands at the moment, I have decided to make a handy
> guide for interested parties which may be used for
> conversion and/or programming purposes. This guide is
> not perfect or complete. Some people may not be able
> to see the characters and they MAY change with
> different fonts.
>
> (space) = %20
> < = %3c

[etc]

These %xx characters are simply the ascii representation of the
character in hexa-decimal. They get encoded *usually* because
they're reserved for markup (especially '<' and '>'.) Therefore
they're unique and you couldn't have different characters represented
by the same number (ie. you seem to list %20 a couple of times
whereas it should represent <space>, <tilde> should be %7E. '%?20'
seems a bit weird though...)

The accented characters aren't represented by ascii though and don't/
can't have a corresponding number in this fashion.

Toby.

-- 
This is the NewtonTalk list - http://www.newtontalk.net/ for all inquiries
Official Newton FAQ: http://www.chuma.org/newton/faq/
WikiWikiNewt for all kinds of articles: http://tools.unna.org/wikiwikinewt/


This archive was generated by hypermail 2.1.5 : Wed Aug 24 2005 - 19:00:02 PDT