Summary ======= This document is primarily for people who will be using Lynx on a remote UNIX or VMS system via an MS-DOS based terminal program. General Information =================== Lynx comes with built-in translation tables to map the 8-bit character codes or character entities coming in from an HTML document to their equivalent codes, where possible, for various character sets. IMPORTANT: you should choose display character set in Lynx Options Menu according to your font installed locally. Probably it would be cpXXX. Please contact lynx-dev mailing list if you want any new codepage not listed there. Note that all points of the connection between the display at your end and Lynx at the remote end must be 8-bit clean. If the high bit is being stripped at any point in between, the only character set you can use (effectively) in Lynx will be "7 bit approximations". More on that later. MS-DOS character set weirdness ============================== MS-DOS uses a bass-ackwards character set in which half the normal characters have been replaced by pseudo-graphic line and box-drawing characters, and in which almost all of the international characters are mapped to nonstandard numbers. It also contains Greek letters. Further confusing matters, there is more than one MS-DOS character set. The character sets are referred to as "codepages," each of which has a unique number. IBM PCs and compatibles come with one hardware-based default codepage and a keyboard to match. In the US market the hardware codepage is 437. PCs destined for other regions of the world often have a different default codepage which contains characters for other languages and keyboards. Under MS-DOS, one can load different codepages into memory and use one of them instead of the hardware default. If you are using Lynx through an MS-DOS based terminal program or telnet client, you should use an appropriate DOS codepage in Lynx and you need not any translation within terminal program (this is different from old-style behavior and works better because of superior Lynx translation support). Check your display by accessing Martin Ramsch's ISO-8859-1 table (iso8859-1.html in the Lynx distribution's test subdirectory). Ramsch's table describes each entity and shows examples of each. It should be immediately obvious that you are either seeing what you are supposed to, or you're not. If you see box and line-drawing characters and mismatched letters and so on, you are likely displaying 7 bit data, not 8. Ensure that all points of your connection are 8-bit clean: On any remote UNIX systems you must pass through, do 'stty cs8 -istrip' or 'stty pass8'. 'stty -a' should list your settings. On any remote VMS systems, do 'set terminal /eightbit'. Make sure your terminal program or telnet client is not filtering 8-bit data. You may found the choice between "VT-100 strict" and "VT-100 relaxed" emulation mode - use relaxed. Note: Procomm for DOS has a confusing "Use 7 bit or 8 bit ANSI" setting -- this has to do with ANSI sequences. If set to 8 bit, some 8-bit character sequences, including those passed by Lynx as well as those which are for your terminal type (vt100, etc.) will be processed by Procomm as ANSI screen control codes and will most likely result in a garbled display. Set it to 7 bit. If going through a dialup terminal server, you may have to set the terminal server itself to pass 8 bit data. How to do this varies with the make of the server, and in some cases only a system admin in charge of the box will have the authorization to do that. SLIP or PPP connections should already be 8-bit clean.