Wireshark mailing list archives

Re: UTF8 vs. locale in error messages (bug 5715)


From: Guy Harris <guy () alum mit edu>
Date: Tue, 28 Jun 2011 11:36:44 -0700


On Jun 28, 2011, at 10:27 AM, Guy Harris wrote:

We have an issue regarding strings in packets in general.  Strings might be in a number of encodings, including ASCII 
(meaning that any byte with the 8th bit set is something that shouldn't be there), other national variants of ISO 
646, UTF-8, UTF-16, UCS-2 (meaning "only the Basic Multilingual plane, with no surrogate pairs"), ISO 8859/x for 
various values of x, various ISO 2022-based encodings (e.g., the EUC encodings), various national standards, various 
DOS and Windows code pages, various Mac OS encodings, EBCDIC, whatever encodings are used for SMS, etc., etc., etc, 
etc.:

      http://en.wikipedia.org/wiki/Template:Character_encoding

As long as I'm piling up a ton of information about humanity's twisty little maze of character encodings, all different:

SMS:

        https://secure.wikimedia.org/wikipedia/en/wiki/GSM_03.38
___________________________________________________________________________
Sent via:    Wireshark-dev mailing list <wireshark-dev () wireshark org>
Archives:    http://www.wireshark.org/lists/wireshark-dev
Unsubscribe: https://wireshark.org/mailman/options/wireshark-dev
             mailto:wireshark-dev-request () wireshark org?subject=unsubscribe


Current thread: