Wireshark mailing list archives

Re: tvb_get_string_enc() doesn't always return valid UTF-8


From: Evan Huus <eapache () gmail com>
Date: Sun, 26 Jan 2014 17:32:39 -0500

On Sun, Jan 26, 2014 at 4:53 PM, Guy Harris <guy () alum mit edu> wrote:

On Jan 21, 2014, at 5:01 AM, Evan Huus <eapache () gmail com> wrote:

On Tue, Jan 21, 2014 at 2:40 AM, Guy Harris <guy () alum mit edu> wrote:

On Jan 20, 2014, at 5:59 PM, Evan Huus <eapache () gmail com> wrote:

In which case is dumb search-and-replace of tvb_get_string with
tvb_get_string_enc and ENC_ASCII an easy way to make (part of) the API
transition?

Did somebody say that had been done or suggest that it be done?

I thought it was kind of implied when you wrote "We should also
probably audit all calls to tvb_get_string() and tvb_get_stringz() in
dissectors and change them to tvb_get_string_enc() with the
appropriate encoding."

No, it isn't implied; the last four words of what you quote are "with the appropriate encoding".  A dumb 
search-and-replace won't do that, as ENC_ASCII isn't the appropriate encoding for all those calls; that's why I said 
"audit", as in "read the code and the spec and see what the right encoding is".

OK. I just meant that since tvb_get_string() is currently ASCII, a
dumb search and replace will let us make the API change now without
any regressions. We can then audit calls that could be incorrect.

Admittedly, it's easier to track which calls have been audited if we
do it gradually, so that's probably a better choice anyways.

If tvb_get_string() behaves identically to tvb_get_string_enc(...
ENC_ASCII) then there doesn't seem much point in having both.

There *isn't* much point in having both.  If we "probably audit all calls to tvb_get_string() and tvb_get_stringz() 
in dissectors and change them to tvb_get_string_enc() with the appropriate encoding.", the next step would be to 
eliminate tvb_get_string() and tvb_get_stringz() and then rename tvb_get_string_enc() and tvb_get_stringz_enc() to 
just tvb_get_string() and tvb_get_stringz() (as "_enc" is then redundant).

Given that we already added a wmem scope argument to tvb_get_string() and tvb_get_stringz(), there's no particular 
reason not to make further API changes to those routines.

+1
___________________________________________________________________________
Sent via:    Wireshark-dev mailing list <wireshark-dev () wireshark org>
Archives:    http://www.wireshark.org/lists/wireshark-dev
Unsubscribe: https://wireshark.org/mailman/options/wireshark-dev
             mailto:wireshark-dev-request () wireshark org?subject=unsubscribe


Current thread: