[tor-bugs] #22410 [Core Tor/Tor]: ensure that uint8_t is unsigned char

Tor Bug Tracker & Wiki blackhole at torproject.org
Sun May 28 16:14:54 UTC 2017


#22410: ensure that uint8_t is unsigned char
--------------------------+------------------------------------
 Reporter:  catalyst      |          Owner:  catalyst
     Type:  defect        |         Status:  needs_review
 Priority:  Medium        |      Milestone:  Tor: 0.3.2.x-final
Component:  Core Tor/Tor  |        Version:
 Severity:  Normal        |     Resolution:
 Keywords:                |  Actual Points:
Parent ID:  #6877         |         Points:
 Reviewer:                |        Sponsor:
--------------------------+------------------------------------

Comment (by catalyst):

 Replying to [comment:4 cypherpunks]:
 > Comparing the length of the keywords is a bad way of choosing data
 types.
 In the abstract, I agree.  I also think the length of a type name does
 affects readability by humans and we should consider that in our
 decisions.
 > Replying to [comment:3 catalyst]:
 > > If uint8_t is an extended integer type rather than unsigned char
 (which is admittedly unlikely), it won't have the privileged aliasing
 properties of unsigned char so code that casts pointers to other types to
 pointers to uint8_t might violate the strict aliasing rules and produce
 undefined behavior.
 > I agree with the possibility of violating strict aliasing. However, i
 assume (yes, i know i should never) these pointers are dereferenced at
 some point which is always undefined behavior when the old and new type
 differs so the point is moot.
 C99 ยง6.5 paragraph 7 explicitly says that it's always valid to use a
 character lvalue to access the stored value of any object.  This means
 it's always valid to dereference a pointer to a character type as long as
 it points into an object.  If we detect that uint8_t is a character type,
 then we know that it will also have these privileged aliasing properties.
 > I don't care about the data type names (any renaming can easily be done
 using `typedef` if preferred). IMO it's more important that the data type
 matches the type of data it holds and the code handling these data types
 is built around these data types in order to keep casting to a minimum
 (preferably none).
 I think the best data type for handling arbitrary byte data on a platform
 with CHAR_BIT==8 is unsigned char.  This also has an advantage when
 handling encoding or decoding a larger type, because of the privileged
 aliasing properties of character types in C.

 A lot of existing code in the tree uses uint8_t.  It's easier to check at
 configure time whether uint8_t is a character type than to check each use
 of uint8_t for strict aliasing violations that could occur on (presumably
 rare) platforms where uint8_t is not a character type.

 There will often need to be casting even when using unsigned char or
 uint8_t because they will promote to signed int on most platforms.  This
 can cause problems with bitwise shifts if the appropriate casts aren't
 done.  (Left-shifting a 1 bit into the sign bit is undefined behavior.)

--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/22410#comment:5>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online


More information about the tor-bugs mailing list