charset と encoding #2

lotaki さん曰く RFC 2278 に charset のことが書いてあるよ、とのことで見てみた。2.3. Charset から。

The term "charset" (see historical note below) is used here to refer to a method of converting a sequence of octets into a sequence of characters. This conversion may also optionally produce additional control information such as directionality indicators.

むう、これじゃあ、まんま "encoding" の説明じゃないか、とか思うのだが、HISTORICAL NOTE にはこうある。

HISTORICAL NOTE: The term "character set" was originally used in MIME to describe such straightforward schemes as US-ASCII and ISO-8859-1 which consist of a small set of characters and a simple one-to-one mapping from single octets to single characters. Multi-octet character encoding schemes and switching techniques make the situation much more complex. As such, the definition of this term was revised to emphasize both the conversion aspect of the process, and the term itself has been changed to "charset" to emphasize that it is not, after all, just a set of characters. A discussion of these issues as well as specification of standard terminology for use in the IETF appears in RFC 2130.

なるほど。もともと1オクテット=1文字(=コードポイント)だった時代にどの文字集合かを規定するのに使われてた用語(term)として "character set" が使われていたわけね。
で、encoding と character set の両方を含む単語として charset になったと。つまり "character set" = "charset" だと思っていた自分が大間違いということか。