Internet Architecture Board (IAB)                              D. Thaler
Request for Comments: 6055                                     Microsoft
Updates: 2130                                                 J. Klensin
Category: Informational
ISSN: 2070-1721                                              S. Cheshire
                                                                   Apple
                                                           February 2011
        
Internet Architecture Board (IAB)                              D. Thaler
Request for Comments: 6055                                     Microsoft
Updates: 2130                                                 J. Klensin
Category: Informational
ISSN: 2070-1721                                              S. Cheshire
                                                                   Apple
                                                           February 2011
        

IAB Thoughts on Encodings for Internationalized Domain Names

IAB对国际化域名编码的思考

Abstract

摘要

This document explores issues with Internationalized Domain Names (IDNs) that result from the use of various encoding schemes such as UTF-8 and the ASCII-Compatible Encoding produced by the Punycode algorithm. It focuses on the importance of agreeing on a single encoding and how complicated the state of affairs ends up being as a result of using different encodings today.

本文档探讨了由于使用各种编码方案(如UTF-8和Punycode算法产生的ASCII兼容编码)而导致的国际化域名(IDN)问题。它着重于在单一编码上达成一致的重要性,以及由于今天使用不同的编码而导致的情况有多复杂。

Status of This Memo

关于下段备忘

This document is not an Internet Standards Track specification; it is published for informational purposes.

本文件不是互联网标准跟踪规范;它是为了提供信息而发布的。

This document is a product of the Internet Architecture Board (IAB) and represents information that the IAB has deemed valuable to provide for permanent record. Documents approved for publication by the IAB are not a candidate for any level of Internet Standard; see Section 2 of RFC 5741.

本文件是互联网体系结构委员会(IAB)的产品,代表IAB认为有价值提供永久记录的信息。IAB批准发布的文件不适用于任何级别的互联网标准;见RFC 5741第2节。

Information about the current status of this document, any errata, and how to provide feedback on it may be obtained at http://www.rfc-editor.org/info/rfc6055.

有关本文件当前状态、任何勘误表以及如何提供反馈的信息,请访问http://www.rfc-editor.org/info/rfc6055.

Copyright Notice

版权公告

Copyright (c) 2011 IETF Trust and the persons identified as the document authors. All rights reserved.

版权所有(c)2011 IETF信托基金和确定为文件作者的人员。版权所有。

This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document.

本文件受BCP 78和IETF信托有关IETF文件的法律规定的约束(http://trustee.ietf.org/license-info)自本文件出版之日起生效。请仔细阅读这些文件,因为它们描述了您对本文件的权利和限制。

Table of Contents

目录

   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  2
     1.1.  APIs . . . . . . . . . . . . . . . . . . . . . . . . . . .  8
   2.  Use of Non-DNS Protocols . . . . . . . . . . . . . . . . . . .  9
   3.  Use of Non-ASCII in DNS  . . . . . . . . . . . . . . . . . . . 10
     3.1.  Examples . . . . . . . . . . . . . . . . . . . . . . . . . 14
   4.  Recommendations  . . . . . . . . . . . . . . . . . . . . . . . 16
   5.  Security Considerations  . . . . . . . . . . . . . . . . . . . 18
   6.  Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 19
   7.  IAB Members at the Time of Approval  . . . . . . . . . . . . . 19
   8.  References . . . . . . . . . . . . . . . . . . . . . . . . . . 20
     8.1.  Normative References . . . . . . . . . . . . . . . . . . . 20
     8.2.  Informative References . . . . . . . . . . . . . . . . . . 20
        
   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  2
     1.1.  APIs . . . . . . . . . . . . . . . . . . . . . . . . . . .  8
   2.  Use of Non-DNS Protocols . . . . . . . . . . . . . . . . . . .  9
   3.  Use of Non-ASCII in DNS  . . . . . . . . . . . . . . . . . . . 10
     3.1.  Examples . . . . . . . . . . . . . . . . . . . . . . . . . 14
   4.  Recommendations  . . . . . . . . . . . . . . . . . . . . . . . 16
   5.  Security Considerations  . . . . . . . . . . . . . . . . . . . 18
   6.  Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 19
   7.  IAB Members at the Time of Approval  . . . . . . . . . . . . . 19
   8.  References . . . . . . . . . . . . . . . . . . . . . . . . . . 20
     8.1.  Normative References . . . . . . . . . . . . . . . . . . . 20
     8.2.  Informative References . . . . . . . . . . . . . . . . . . 20
        
1. Introduction
1. 介绍

The goal of this document is to explore what can be learned from some current difficulties in implementing Internationalized Domain Names (IDNs).

本文档的目标是探索从当前实施国际化域名(IDN)的一些困难中可以学到什么。

A domain name consists of a sequence of labels, conventionally written separated by dots. An IDN is a domain name that contains one or more labels that, in turn, contain one or more non-ASCII characters. Just as with plain ASCII domain names, each IDN label must be encoded using some mechanism before it can be transmitted in network packets, stored in memory, stored on disk, etc. These encodings need to be reversible, but they need not store domain names the same way humans conventionally write them on paper. For example, when transmitted over the network in DNS packets, domain name labels are *not* separated with dots.

域名由一系列标签组成,通常用点分隔。IDN是包含一个或多个标签的域名,而这些标签又包含一个或多个非ASCII字符。与普通ASCII域名一样,每个IDN标签必须使用某种机制进行编码,然后才能在网络数据包中传输、存储在内存中、存储在磁盘上等。这些编码必须是可逆的,但它们不需要像人们通常在纸上写域名那样存储域名。例如,当在DNS数据包中通过网络传输时,域名标签*不*用点分隔。

Internationalized Domain Names for Applications (IDNA), discussed later in this document, is the standard that defines the use and coding of internationalized domain names for use on the public Internet [RFC5890]. An earlier version of IDNA [RFC3490] is now being phased out. Except where noted, the two versions are approximately the same with regard to the issues discussed in this document. However, some explanations appeared in the earlier documents that were no longer considered useful when the later revision was created; they are quoted here from the documents in which they appear. In addition, the terminology of the two versions differ somewhat; this document reflects the terminology of the current version.

本文件后面讨论的应用程序国际化域名(IDNA)是定义在公共互联网上使用的国际化域名的使用和编码的标准[RFC5890]。IDNA的早期版本[RFC3490]目前正在逐步淘汰。除非另有说明,两个版本在本文件中讨论的问题上大致相同。然而,早期文件中出现了一些解释,在后期修订时,这些解释不再被视为有用;这里引用的是它们出现的文件。此外,两个版本的术语有所不同;本文件反映了当前版本的术语。

Unicode [Unicode] is a list of characters (including non-spacing marks that are used to form some other characters), where each character is assigned an integer value, called a code point. In

Unicode[Unicode]是一个字符列表(包括用于形成其他字符的非间隔标记),其中每个字符都分配了一个整数值,称为代码点。在里面

simple terms a Unicode string is a string of integer code point values in the range 0 to 1,114,111 (10FFFF in base 16). These integer code points must be encoded using some mechanism before they can be transmitted in network packets, stored in memory, stored on disk, etc. Some common ways of encoding these integer code point values in computer systems include UTF-8, UTF-16, and UTF-32. In addition to the material below, those forms and the tradeoffs among them are discussed in Chapter 2 of The Unicode Standard [Unicode].

简单术语Unicode字符串是范围为0到1114111(以16为基数为10FFFF)的整数代码点值字符串。这些整数码点必须使用某种机制进行编码,然后才能在网络数据包中传输、存储在内存中、存储在磁盘上等。计算机系统中对这些整数码点值进行编码的一些常用方法包括UTF-8、UTF-16和UTF-32。除了下面的内容外,Unicode标准[Unicode]第2章还讨论了这些表单及其折衷。

UTF-8 is a mechanism for encoding a Unicode code point in a variable number of 8-bit octets, where an ASCII code point is preserved as-is. Those octets encode a string of integer code point values, which represent a string of Unicode characters. The authoritative definition of UTF-8 is in Sections 3.9 and 3.10 of The Unicode Standard [Unicode], but a description of UTF-8 encoding can also be found in RFC 3629 [RFC3629]. Descriptions and formulae can also be found in Annex D of ISO/IEC 10646-1 [10646].

UTF-8是一种将Unicode码点编码为可变数量的8位八位组的机制,其中ASCII码点按原样保留。这些八位字节编码一个整数码点值字符串,表示一个Unicode字符字符串。UTF-8的权威定义在Unicode标准[Unicode]的第3.9节和第3.10节中,但UTF-8编码的描述也可以在RFC 3629[RFC3629]中找到。说明和公式也可在ISO/IEC 10646-1[10646]的附录D中找到。

UTF-16 is a mechanism for encoding a Unicode code point in one or two 16-bit integers, described in detail in Sections 3.9 and 3.10 of The Unicode Standard [Unicode]. A UTF-16 string encodes a string of integer code point values that represent a string of Unicode characters.

UTF-16是一种将Unicode码点编码为一个或两个16位整数的机制,Unicode标准[Unicode]第3.9节和第3.10节对此进行了详细描述。UTF-16字符串对表示Unicode字符字符串的整数代码点值字符串进行编码。

UTF-32 (formerly UCS-4), also described in Sections 3.9 and 3.10 of The Unicode Standard [Unicode], is a mechanism for encoding a Unicode code point in a single 32-bit integer. A UTF-32 string is thus a string of 32-bit integer code point values, which represent a string of Unicode characters.

UTF-32(以前的UCS-4)也是Unicode标准[Unicode]第3.9节和第3.10节中描述的,是一种将Unicode码点编码为单个32位整数的机制。因此,UTF-32字符串是32位整数代码点值的字符串,表示Unicode字符字符串。

Note that UTF-16 results in some all-zero octets when code points occur early in the Unicode sequence, and UTF-32 always has all-zero octets.

请注意,在Unicode序列的早期出现代码点时,UTF-16会产生一些全零八位字节,而UTF-32总是具有全零八位字节。

IDNA specifies validity of a label, such as what characters it can contain, relationships among them, and so on, in Unicode terms. Valid labels can be in either "U-label" or "A-label" form, with the appropriate one determined by particular protocols or by context. U-label form is a direct representation of the Unicode characters using one of the encoding forms discussed above. This document discusses UTF-8 strings in many places. While all U-labels can be represented by UTF-8 strings, not all UTF-8 strings are valid U-labels (see Section 2.3.2 of the IDNA Definitions document [RFC5890] for a discussion of these distinctions). A-label form uses a compressed, ASCII-compatible encoding (an "ACE" in IDNA and other terminology) produced by an algorithm called Punycode. U-labels and

IDNA以Unicode术语指定标签的有效性,例如它可以包含哪些字符、它们之间的关系等等。有效标签可以是“U标签”或“A标签”形式,适当的标签由特定协议或上下文确定。U标签形式是使用上面讨论的编码形式之一直接表示Unicode字符。本文档在许多地方讨论UTF-8字符串。虽然所有U型标签都可以用UTF-8字符串表示,但并非所有UTF-8字符串都是有效的U型标签(有关这些区别的讨论,请参阅IDNA定义文件[RFC5890]第2.3.2节)。A标签表单使用一种压缩的、与ASCII兼容的编码(IDNA和其他术语中的“ACE”),该编码由一种称为Punycode的算法生成。U型标签和

A-labels are duals of each other: transformations from one to the other do not lose information. The transformation mechanisms are specified in the IDNA Protocol document [RFC5891].

A标签是彼此的对偶:从一个标签到另一个标签的转换不会丢失信息。IDNA协议文件[RFC5891]中规定了转换机制。

Punycode [RFC3492] is thus a mechanism for encoding a Unicode string in an ASCII-compatible encoding, i.e., using only letters, digits, and hyphens from the ASCII character set. When a Unicode label that is valid under the IDNA rules (a U-label) is encoded with Punycode for IDNA purposes, it is prefixed with "xn--"; the result is called an A-label. The prefix convention assumes that no other DNS labels (at least no other DNS labels in IDNA-aware applications) are allowed to start with these four characters. Consequently, when A-label encoding is assumed, any DNS labels beginning with "xn--" now have a different meaning (the Punycode encoding of a label containing one or more non-ASCII characters) or no defined meaning at all (in the case of labels that are not IDNA-compliant, i.e., are not well-formed A-labels).

因此,Punycode[RFC3492]是一种以ASCII兼容编码方式对Unicode字符串进行编码的机制,即仅使用ASCII字符集中的字母、数字和连字符。当根据IDNA规则有效的Unicode标签(U标签)为IDNA目的使用Punycode编码时,其前缀为“xn--”;结果称为A标签。前缀约定假定不允许其他DNS标签(至少在支持IDNA的应用程序中不允许其他DNS标签)以这四个字符开头。因此,当假定A标签编码时,以“xn--”开头的任何DNS标签现在具有不同的含义(包含一个或多个非ASCII字符的标签的Punycode编码)或根本没有定义的含义(在不符合IDNA的标签的情况下,即不是格式良好的A标签)。

ISO-2022-JP [RFC1468] is a mechanism for encoding a string of ASCII and Japanese characters, where an ASCII character is preserved as-is. ISO-2022-JP is stateful: special sequences are used to switch between character coding tables. As a result, if there are lost or mangled characters in a character stream, it is extremely difficult to recover the original stream after such a lost character encoding shift.

ISO-2022-JP[RFC1468]是一种用于编码ASCII和日语字符字符串的机制,其中ASCII字符按原样保留。ISO-2022-JP是有状态的:特殊序列用于在字符编码表之间切换。因此,如果字符流中存在丢失或损坏的字符,则在这种丢失的字符编码移位之后恢复原始流是极其困难的。

Comparison of Unicode strings is not as easy as comparing ASCII strings. First, there are a multitude of ways to represent a string of Unicode characters. Second, in many languages and scripts, the actual definition of "same" is very context-dependent. Because of this, comparison of two Unicode strings must take into account how the Unicode strings are encoded. Regardless of the encoding, however, comparison cannot simply be done by comparing the encoded Unicode strings byte by byte. The only time that is possible is when the strings are both mapped into some canonical form and encoded the same way.

比较Unicode字符串不如比较ASCII字符串容易。首先,有多种方法来表示Unicode字符字符串。第二,在许多语言和脚本中,“相同”的实际定义与上下文密切相关。因此,两个Unicode字符串的比较必须考虑Unicode字符串的编码方式。然而,不管编码是什么,比较不能简单地通过逐字节比较编码的Unicode字符串来完成。唯一可能的情况是字符串都被映射成某种规范形式并以相同的方式编码。

In 1996 the IAB sponsored a workshop on character sets and encodings [RFC2130]. This document adds to that discussion and focuses on the importance of agreeing on a single encoding and how complicated the state of affairs ends up being as a result of using different encodings today.

1996年,IAB赞助了一个关于字符集和编码的研讨会[RFC2130]。本文档补充了这一讨论,并重点讨论了就单一编码达成一致的重要性,以及由于今天使用不同的编码而导致的情况有多复杂。

Different applications, APIs, and protocols use different encoding schemes today. Many of them were originally defined to use only ASCII. Internationalizing Domain Names in Applications (IDNA) [RFC5890] defines a mechanism that requires changes to applications, but in an attempt not to change APIs or servers, specifies that the

今天,不同的应用程序、API和协议使用不同的编码方案。其中许多最初定义为仅使用ASCII。应用程序中的域名国际化(IDNA)[RFC5890]定义了一种需要更改应用程序的机制,但为了不更改API或服务器,指定

A-label format is to be used in many contexts. In some ways this could be seen as not changing the existing APIs, in the sense that the strings being passed to and from the APIs are still apparently ASCII strings. In other ways it is a very profound change to the existing APIs, because while those strings are still syntactically valid ASCII strings, they no longer mean the same thing that they used to. What looks like a plain ASCII string to one piece of software or library could be seen by another piece of software or library (with the application of out-of-band information) to be in fact an encoding of a Unicode string.

A标签格式在许多情况下都会使用。在某些方面,这可以被视为没有改变现有的API,从某种意义上说,传入和传出API的字符串显然仍然是ASCII字符串。在其他方面,这是对现有API的一个非常深刻的改变,因为尽管这些字符串在语法上仍然是有效的ASCII字符串,但它们的含义不再与以前相同。在一个软件或库中看起来像普通ASCII字符串的内容,可以被另一个软件或库(应用带外信息)看到,实际上是Unicode字符串的编码。

Section 1.3 of the original IDNA specification [RFC3490] states:

原始IDNA规范[RFC3490]第1.3节规定:

The IDNA protocol is contained completely within applications. It is not a client-server or peer-to-peer protocol: everything is done inside the application itself. When used with a DNS resolver library, IDNA is inserted as a "shim" between the application and the resolver library. When used for writing names into a DNS zone, IDNA is used just before the name is committed to the zone.

IDNA协议完全包含在应用程序中。它不是客户机-服务器或对等协议:一切都是在应用程序内部完成的。当与DNS解析程序库一起使用时,IDNA作为“垫片”插入到应用程序和解析程序库之间。当用于将名称写入DNS区域时,IDNA将在名称提交到区域之前使用。

Figure 1 depicts a simplistic architecture that a naive reader might assume from the paragraph quoted above. (A variant of this same picture appears in Section 6 of the original IDNA specification [RFC3490], further strengthening this assumption.)

图1描述了一个简单的体系结构,天真的读者可能会从上面引用的段落中假设它。(相同图片的变体出现在原始IDNA规范[RFC3490]的第6节中,进一步强化了这一假设。)

                +-----------------------------------------+
                |Host                                     |
                |             +-------------+             |
                |             | Application |             |
                |             +------+------+             |
                |                    |                    |
                |               +----+----+               |
                |               |   DNS   |               |
                |               | Resolver|               |
                |               | Library |               |
                |               +----+----+               |
                |                    |                    |
                +-----------------------------------------+
                                     |
                            _________|_________
                           /                   \
                          /                     \
                         /                       \
                        |         Internet        |
                         \                       /
                          \                     /
                           \___________________/
        
                +-----------------------------------------+
                |Host                                     |
                |             +-------------+             |
                |             | Application |             |
                |             +------+------+             |
                |                    |                    |
                |               +----+----+               |
                |               |   DNS   |               |
                |               | Resolver|               |
                |               | Library |               |
                |               +----+----+               |
                |                    |                    |
                +-----------------------------------------+
                                     |
                            _________|_________
                           /                   \
                          /                     \
                         /                       \
                        |         Internet        |
                         \                       /
                          \                     /
                           \___________________/
        

Simplistic Architecture

简单化建筑

Figure 1

图1

There are, however, two problems with this simplistic architecture that cause it to differ from reality.

然而,这种过于简单的架构存在两个问题,导致它与现实不同。

   First, resolver APIs on Operating Systems (OSs) today (Mac OS,
   Windows, Linux, etc.) are not DNS-specific.  They typically provide a
   layer of indirection so that the application can work independent of
   the name resolution mechanism, which could be DNS, mDNS
   [DNS-MULTICAST], LLMNR [RFC4795], NetBIOS-over-TCP
   [RFC1001][RFC1002], hosts table [RFC0952], NIS [NIS], or anything
   else.  For example, "Basic Socket Interface Extensions for IPv6"
   [RFC3493] specifies the getaddrinfo() API and contains many phrases
   like "For example, when using the DNS" and "any type of name
   resolution service (for example, the DNS)".  Importantly, DNS is
   mentioned only as an example, and the application has no knowledge as
   to whether DNS or some other protocol will be used.
        
   First, resolver APIs on Operating Systems (OSs) today (Mac OS,
   Windows, Linux, etc.) are not DNS-specific.  They typically provide a
   layer of indirection so that the application can work independent of
   the name resolution mechanism, which could be DNS, mDNS
   [DNS-MULTICAST], LLMNR [RFC4795], NetBIOS-over-TCP
   [RFC1001][RFC1002], hosts table [RFC0952], NIS [NIS], or anything
   else.  For example, "Basic Socket Interface Extensions for IPv6"
   [RFC3493] specifies the getaddrinfo() API and contains many phrases
   like "For example, when using the DNS" and "any type of name
   resolution service (for example, the DNS)".  Importantly, DNS is
   mentioned only as an example, and the application has no knowledge as
   to whether DNS or some other protocol will be used.
        

Second, even with the DNS protocol, private namespaces (sometimes including private uses of the DNS) do not necessarily use the same character set encoding scheme as the public Internet namespace.

其次,即使使用DNS协议,私有名称空间(有时包括DNS的私有使用)也不一定使用与公共Internet名称空间相同的字符集编码方案。

   We will discuss each of the above issues in subsequent sections.  For
   reference, Figure 2 depicts a more realistic architecture on typical
   hosts today (which don't have IDNA inserted as a shim immediately
   above the DNS resolver library).  More generally, the host may be
   attached to one or more local networks, each of which may or may not
   be connected to the public Internet and may or may not have a private
   namespace.
                +-----------------------------------------+
                |Host                                     |
                |             +-------------+             |
                |             | Application |             |
                |             +------+------+             |
                |                    |                    |
                |             +------+------+             |
                |             |   Generic   |             |
                |             |    Name     |             |
                |             |  Resolution |             |
                |             |     API     |             |
                |             +------+------+             |
                |                    |                    |
                |   +-----+------+---+--+-------+-----+   |
                |   |     |      |      |       |     |   |
                | +-+-++--+--++--+-++---+---++--+--++-+-+ |
                | |DNS||LLMNR||mDNS||NetBIOS||hosts||...| |
                | +---++-----++----++-------++-----++---+ |
                |                                         |
                +-----------------------------------------+
                                     |
                               ______|______
                              /             \
                             /               \
                            /      local      \
                            \     network     /
                             \               /
                              \_____________/
                                     |
                            _________|_________
                           /                   \
                          /                     \
                         /                       \
                        |         Internet        |
                         \                       /
                          \                     /
                           \___________________/
        
   We will discuss each of the above issues in subsequent sections.  For
   reference, Figure 2 depicts a more realistic architecture on typical
   hosts today (which don't have IDNA inserted as a shim immediately
   above the DNS resolver library).  More generally, the host may be
   attached to one or more local networks, each of which may or may not
   be connected to the public Internet and may or may not have a private
   namespace.
                +-----------------------------------------+
                |Host                                     |
                |             +-------------+             |
                |             | Application |             |
                |             +------+------+             |
                |                    |                    |
                |             +------+------+             |
                |             |   Generic   |             |
                |             |    Name     |             |
                |             |  Resolution |             |
                |             |     API     |             |
                |             +------+------+             |
                |                    |                    |
                |   +-----+------+---+--+-------+-----+   |
                |   |     |      |      |       |     |   |
                | +-+-++--+--++--+-++---+---++--+--++-+-+ |
                | |DNS||LLMNR||mDNS||NetBIOS||hosts||...| |
                | +---++-----++----++-------++-----++---+ |
                |                                         |
                +-----------------------------------------+
                                     |
                               ______|______
                              /             \
                             /               \
                            /      local      \
                            \     network     /
                             \               /
                              \_____________/
                                     |
                            _________|_________
                           /                   \
                          /                     \
                         /                       \
                        |         Internet        |
                         \                       /
                          \                     /
                           \___________________/
        

Realistic Architecture

现实主义建筑

Figure 2

图2

1.1. APIs
1.1. 原料药

Section 6.2 of the original IDNA specification [RFC3490] states (where ToASCII and ToUnicode below refer to conversions using the Punycode algorithm):

原始IDNA规范[RFC3490]第6.2节规定(以下ToASCII和ToUnicode指使用Punycode算法的转换):

It is expected that new versions of the resolver libraries in the future will be able to accept domain names in other charsets than ASCII, and application developers might one day pass not only domain names in Unicode, but also in local script to a new API for the resolver libraries in the operating system. Thus the ToASCII and ToUnicode operations might be performed inside these new versions of the resolver libraries.

预计未来新版本的冲突解决程序库将能够接受ASCII以外的其他字符集中的域名,应用程序开发人员有朝一日可能不仅会将Unicode中的域名传递给操作系统中冲突解决程序库的新API,还会将本地脚本中的域名传递给该API。因此,ToASCII和ToUnicode操作可以在这些新版本的解析器库中执行。

Resolver APIs such as getaddrinfo() and its predecessor gethostbyname() were defined to accept C-Language "char *" arguments, meaning they accept a string of bytes, terminated with a NULL (0) byte. Because of the use of a NULL octet as a string terminator, this is sufficient for ASCII strings (including A-labels) and even ISO-2022-JP [RFC1468] and UTF-8 strings (unless an implementation artificially precludes them), but not UTF-16 or UTF-32 strings because a NULL octet could appear in the middle of strings using these encodings. Several operating systems historically used in Japan will accept (and expect) ISO-2022-JP strings in such APIs. Some platforms used worldwide also have new versions of the APIs (e.g., GetAddrInfoW() on Windows) that accept other encoding schemes such as UTF-16.

诸如getaddrinfo()及其前身gethostbyname()之类的解析器API被定义为接受C语言的“char*”参数,这意味着它们接受以空(0)字节结尾的字节字符串。由于使用零八位字节作为字符串终止符,这对于ASCII字符串(包括A标签)甚至ISO-2022-JP[RCF1468 ]和UTF-8字符串都是足够的(除非实现人为地排除它们),而不是UTF-16或UTF-32字符串,因为空八位字节可以使用这些编码出现在字符串的中间。日本历史上使用的几种操作系统将在此类API中接受(并期望)ISO-2022-JP字符串。世界各地使用的一些平台也有新版本的API(例如,Windows上的GetAddrInfoW()),可接受其他编码方案,如UTF-16。

It is worth noting that an API using C-Language "char *" arguments can distinguish between conventional ASCII "hostname" labels, A-labels, ISO-2022-JP, and UTF-8 labels in names if the coding is known to be one of those four, and the label is intact (no lost or mangled characters). If a stateful encoding like ISO-2022-JP is used, applications extracting labels from text must take special precautions to be sure that the appropriate state-setting characters are included in the string passed to the API.

值得注意的是,使用C语言“char*”参数的API可以在名称中区分传统的ASCII“hostname”标签、A标签、ISO-2022-JP和UTF-8标签,前提是已知编码是这四种标签之一,并且标签完好无损(没有丢失或损坏的字符)。如果使用ISO-2022-JP等有状态编码,则从文本中提取标签的应用程序必须采取特殊预防措施,以确保传递给API的字符串中包含适当的状态设置字符。

An example method for distinguishing among such codings is as follows:

区分此类编码的示例方法如下:

o if the label contains an ESC (0x1B) byte, the label is ISO-2022-JP; otherwise,

o 如果标签包含ESC(0x1B)字节,则标签为ISO-2022-JP;否则

o if any byte in the label has the high bit set, the label is UTF-8; otherwise,

o 如果标签中的任何字节设置了高位,则标签为UTF-8;否则

o if the label starts with "xn--", then it is presumed to be an A-label; otherwise,

o 如果标签以“xn-”开头,则假定它是A标签;否则

o the label is ASCII (and therefore, by definition, the label is also UTF-8, since ASCII is a subset of UTF-8).

o 标签是ASCII(因此,根据定义,标签也是UTF-8,因为ASCII是UTF-8的子集)。

Again this assumes that ASCII labels never start with "xn--", and also that UTF-8 strings never contain an ESC character. Also the above is merely an illustration; UTF-8 can be detected and distinguished from other 8-bit encodings with good accuracy [MJD].

同样,这假设ASCII标签从不以“xn--”开头,并且UTF-8字符串从不包含ESC字符。以上只是一个例子;UTF-8可以被检测到,并与其他8位编码区分开来,具有良好的精确度[MJD]。

It is more difficult or impossible to distinguish the ISO 8859 character sets [ISO8859] from each other, because they differ in up to about 90 characters that have exactly the same encodings, and a short string is very unlikely to contain enough characters to allow a receiver to deduce the character set. Similarly, it is not possible in general to distinguish between ISO-2022-JP and any other encoding based on ISO 2022 code table switching.

区分ISO 8859字符集[ISO8859]更为困难或不可能,因为它们最多相差90个编码完全相同的字符,短字符串不太可能包含足够的字符以允许接收者推断字符集。类似地,通常不可能区分ISO-2022-JP和基于ISO 2022代码表切换的任何其他编码。

Although it is possible (as in the example above) to distinguish some encodings when not explicitly specified, it is cleaner to have the encodings specified explicitly, such as specifying UTF-16 for GetAddrInfoW(), or specifying explicitly which APIs expect UTF-8 strings.

虽然可以(如上面的示例中)在未显式指定的情况下区分某些编码,但是显式指定编码更为简洁,例如为GetAddrInfoW()指定UTF-16,或者显式指定哪些API需要UTF-8字符串。

2. Use of Non-DNS Protocols
2. 非DNS协议的使用

As noted earlier, typical name resolution libraries are not DNS-specific. Furthermore, some protocols are defined to use encoding forms other than IDNA A-labels. For example, mDNS [DNS-MULTICAST] specifies that UTF-8 be used. Indeed, the IETF policy on character sets and languages [RFC2277] (which followed the 1996 IAB-sponsored workshop [RFC2130]) states:

如前所述,典型的名称解析库不是特定于DNS的。此外,一些协议被定义为使用IDNA A标签以外的编码形式。例如,mDNS[DNS-MULTICAST]指定使用UTF-8。事实上,IETF关于字符集和语言的政策[RFC2277](在1996年IAB主办的研讨会[RFC2130]之后)规定:

Protocols MUST be able to use the UTF-8 charset, which consists of the ISO 10646 coded character set combined with the UTF-8 character encoding scheme, as defined in [10646] Annex R (published in Amendment 2), for all text.

协议必须能够对所有文本使用UTF-8字符集,该字符集由ISO 10646编码字符集和UTF-8字符编码方案组成,如[10646]附录R(修订件2中发布)中所定义。

Protocols MAY specify, in addition, how to use other charsets or other character encoding schemes for ISO 10646, such as UTF-16, but lack of an ability to use UTF-8 is a violation of this policy; such a violation would need a variance procedure ([BCP9] section 9) with clear and solid justification in the protocol specification document before being entered into or advanced upon the standards track.

此外,协议可规定如何使用其他字符集或ISO 10646的其他字符编码方案,如UTF-16,但缺乏使用UTF-8的能力违反了本政策;在进入标准轨道或进入标准轨道之前,此类违规行为需要在协议规范文件中有明确和可靠理由的变更程序([BCP9]第9节)。

For existing protocols or protocols that move data from existing datastores, support of other charsets, or even using a default other than UTF-8, may be a requirement. This is acceptable, but UTF-8 support MUST be possible.

对于现有协议或从现有数据存储中移动数据的协议,可能需要支持其他字符集,甚至使用UTF-8以外的默认字符集。这是可以接受的,但必须能够支持UTF-8。

Applications that convert an IDN to A-label form before calling getaddrinfo() will result in name resolution failures if the Punycode name is directly used in such protocols. Having libraries or protocols to convert from A-labels to the encoding scheme defined by the protocol (e.g., UTF-8) would require changes to APIs and/or servers, which IDNA was intended to avoid.

如果在调用getaddrinfo()之前将IDN转换为A标签形式的应用程序在此类协议中直接使用Punycode名称,则会导致名称解析失败。如果要将库或协议从A标签转换为协议(例如UTF-8)定义的编码方案,则需要对API和/或服务器进行更改,IDNA旨在避免这种更改。

As a result, applications that assume that non-ASCII names are resolved using the public DNS and blindly convert them to A-labels without knowledge of what protocol will be selected by the name resolution library, have problems. Furthermore, name resolution libraries often try multiple protocols until one succeeds, because they are defined to use a common namespace. For example, the hosts file [RFC0952], NetBIOS-over-TCP [RFC1001], and DNS [RFC1034], are all defined to be able to share a common syntax. This means that when an application passes a name to be resolved, resolution may in fact be attempted using multiple protocols, each with a potentially different encoding scheme. For this to work successfully, the name must be converted to the appropriate encoding scheme only after the choice is made to use that protocol. In general, this cannot be done by the application since the choice of protocol is not made by the application.

因此,假设使用公共DNS解析非ASCII名称并在不知道名称解析库将选择什么协议的情况下将其盲目转换为a标签的应用程序存在问题。此外,名称解析库通常会尝试多种协议,直到其中一种协议成功,因为它们被定义为使用公共名称空间。例如,主机文件[RFC0952]、TCP上的NetBIOS[RFC1001]和DNS[RFC1034]都被定义为能够共享公共语法。这意味着,当应用程序传递要解析的名称时,实际上可以使用多个协议尝试解析,每个协议都具有可能不同的编码方案。要使其成功工作,只有在选择使用该协议后,才能将名称转换为适当的编码方案。一般来说,这不能由应用程序完成,因为协议的选择不是由应用程序完成的。

3. Use of Non-ASCII in DNS
3. DNS中非ASCII码的使用

A common misconception is that DNS only supports names that can be expressed using letters, digits, and hyphens.

一个常见的误解是DNS只支持可以用字母、数字和连字符表示的名称。

This misconception originally stems from the 1985 definition of an "Internet hostname" (and net, gateway, and domain name) for use in the "hosts" file [RFC0952]. An Internet hostname was defined therein as including only letters, digits, and hyphens, where uppercase and lowercase letters were to be treated as identical. The DNS specification [RFC1034], Section 3.5 entitled "Preferred name syntax" then repeated this definition in 1987, saying that this "syntax will result in fewer problems with many applications that use domain names (e.g., mail, TELNET)".

这种误解源于1985年对“主机”文件[RFC0952]中使用的“互联网主机名”(以及网络、网关和域名)的定义。其中定义的Internet主机名仅包括字母、数字和连字符,其中大写字母和小写字母应视为相同。DNS规范[RFC1034]第3.5节“首选名称语法”随后在1987年重复了这一定义,称这种“语法将减少使用域名的许多应用程序(如邮件、TELNET)的问题”。

The confusion was thus left as to whether the "preferred" name syntax was a mandatory restriction in DNS, or merely "preferred".

因此,关于“首选”名称语法是DNS中的强制限制,还是仅仅是“首选”名称语法,就留下了混淆。

The definition of an Internet hostname was updated in 1989 ([RFC1123], Section 2.1) to allow names starting with a digit. However, it did not address the increasing confusion as to whether all names in DNS are "hostnames", or whether a "hostname" is merely a special case of a DNS name.

1989年更新了Internet主机名的定义([RFC1123],第2.1节),允许名称以数字开头。然而,它并没有解决DNS中所有名称是否都是“主机名”或“主机名”是否只是DNS名称的特例这一问题。

By 1997, things had progressed to a state where it was necessary to clarify these areas of confusion. "Clarifications to the DNS Specification" [RFC2181], Section 11 states:

到1997年,情况已经发展到需要澄清这些混乱领域的地步。“DNS规范澄清”[RFC2181],第11节规定:

The DNS itself places only one restriction on the particular labels that can be used to identify resource records. That one restriction relates to the length of the label and the full name. The length of any one label is limited to between 1 and 63 octets. A full domain name is limited to 255 octets (including the separators). The zero length full name is defined as representing the root of the DNS tree, and is typically written and displayed as ".". Those restrictions aside, any binary string whatever can be used as the label of any resource record. Similarly, any binary string can serve as the value of any record that includes a domain name as some or all of its value (SOA, NS, MX, PTR, CNAME, and any others that may be added). Implementations of the DNS protocols must not place any restrictions on the labels that can be used.

DNS本身仅对可用于标识资源记录的特定标签设置一个限制。这一限制与标签的长度和全名有关。任何一个标签的长度限制在1到63个八位字节之间。完整域名限制为255个八位字节(包括分隔符)。长度为零的全名定义为表示DNS树的根,通常写入并显示为“”。撇开这些限制不谈,任何二进制字符串都可以用作任何资源记录的标签。类似地,任何二进制字符串都可以用作任何记录的值,该记录将域名作为其部分或全部值(SOA、NS、MX、PTR、CNAME以及可能添加的任何其他值)。DNS协议的实施不得对可使用的标签施加任何限制。

Hence, it clarified that the restriction to letters, digits, and hyphens does not apply to DNS names in general, nor to records that include "domain names". Hence, the "preferred" name syntax described in the original DNS specification [RFC1034] is indeed merely "preferred", not mandatory.

因此,它澄清,对字母、数字和连字符的限制一般不适用于DNS名称,也不适用于包含“域名”的记录。因此,原始DNS规范[RFC1034]中描述的“首选”名称语法实际上只是“首选”,而不是强制性的。

Since there is no restriction even to ASCII, let alone letter-digit-hyphen use, DNS does not violate the subsequent IETF requirement to allow UTF-8 [RFC2277].

由于甚至对ASCII都没有限制,更不用说字母-数字连字符的使用了,DNS不违反后续IETF要求,允许UTF-8[RFC2277]。

Using UTF-16 or UTF-32 encoding, however, would not be ideal for use in DNS packets or C-Language "char *" APIs because existing software already uses ASCII, and UTF-16 and UTF-32 strings can contain all-zero octets that existing software will interpret as the end of the string. To use UTF-16 or UTF-32, one would need some way of knowing whether the string was encoded using ASCII, UTF-16, or UTF-32, and indeed for UTF-16 or UTF-32 whether it was big-endian or little-endian encoding. In contrast, UTF-8 works well because any 7-bit ASCII string is also a UTF-8 string representing the same characters.

然而,使用UTF-16或UTF-32编码并不适合在DNS数据包或C语言“char*”API中使用,因为现有软件已经使用ASCII,并且UTF-16和UTF-32字符串可以包含现有软件将解释为字符串结尾的所有零八位字节。要使用UTF-16或UTF-32,需要知道字符串是使用ASCII、UTF-16还是UTF-32编码的,事实上,对于UTF-16或UTF-32,需要知道它是大端编码还是小端编码。相反,UTF-8工作得很好,因为任何7位ASCII字符串也是表示相同字符的UTF-8字符串。

If a private namespace is defined to use UTF-8 (and not other encodings such as UTF-16 or UTF-32), there's no need for a mechanism to know whether a string was encoded using ASCII or UTF-8, because (for any string that can be represented using ASCII) the representations are exactly the same. In other words, for any string that can be represented using ASCII, it doesn't matter whether it is interpreted as ASCII or UTF-8 because both encodings are the same, and for any string that can't be represented using ASCII, it's

如果将私有名称空间定义为使用UTF-8(而不是其他编码,如UTF-16或UTF-32),则不需要一种机制来知道字符串是使用ASCII还是UTF-8编码的,因为(对于任何可以使用ASCII表示的字符串)表示完全相同。换句话说,对于任何可以用ASCII表示的字符串,它被解释为ASCII还是UTF-8都无关紧要,因为两种编码都是相同的,对于任何不能用ASCII表示的字符串,都是相同的

obviously UTF-8. In addition, unlike UTF-16 and UTF-32, ASCII and UTF-8 are both byte-oriented encodings so the question of big-endian or little-endian encoding doesn't apply.

显然是UTF-8。此外,与UTF-16和UTF-32不同,ASCII和UTF-8都是面向字节的编码,因此大端或小端编码的问题不适用。

While implementations of the DNS protocol must not place any restrictions on the labels that can be used, applications that use the DNS are free to impose whatever restrictions they like, and many have. The above rules permit a domain name label that contains unusual characters, such as embedded spaces, which many applications consider a bad idea. For example, the original specification [RFC0821] of the SMTP protocol [RFC5321] constrains the character set usable in email addresses. There is now an effort underway to define an extension to SMTP to support internationalized email addresses and headers. See the EAI framework [RFC4952] for more discussion on this topic.

虽然DNS协议的实现不能对可使用的标签施加任何限制,但使用DNS的应用程序可以自由施加任何他们喜欢的限制,许多应用程序都有。上述规则允许域名标签包含异常字符,如嵌入空间,许多应用认为这是一个坏主意。例如,SMTP协议[RFC5321]的原始规范[RFC0821]限制了电子邮件地址中可用的字符集。现在正在努力定义SMTP的扩展,以支持国际化的电子邮件地址和邮件头。有关此主题的更多讨论,请参阅EAI框架[RFC4952]。

Shortly after the DNS Clarifications [RFC2181] and IETF character sets and languages policy [RFC2277] were published, the need for internationalized names within private namespaces (i.e., within enterprises) arose. The current (and past, predating IDNA and the prefixed ACE conventions) practice within enterprises that support other languages is to put UTF-8 names in their internal DNS servers in a private namespace. For example, "Using the UTF-8 Character Set in the Domain Name System" [UTF8-DNS] was first written in 1997, and was then widely deployed in Windows. The use of UTF-8 names in DNS was similarly implemented and deployed in Mac OS, simply by virtue of the fact that applications blindly passed UTF-8 strings to the name resolution APIs, the name resolution APIs blindly passed those UTF-8 strings to the DNS servers, and the DNS servers correctly answered those queries. From the user's point of view, everything worked properly without any special new code being written, except that ASCII is matched case-insensitively whereas UTF-8 is not (although some enterprise DNS servers reportedly attempt to do case-insensitive matching on UTF-8 within private namespaces, an action that causes other problems and violates a subsequent prohibition [RFC4343]). Within a private namespace, and especially in light of the IETF UTF-8 policy [RFC2277], it was reasonable to assume that binary strings were encoded in UTF-8.

DNS澄清[RFC2181]和IETF字符集和语言政策[RFC2277]发布后不久,就出现了在私有名称空间(即企业内部)中使用国际化名称的需求。在支持其他语言的企业中,当前(和过去,早于IDNA和带前缀的ACE约定)的做法是将UTF-8名称放在私有名称空间中的内部DNS服务器中。例如,“在域名系统中使用UTF-8字符集”[UTF8-DNS]最早是在1997年编写的,然后在Windows中广泛部署。DNS中UTF-8名称的使用类似地在Mac OS中实现和部署,只是因为应用程序盲目地将UTF-8字符串传递给名称解析API,名称解析API盲目地将这些UTF-8字符串传递给DNS服务器,DNS服务器正确地回答了这些查询。从用户的角度来看,除了ASCII不区分大小写,而UTF-8不区分大小写之外,一切都正常工作,没有编写任何特殊的新代码(尽管据报道,一些企业DNS服务器试图在私有名称空间内的UTF-8上进行不区分大小写的匹配,但这一行为会导致其他问题并违反随后的禁令[RFC4343])。在私有名称空间内,尤其是根据IETF UTF-8策略[RFC2277],可以合理地假设二进制字符串是用UTF-8编码的。

As implied earlier, there are also issues with mapping strings to some canonical form, independent of the encoding. Such issues are not discussed in detail in this document. They are discussed to some extent in, for example, Section 3 of "Unicode Format for Network Interchange" [RFC5198], and are left as opportunities for elaboration in other documents.

正如前面所暗示的,将字符串映射到某种规范形式也存在问题,与编码无关。本文件未详细讨论此类问题。例如,“网络交换用Unicode格式”[RFC5198]的第3节在某种程度上对其进行了讨论,并在其他文件中作为详细阐述的机会。

A few years after UTF-8 was already in use in private namespaces in DNS, the strategy of using a reserved prefix and an ASCII-compatible

UTF-8已经在DNS的私有名称空间中使用了几年之后,使用保留前缀和ASCII兼容的策略

encoding (ACE) was developed for IDNA. That strategy included the Punycode algorithm, which began to be developed (during the period from 2002 [IDN-PUNYCODE] to 2003 [RFC3492]) for use in the public DNS namespace. There were a number of reasons for this. One such reason the prefixed ACE strategy was selected for the public DNS namespace had to do with the fact that other encodings such as ISO 8859-1 were also in use in DNS and the various encodings were not necessarily distinguishable from each other. Another reason had to do with concerns about whether the details of IDNA, including the use of the Punycode algorithm, were an adequate solution to the problems that were posed. If either the Punycode algorithm or fundamental aspects of character handling were wrong, and had to be changed to something incompatible, it would be possible to switch to a new prefix or adopt another model entirely. Only the part of the public DNS namespace that starts a label with "xn--" would be polluted.

为IDNA开发了编码(ACE)。该策略包括Punycode算法,该算法开始开发(在2002年[IDN-Punycode]至2003年[RFC3492]期间),用于公共DNS名称空间。原因有很多。为公共DNS名称空间选择带前缀的ACE策略的一个原因是,DNS中也使用了其他编码,如ISO 8859-1,并且各种编码不一定彼此区分。另一个原因是担心IDNA的细节,包括Punycode算法的使用,是否足以解决所提出的问题。如果Punycode算法或字符处理的基本方面出错,必须更改为不兼容的内容,则可以切换到新前缀或完全采用另一种模型。只有公共DNS命名空间中以“xn--”开头的部分会受到污染。

Today the algorithm is seen as being about as good as it can realistically be, so moving to a different encoding (UTF-8 as suggested in this document) that can be viewed as "native" would not be as risky as it would have been in 2002.

如今,该算法被视为与实际情况一样好,因此,转向一种可以被视为“本机”的不同编码(本文中建议的UTF-8)不会像2002年那样危险。

In any case, the publication of Punycode [RFC3492] and the dependencies on it in the IDNA Protocol document [RFC5891] and the earlier IDNA specification [RFC3490] thus resulted in having to use different encodings for different namespaces (where UTF-8 for private namespaces was already deployed). Hence, referring back to Figure 2, a different encoding scheme may be in use on the Internet vs. a local network.

在任何情况下,在IDNA协议文档[RFC5891]和早期IDNA规范[RFC3490]中发布Punycode[RFC3492]及其依赖关系,都会导致必须对不同的名称空间使用不同的编码(其中已部署了专用名称空间的UTF-8)。因此,回到图2,与本地网络相比,互联网上可能使用不同的编码方案。

In general, a host may be connected to zero or more networks using private namespaces, plus potentially the public namespace. Applications that convert a U-label form IDN to an A-label before calling getaddrinfo() will incur name resolution failures if the name is actually registered in a private namespace in some other encoding (e.g., UTF-8). Having libraries or protocols convert from A-labels to the encoding used by a private namespace (e.g., UTF-8) would require changes to APIs and/or servers, which IDNA was intended to avoid.

通常,一台主机可以使用专用名称空间(可能还有公共名称空间)连接到零个或多个网络。如果在调用getaddrinfo()之前将U-label表单IDN转换为a-label的应用程序的名称实际上是以某种其他编码(例如UTF-8)在私有命名空间中注册的,则会导致名称解析失败。让库或协议从A标签转换为私有名称空间(如UTF-8)使用的编码需要对API和/或服务器进行更改,IDNA旨在避免这种情况。

Also, a fully-qualified domain name (FQDN) to be resolved may be obtained directly from an application, or it may be composed by the DNS resolver itself from a single label obtained from an application by using a configured suffix search list, and the resulting FQDN may use multiple encodings in different labels. For more information on the suffix search list, see Section 6 of "Common DNS Implementation Errors and Suggested Fixes" [RFC1536], the DHCP Domain Search Option [RFC3397], and Section 4 of "DNS Configuration options for DHCPv6" [RFC3646].

此外,要解析的完全限定域名(FQDN)可以直接从应用程序获得,也可以由DNS解析器本身通过使用配置的后缀搜索列表从应用程序获得的单个标签组成,并且生成的FQDN可以在不同的标签中使用多个编码。有关后缀搜索列表的更多信息,请参阅“常见DNS实现错误和建议修复程序”[RFC1536]第6节、DHCP域搜索选项[RFC3397]和“DHCPv6的DNS配置选项”[RFC3646]第4节。

As noted in Section 6 of "Common DNS Implementation Errors and Suggested Fixes" [RFC1536], the community has had bad experiences (e.g., security problems [RFC1535]) with "searching" for domain names by trying multiple variations or appending different suffixes. Such searching can yield inconsistent results depending on the order in which alternatives are tried. Nonetheless, the practice is widespread and must be considered.

如“常见DNS实现错误和建议修复程序”[RFC1536]第6节所述,社区在通过尝试多种变体或附加不同后缀“搜索”域名方面有过不好的经历(例如,安全问题[RFC1535])。这种搜索可能会产生不一致的结果,这取决于备选方案的尝试顺序。尽管如此,这种做法还是很普遍,必须加以考虑。

The practice of searching for names, whether by the use of a suffix search list or by searching in different namespaces, can yield inconsistent results. For example, even when a suffix search list is only used when an application provides a name containing no dots, two clients with different configured suffix search lists can get different answers, and the same client could get different answers at different times if it changes its configuration (e.g., when moving to another network). A deeper discussion of this topic is outside the scope of this document.

搜索名称的实践,无论是使用后缀搜索列表还是在不同的名称空间中搜索,都可能产生不一致的结果。例如,即使仅当应用程序提供不包含点的名称时才使用后缀搜索列表,具有不同配置后缀搜索列表的两个客户端也可以获得不同的答案,并且如果同一客户端更改其配置(例如,移动到另一个网络时),则该客户端可以在不同的时间获得不同的答案。对这一主题的深入讨论超出了本文件的范围。

3.1. Examples
3.1. 例子

Some examples of cases that can happen in existing implementations today (where {non-ASCII} below represents some user-entered non-ASCII string) are:

目前在现有实现中可能发生的一些情况示例(其中,下面的{non ASCII}表示一些用户输入的非ASCII字符串)如下:

o User types in {non-ASCII}.{non-ASCII}.com, and the application passes it, in the form of a UTF-8 string, to getaddrinfo() or gethostbyname() or equivalent.

o 用户在{non-ASCII}.{non-ASCII}.com中输入,应用程序以UTF-8字符串的形式将其传递给getaddrinfo()或gethostbyname()或等效对象。

1. The DNS resolver passes the (UTF-8) string unmodified to a DNS server.

1. DNS解析器将未经修改的(UTF-8)字符串传递给DNS服务器。

o User types in {non-ASCII}.{non-ASCII}.com, and the application passes it to a name resolution API that accepts strings in some other encoding such as UTF-16, e.g., GetAddrInfoW() on Windows.

o 用户在{non-ASCII}.{non-ASCII}.com中输入,应用程序将其传递给名称解析API,该API接受其他编码(如UTF-16)中的字符串,例如Windows上的GetAddrInfoW()。

1. The name resolution API decides to pass the string to DNS (and possibly other protocols).

1. 名称解析API决定将字符串传递给DNS(可能还有其他协议)。

2. The DNS resolver converts the name from UTF-16 to UTF-8 and passes the query to a DNS server.

2. DNS解析器将名称从UTF-16转换为UTF-8,并将查询传递给DNS服务器。

o User types in {non-ASCII}.{non-ASCII}.com, but the application first converts it to A-label form such that the name that is passed to name resolution APIs is (say) xn--e1afmkfd.xn--80akhbyknj4f.com.

o 用户在{non ASCII}.{non ASCII}.com中输入,但应用程序首先将其转换为A标签形式,以便传递给名称解析API的名称是(比如)xn--e1afmkfd.xn--80akhbyknj4f.com。

1. The name resolution API decides to pass the string to DNS (and possibly other protocols).

1. 名称解析API决定将字符串传递给DNS(可能还有其他协议)。

2. The DNS resolver passes the string unmodified to a DNS server.

2. DNS解析程序将未经修改的字符串传递给DNS服务器。

3. If the name is not found in DNS, the name resolution API decides to try another protocol, say mDNS.

3. 如果在DNS中找不到该名称,则名称解析API决定尝试另一种协议,例如mDNS。

4. The query goes out in mDNS, but since mDNS specified that names are to be registered in UTF-8, the name isn't found since it was encoded as an A-label in the query.

4. 查询在MDN中发出,但由于MDN指定名称将在UTF-8中注册,因此找不到该名称,因为该名称在查询中被编码为A标签。

o User types in {non-ASCII}, and the application passes it, in the form of a UTF-8 string, to getaddrinfo() or equivalent.

o 用户输入{non ASCII},应用程序以UTF-8字符串的形式将其传递给getaddrinfo()或等效对象。

1. The name resolution API decides to pass the string to DNS (and possibly other protocols).

1. 名称解析API决定将字符串传递给DNS(可能还有其他协议)。

2. The DNS resolver will append suffixes in the suffix search list, which may contain UTF-8 characters if the local network uses a private namespace.

2. DNS解析程序将在后缀搜索列表中附加后缀,如果本地网络使用专用名称空间,该列表可能包含UTF-8字符。

3. Each FQDN in turn will then be sent in a query to a DNS server, until one succeeds.

3. 然后依次将每个FQDN以查询方式发送到DNS服务器,直到其中一个成功。

o User types in {non-ASCII}, but the application first converts it to an A-label, such that the name that is passed to getaddrinfo() or equivalent is (say) xn--e1afmkfd.

o 用户在{non ASCII}中键入,但应用程序首先将其转换为A标签,这样传递给getaddrinfo()或等效项的名称就是(比如)xn--e1afmkfd。

1. The name resolution API decides to pass the string to DNS (and possibly other protocols).

1. 名称解析API决定将字符串传递给DNS(可能还有其他协议)。

2. The DNS stub resolver will append suffixes in the suffix search list, which may contain UTF-8 characters if the local network uses a private namespace, resulting in (say) xn--e1afmkfd.{non-ASCII}.com

2. DNS存根解析器将在后缀搜索列表中附加后缀,如果本地网络使用专用名称空间,则可能包含UTF-8字符,从而导致(比如)xn--e1afmkfd.{non ASCII}.com

3. Each FQDN in turn will then be sent in a query to a DNS server, until one succeeds.

3. 然后依次将每个FQDN以查询方式发送到DNS服务器,直到其中一个成功。

4. Since the private namespace in this case uses UTF-8, the above queries fail, since the A-label version of the name was not registered in that namespace.

4. 由于本例中的私有名称空间使用UTF-8,因此上述查询将失败,因为名称的A标签版本未在该名称空间中注册。

o User types in {non-ASCII1}.{non-ASCII2}.{non-ASCII3}.com, where {non-ASCII3}.com is a public namespace using IDNA and A-labels, but {non-ASCII2}.{non-ASCII3}.com is a private namespace using UTF-8, which is accessible to the user. The application passes the name, in the form of a UTF-8 string, to getaddrinfo() or equivalent.

o {non-ASCII1}.{non-ASCII2}.{non-ASCII3}.com中的用户类型,其中{non-ASCII3}.com是使用IDNA和a标签的公共命名空间,而{non-ASCII2}.{non-ASCII3}.com是使用UTF-8的私有命名空间,用户可以访问。应用程序以UTF-8字符串的形式将名称传递给getaddrinfo()或等效对象。

1. The name resolution API decides to pass the string to DNS (and possibly other protocols).

1. 名称解析API决定将字符串传递给DNS(可能还有其他协议)。

2. The DNS resolver tries to locate the authoritative server, but fails the lookup because it cannot find a server for the UTF-8 encoding of {non-ASCII3}.com, even though it would have access to the private namespace. (To make this work, the private namespace would need to include the UTF-8 encoding of {non-ASCII3}.com.)

2. DNS解析程序尝试定位权威服务器,但查找失败,因为它找不到{non-ASCII3}.com的UTF-8编码的服务器,即使它可以访问私有命名空间。(为了实现这一点,私有名称空间需要包含{non-ASCII3}.com的UTF-8编码。)

When users use multiple applications, some of which do A-label conversion prior to passing a name to name resolution APIs, and some of which do not, odd behavior can result which at best violates the Principle of Least Surprise, and at worst can result in security vulnerabilities.

当用户使用多个应用程序时,其中一些应用程序在传递名称到名称解析API之前进行A标签转换,而另一些应用程序则不进行A标签转换,这可能会导致奇怪的行为,充其量会违反最小意外原则,最坏情况下会导致安全漏洞。

First consider two competing applications, such as web browsers, that are designed to achieve the same task. If the user types the same name into each browser, one may successfully resolve the name (and hence access the desired content) because the encoding scheme is correct, while the other may fail name resolution because the encoding scheme is incorrect. Hence the issue can incent users to switch to another application (which in some cases means switching to an IDNA application, and in other cases means switching away from an IDNA application).

首先考虑两个相互竞争的应用程序,例如Web浏览器,它们是为了实现相同的任务而设计的。如果用户在每个浏览器中键入相同的名称,其中一个可能会成功解析名称(从而访问所需的内容),因为编码方案是正确的,而另一个可能会因为编码方案不正确而无法解析名称。因此,该问题可能会激励用户切换到另一个应用程序(在某些情况下意味着切换到IDNA应用程序,在其他情况下意味着从IDNA应用程序切换)。

Next consider two separate applications where one is designed to be launched from the other, for example a web browser launching a media player application when the link to a media file is clicked. If both types of content (web pages and media files in this example) are hosted at the same IDN in a private namespace, but one application converts to A-labels before calling name resolution APIs and the other does not, the user may be able to access a web page, click on the media file causing the media player to launch and attempt to retrieve the media file, which will then fail because the IDN encoding scheme was incorrect. Or even worse, if an attacker is able to register the same name in the other encoding scheme, the user may get the content from the attacker's machine. This is similar to a normal phishing attack, except that the two names represent exactly the same Unicode characters.

接下来,考虑两个单独的应用程序,其中一个设计为从另一个应用程序启动,例如,当点击到媒体文件的链接时,Web浏览器启动媒体播放器应用程序。如果两种类型的内容(本例中的网页和媒体文件)都托管在私有名称空间中的同一IDN上,但一个应用程序在调用名称解析API之前转换为a标签,而另一个则不转换,则用户可能能够访问网页,单击导致媒体播放器启动的媒体文件并尝试检索该媒体文件,然后检索将失败,因为IDN编码方案不正确。甚至更糟糕的是,如果攻击者能够在其他编码方案中注册相同的名称,则用户可能会从攻击者的计算机中获取内容。这类似于普通的网络钓鱼攻击,只是这两个名称表示完全相同的Unicode字符。

4. Recommendations
4. 建议

On many platforms, the name resolution library will automatically use a variety of protocols to search a variety of namespaces, which might be using UTF-8 or other encodings. In addition, even when only the DNS protocol is used, in many operational environments, a private DNS

在许多平台上,名称解析库将自动使用各种协议来搜索各种名称空间,这些名称空间可能使用UTF-8或其他编码。此外,即使仅使用DNS协议,在许多操作环境中,也会使用专用DNS

namespace using UTF-8 is also deployed and is automatically searched by the name resolution library.

还部署了使用UTF-8的命名空间,并由名称解析库自动搜索。

As explained earlier, using multiple canonical formats, and multiple encodings in different protocols or even in different places in the same namespace creates problems. Because of this, and the fact that both IDNA A-labels and UTF-8 are in use as encoding mechanisms for domain names today, we make the recommendations described below.

如前所述,使用多个规范格式以及不同协议中的多个编码,甚至在同一名称空间中的不同位置,都会产生问题。因此,鉴于IDNA A标签和UTF-8目前都被用作域名的编码机制,我们提出以下建议。

It is inappropriate for an application that calls a general-purpose name resolution library to convert a name to an A-label unless the application is absolutely certain that, in all environments where the application might be used, only the global DNS that uses IDNA A-labels actually will be used to resolve the name.

对于调用通用名称解析库的应用程序来说,将名称转换为a标签是不合适的,除非该应用程序完全确定,在可能使用该应用程序的所有环境中,只有使用IDNA a标签的全局DNS才会实际用于解析名称。

Instead, conversion to A-label form, or any other special encoding required by a particular name-lookup protocol, should be done only by an entity that knows which protocol will be used (e.g., the DNS resolver, or getaddrinfo() upon deciding to pass the name to DNS), rather than by general applications that call protocol-independent name resolution APIs. (Of course, applications that store strings internally in a different format than that required by those APIs, need to convert strings from their own internal format to the format required by the API.) Similarly, even if an application can know that DNS is to be used, the conversion to A-labels should be done only by an entity that knows which part of the DNS namespace will be used.

相反,转换为A标签形式,或特定名称查找协议所需的任何其他特殊编码,只能由知道将使用哪个协议的实体(例如,DNS解析程序,或决定将名称传递给DNS时的getaddrinfo())完成,而不是由调用协议无关的名称解析API的一般应用程序。(当然,以不同于API要求的格式在内部存储字符串的应用程序需要将字符串从其自身的内部格式转换为API要求的格式。)类似地,即使应用程序可以知道要使用DNS,只有知道将使用DNS命名空间的哪一部分的实体才能完成到A标签的转换。

That is, a more intelligent DNS resolver would be more liberal in what it would accept from an application and be able to query for both a name in A-label form (e.g., over the Internet) and a UTF-8 name (e.g., over a corporate network with a private namespace) in case the server only recognizes one. However, we might also take into account that the various resolution behaviors discussed earlier could also occur with record updates (e.g., with Dynamic Update [RFC2136]), resulting in some names being registered in a local network's private namespace by applications doing conversion to A-labels, and other names being registered using UTF-8. Hence, a name might have to be queried with both encodings to be sure to succeed without changes to DNS servers.

也就是说,更智能的DNS解析程序将更自由地接受来自应用程序的内容,并且能够查询a标签形式的名称(例如,通过Internet)和UTF-8名称(例如,通过具有专用名称空间的公司网络),以防服务器只识别一个名称。但是,我们也可以考虑到,前面讨论的各种解析行为也可能发生在记录更新中(例如,使用动态更新[RFC2136]),导致一些名称通过应用程序转换为a标签注册到本地网络的私有名称空间中,而其他名称则使用UTF-8注册。因此,可能必须使用两种编码查询名称,以确保在不更改DNS服务器的情况下成功查询。

Similarly, a more intelligent stub resolver would also be more liberal in what it would accept from a response as the value of a record (e.g., PTR) in that it would accept either UTF-8 (U-labels in the case of IDNA) or A-labels and convert them to whatever encoding is used by the application APIs to return strings to applications.

类似地,更智能的存根解析器也会更自由地接受响应中的记录值(例如PTR),因为它会接受UTF-8(IDNA中的U标签)或a标签,并将它们转换为应用程序API使用的任何编码,以将字符串返回给应用程序。

Indeed the choice of conversion within the resolver libraries is consistent with the quote from Section 6.2 of the original IDNA specification [RFC3490] stating that conversion using the Punycode algorithm (i.e., to A-labels) "might be performed inside these new versions of the resolver libraries".

事实上,在解析器库中选择的转换与原始IDNA规范[RFC3490]第6.2节中的引用一致,该节指出使用Punycode算法(即到A标签)的转换“可能在这些新版本的解析器库中执行”。

That said, some application-layer protocols (e.g., EPP Domain Name Mapping [RFC5731]) are defined to use A-labels rather than simply using UTF-8 as recommended by the IETF character sets and languages policy [RFC2277]. In this case, an application may receive a string containing A-labels and want to pass it to name resolution APIs. Again the recommendation that a resolver library be more liberal in what it would accept from an application would mean that such a name would be accepted and re-encoded as needed, rather than requiring the application to do so.

也就是说,一些应用层协议(例如,EPP域名映射[RFC5731])被定义为使用A标签,而不是按照IETF字符集和语言策略[RFC2277]的建议简单地使用UTF-8。在这种情况下,应用程序可能会收到包含a标签的字符串,并希望将其传递给名称解析API。同样,建议解析器库在从应用程序接受什么方面更加自由,这意味着这样的名称将被接受并根据需要重新编码,而不是要求应用程序这样做。

It is important that any APIs used by applications to pass names specify what encoding(s) the API uses. For example, GetAddrInfoW() on Windows specifies that it accepts UTF-16 and only UTF-16. In contrast, the original specification of getaddrinfo() [RFC3493] does not, and hence platforms vary in what they use (e.g., Mac OS uses UTF-8 whereas Windows uses Windows code pages).

应用程序用于传递名称的任何API都必须指定API使用的编码,这一点很重要。例如,Windows上的GetAddrInfoW()指定它接受UTF-16,并且只接受UTF-16。相反,getaddrinfo()[RFC3493]的原始规范没有,因此平台在使用什么方面有所不同(例如,Mac OS使用UTF-8,而Windows使用Windows代码页)。

Finally, the question remains about what, if anything, a DNS server should do to handle cases where some existing applications or hosts do IDNA queries using A-labels within the local network using a private namespace, and other existing applications or hosts send UTF-8 queries. It is undesirable to store different records for different encodings of the same name, since this introduces the possibility for inconsistency between them. Instead, a new DNS server serving a private namespace using UTF-8 could potentially treat encoding-conversion in the same way as case-insensitive comparison which a DNS server is already required to do, as long the DNS server has some way to know what the encoding is. Two encodings are, in this sense, two representations of the same name, just as two case-different strings are. However, whereas case comparison of non-ASCII characters is complicated by ambiguities (as explained in the IAB's Review and Recommendations for Internationalized Domain Names [RFC4690]), encoding conversion between A-labels and U-labels is unambiguous.

最后,问题仍然是DNS服务器应该做些什么(如果有的话)来处理某些现有应用程序或主机使用专用名称空间在本地网络中使用a标签执行IDNA查询,而其他现有应用程序或主机则发送UTF-8查询的情况。不希望为相同名称的不同编码存储不同的记录,因为这可能会导致它们之间的不一致。相反,使用UTF-8为私有名称空间提供服务的新DNS服务器可能以与DNS服务器已经需要进行的不区分大小写比较相同的方式处理编码转换,只要DNS服务器有某种方式知道编码是什么。从这个意义上讲,两种编码是相同名称的两种表示,就像两个大小写不同的字符串一样。然而,尽管非ASCII字符的大小写比较因歧义而变得复杂(如IAB的审查和国际化域名建议[RFC4690]中所述),但A标签和U标签之间的编码转换是明确的。

5. Security Considerations
5. 安全考虑

Having applications convert names to prefixed ACE format (A-labels) before calling name resolution can result in security vulnerabilities. If the name is resolved by protocols or in zones for which records are registered using other encoding schemes, an attacker can claim the A-label version of the same name and hence

让应用程序在调用名称解析之前将名称转换为带前缀的ACE格式(A标签),可能会导致安全漏洞。如果名称通过协议解析,或者在使用其他编码方案注册记录的区域中解析,则攻击者可以声明相同名称的A标签版本,从而

trick the victim into accessing a different destination. This can be done for any non-ASCII name, even when there is no possible confusion due to case, language, or other issues. Other types of confusion beyond those resulting simply from the choice of encoding scheme are discussed in "Review and Recommendations for IDNs" [RFC4690].

诱骗受害者进入另一个目的地。可以对任何非ASCII名称执行此操作,即使不会因大小写、语言或其他问题而造成混淆。在“IDN的回顾和建议”[RFC4690]中讨论了编码方案选择以外的其他类型的混淆。

Designers and users of encodings that represent Unicode strings in terms of ASCII should also consider whether trademark protection or phishing are issues, e.g., if one name would be encoded in a way that would be naturally associated with another organization or product.

以ASCII表示Unicode字符串的编码的设计者和用户也应该考虑商标保护或钓鱼是否是问题,例如,如果一个名称将以与另一个组织或产品自然相关的方式编码。

6. Acknowledgements
6. 致谢

The authors wish to thank Patrik Faltstrom, Martin Duerst, JFC Morfin, Ran Atkinson, S. Moonesamy, Paul Hoffman, and Stephane Bortzmeyer for their careful review and helpful suggestions. It is also interesting to note that none of the first three individuals' names above can be spelled out and written correctly in ASCII text. Furthermore, one of the IAB member's names below (Andrei Robachevsky) cannot be written in the script as it appears on his birth certificate.

作者希望感谢Patrik Faltstrom、Martin Duerst、JFC Morfin、Ran Atkinson、S.Moonesamy、Paul Hoffman和Stephane Bortzmeyer的仔细审查和有益建议。值得注意的是,前面三个人的名字都不能用ASCII文本正确拼写和书写。此外,以下IAB成员之一(Andrei Robachevsky)的名字不能写在其出生证明上的脚本中。

7. IAB Members at the Time of Approval
7. 批准时的IAB成员

Bernard Aboba Marcelo Bagnulo Ross Callon Spencer Dawkins Vijay Gill Russ Housley John Klensin Olaf Kolkman Danny McPherson Jon Peterson Andrei Robachevsky Dave Thaler Hannes Tschofenig

伯纳德·阿博巴·马塞洛·巴格努洛·罗斯·卡隆·斯宾塞·道金斯·维杰·吉尔·罗斯·霍斯利·约翰·克伦·奥拉夫·科尔克曼·丹尼·麦克弗森·乔恩·彼得森·安德烈·罗巴切夫斯基·戴夫·泰勒·汉内斯·茨霍芬尼

8. References
8. 工具书类
8.1. Normative References
8.1. 规范性引用文件

[10646] International Organization for Standardization, "Information Technology - Universal Multiple-octet coded Character Set (UCS)".

[10646]国际标准化组织,“信息技术-通用多八位编码字符集(UCS)”。

ISO/IEC Standard 10646, comprised of ISO/IEC 10646- 1:2000, "Information technology -- Universal Multiple-Octet Coded Character Set (UCS) -- Part 1: Architecture and Basic Multilingual Plane", ISO/IEC 10646-2:2001, "Information technology -- Universal Multiple-Octet Coded Character Set (UCS) -- Part 2: Supplementary Planes" and ISO/IEC 10646- 1:2000/Amd 1:2002, "Mathematical symbols and other characters".

ISO/IEC标准10646,由ISO/IEC 10646-1:2000“信息技术——通用多八位编码字符集(UCS)——第1部分:体系结构和基本多语言平面”组成,ISO/IEC 10646-2:2001,“信息技术——通用多八位编码字符集(UCS)——第2部分:补充平面”以及ISO/IEC 10646-1:2000/Amd 1:2002《数学符号和其他字符》。

[Unicode] The Unicode Consortium. The Unicode Standard, Version 5.1.0, defined by: "The Unicode Standard, Version 5.0", Boston, MA, Addison-Wesley, 2007, ISBN 0-321-48091-0, as amended by Unicode 5.1.0 (http://www.unicode.org/versions/Unicode5.1.0/).

[Unicode]Unicode联盟。Unicode标准,版本5.1.0,定义为:“Unicode标准,版本5.0”,波士顿,马萨诸塞州,Addison-Wesley,2007年,ISBN 0-321-48091-0,经Unicode 5.1.0修订(http://www.unicode.org/versions/Unicode5.1.0/).

8.2. Informative References
8.2. 资料性引用

[DNS-MULTICAST] Cheshire, S. and M. Krochmal, "Multicast DNS", Work in Progress, February 2011.

[DNS-MULTICAST]Cheshire,S.和M.Krochmal,“多播DNS”,正在进行的工作,2011年2月。

[IDN-PUNYCODE] Costello, A., "Punycode version 0.3.3", Work in Progress, January 2002.

[IDN-PUNYCODE]Costello,A.,“PUNYCODE版本0.3.3”,正在进行的工作,2002年1月。

[ISO8859] International Organization for Standardization, "Information technology -- 8-bit single-byte coded graphic character sets".

[ISO8859]国际标准化组织,“信息技术——8位单字节编码图形字符集”。

                    ISO/IEC Standard 8859, comprised of ISO/IEC 8859-
                    1:1998, Part 1: Latin alphabet No. 1 - ISO/IEC 8859-
                    2:1999, Part 2: Latin alphabet No. 2 - ISO/IEC 8859-
                    3:1999, Part 3: Latin alphabet No. 3 - ISO/IEC 8859-
                    4:1998, Part 4: Latin alphabet No. 4 - ISO/IEC 8859-
                    5:1999, Part 5: Latin/Cyrillic alphabet - ISO/IEC
                    8859-6:1999, Part 6: Latin/Arabic alphabet - ISO/IEC
                    8859-7:2003, Part 7: Latin/Greek alphabet - ISO/IEC
                    8859-8:1999, Part 8: Latin/Hebrew alphabet - ISO/IEC
                    8859-9:1999, Part 9: Latin alphabet No. 5 - ISO/IEC
                    8859-10:1998, Part 10: Latin alphabet No. 6 - ISO/
                    IEC 8859-11:2001, Part 11: Latin/Thai alphabet -
                    ISO/IEC 8859-13:1998, Part 13: Latin alphabet No. 7
        
                    ISO/IEC Standard 8859, comprised of ISO/IEC 8859-
                    1:1998, Part 1: Latin alphabet No. 1 - ISO/IEC 8859-
                    2:1999, Part 2: Latin alphabet No. 2 - ISO/IEC 8859-
                    3:1999, Part 3: Latin alphabet No. 3 - ISO/IEC 8859-
                    4:1998, Part 4: Latin alphabet No. 4 - ISO/IEC 8859-
                    5:1999, Part 5: Latin/Cyrillic alphabet - ISO/IEC
                    8859-6:1999, Part 6: Latin/Arabic alphabet - ISO/IEC
                    8859-7:2003, Part 7: Latin/Greek alphabet - ISO/IEC
                    8859-8:1999, Part 8: Latin/Hebrew alphabet - ISO/IEC
                    8859-9:1999, Part 9: Latin alphabet No. 5 - ISO/IEC
                    8859-10:1998, Part 10: Latin alphabet No. 6 - ISO/
                    IEC 8859-11:2001, Part 11: Latin/Thai alphabet -
                    ISO/IEC 8859-13:1998, Part 13: Latin alphabet No. 7
        

- ISO/IEC 8859-14:1998, Part 14: Latin alphabet No. 8 (Celtic) - ISO/IEC 8859-15:1999, Part 15: Latin alphabet No. 9 - ISO/IEC 8859-16:2001, Part 16: Latin alphabet No. 10.

- ISO/IEC 8859-14:1998,第14部分:第8号拉丁字母(凯尔特语)-ISO/IEC 8859-15:1999,第15部分:第9号拉丁字母-ISO/IEC 8859-16:2001,第16部分:第10号拉丁字母。

[MJD] Duerst, M., "The Properties and Promizes of UTF-8", 11th International Unicode Conference, San Jose , September 1997, <http://www.ifi.unizh.ch/mml/ mduerst/papers/PDF/IUC11-UTF-8.pdf>.

[MJD]Duerst,M.,“UTF-8的特性和推广”,第11届国际Unicode会议,圣何塞,1997年9月<http://www.ifi.unizh.ch/mml/ mduerst/papers/PDF/IUC11-UTF-8.PDF>。

[NIS] Sun Microsystems, "System and Network Administration", March 1990.

[NIS]Sun Microsystems,“系统和网络管理”,1990年3月。

[RFC0821] Postel, J., "Simple Mail Transfer Protocol", STD 10, RFC 821, August 1982.

[RFC0821]Postel,J.,“简单邮件传输协议”,STD 10,RFC 821,1982年8月。

[RFC0952] Harrenstien, K., Stahl, M., and E. Feinler, "DoD Internet host table specification", RFC 952, October 1985.

[RFC0952]Harrenstien,K.,Stahl,M.和E.Feinler,“国防部互联网主机表规范”,RFC 952,1985年10月。

[RFC1001] NetBIOS Working Group, "Protocol standard for a NetBIOS service on a TCP/UDP transport: Concepts and methods", STD 19, RFC 1001, March 1987.

[RFC1001]NetBIOS工作组,“TCP/UDP传输上NetBIOS服务的协议标准:概念和方法”,STD 19,RFC 10011987年3月。

[RFC1002] NetBIOS Working Group, "Protocol standard for a NetBIOS service on a TCP/UDP transport: Detailed specifications", STD 19, RFC 1002, March 1987.

[RFC1002]NetBIOS工作组,“TCP/UDP传输上NetBIOS服务的协议标准:详细规范”,STD 19,RFC 1002,1987年3月。

[RFC1034] Mockapetris, P., "Domain names - concepts and facilities", STD 13, RFC 1034, November 1987.

[RFC1034]Mockapetris,P.,“域名-概念和设施”,STD 13,RFC 1034,1987年11月。

[RFC1123] Braden, R., "Requirements for Internet Hosts - Application and Support", STD 3, RFC 1123, October 1989.

[RFC1123]Braden,R.,“互联网主机的要求-应用和支持”,STD 3,RFC 1123,1989年10月。

[RFC1468] Murai, J., Crispin, M., and E. van der Poel, "Japanese Character Encoding for Internet Messages", RFC 1468, June 1993.

[RFC1468]Murai,J.,Crispin,M.,和E.van der Poel,“互联网信息的日语字符编码”,RFC 1468,1993年6月。

[RFC1535] Gavron, E., "A Security Problem and Proposed Correction With Widely Deployed DNS Software", RFC 1535, October 1993.

[RFC1535]Gavron,E.,“广泛部署DNS软件的安全问题和建议纠正”,RFC 1535,1993年10月。

[RFC1536] Kumar, A., Postel, J., Neuman, C., Danzig, P., and S. Miller, "Common DNS Implementation Errors and Suggested Fixes", RFC 1536, October 1993.

[RFC1536]Kumar,A.,Postel,J.,Neuman,C.,Danzig,P.,和S.Miller,“常见DNS实现错误和建议修复”,RFC 1536,1993年10月。

[RFC2130] Weider, C., Preston, C., Simonsen, K., Alvestrand, H., Atkinson, R., Crispin, M., and P. Svanberg, "The Report of the IAB Character Set Workshop held 29 February - 1 March, 1996", RFC 2130, April 1997.

[RFC2130]Weider,C.,Preston,C.,Simonsen,K.,Alvestrand,H.,Atkinson,R.,Crispin,M.,和P.Svanberg,“1996年2月29日至3月1日举行的IAB字符集研讨会报告”,RFC 21301997年4月。

[RFC2136] Vixie, P., Thomson, S., Rekhter, Y., and J. Bound, "Dynamic Updates in the Domain Name System (DNS UPDATE)", RFC 2136, April 1997.

[RFC2136]Vixie,P.,Thomson,S.,Rekhter,Y.,和J.Bound,“域名系统中的动态更新(DNS更新)”,RFC 21361997年4月。

[RFC2181] Elz, R. and R. Bush, "Clarifications to the DNS Specification", RFC 2181, July 1997.

[RFC2181]Elz,R.和R.Bush,“DNS规范的澄清”,RFC 21811997年7月。

[RFC2277] Alvestrand, H., "IETF Policy on Character Sets and Languages", BCP 18, RFC 2277, January 1998.

[RFC2277]Alvestrand,H.,“IETF字符集和语言政策”,BCP 18,RFC 2277,1998年1月。

[RFC3397] Aboba, B. and S. Cheshire, "Dynamic Host Configuration Protocol (DHCP) Domain Search Option", RFC 3397, November 2002.

[RFC3397]Aboba,B.和S.Cheshire,“动态主机配置协议(DHCP)域搜索选项”,RFC 3397,2002年11月。

[RFC3490] Faltstrom, P., Hoffman, P., and A. Costello, "Internationalizing Domain Names in Applications (IDNA)", RFC 3490, March 2003.

[RFC3490]Faltstrom,P.,Hoffman,P.,和A.Costello,“应用程序中的域名国际化(IDNA)”,RFC 34902003年3月。

[RFC3492] Costello, A., "Punycode: A Bootstring encoding of Unicode for Internationalized Domain Names in Applications (IDNA)", RFC 3492, March 2003.

[RFC3492]Costello,A.,“Punycode:应用程序中国际化域名的Unicode引导字符串编码(IDNA)”,RFC 3492,2003年3月。

[RFC3493] Gilligan, R., Thomson, S., Bound, J., McCann, J., and W. Stevens, "Basic Socket Interface Extensions for IPv6", RFC 3493, February 2003.

[RFC3493]Gilligan,R.,Thomson,S.,Bound,J.,McCann,J.,和W.Stevens,“IPv6的基本套接字接口扩展”,RFC 3493,2003年2月。

[RFC3629] Yergeau, F., "UTF-8, a transformation format of ISO 10646", STD 63, RFC 3629, November 2003.

[RFC3629]Yergeau,F.,“UTF-8,ISO 10646的转换格式”,STD 63,RFC 3629,2003年11月。

[RFC3646] Droms, R., "DNS Configuration options for Dynamic Host Configuration Protocol for IPv6 (DHCPv6)", RFC 3646, December 2003.

[RFC3646]Droms,R.,“IPv6动态主机配置协议(DHCPv6)的DNS配置选项”,RFC 36462003年12月。

[RFC4343] Eastlake, D., "Domain Name System (DNS) Case Insensitivity Clarification", RFC 4343, January 2006.

[RFC4343]Eastlake,D.,“域名系统(DNS)案例不敏感澄清”,RFC 4343,2006年1月。

[RFC4690] Klensin, J., Faltstrom, P., Karp, C., and IAB, "Review and Recommendations for Internationalized Domain Names (IDNs)", RFC 4690, September 2006.

[RFC4690]Klensin,J.,Faltstrom,P.,Karp,C.,和IAB,“国际化域名(IDN)的审查和建议”,RFC 46902006年9月。

[RFC4795] Aboba, B., Thaler, D., and L. Esibov, "Link-local Multicast Name Resolution (LLMNR)", RFC 4795, January 2007.

[RFC4795]Aboba,B.,Thaler,D.,和L.Esibov,“链路本地多播名称解析(LLMNR)”,RFC 47952007年1月。

[RFC4952] Klensin, J. and Y. Ko, "Overview and Framework for Internationalized Email", RFC 4952, July 2007.

[RFC4952]Klensin,J.和Y.Ko,“国际化电子邮件的概述和框架”,RFC 49522007年7月。

[RFC5198] Klensin, J. and M. Padlipsky, "Unicode Format for Network Interchange", RFC 5198, March 2008.

[RFC5198]Klensin,J.和M.Padlipsky,“网络交换的Unicode格式”,RFC 51982008年3月。

[RFC5321] Klensin, J., "Simple Mail Transfer Protocol", RFC 5321, October 2008.

[RFC5321]Klensin,J.,“简单邮件传输协议”,RFC 53212008年10月。

[RFC5731] Hollenbeck, S., "Extensible Provisioning Protocol (EPP) Domain Name Mapping", STD 69, RFC 5731, August 2009.

[RFC5731]Hollenbeck,S.,“可扩展供应协议(EPP)域名映射”,STD 69,RFC 57312009年8月。

[RFC5890] Klensin, J., "Internationalized Domain Names for Applications (IDNA): Definitions and Document Framework", RFC 5890, August 2010.

[RFC5890]Klensin,J.,“应用程序的国际化域名(IDNA):定义和文档框架”,RFC 58902010年8月。

[RFC5891] Klensin, J., "Internationalized Domain Names in Applications (IDNA): Protocol", RFC 5891, August 2010.

[RFC5891]Klensin,J.,“应用程序中的国际化域名(IDNA):协议”,RFC 58912010年8月。

[UTF8-DNS] Kwan, S. and J. Gilroy, "Using the UTF-8 Character Set in the Domain Name System", Work in Progress, November 1997.

[UTF8-DNS]Kwan,S.和J.Gilroy,“在域名系统中使用UTF-8字符集”,正在进行的工作,1997年11月。

Authors' Addresses

作者地址

Dave Thaler Microsoft Corporation One Microsoft Way Redmond, WA 98052 USA

Dave Thaler微软公司美国华盛顿州雷德蒙微软大道一号,邮编:98052

   Phone: +1 425 703 8835
   EMail: dthaler@microsoft.com
        
   Phone: +1 425 703 8835
   EMail: dthaler@microsoft.com
        

John C Klensin 1770 Massachusetts Ave, Ste 322 Cambridge, MA 02140

马萨诸塞州剑桥322号马萨诸塞大道1770号约翰·C·克伦辛,邮编:02140

   Phone: +1 617 245 1457
   EMail: john+ietf@jck.com
        
   Phone: +1 617 245 1457
   EMail: john+ietf@jck.com
        

Stuart Cheshire Apple Inc. 1 Infinite Loop Cupertino, CA 95014

斯图尔特柴郡苹果公司,加利福尼亚州库珀蒂诺市无限环路1号,邮编95014

   Phone: +1 408 974 3207
   EMail: cheshire@apple.com
        
   Phone: +1 408 974 3207
   EMail: cheshire@apple.com