Aleksandar Kostadinov wrote:
Jonathan Halliday wrote, On 10/11/2007 05:39 PM (EEST):
> Jason T. Greene wrote:
>> In reality there is nothing wrong with [underscores],
>> they just aren't allowed in DNS.
> That was correct in ARPANET times, but it's been untrue for quite a
> while, even before internationalized domain names became widely
> supported. DNS itself (as in the RFCs) does not make such a restriction.
> Certain overly conservative client programs may.
Try to register one...
lol. Actually this corner case is a great way of confusing
the customer support staff at most domain registrars. The
reasons this registration is more or less impossible at
present are somewhat convoluted. Warning: trivia ahead...
International domains names (IDNs) are basically anything
that contains values other than those allowed under the LDH
rule, which is as follows (rfc3696 section 2):
"The LDH rule, as updated, provides that the labels (words
or strings separated by periods) that make up a domain name
must consist of only the ASCII [ASCII] alphabetic and
numeric characters, plus the hyphen. No other symbols or
punctuation characters are permitted, nor is blank space.
If the hyphen is used, it is not permitted to appear at
either the beginning or end of a label. There is an
additional rule that essentially requires that top-level
domain names not be all-numeric."
So LDH is basically a real subset of ASCII. This becomes
important later.
International domains are not stuffed into the DNS in
unicode, even though it allows arbitrary octets, because
that would be way too straightforward and may break really
ancient clients that can only deal with LDH values.
Instead, the unicode value is transformed into an ACSII
Character Encoded (ACE) string, which is what actually gets
written into DNS. This approach was cooked up by ITEF and
Verisign (home of .com and .net). Most other registries
followed their lead. A browser plugin deals with taking what
the user types and converting it to an ACE value before
handing it off to the resolver library, which therefore does
not need to be IDN aware.
Here is the catch: The specific ACE algorithm they chose
does not translate the '_' character. It's valid ASCII, so
it goes through unaltered. Indeed, the output string is
valid ACSII so arguably the ACE is doing its job.
However, the output string is not a valid under the LDH
rule. So it can't go into the DNS as long as the registry's
policy remains to keep using only LDH.
Arggggg. So you can register domains with weird unicode
characters nobody in their right mind would use, but only if
the chosen ACE can encode them not only to ACSII but
specifically to LDH. But the ACE is not LDH aware, so it
doesn't always do that. Basically inputs that have any
character that is ACSII but not LDH may have problems,
whilst unicode values outside the ACSII range are fine. Oops.
Things get really interesting where the second level domain
is an IDN using LDH but the host part at the third level is
genuine unicode value that has not been through an ACE.
That's possible in cases where an organization is running
its own DNS servers for the third level, whilst the second
level is on .com registry public servers.
The resulting interaction between the browser plugin (which
is ACE encoding the entire name, host+domain), the client
resolver library (which may or may not be ACE and unicode
aware) and the various DNS servers, which may or may not be
operating recursively and may or may not each variously be
ACE and/or uincode aware, has the potential to be rather
messy to say the least. Interesting times ahead as this
becomes more prevalent I think.
Not that DNS was exactly straightforward to start with:
http://www.ccc.de/congress/2004/fahrplan/files/297-black-ops-of-dns-slide...
Jonathan.
--
------------------------------------------------------------
Registered Address: Red Hat UK Ltd, Amberley Place, 107-111
Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom.
Registered in UK and Wales under Company Registration No.
3798903 Directors: Michael Cunningham (USA), Charlie Peters
(USA) and David Owens (Ireland)