Domain hacks with unusual Unicode characters

( Original text by @edent )

Unicode contains a range of symbols which don’t get much use. For example, there are separate symbols for TradeMark — ™, Service Mark — ℠, and Prescriptions — ℞.

Nestling among the «Letterlike Symbols» are two curious entries. Both of these are single characters:

What’s interesting is both .tel and .no are Top-Level-Domains (TLD) on the Domain Name System (DNS).

So my contact site — https://edent.tel/ — can be written as — https://edent.℡/

And the Norwegian domain name registry NORID can be accessed at https://www.norid.№/

Copy and paste those links — they work in any browser!

Is this limited to TLDs?

No! This works ANYWHERE in a domain name. Copy and paste these examples:

  • Script https://ℰ????????ℳ????ℒℰ.????????ℳ/
  • Math Bold https://????????????????????????????.????????????/
  • Fraktur https://????????????????????????????.????????????/
  • Math bold italic https://????????????????????????????.????????????/
  • Math bold script https://????????????????????????????.????????????/
  • Double struck https://????????????????????????????.????????????/
  • Monospace https://????????????????????????????.????????????/
  • Super script https://ᵉˣᵃᵐᵖˡᵉ.ᶜᵒᵐ/
  • Sub script https://ₑₓₐₘₚₗₑ.cₒₘ/ NB not all characters supported
  • Math sans bold https://????????????????????????????.????????????/
  • Math sans bold italic https://????????????????????????????.????????????/
  • Math sans italic https://????????????????????????????.????????????/
  • Math Squared https://????????????????????????????.????????????/ NB the dot must not be squared
  • Circled https://ⓔⓧⓐⓜⓟⓛⓔ.ⓒⓞⓜ/ NB the dot must not be circled

There are a whole bunch more miscellaneous characters you can use:

How does this work?

Magic! Which is to say, I think it is the browser doing the conversion. DNS Servers don’t successfully reply to queries about .℡ domains.

The browser sees the .℡ and then follows the IDNA2008 process listed in RFC5895 to normalise it:

The ℡ entry is:

2121;TELEPHONE SIGN;So;0;ON;<compat> 0054 0045 004C;;;;N;T E L SYMBOL;;;;

U+0054 is T, U+0045 is E, U+004C is L.

You can test this in Python using:


python -c 'import sys;print sys.argv[<span class="hljs-number">1</span>].decode(<span class="hljs-string">"utf-8"</span>).encode(<span class="hljs-string">"idna"</span>)' <span class="hljs-string">"℡"</span>

Does this work?

Yes! I asked people on Twitter whether they could access my website using a .℡ — and it appeared to work on every modern browser and operating system.

It even works on command line tools like 

wget

 and 

curl

.

It does fail in some circumstances:

What are the limitations?

Two main ones:

  • Sites like Twitter and Facebook don’t recognise it as a valid URl and refuse to auto link it.
  • Some command line tools like 
    dig

     and 

    host

     don’t understand it


dig edent.℡

; &lt;&lt;&gt;&gt; DiG 9.10.6 &lt;&lt;&gt;&gt; edent.℡
;; global options: +cmd
;; Got answer:
;; -&gt;&gt;HEADER&lt;&lt;- opcode: QUERY, status: NXDOMAIN, id: 55282

Is this useful?

Obviously yes. This may be the most important discovery of the decade. You get cool looking URls and get to save a couple of characters on specific domains, at the minor expense of working inconsistently.

It could also be used for evading URl filters.

Every modern browser supports these «fancy» domain names — but most websites won’t automatically link to them. So sharing on Facebook doesn’t work.

Where can it be used?

Here are the single characters which can be normalised down to a valid TLD. They’re mostly country codes, but there are a few interesting exceptions:

  •  — US Military

  •  — .tel registry

  •  — Norway

  •  — Australia

  •  — Dominica

  •  — Panama

  •  — Namibia

  •  — Morocco

  •  — French Polynesia

  •  — Norfolk Island

  •  — Kyrgyzstan

  •  — Mali

  •  — Federated States of Micronesia

  •  — Finland

  •  — Myanmar

  •  — Cameroon

  •  & 

     — Comoros

  •  — Palestine

  •  — Montserrat

  •  & 

     — Republic of Maldives.

  •  — Palau

  •  & 

     — Malawi

  •  — Cocos (Keeling) Islands

  •  — Democratic Republic of Congo

  •  — Guyana

  •  — Philippines

  •  — Saint Pierre and Miquelon

  •  — Puerto Rico

  •  — Suriname

  •  — El Salvador

  •  — San Marino

  •  — Turkmenistan

  •  & 

     — São Tomé and Príncipe

  •  — Great Britain (Obsolete)

  • ß

     — South Sudan (Not available)

  •  — India and Indiana (subdomain of .us)

  •  & 

     — Virgin Islands and Virginia (subdomain of .us)

  •  — Florida (subdomain of .us)

  •  — New Mexico (subdomain of .us)

  •  — Nevada (subdomain of .us)

  •  — As part of .ovh

If you can find any more, please stick a comment in the box below.

You can always reach this blog post at:

https://????????????ₛᵖ????.ⓜ????????????/????????????/


РубрикиБез рубрики

Добавить комментарий

Ваш адрес email не будет опубликован. Обязательные поля помечены *

%d такие блоггеры, как: