HOW TO USE UNICODE NCRs IN A WEB PAGE (HTML File)
NCR stands for Numeric Character Reference. All of the characters in
Unicode can be included as "text" in any HTML file which can be read
by all modern web browsers including Opera, Netscape, and Internet Explorer.
Microsoft® offers many free Unicode based fonts available covering many
different scripts, including Chinese.
Once you've assured that your favorite browser in installed in its latest
version on your system, you can visit:
http://www.microsoft.com/windows/ie/downloads/recommended/ime/default.asp
... This is Microsoft's IME page and offers language packs including fonts
and input methods for various writing systems.
There are several fine Unicode based commercial fonts available on the World
Wide Web.
Mr. Ronald Ogawa has a very nice font currently available as beta-test freeware
which includes Latin, Cyrillic, Greek and the UCAS (Unified Canadian Aboriginal
Syllabics). The UCAS are used for writing languages such as Cree, Naskapi, Ojibwe,
and Inuktitut. The font is called "Ballymun RO"
and is available at:
http://nexus.brocku.ca/rogawa/ucas
How To Add Special Characters (to HTML):
The web browser substitutes special characters from fonts whenever it finds
this sequence:
the ampersand symbol, the number sign, 99999, the semi-colon
(Where 99999 can be any number up to 65535)
So, A will produce the capital letter "A" because
the number 65 is the decimal code point assigned in Unicode for "A".
Of course, most of us would simply type the letter "A".
If the trademark symbol "™" is needed, however, it doesn't
appear on most keyboards.
™ will produce the trademark symbol.
Долина Кукол
will produce "Dolina Kukol" in the Cyrillic script. This is the book
title "Valley of the Dolls" in Russian:
Долина Кукол.
(Many people know that
Долина Кукол
was written by Жаклин Сьюзан.)
Although Unicode is comprehensive, it isn't yet complete. The Unicode
Consortium welcomes input from users of the various World's scripts.
It is possible to represent any of the letter-plus-diacritic combinations
found in Vietnamese with a single Unicode NCR (at least, as far as I can
tell...) But, some languages have combinations which aren't included in
Unicode as "precomposed" characters. One example is the Guarani language
which uses the letter g combined with the tilde. When a precomposed form
isn't directly encoded in Unicode it is necessary to use one of the characters
found in the combining diacritic range. So, to get the Latin letter 'g with
tilde', use the letter g followed by the NCR for the tilde as a combining
diacritic. Thus, " g̃ " should produce the symbol
"g̃"
Many of the newer e-mail programs can be set to handle HTML, so multilingual
e-mail is possible.
Several word processors allow "global search and replace" which
means that the word processor could substitute the NCR-macro any time it
finds a certain letter or combination of letters. For example, if the
HTML sheet needs a lot of the trademark symbols, the author could use
any keyboard symbol which isn't needed in the document and then "Find
and Replace" every appearance of that symbol with the desired NCR macro.
(I use the ` and I could replace all the ` signs with ™)
The more sophisticated word processors allow for a series of "global
search and replace" operations to be "programmed".
So, someone wishing to set type in the Cherokee script would be able to
type phonetically using the Latin script and then use the pre-programmed
series to convert the Latin script file into Cherokee Unicode NCRs:
for example:
FIND AND REPLACE ALL te WITH Ꮦ
FIND AND REPLACE ALL di WITH Ꮧ
FIND AND REPLACE ALL ti WITH Ꮨ
FIND AND REPLACE ALL do WITH Ꮩ
FIND AND REPLACE ALL du WITH Ꮪ
et cetera.
Folks used to have to make gifs or bmps (picture files) of any special symbol
or script and then insert the gifs into the document. Picture files take up
a lot of room and sometimes take forever to load. Using the font(s) that are
already installed on your web page reader's computer saves time and storage
space.
Sometimes, it is necessary to make a picture file of an unusual script.
The HTML author may wish to display a specific type face of a script (as one
example) because the author believes the reader's computer lacks the proper
font(s).
I do this by creating my HTML document, calling it up on my web browser
(off-line, because it is on my hard drive), using the "Screen Capture"
feature in my registered copy of IrfanView32, saving the "capture"
as a Windows BMP (bitmap) file, modifying the bitmap in Windows Paint (if
any modification is necessary, like trimming off the explorer bar),
opening the modified bitmap in IrfanView32,
then finally saving the bitmap as a gif. (Gifs are much smaller than bmps,
and thus take up less space and load much faster.)