The first column in this table contains the MARC-8 code (in hex) for the character as coming from the G0 graphic set, the second column contains the MARC-8 code (in hex) for the character as coming from the G1 graphic set, the third column contains the UCS/Unicode 16-bit code (in hex), the fourth column contains the UTF-8 code (in hex) for the UCS characters, the fifth column contains a representation of the character (where possible), the sixth column contains the MARC character name, followed by the UCS name.  If the MARC name is the same as or very similar to the UCS name, only the UCS name is given.  For some tables alternate encodings in Unicode and UTF-8 are given.  When that occurs the alternate Unicode and alternate UTF-8 columns follow the character name.

Revised June 2004 to add the Eszett (M+C7) and the Euro Sign (M+C8) to the MARC-8 set.

Revised September 2004 to change the mapping from MARC-8 to Unicode for the Ligature (M+EB and M+EC) from U+FE20 and U+FE21 to U+0361.

Revised September 2004 to change the mapping from MARC-8 to Unicode for the Double Tilde (M+FA and M+FB) from U+FE22 and U+FE23 to U+0360.

Revised March 2005 to change the mapping from MARC-8 to Unicode for the Alif (M+2E) from U+02BE to U+02BC.

Not all characters display in all browsers.  You must have one of the font families that shows each character set on your computer.  See the W3C site for a discussion of fonts.

MARC-8

MARC-8 as C1

UCS

UTF-8

CHAR

C?

NAME

ALT

ALT UTF-8

 

7F-87

 

 

 

 

[RESERVED]

 

 

 

88

0098

C298

˜

 

NON-SORT BEGIN / START OF STRING

 

 

 

89

009C

C29C

œ

 

NON-SORT END / STRING TERMINATOR

 

 

 

8A-8C

 

 

 

 

[RESERVED]

 

 

 

8D

200D

E2808D

 

 

JOINER / ZERO WIDTH JOINER

 

 

 

8E

200C

E2808C

 

 

NON-JOINER / ZERO WIDTH NON-JOINER

 

 

 

8F-A0

 

 

 

 

[RESERVED]

 

 

21

A1

0141

C581

Ł

 

UPPERCASE POLISH L / LATIN CAPITAL LETTER L WITH STROKE

 

 

22

A2

00D8

C398

Ø

 

UPPERCASE SCANDINAVIAN O / LATIN CAPITAL LETTER O WITH STROKE

 

 

23

A3

0110

C490

Đ

 

UPPERCASE D WITH CROSSBAR / LATIN CAPITAL LETTER D WITH STROKE

 

 

24

A4

00DE

C39E

Þ

 

UPPERCASE ICELANDIC THORN / LATIN CAPITAL LETTER THORN (Icelandic)

 

 

25

A5

00C6

C386

Æ

 

UPPERCASE DIGRAPH AE / LATIN CAPITAL LIGATURE AE

 

 

26

A6

0152

C592

Œ

 

UPPERCASE DIGRAPH OE / LATIN CAPITAL LIGATURE OE

 

 

27

A7

02B9

CAB9

ʹ

 

SOFT SIGN, PRIME / MODIFIER LETTER PRIME

 

 

28

A8

00B7

C2B7

·

 

MIDDLE DOT

 

 

29

A9

266D

E299AD

 

MUSIC FLAT SIGN

 

 

2A

AA

00AE

C2AE

®

 

PATENT MARK / REGISTERED SIGN

 

 

2B

AB

00B1

C2B1

±

 

PLUS OR MINUS / PLUS-MINUS SIGN

 

 

2C

AC

01A0

C6A0

Ơ

 

UPPERCASE O-HOOK / LATIN CAPITAL LETTER O WITH HORN

 

 

2D

AD

01AF

C6AF

Ư

 

UPPERCASE U-HOOK / LATIN CAPITAL LETTER U WITH HORN

 

 

2E

AE

02BC

CABC

ʼ

 

ALIF / MODIFIER LETTER APOSTROPHE

 

 

 

AF

 

 

 

 

[RESERVED]

 

 

30

B0

02BB

CABB

ʻ

 

AYN / MODIFIER LETTER TURNED COMMA

 

 

31

B1

0142

C582

ł

 

LOWERCASE POLISH L / LATIN SMALL LETTER L WITH STROKE

 

 

32

B2

00F8

C3B8

ø

 

LOWERCASE SCANDINAVIAN O / LATIN SMALL LETTER O WITH STROKE

 

 

33

B3

0111

C491

đ

 

LOWERCASE D WITH CROSSBAR / LATIN SMALL LETTER D WITH STROKE

 

 

34

B4

00FE

C3BE

þ

 

LOWERCASE ICELANDIC THORN / LATIN SMALL LETTER THORN (Icelandic)

 

 

35

B5

00E6

C3A6

æ

 

LOWERCASE DIGRAPH AE / LATIN SMALL LIGATURE AE

 

 

36

B6

0153

C593

œ

 

LOWERCASE DIGRAPH OE / LATIN SMALL LIGATURE OE

 

 

37

B7

02BA

CABA

ʺ

 

HARD SIGN, DOUBLE PRIME / MODIFIER LETTER DOUBLE PRIME

 

 

38

B8

0131

C4B1

ı

 

LOWERCASE TURKISH I / LATIN SMALL LETTER DOTLESS I

 

 

39

B9

00A3

C2A3

£

 

BRITISH POUND / POUND SIGN

 

 

3A

BA

00F0

C3B0

ð

 

LOWERCASE ETH / LATIN SMALL LETTER ETH (Icelandic)

 

 

 

BB

 

 

 

 

[RESERVED]

 

 

3C

BC

01A1

C6A1

ơ

 

LOWERCASE O-HOOK / LATIN SMALL LETTER O WITH HORN

 

 

3D

BD

01B0

C6B0

ư

 

LOWERCASE U-HOOK / LATIN SMALL LETTER U WITH HORN

 

 

 

BE

 

 

 

 

[RESERVED]

 

 

 

BF

 

 

 

 

[RESERVED]

 

 

40

C0

00B0

C2B0

°

 

DEGREE SIGN

 

 

41

C1

2113

E28493

 

SCRIPT SMALL L

 

 

42

C2

2117

E28497

 

SOUND RECORDING COPYRIGHT

 

 

43

C3

00A9

C2A9

©

 

COPYRIGHT SIGN

 

 

44

C4

266F

E299AF

 

MUSIC SHARP SIGN

 

 

45

C5

00BF

C2BF

¿

 

INVERTED QUESTION MARK

 

 

46

C6

00A1

C2A1

¡

 

INVERTED EXCLAMATION MARK

 

 

47

C7

00DF

C39F

ß

 

ESZETT SYMBOL

 

 

48

C8

20AC

E282AC

 

EURO SIGN

 

 

 

C9-DF

 

 

 

 

[RESERVED]

 

 

60

E0

0309

CC89

̉

C

PSEUDO QUESTION MARK / COMBINING HOOK ABOVE

 

 

61

E1

0300

CC80

̀

C

GRAVE / COMBINING GRAVE ACCENT (Varia)

 

 

62

E2

0301

CC81

́

C

ACUTE / COMBINING ACUTE ACCENT (Oxia)

 

 

63

E3

0302

CC82

̂

C

CIRCUMFLEX / COMBINING CIRCUMFLEX ACCENT

 

 

64

E4

0303

CC83

̃

C

TILDE / COMBINING TILDE

 

 

65

E5

0304

CC84

̄

C

MACRON / COMBINING MACRON

 

 

66

E6

0306

CC86

̆

C

BREVE / COMBINING BREVE (Vrachy)

 

 

67

E7

0307

CC87

̇

C

SUPERIOR DOT / COMBINING DOT ABOVE

 

 

68

E8

0308

CC88

̈

C

UMLAUT, DIAERESIS / COMBINING DIAERESIS (Dialytika)

 

 

69

E9

030C

CC8C

̌

C

HACEK / COMBINING CARON

 

 

6A

EA

030A

CC8A

̊

C

CIRCLE ABOVE, ANGSTROM / COMBINING RING ABOVE

 

 

6B

EB

0361

CDA1

͡

C

LIGATURE, FIRST HALF / COMBINING DOUBLE INVERTED BREVE

FE20

EFB8A0

6C

EC

Note 1

 

 

C

LIGATURE, SECOND HALF / COMBINING LIGATURE RIGHT HALF

FE21

EFB8A1

6D

ED

0315

CC95

̕

C

HIGH COMMA, OFF CENTER / COMBINING COMMA ABOVE RIGHT

 

 

6E

EE

030B

CC8B

̋̕

C

DOUBLE ACUTE / COMBINING DOUBLE ACUTE ACCENT

 

 

6F

EF

0310

CC90

̐

C

CANDRABINDU / COMBINING CANDRABINDU

 

 

70

F0

0327

CCA7

̧

C

CEDILLA / COMBINING CEDILLA

 

 

71

F1

0328

CCA8

̨

C

RIGHT HOOK, OGONEK / COMBINING OGONEK

 

 

72

F2

0323

CCA3

̣

C

DOT BELOW / COMBINING DOT BELOW

 

 

73

F3

0324

CCA4

̤

C

DOUBLE DOT BELOW / COMBINING DIAERESIS BELOW

 

 

74

F4

0325

CCA5

̥

C

CIRCLE BELOW / COMBINING RING BELOW

 

 

75

F5

0333

CCB3

̳

C

DOUBLE UNDERSCORE / COMBINING DOUBLE LOW LINE

 

 

76

F6

0332

CCB2

̲

C

UNDERSCORE / COMBINING LOW LINE

 

 

77

F7

0326

CCA6

̦

C

LEFT HOOK (COMMA BELOW) / COMBINING COMMA BELOW

 

 

78

F8

031C

CC9C

̜

C

RIGHT CEDILLA / COMBINING LEFT HALF RING BELOW

 

 

79

F9

032E

CCAE

̮

C

UPADHMANIYA / COMBINING BREVE BELOW

 

 

7A

FA

0360

CDA0

͠

C

DOUBLE TILDE, FIRST HALF / COMBINING DOUBLE TILDE

FE22

EFB8A2

7B

FB

Note 2

 

 

C

DOUBLE TILDE, SECOND HALF / COMBINING DOUBLE TILDE RIGHT HALF

FE23

EFB8A3

 

FC

 

 

 

 

[RESERVED]

 

 

 

FD

 

 

 

 

[RESERVED]

 

 

7E

FE

0313

CC93

̓

C

HIGH COMMA, CENTERED / COMBINING COMMA ABOVE (Psili)

 

 

 

FF

 

 

 

 

[RESERVED]

 

 

Note 1:  The Ligature that spans two characters is constructed of two halves in MARC-8:  EB (Ligature, first half) and EC (Ligature, second half).  The preferred Unicode/UTF-8 mapping is to the single character Ligature that spans two characters, U+0361.  The single character Ligature is encoded between the two characters to be spanned.  The two half Ligatures in Unicode, to which the Ligature has been mapped since 1996, are indicated in the mapping as alternatives, but their use is not recommended.  It is expected that font support for the single character Ligature mark will be more easily obtained than for the two halves.

Note 2:  The Double Tilde that spans two characters is constructed of two halves in MARC-8:  FA (Double Tilde, first half) and FB (Double Tilde, second half).  The preferred Unicode/UTF-8 mapping is to the single character Double Tilde that spans two characters, U+0360.  The single character Double Tilde is encoded between the two characters to be spanned.  The two half Double Tildes in Unicode, to which the MARC8 Double Tilde has been mapped since 1996, are indicated in the mapping as alternatives, but their use is not recommended.  It is expected that font support for the single character Double Tilde mark will be more easily obtained than for the two halves.

To return, select:

Part 5:  Code Tables

Character Sets and Encoding Options