Make it work. i.e. really DO match all possible digits. Else "won't
work the way you expect, regardless of what you expect" could be
construed a bug.
It's not our fault.
% unichars '\pN' '\D' 'NAME =~ /DIGIT/' | wc -l
91
So should \d without the /a modifier match only decimal digits we can ge
a value for. IIRC there are some numeric characters with very strange
values.
\d exactly maps to \p{Nd}. That's all. Whether something has a
numeric value or not doesn't matter.
Please play around with the programs I sent; they will help you
get a feel for these things.
--tom
% uniprops -a 1
U+0031 ‹1› \N{ DIGIT ONE }:
\w \d \pN \p{Nd}
AHex ASCII_Hex_Digit All Any Alnum ASCII Assigned Common Zyyy
Decimal_Number Digit Nd N Gr_Base Grapheme_Base Graph GrBase Hex
XDigit Hex_Digit ID_Continue IDC Number PerlWord PosixAlnum
PosixDigit PosixGraph PosixPrint Print Word XID_Continue XIDC
Age:1.1 Block=Basic_Latin Bidi_Class:EN Bidi_Class=European_Number
Bidi_Class:European_Number Bc=EN Block:ASCII Block:Basic_Latin
Blk=ASCII Canonical_Combining_Class:0
Canonical_Combining_Class=Not_Reordered
Canonical_Combining_Class:Not_Reordered Ccc=NR
Canonical_Combining_Class:NR Script=Common
Decomposition_Type:None Dt=None East_Asian_Width:Na
East_Asian_Width=Narrow East_Asian_Width:Narrow Ea=Na
Grapheme_Cluster_Break:Other GCB=XX Grapheme_Cluster_Break:XX
Grapheme_Cluster_Break=Other Hangul_Syllable_Type:NA
Hangul_Syllable_Type=Not_Applicable
Hangul_Syllable_Type:Not_Applicable Hst=NA
Joining_Group:No_Joining_Group Jg=NoJoiningGroup
Joining_Type:Non_Joining Jt=U Joining_Type:U
Joining_Type=Non_Joining Line_Break:NU Line_Break=Numeric
Line_Break:Numeric Lb=NU Numeric_Type:De Numeric_Type=Decimal
Numeric_Type:Decimal Nt=De Numeric_Value:1 Nv=1 Present_In:1.1
Age=1.1 In=1.1 Present_In:2.0 In=2.0 Present_In:2.1 In=2.1
Present_In:3.0 In=3.0 Present_In:3.1 In=3.1 Present_In:3.2 In=3.2
Present_In:4.0 In=4.0 Present_In:4.1 In=4.1 Present_In:5.0 In=5.0
Present_In:5.1 In=5.1 Present_In:5.2 In=5.2 Script:Common Sc=Zyyy
Script:Zyyy Sentence_Break:NU Sentence_Break=Numeric
Sentence_Break:Numeric SB=NU Word_Break:NU Word_Break=Numeric
Word_Break:Numeric WB=NU
% unichars -a '\p{Numeric_Value=1}' '\D'
¹ 185 0000B9 SUPERSCRIPT ONE
౹ 3193 000C79 TELUGU FRACTION DIGIT ONE FOR ODD POWERS OF FOUR
౼ 3196 000C7C TELUGU FRACTION DIGIT ONE FOR EVEN POWERS OF FOUR
፩ 4969 001369 ETHIOPIC DIGIT ONE
៱ 6129 0017F1 KHMER SYMBOL LEK ATTAK MUOY
₁ 8321 002081 SUBSCRIPT ONE
⅟ 8543 00215F FRACTION NUMERATOR ONE
Ⅰ 8544 002160 ROMAN NUMERAL ONE
ⅰ 8560 002170 SMALL ROMAN NUMERAL ONE
① 9312 002460 CIRCLED DIGIT ONE
⑴ 9332 002474 PARENTHESIZED DIGIT ONE
⒈ 9352 002488 DIGIT ONE FULL STOP
⓵ 9461 0024F5 DOUBLE CIRCLED DIGIT ONE
❶ 10102 002776 DINGBAT NEGATIVE CIRCLED DIGIT ONE
➀ 10112 002780 DINGBAT CIRCLED SANS-SERIF DIGIT ONE
➊ 10122 00278A DINGBAT NEGATIVE CIRCLED SANS-SERIF DIGIT ONE
㆒ 12690 003192 IDEOGRAPHIC ANNOTATION ONE MARK
㈠ 12832 003220 PARENTHESIZED IDEOGRAPH ONE
㊀ 12928 003280 CIRCLED IDEOGRAPH ONE
ꛦ 42726 00A6E6 BAMUM LETTER MO
𐄇 65799 010107 AEGEAN NUMBER ONE
𐅂 65858 010142 GREEK ACROPHONIC ATTIC ONE DRACHMA
𐅘 65880 010158 GREEK ACROPHONIC HERAEUM ONE PLETHRON
𐅙 65881 010159 GREEK ACROPHONIC THESPIAN ONE
𐅚 65882 01015A GREEK ACROPHONIC HERMIONIAN ONE
𐌠 66336 010320 OLD ITALIC NUMERAL ONE
𐏑 66513 0103D1 OLD PERSIAN NUMBER ONE
𐡘 67672 010858 IMPERIAL ARAMAIC NUMBER ONE
𐤖 67862 010916 PHOENICIAN NUMBER ONE
𐩀 68160 010A40 KHAROSHTHI DIGIT ONE
𐩽 68221 010A7D OLD SOUTH ARABIAN NUMBER ONE
𐭘 68440 010B58 INSCRIPTIONAL PARTHIAN NUMBER ONE
𐭸 68472 010B78 INSCRIPTIONAL PAHLAVI NUMBER ONE
𐹠 69216 010E60 RUMI DIGIT ONE
𒐕 74773 012415 CUNEIFORM NUMERIC SIGN ONE GESH2
𒐞 74782 01241E CUNEIFORM NUMERIC SIGN ONE GESHU
𒐬 74796 01242C CUNEIFORM NUMERIC SIGN ONE SHARU
𒐴 74804 012434 CUNEIFORM NUMERIC SIGN ONE BURU
𒑏 74831 01244F CUNEIFORM NUMERIC SIGN ONE BAN2
𒑘 74840 012458 CUNEIFORM NUMERIC SIGN ONE ESHE3
𝍠 119648 01D360 COUNTING ROD UNIT DIGIT ONE
🄂 127234 01F102 DIGIT ONE COMMA