From d3c6de7a382b5f263dd1614357480e603fc008b8 Mon Sep 17 00:00:00 2001
From: Ben McIlwain Note: version numbers start at 1. RFC 7940 recommends using simple integers. The version comment is optional,
please replace or delete the default comment. Version comments may be used by some tools as part of the page header. Note: the scope element may be repeated, so that the same document can serve for multiple domains. Registry Contact Information: Please fill in the Registry Contact Details. Change History If you made technical modifications to the LGR, please summarize them in the Change History (and also note the details in the appropriate section of the description). PLEASE DELETE THESE INSTRUCTIONS BEFORE DEPOSITING THE DOCUMENT The repertoire contains the 197 letters needed to write hundreds of languages in the Latin script.
- An additional 7 combining diacritical marks are available as part of 21 explicitly defined combining sequences.
+ The repertoire contains the 164 letters needed to write hundreds of languages in the Latin script.
The repertoire is a subset of [Unicode 11.0.0]. For details, see Section 5, “Repertoire” in [Proposal-Latin].
(The proposal cited has been adopted for the Latin script portion of the Root Zone LGR.)
- Compared to that source, an additional language is supported by adding the code point for the Middle Dot used
- in the Catalan Ela Geminada: U+006C U+00B7 U+006C. Context rules limit U+00B7 MIDDLE DOT to being bracketed
- by the letter “l”. (See also [280]) For the second level, the repertoire has been augmented with the ASCII digits, U+0030 to U+0039, plus U+002D HYPHEN-MINUS, for a total of 231 repertoire elements. For the second level, the repertoire has been augmented with the ASCII digits, U+0030 to U+0039, plus U+002D HYPHEN-MINUS, for a total of 175 repertoire elements. Any code points outside the Latin Script repertoire that are targets for
out-of-repertoire variants would be included here only if the variant is listed
@@ -142,28 +135,6 @@
U+00B7 MIDDLE DOT and U+002D HYPHEN-MINUS —
- the use of the hyphen as fallback for the middle dot in the Catalan Ela Geminada follows registry practice, see [281].
- The variant is limited to an Ela Geminada context. Some second level LGRs provide ASCII fallback variants for some or all accented Latin characters.
- Likewise the U+0153 Small OE Ligature and U+00E6 Small AE ligature have ASCII fallbacks consisting of the
- non-ligated “oe” and “ae” sequences. None of these fallbacks have been added to the current version of the LGR. Overlapped Variant Sequence: Both “ss” and “s” coexist in the repertoire and “s” has variant
- relationships on its own. These variants thus overlap: making the variant set well-behaved for
- index variant calculation requires that the sequence “ss” also be given variants to all permutations of
- variants for the letter s followed by itself, as well as all transitive variants due to other variants
- for U+00DF. In each of the fallback variant pairs defined above, the mapping type from the first element to the second is of type
“fallback”, while the variant type for the other direction is “blocked”. In addition, the first element of each pair uses the
@@ -194,17 +165,6 @@
The following context rule applies to U+00B7 MIDDLE DOT and its variants.
- It ensures that the middle dot is part of an Ela Geminada sequence and variants between it and HYPHEN-MINUS are only defined in that context. The following WLE rule invalidates labels in which two Ela Geminada sequences overlap.INSTRUCTIONS
@@ -35,22 +35,21 @@
<version comment="[Please replace (or delete) the optional comment]">[Please fill in version number, starting at 1]</version><date>[Please fill in with publication date, in YYYY-MM-DD format]</date><validity-start>[Please fill in effective date, in YYYY-MM-DD format]</validity-start><date>2025-10-01</date><validity-start>2025-10-01</validity-start><scope type="domain">[Please provide, in ".domain" format]</scope>Registry Contact Details
-
Repertoire
-
-
-
- In-script Variant Mapping Types
Latin-specific Rules
-
-
-
-
-
-
Actions
Default Actions
@@ -213,27 +173,6 @@
invalidate labels with misplaced combining marks. They are marked with ⍟.
For a description see [RFC 7940].
Because this LGR defines allocatable fallback variants the following default actions are applicable.
- -These actions resolve as “allocatable” any label where all variants are of type “fallback”, and as “valid” any label - where all variants are of type “r-original”. Labels with a mix of variant types are resolved as “blocked”.
- -To account for original code points in a permuted variant, reflexive variant - mappings with an “r-” prefix are used. (See [RFC 7940]). - In particular, the mapping type “r-original” is given to any code point that has a fallback mapping, - but that appears in its non-fallback form in the original label, and thus “maps to itself”.
-Default actions that are
triggered by the LGR-specific variant types described above limit the “allocatable” variant
labels to those containing only “ss”, dotted “i” or hyphen variants, while
@@ -258,7 +197,7 @@
Adopted from the Second Level Reference LGR for the Latin Script [Ref-LGR-und-Latn] without normative changes. Adopted from the Second Level Reference LGR for the Latin Script [Ref-LGR-und-Latn] with security improvements implemented by removing confusable variants.Changes from Version Dated 25 October 2024
-