Jump to content

Module:Unicode chart/subsets

From Wikipedia, the free encyclopedia
--[[------------------------------

Notes:
* Subset names are normalized to lowercase and '\s+' → '_' for lookup.
* Subset content must be a string as it is, for now, implemented as canned shorthand for the "range" parameter.
* All numbers are hex, whether or not they look like it.
* Ranges are separated by one or more commas or whitespaces and must match:
	* '^[0-9A-F]+[-–][0-9A-F]+$' (inclusive range), or
	* '^[0-9A-F]+$' (single code point).
* This means spaced hyphen/dash '62 - 7A' or ranges of more than two '62-70-7A' will fail.
* Codepoint prefixes 'U+' and '0x' are also ok, will be stripped accordingly.

Please replace example entries with something more useful than this, in the following format.

--]]------------------------------

return {
	basic_latin_digits = "30-39\n\n\n      \n\n",
	basic_latin_vowels = "41,45,49,4F,55,59          61 65 69 6F 75 79",
	basic_latin_consonants = "42-44,46-48,4A-4E,50-54,56-58,5A,62-64,66-68,6A-6E,70-74,76-78,7A",

	cjk_letters_months_hangul = "3200-321F, 3260-327F",
	cjk_letters_months_katakana = "32D0-32FF",

	}