public static class Characters.Filter extends Character.Subset
Character
class, like
LOWERCASE_LETTER
,
UPPERCASE_LETTER
,
DECIMAL_DIGIT_NUMBER
and
SPACE_SEPARATOR
.
An instance of this class can be obtained from an enumeration of character types
using the forTypes(byte[])
method, or using one of the constants predefined
in this class. Then, Unicode characters can be tested for inclusion in the subset by
calling the contains(int)
method.
_
” (underscore), “-
” (minus sign), “/
” (solidus),
“(
” (left parenthesis) and “)
” (right parenthesis).
The same specification also limits the set of valid characters in a name to the following (§6.3.1):
A-Z a-z 0-9 _ [ ] ( ) { } < = > . , : ; + - (space) % & ' " * ^ / \ ? | °
Character.Subset
,
Character.getType(int)
,
WKT 2 specification §B.5Defined in the sis-utility
module
Modifier and Type | Field and Description |
---|---|
static Characters.Filter |
LETTERS_AND_DIGITS
The subset of all characters for which
Character.isLetterOrDigit(int)
returns true . |
static Characters.Filter |
UNICODE_IDENTIFIER
The subset of all characters for which
Character.isUnicodeIdentifierPart(int)
returns true , excluding ignorable characters. |
Modifier and Type | Method and Description |
---|---|
boolean |
contains(int codePoint)
Returns
true if this subset contains the given Unicode character. |
boolean |
containsType(int type)
Returns
true if this subset contains the characters of the given type. |
static Characters.Filter |
forTypes(byte... types)
Returns a subset representing the union of all Unicode characters of the given types.
|
equals, hashCode, toString
public static final Characters.Filter LETTERS_AND_DIGITS
Character.isLetterOrDigit(int)
returns true
. This subset includes the following general categories:
SIS uses this filter when comparing two identified object names. See the Relationship with international standards section in this class javadoc for more information.Character.LOWERCASE_LETTER
,UPPERCASE_LETTER
,TITLECASE_LETTER
,MODIFIER_LETTER
,OTHER_LETTER
andDECIMAL_DIGIT_NUMBER
.
public static final Characters.Filter UNICODE_IDENTIFIER
Character.isUnicodeIdentifierPart(int)
returns true
, excluding ignorable characters.
This subset includes all the LETTERS_AND_DIGITS
categories with the addition of the following
ones:
Character.LETTER_NUMBER
,CONNECTOR_PUNCTUATION
,NON_SPACING_MARK
andCOMBINING_SPACING_MARK
.
public boolean contains(int codePoint)
true
if this subset contains the given Unicode character.codePoint
- The Unicode character, as a code point value.true
if this subset contains the given character.public final boolean containsType(int type)
true
if this subset contains the characters of the given type.
The given type shall be one of the Character
constants like
LOWERCASE_LETTER
,
UPPERCASE_LETTER
,
DECIMAL_DIGIT_NUMBER
or
SPACE_SEPARATOR
.type
- One of the Character
constants.true
if this subset contains the characters of the given type.Character.getType(int)
public static Characters.Filter forTypes(byte... types)
types
- The character types, as Character
constants.Character.LOWERCASE_LETTER
,
Character.UPPERCASE_LETTER
,
Character.DECIMAL_DIGIT_NUMBER
,
Character.SPACE_SEPARATOR
Copyright © 2010–2015 The Apache Software Foundation. All rights reserved.