java.lang
Class Character
java.lang.Object
|
+--java.lang.Character
All Implemented Interfaces:
Serializable, Comparable
Wrapper class for the primitive char data type. In addition, this class
allows one to retrieve property information and perform transformations
on the 57,707 defined characters in the Unicode Standard, Version 3.0.0.
java.lang.Character is designed to be very dynamic, and as such, it
retrieves information on the Unicode character set from a separate
database, gnu.java.lang.CharData, which can be easily upgraded.
For predicates, boundaries are used to describe
the set of characters for which the method will return true.
This syntax uses fairly normal regular expression notation.
See 5.13 of the Unicode Standard, Version 3.0, for the
boundary specification.
See http://www.unicode.org
for more information on the Unicode Standard.
Since:Authors:- Tom Tromey <tromey@cygnus.com>
- Paul N. Fisher
- Jochen Hoenicke
- Eric Blake <ebb9@email.byu.edu>
See Also:
COMBINING_SPACING_MARK
public static final byte COMBINING_SPACING_MARK
Mc = Mark, Spacing Combining (Normative).
Since:
CONNECTOR_PUNCTUATION
public static final byte CONNECTOR_PUNCTUATION
Pc = Punctuation, Connector (Informative).
Since:
CONTROL
public static final byte CONTROL
Cc = Other, Control (Normative).
Since:
CURRENCY_SYMBOL
public static final byte CURRENCY_SYMBOL
Sc = Symbol, Currency (Informative).
Since:
DASH_PUNCTUATION
public static final byte DASH_PUNCTUATION
Pd = Punctuation, Dash (Informative).
Since:
DECIMAL_DIGIT_NUMBER
public static final byte DECIMAL_DIGIT_NUMBER
Nd = Number, Decimal Digit (Normative).
Since:
DIRECTIONALITY_ARABIC_NUMBER
public static final byte DIRECTIONALITY_ARABIC_NUMBER
Weak bidirectional character type "AN".
Since:
DIRECTIONALITY_BOUNDARY_NEUTRAL
public static final byte DIRECTIONALITY_BOUNDARY_NEUTRAL
Weak bidirectional character type "BN".
Since:
DIRECTIONALITY_COMMON_NUMBER_SEPARATOR
public static final byte DIRECTIONALITY_COMMON_NUMBER_SEPARATOR
Weak bidirectional character type "CS".
Since:
DIRECTIONALITY_EUROPEAN_NUMBER
public static final byte DIRECTIONALITY_EUROPEAN_NUMBER
Weak bidirectional character type "EN".
Since:
DIRECTIONALITY_EUROPEAN_NUMBER_SEPARATOR
public static final byte DIRECTIONALITY_EUROPEAN_NUMBER_SEPARATOR
Weak bidirectional character type "ES".
Since:
DIRECTIONALITY_EUROPEAN_NUMBER_TERMINATOR
public static final byte DIRECTIONALITY_EUROPEAN_NUMBER_TERMINATOR
Weak bidirectional character type "ET".
Since:
DIRECTIONALITY_LEFT_TO_RIGHT
public static final byte DIRECTIONALITY_LEFT_TO_RIGHT
Strong bidirectional character type "L".
Since:
DIRECTIONALITY_LEFT_TO_RIGHT_EMBEDDING
public static final byte DIRECTIONALITY_LEFT_TO_RIGHT_EMBEDDING
Strong bidirectional character type "LRE".
Since:
DIRECTIONALITY_LEFT_TO_RIGHT_OVERRIDE
public static final byte DIRECTIONALITY_LEFT_TO_RIGHT_OVERRIDE
Strong bidirectional character type "LRO".
Since:
DIRECTIONALITY_NONSPACING_MARK
public static final byte DIRECTIONALITY_NONSPACING_MARK
Weak bidirectional character type "NSM".
Since:
DIRECTIONALITY_OTHER_NEUTRALS
public static final byte DIRECTIONALITY_OTHER_NEUTRALS
Neutral bidirectional character type "ON".
Since:
DIRECTIONALITY_PARAGRAPH_SEPARATOR
public static final byte DIRECTIONALITY_PARAGRAPH_SEPARATOR
Neutral bidirectional character type "B".
Since:
DIRECTIONALITY_POP_DIRECTIONAL_FORMAT
public static final byte DIRECTIONALITY_POP_DIRECTIONAL_FORMAT
Weak bidirectional character type "PDF".
Since:
DIRECTIONALITY_RIGHT_TO_LEFT
public static final byte DIRECTIONALITY_RIGHT_TO_LEFT
Strong bidirectional character type "R".
Since:
DIRECTIONALITY_RIGHT_TO_LEFT_ARABIC
public static final byte DIRECTIONALITY_RIGHT_TO_LEFT_ARABIC
Strong bidirectional character type "AL".
Since:
DIRECTIONALITY_RIGHT_TO_LEFT_EMBEDDING
public static final byte DIRECTIONALITY_RIGHT_TO_LEFT_EMBEDDING
Strong bidirectional character type "RLE".
Since:
DIRECTIONALITY_RIGHT_TO_LEFT_OVERRIDE
public static final byte DIRECTIONALITY_RIGHT_TO_LEFT_OVERRIDE
Strong bidirectional character type "RLO".
Since:
DIRECTIONALITY_SEGMENT_SEPARATOR
public static final byte DIRECTIONALITY_SEGMENT_SEPARATOR
Neutral bidirectional character type "S".
Since:
DIRECTIONALITY_UNDEFINED
public static final byte DIRECTIONALITY_UNDEFINED
Undefined bidirectional character type. Undefined char values have
undefined directionality in the Unicode specification.
Since:
DIRECTIONALITY_WHITESPACE
public static final byte DIRECTIONALITY_WHITESPACE
Strong bidirectional character type "WS".
Since:
ENCLOSING_MARK
public static final byte ENCLOSING_MARK
Me = Mark, Enclosing (Normative).
Since:
END_PUNCTUATION
public static final byte END_PUNCTUATION
Pe = Punctuation, Close (Informative).
Since:
FINAL_QUOTE_PUNCTUATION
public static final byte FINAL_QUOTE_PUNCTUATION
Pf = Punctuation, Final Quote (Informative).
Since:
FORMAT
public static final byte FORMAT
Cf = Other, Format (Normative).
Since:
INITIAL_QUOTE_PUNCTUATION
public static final byte INITIAL_QUOTE_PUNCTUATION
Pi = Punctuation, Initial Quote (Informative).
Since:
LETTER_NUMBER
public static final byte LETTER_NUMBER
Nl = Number, Letter (Normative).
Since:
LINE_SEPARATOR
public static final byte LINE_SEPARATOR
Zl = Separator, Line (Normative).
Since:
LOWERCASE_LETTER
public static final byte LOWERCASE_LETTER
Ll = Letter, Lowercase (Informative).
Since:
MATH_SYMBOL
public static final byte MATH_SYMBOL
Sm = Symbol, Math (Informative).
Since:
MAX_RADIX
public static final int MAX_RADIX
Largest value allowed for radix arguments in Java. This value is 36.
See Also:
MAX_VALUE
public static final char MAX_VALUE
The maximum value the char data type can hold.
This value is '\\uFFFF'
.
MIN_RADIX
public static final int MIN_RADIX
Smallest value allowed for radix arguments in Java. This value is 2.
See Also:
MIN_VALUE
public static final char MIN_VALUE
The minimum value the char data type can hold.
This value is '\\u0000'
.
MODIFIER_LETTER
public static final byte MODIFIER_LETTER
Lm = Letter, Modifier (Informative).
Since:
MODIFIER_SYMBOL
public static final byte MODIFIER_SYMBOL
Sk = Symbol, Modifier (Informative).
Since:
NON_SPACING_MARK
public static final byte NON_SPACING_MARK
Mn = Mark, Non-Spacing (Normative).
Since:
OTHER_LETTER
public static final byte OTHER_LETTER
Lo = Letter, Other (Informative).
Since:
OTHER_NUMBER
public static final byte OTHER_NUMBER
No = Number, Other (Normative).
Since:
OTHER_PUNCTUATION
public static final byte OTHER_PUNCTUATION
Po = Punctuation, Other (Informative).
Since:
OTHER_SYMBOL
public static final byte OTHER_SYMBOL
So = Symbol, Other (Informative).
Since:
PARAGRAPH_SEPARATOR
public static final byte PARAGRAPH_SEPARATOR
Zp = Separator, Paragraph (Normative).
Since:
PRIVATE_USE
public static final byte PRIVATE_USE
Co = Other, Private Use (Normative).
Since:
SPACE_SEPARATOR
public static final byte SPACE_SEPARATOR
Zs = Separator, Space (Normative).
Since:
START_PUNCTUATION
public static final byte START_PUNCTUATION
Ps = Punctuation, Open (Informative).
Since:
SURROGATE
public static final byte SURROGATE
Cs = Other, Surrogate (Normative).
Since:
TITLECASE_LETTER
public static final byte TITLECASE_LETTER
Lt = Letter, Titlecase (Informative).
Since:
TYPE
public static final Class TYPE
Class object representing the primitive char data type.
Since:
UNASSIGNED
public static final byte UNASSIGNED
Cn = Other, Not Assigned (Normative).
Since:
UPPERCASE_LETTER
public static final byte UPPERCASE_LETTER
Lu = Letter, Uppercase (Informative).
Since:
Character
public Character(char value)
Wraps up a character.
Parameters:
charValue
public char charValue()
Returns the character which has been wrapped by this class.
Returns:
compareTo
public int compareTo(java.lang.Character anotherCharacter)
Compares another Character to this Character, numerically.
Since:Parameters:
Returns:
- a negative integer if this Character is less than
anotherCharacter, zero if this Character is equal, and
a positive integer if this Character is greater
Throws:
compareTo
public int compareTo(java.lang.Object o)
Compares an object to this Character. Assuming the object is a
Character object, this method performs the same comparison as
compareTo(Character).
Since:Parameters:
Returns:
Throws:
See Also:
digit
public static int digit(char ch, int radix)
Converts a character into a digit of the specified radix. If the radix
exceeds MIN_RADIX or MAX_RADIX, or if the result of getNumericValue(ch)
exceeds the radix, or if ch is not a decimal digit or in the case
insensitive set of 'a'-'z', the result is -1.
character argument boundary = [Nd]|U+0041-U+005A|U+0061-U+007A
|U+FF21-U+FF3A|U+FF41-U+FF5A
Parameters:
Returns:
- digit which ch represents in radix, or -1 not a valid digit
See Also:
equals
public boolean equals(java.lang.Object o)
Determines if an object is equal to this object. This is only true for
another Character object wrapping the same value.
Parameters:
Returns:
- true if o is a Character with the same value
forDigit
public static char forDigit(int digit, int radix)
Converts a digit into a character which represents that digit
in a specified radix. If the radix exceeds MIN_RADIX or MAX_RADIX,
or the digit exceeds the radix, then the null character '\0'
is returned. Otherwise the return value is in '0'-'9' and 'a'-'z'.
return value boundary = U+0030-U+0039|U+0061-U+007A
Parameters:
Returns:
- character representing digit in radix, or '\0'
See Also:
getDirectionality
public static byte getDirectionality(char ch)
Returns the Unicode directionality property of the character. This
is used in the visual ordering of text.
Since:Parameters:
Returns:
- the directionality constant, or DIRECTIONALITY_UNDEFINED
See Also:
getNumericValue
public static int getNumericValue(char ch)
Returns the Unicode numeric value property of a character. For example,
'\\u216C'
(the Roman numeral fifty) returns 50.
This method also returns values for the letters A through Z, (not
specified by Unicode), in these ranges: '\u0041'
through '\u005A'
(uppercase); '\u0061'
through '\u007A'
(lowercase); and '\uFF21'
through '\uFF3A'
, '\uFF41'
through
'\uFF5A'
(full width variants).
If the character lacks a numeric value property, -1 is returned.
If the character has a numeric value property which is not representable
as a nonnegative integer, such as a fraction, -2 is returned.
character argument boundary = [Nd]|[Nl]|[No]|U+0041-U+005A|U+0061-U+007A
|U+FF21-U+FF3A|U+FF41-U+FF5A
Since:Parameters:
Returns:
- the numeric value property of ch, or -1 if it does not exist, or
-2 if it is not representable as a nonnegative integer
See Also:
getType
public static int getType(char ch)
Returns the Unicode general category property of a character.
Since:Parameters:
Returns:
- the character category property of ch as an integer
See Also:
hashCode
public int hashCode()
Returns the numerical value (unsigned) of the wrapped character.
Range of returned values: 0x0000-0xFFFF.
Returns:
- the value of the wrapped character
isDefined
public static boolean isDefined(char ch)
Determines if a character is part of the Unicode Standard. This is an
evolving standard, but covers every character in the data file.
defined = not [Cn]
Parameters:
Returns:
- true if ch is a Unicode character, else false
See Also:
isDigit
public static boolean isDigit(char ch)
Determines if a character is a Unicode decimal digit. For example,
'0'
is a digit.
Unicode decimal digit = [Nd]
Parameters:
Returns:
- true if ch is a Unicode decimal digit, else false
See Also:
isISOControl
public static boolean isISOControl(char ch)
Determines if a character has the ISO Control property.
ISO Control = [Cc]
Since:Parameters:
Returns:
- true if ch is an ISO Control character, else false
See Also:
isIdentifierIgnorable
public static boolean isIdentifierIgnorable(char ch)
Determines if a character is ignorable in a Unicode identifier. This
includes the non-whitespace ISO control characters ('\u0000'
through '\u0008'
, '\u000E'
through
'\u001B'
, and '\u007F'
through
'\u009F'
), and FORMAT characters.
Unicode identifier ignorable = [Cf]|U+0000-U+0008|U+000E-U+001B
|U+007F-U+009F
Since:Parameters:
Returns:
- true if ch is ignorable in a Unicode or Java identifier
See Also:
isJavaIdentifierPart
public static boolean isJavaIdentifierPart(char ch)
Determines if a character can follow the first letter in
a Java identifier. This is the combination of isJavaLetter (isLetter,
type of LETTER_NUMBER, currency, connecting punctuation) and digit,
numeric letter (like Roman numerals), combining marks, non-spacing marks,
or isIdentifierIgnorable.
Java identifier extender =
[Lu]|[Ll]|[Lt]|[Lm]|[Lo]|[Nl]|[Sc]|[Pc]|[Mn]|[Mc]|[Nd]|[Cf]
|U+0000-U+0008|U+000E-U+001B|U+007F-U+009F
Since:Parameters:
Returns:
- true if ch can follow the first letter in a Java identifier
See Also:
isJavaIdentifierStart
public static boolean isJavaIdentifierStart(char ch)
Determines if a character can start a Java identifier. This is the
combination of isLetter, any character where getType returns
LETTER_NUMBER, currency symbols (like '$'), and connecting punctuation
(like '_').
Java identifier start = [Lu]|[Ll]|[Lt]|[Lm]|[Lo]|[Nl]|[Sc]|[Pc]
Since:Parameters:
Returns:
- true if ch can start a Java identifier, else false
See Also:
isJavaLetter
public static boolean isJavaLetter(char ch)
Determines if a character can start a Java identifier. This is the
combination of isLetter, any character where getType returns
LETTER_NUMBER, currency symbols (like '$'), and connecting punctuation
(like '_').
Parameters:
Returns:
- true if ch can start a Java identifier, else false
See Also:
isJavaLetterOrDigit
public static boolean isJavaLetterOrDigit(char ch)
Determines if a character can follow the first letter in
a Java identifier. This is the combination of isJavaLetter (isLetter,
type of LETTER_NUMBER, currency, connecting punctuation) and digit,
numeric letter (like Roman numerals), combining marks, non-spacing marks,
or isIdentifierIgnorable.
Parameters:
Returns:
- true if ch can follow the first letter in a Java identifier
See Also:
isLetter
public static boolean isLetter(char ch)
Determines if a character is a Unicode letter. Not all letters have case,
so this may return true when isLowerCase and isUpperCase return false.
letter = [Lu]|[Ll]|[Lt]|[Lm]|[Lo]
Parameters:
Returns:
- true if ch is a Unicode letter, else false
See Also:
isLetterOrDigit
public static boolean isLetterOrDigit(char ch)
Determines if a character is a Unicode letter or a Unicode digit. This
is the combination of isLetter and isDigit.
letter or digit = [Lu]|[Ll]|[Lt]|[Lm]|[Lo]|[Nd]
Parameters:
Returns:
- true if ch is a Unicode letter or a Unicode digit, else false
See Also:
isLowerCase
public static boolean isLowerCase(char ch)
Determines if a character is a Unicode lowercase letter. For example,
'a'
is lowercase.
lowercase = [Ll]
Parameters:
Returns:
- true if ch is a Unicode lowercase letter, else false
See Also:
isMirrored
public static boolean isMirrored(char ch)
Determines whether the character is mirrored according to Unicode. For
example, \u0028
(LEFT PARENTHESIS) appears as '(' in
left-to-right text, but ')' in right-to-left text.
Since:Parameters:
Returns:
- true if the character is mirrored
isSpace
public static boolean isSpace(char ch)
Determines if a character is a ISO-LATIN-1 space. This is only the five
characters '\t'
, '\n'
, '\f'
,
'\r'
, and ' '
.
Java space = U+0020|U+0009|U+000A|U+000C|U+000D
Parameters:
Returns:
- true if ch is a space, else false
See Also:
isSpaceChar
public static boolean isSpaceChar(char ch)
Determines if a character is a Unicode space character. This includes
SPACE_SEPARATOR, LINE_SEPARATOR, and PARAGRAPH_SEPARATOR.
Unicode space = [Zs]|[Zp]|[Zl]
Since:Parameters:
Returns:
- true if ch is a Unicode space, else false
See Also:
isTitleCase
public static boolean isTitleCase(char ch)
Determines if a character is a Unicode titlecase letter. For example,
the character "Lj" (Latin capital L with small letter j) is titlecase.
titlecase = [Lt]
Parameters:
Returns:
- true if ch is a Unicode titlecase letter, else false
See Also:
isUnicodeIdentifierPart
public static boolean isUnicodeIdentifierPart(char ch)
Determines if a character can follow the first letter in
a Unicode identifier. This includes letters, connecting punctuation,
digits, numeric letters, combining marks, non-spacing marks, and
isIdentifierIgnorable.
Unicode identifier extender =
[Lu]|[Ll]|[Lt]|[Lm]|[Lo]|[Nl]|[Mn]|[Mc]|[Nd]|[Pc]|[Cf]|
|U+0000-U+0008|U+000E-U+001B|U+007F-U+009F
Since:Parameters:
Returns:
- true if ch can follow the first letter in a Unicode identifier
See Also:
isUnicodeIdentifierStart
public static boolean isUnicodeIdentifierStart(char ch)
Determines if a character can start a Unicode identifier. Only
letters can start a Unicode identifier, but this includes characters
in LETTER_NUMBER.
Unicode identifier start = [Lu]|[Ll]|[Lt]|[Lm]|[Lo]|[Nl]
Since:Parameters:
Returns:
- true if ch can start a Unicode identifier, else false
See Also:
isUpperCase
public static boolean isUpperCase(char ch)
Determines if a character is a Unicode uppercase letter. For example,
'A'
is uppercase.
uppercase = [Lu]
Parameters:
Returns:
- true if ch is a Unicode uppercase letter, else false
See Also:
isWhitespace
public static boolean isWhitespace(char ch)
Determines if a character is Java whitespace. This includes Unicode
space characters (SPACE_SEPARATOR, LINE_SEPARATOR, and
PARAGRAPH_SEPARATOR) except the non-breaking spaces
('\u00A0'
, '\u2007'
, and '\u202F'
);
and these characters: '\u0009'
, '\u000A'
,
'\u000B'
, '\u000C'
, '\u000D'
,
'\u001C'
, '\u001D'
, '\u001E'
,
and '\u001F'
.
Java whitespace = ([Zs] not Nb)|[Zl]|[Zp]|U+0009-U+000D|U+001C-U+001F
Since:Parameters:
Returns:
- true if ch is Java whitespace, else false
See Also:
toLowerCase
public static char toLowerCase(char ch)
Converts a Unicode character into its lowercase equivalent mapping.
If a mapping does not exist, then the character passed is returned.
Note that isLowerCase(toLowerCase(ch)) does not always return true.
Parameters:
Returns:
- lowercase mapping of ch, or ch if lowercase mapping does
not exist
See Also:
toString
public String toString()
Converts the wrapped character into a String.
Returns:
- a String containing one character -- the wrapped character
of this instance
toString
public static String toString(char ch)
Returns a String of length 1 representing the specified character.
Since:Parameters:
Returns:
- a String containing the character
toTitleCase
public static char toTitleCase(char ch)
Converts a Unicode character into its titlecase equivalent mapping.
If a mapping does not exist, then the character passed is returned.
Note that isTitleCase(toTitleCase(ch)) does not always return true.
Parameters:
Returns:
- titlecase mapping of ch, or ch if titlecase mapping does
not exist
See Also:
toUpperCase
public static char toUpperCase(char ch)
Converts a Unicode character into its uppercase equivalent mapping.
If a mapping does not exist, then the character passed is returned.
Note that isUpperCase(toUpperCase(ch)) does not always return true.
Parameters:
Returns:
- uppercase mapping of ch, or ch if uppercase mapping does
not exist
See Also:
For predicates, boundaries are used to describe the set of characters for which the method will return true. This syntax uses fairly normal regular expression notation. See 5.13 of the Unicode Standard, Version 3.0, for the boundary specification.
See http://www.unicode.org for more information on the Unicode Standard.