Java Coding: How To Find Number of Characters in a String (!= String.length())
By Angsuman Chakraborty, Gaea News NetworkMonday, November 7, 2005
Length of string can be interpreted variously -
- number of chars in the string
- number of characters in the string
- number of bytes in the string
String.length() gives you the number of chars in the string accurately.
However a char is not necessarily a complete character. Why?
Supplementary characters exist in the Unicode charset. These are characters that have code points above the base set, and they have values greater than 0xFFFF. They extend all the way up to 0×10FFFF.
In Java, these supplementary characters are represented as surrogate pairs, pairs of char units that fall in a specific range. The leading or high surrogate value is in the 0xD800 through 0xDBFF range. The trailing or low surrogate value is in the 0xDC00 through 0xDFFF range.
J2SE 5.0 API has a new String method: codePointCount(int beginIndex, int endIndex) which tells you how many Unicode code points are between the two indices. The index values refer to code unit or char locations, so endIndex - beginIndex for the entire String is equivalent to the String’s length.
So:
int characterLength = myString.codePointCount(0, charLength);
As before:
int charLength = myString.length();
Unless you plan to sell your software to China or Japan (read internationalize) you are unlikely to encounter any difference between charLength and characterLength.
So how many bytes are in a String?
int byteCount = myString.getBytes().length;
getBytes converts its Unicode characters into a legacy charset with the exception of UTF-8 which is a multibyte encoding of Unicode and not a legacy charset. It then returns the characters in a byte array.
March 10, 2010: 9:57 pm
can i have a program that question goes like this: |
![]() Navya |
April 10, 2009: 2:19 pm
Hi, can i have the code for finding the length of the string without using any string fuctions in java. |
![]() Ramesh |
April 1, 2008: 11:13 pm
how can i find a length of string with out using any function in java.. plz help me.. |
![]() Phil Speroff |
October 1, 2007: 3:41 pm
Is there a way to filter out the punctuation similar to setting delimiters to where the program only counts the letters? |
August 30, 2007: 3:13 am
I gave the example code in the article. What other examples are you looking for? |
![]() Aayush |
![]() Jason |
November 11, 2005: 11:12 am
So you saying that you should always use “int byteCount = myString.getBytes().length;” instead of “myString.length()”, just in case you Internationalize later? |
baljidnorov