Class Utf8StringBuilder
- All Implemented Interfaces:
CharsetStringBuilder
- Direct Known Subclasses:
CharsetStringBuilder.ReportingUtf8StringBuilder
,NullAppendable
UTF-8 StringBuilder.
This class wraps a standard StringBuilder
and provides methods to append
UTF-8 encoded bytes, that are converted into characters.
This class is stateful and up to 4 calls to append(byte)
may be needed before
state a character is appended to the string buffer.
The UTF-8 decoding is done by this class and no additional buffers or Readers are used. The algorithm is
fast fail, in that errors are detected as the bytes are appended. However, no exceptions are thrown and
only the hasCodingErrors()
method indicates the fast failure, otherwise the coding errors
are replaced and may be returned, unless the build()
method is used, which may throw
CharacterCodingException
. Already decoded characters may also be appended (e.g. append(char)
making this class suitable for decoding % encoded strings of already decoded characters.
-
Nested Class Summary
Modifier and TypeClassDescriptionstatic class
static class
Nested classes/interfaces inherited from interface org.eclipse.jetty.util.CharsetStringBuilder
CharsetStringBuilder.DecoderStringBuilder, CharsetStringBuilder.Iso88591StringBuilder, CharsetStringBuilder.ReportingUtf8StringBuilder, CharsetStringBuilder.UsAsciiStringBuilder
-
Field Summary
-
Constructor Summary
ModifierConstructorDescriptionUtf8StringBuilder
(int capacity) protected
Utf8StringBuilder
(StringBuilder buffer) -
Method Summary
Modifier and TypeMethodDescriptionvoid
append
(byte b) void
append
(byte[] b) void
append
(byte[] b, int offset, int length) boolean
append
(byte[] b, int offset, int length, int maxChars) void
append
(char c) void
void
void
append
(ByteBuffer buf) void
appendByte
(byte b) protected void
bufferAppend
(char c) protected void
build()
Build the completed string and reset the buffer.protected void
void
complete()
Complete the appendable, adding a replacement character and coding error if the sequence is not currently complete.boolean
boolean
int
length()
void
Partially reset the appendable: clear the buffer and clear any errors, but retain the decoding state of any partially decoded sequences.void
reset()
Reset the appendable, clearing the buffer, resetting decoding state and clearing any errors.takeCompleteString
(Supplier<X> onCodingError) Take the completely decoded string.takePartialString
(Supplier<X> onCodingError) Take the partially decoded string.Get the completely decoded string, which is equivalent to callingcomplete()
thentoString()
.toString()
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
Methods inherited from interface org.eclipse.jetty.util.CharsetStringBuilder
append
-
Field Details
-
REPLACEMENT
public static final char REPLACEMENT- See Also:
-
_state
protected int _state
-
-
Constructor Details
-
Utf8StringBuilder
public Utf8StringBuilder() -
Utf8StringBuilder
public Utf8StringBuilder(int capacity) -
Utf8StringBuilder
-
-
Method Details
-
length
public int length()- Specified by:
length
in interfaceCharsetStringBuilder
- Returns:
- the length in characters
-
hasCodingErrors
public boolean hasCodingErrors()- Returns:
True
if the characters decoded have contained UTF8 coding errors.
-
reset
public void reset()Reset the appendable, clearing the buffer, resetting decoding state and clearing any errors.- Specified by:
reset
in interfaceCharsetStringBuilder
-
partialReset
public void partialReset()Partially reset the appendable: clear the buffer and clear any errors, but retain the decoding state of any partially decoded sequences. -
checkCharAppend
protected void checkCharAppend() -
append
public void append(char c) - Specified by:
append
in interfaceCharsetStringBuilder
- Parameters:
c
- A decoded character to append
-
append
-
append
-
append
public void append(byte b) - Specified by:
append
in interfaceCharsetStringBuilder
- Parameters:
b
- An encoded byte to append
-
append
- Specified by:
append
in interfaceCharsetStringBuilder
- Parameters:
buf
- Buffer of encoded bytes to append. The bytes are consumed from the buffer.
-
append
public void append(byte[] b) - Specified by:
append
in interfaceCharsetStringBuilder
- Parameters:
b
- Array of encoded bytes to append
-
append
public void append(byte[] b, int offset, int length) - Specified by:
append
in interfaceCharsetStringBuilder
- Parameters:
b
- Array of encoded bytesoffset
- offset into the arraylength
- the number of bytes to append from the array.
-
append
public boolean append(byte[] b, int offset, int length, int maxChars) -
bufferAppend
protected void bufferAppend(char c) -
bufferReset
protected void bufferReset() -
appendByte
- Throws:
IOException
-
isComplete
public boolean isComplete()- Returns:
True
if the appended sequences are complete UTF-8 sequences.
-
complete
public void complete()Complete the appendable, adding a replacement character and coding error if the sequence is not currently complete. -
toString
-
toPartialString
- Returns:
- The currently decoded string, excluding any partial sequences appended.
-
toCompleteString
Get the completely decoded string, which is equivalent to callingcomplete()
thentoString()
.- Returns:
- The completely decoded string.
-
takeCompleteString
Take the completely decoded string.- Type Parameters:
X
- The type of the exception thrown- Parameters:
onCodingError
- A supplier of aThrowable
to use ifhasCodingErrors()
returns true, or null for no error action- Returns:
- The complete string.
- Throws:
X
- ifhasCodingErrors()
is true aftercomplete()
.
-
takePartialString
Take the partially decoded string.- Type Parameters:
X
- The type of the exception thrown- Parameters:
onCodingError
- A supplier of aThrowable
to use ifhasCodingErrors()
returns true, or null for no error action- Returns:
- The complete string.
- Throws:
X
- ifhasCodingErrors()
is true aftercomplete()
.
-
build
Description copied from interface:CharsetStringBuilder
Build the completed string and reset the buffer.
- Specified by:
build
in interfaceCharsetStringBuilder
- Returns:
- The decoded built string which must be complete in regard to any multibyte sequences.
- Throws:
CharacterCodingException
- If the bytes cannot be correctly decoded or a multibyte sequence is incomplete.
-