Class SearchPattern
Fast search for patterns within strings, arrays of
bytes and ByteBuffer
s.
Uses an implementation of the Boyer–Moore–Horspool algorithm with a 256 character alphabet.
The algorithm has an average-case complexity of O(n)
on random text and O(nm) in the worst case, where
m = pattern length
and n = length of data to search
.
-
Method Summary
Modifier and TypeMethodDescriptionstatic SearchPattern
compile
(byte[] pattern) Creates aSearchPattern
instance which can be used to find matches of the pattern in data.static SearchPattern
Creates aSearchPattern
instance which can be used to find matches of the pattern in data.int
endsWith
(byte[] data, int offset, int length) Search for a partial match of the pattern at the end of the data.int
endsWith
(ByteBuffer buffer) Searches for a partial match of the pattern at the end of theByteBuffer
.int
byte[]
int
match
(byte[] data, int offset, int length) Search for a complete match of the pattern within the dataint
match
(ByteBuffer buffer) Searches for a full match of the pattern in theByteBuffer
.int
startsWith
(byte[] data, int offset, int length, int matched) Search for a possibly partial match of the pattern at the start of the data.int
startsWith
(ByteBuffer buffer, int matched) Searches for a partial match of the pattern at the beginning of theByteBuffer
.
-
Method Details
-
compile
Creates a
SearchPattern
instance which can be used to find matches of the pattern in data.- Parameters:
pattern
- byte array containing the pattern to search- Returns:
- a new SearchPattern instance using the given pattern
-
compile
Creates a
SearchPattern
instance which can be used to find matches of the pattern in data.The pattern string must only contain ASCII characters.
- Parameters:
pattern
- string containing the pattern to search- Returns:
- a new SearchPattern instance using the given pattern
-
getPattern
public byte[] getPattern()- Returns:
- the pattern to search.
-
match
public int match(byte[] data, int offset, int length) Search for a complete match of the pattern within the data- Parameters:
data
- The data in which to search for. The data may be arbitrary binary data, but the pattern will always beStandardCharsets.US_ASCII
encoded.offset
- The offset within the data to start the searchlength
- The length of the data to search- Returns:
- The index within the data array at which the first instance of the pattern or -1 if not found
-
match
Searches for a full match of the pattern in the
ByteBuffer
.The
ByteBuffer
may contain arbitrary binary data, but the pattern will always beStandardCharsets.US_ASCII
encoded.The position and limit of the
ByteBuffer
are not changed.- Parameters:
buffer
- theByteBuffer
to search into- Returns:
- the number of bytes after the buffer's position at which the full pattern was found, or -1 if the full pattern was not found
-
endsWith
public int endsWith(byte[] data, int offset, int length) Search for a partial match of the pattern at the end of the data.- Parameters:
data
- The data in which to search for. The data may be arbitrary binary data, but the pattern will always beStandardCharsets.US_ASCII
encoded.offset
- The offset within the data to start the searchlength
- The length of the data to search- Returns:
- the length of the partial pattern matched and 0 for no match.
-
endsWith
Searches for a partial match of the pattern at the end of the
ByteBuffer
.The
ByteBuffer
may contain arbitrary binary data, but the pattern will always beStandardCharsets.US_ASCII
encoded.The position and limit of the
ByteBuffer
are not changed.- Parameters:
buffer
- theByteBuffer
to search into- Returns:
- how many bytes of the pattern were matched at the end of the
ByteBuffer
, or 0 for no match
-
startsWith
public int startsWith(byte[] data, int offset, int length, int matched) Search for a possibly partial match of the pattern at the start of the data.- Parameters:
data
- The data in which to search for. The data may be arbitrary binary data, but the pattern will always beStandardCharsets.US_ASCII
encoded.offset
- The offset within the data to start the searchlength
- The length of the data to searchmatched
- The length of the partial pattern already matched- Returns:
- the length of the partial pattern matched and 0 for no match.
-
startsWith
Searches for a partial match of the pattern at the beginning of the
ByteBuffer
.The
ByteBuffer
may contain arbitrary binary data, but the pattern will always beStandardCharsets.US_ASCII
encoded.The position and limit of the
ByteBuffer
are not changed.- Parameters:
buffer
- theByteBuffer
to search intomatched
- how many bytes of the pattern were already matched- Returns:
- how many bytes of the pattern were matched (including those already matched)
at the beginning of the
ByteBuffer
, or 0 for no match
-
getLength
public int getLength()- Returns:
- The length of the pattern in bytes.
-