Interface HttpURI

All Known Implementing Classes:
HttpURI.Immutable, HttpURI.Mutable, HttpURI.Unsafe

public interface HttpURI

Representation of HTTP URIs.

Both HttpURI.Mutable and HttpURI.Immutable implementations are available via the static methods such as build() and from(String), and HttpURI.Unsafe can be used for tests or invalid URIs.

An HTTP URI such as http://user@host:port/path;param1/%2e/f%6fo%2fbar%20bob;param2?query#fragment is split into the following optional elements:

The path part of the URI is provided in both raw form (getPath()) and decoded form (getCanonicalPath()), which has: path parameters removed, percent encoded characters expanded and relative segments resolved. This approach is somewhat contrary to RFC3986 which no longer defines path parameters (removed after RFC2396) and specifies that relative segment normalization should take place before percent encoded character expansion. A literal interpretation of the RFC can result in URI paths with ambiguities when viewed as strings. For example, a URI of /foo%2f..%2fbar is technically a single segment of "/foo/../bar", but could easily be misinterpreted as 3 segments resolving to "/bar" by a file system.

Thus this class avoid and/or detects such ambiguities. Furthermore, by decoding characters and removing parameters before relative path normalization, ambiguous paths will be resolved in such a way to be non-standard-but-non-ambiguous to down stream interpretation of the decoded path string.

This class collates any violations against the specification and/or best practises in the getViolations(). Users of this class should check against a configured UriCompliance mode if the HttpURI is suitable for use (see ComplianceUtils.verify(UriCompliance, HttpURI, ComplianceViolation.Listener, Function)).

For example, implementations that wish to process ambiguous URI paths must configure the compliance modes to accept them and then perform their own decoding of getPath().

If there are multiple path parameters, only the last one is returned by getParam().

  • Method Details

    • build

      static HttpURI.Mutable build()
    • build

      static HttpURI.Mutable build(HttpURI uri)
    • build

      static HttpURI.Mutable build(HttpURI uri, String pathQuery)
    • build

      static HttpURI.Mutable build(HttpURI uri, String path, String param, String query)
    • build

      static HttpURI.Mutable build(URI uri)
    • build

      static HttpURI.Mutable build(String uri)
    • build

      static HttpURI.Mutable build(String method, String uri)
    • from

      static HttpURI.Immutable from(URI uri)
    • from

      static HttpURI.Immutable from(String uri)
    • from

      static HttpURI.Immutable from(String method, String uri)
    • from

      static HttpURI.Immutable from(String scheme, HostPort hostPort, String pathQuery)
    • from

      static HttpURI.Immutable from(String scheme, String host, int port, String pathQuery)
    • from

      static HttpURI.Immutable from(String scheme, String host, int port, String path, String query, String fragment)

      Creates a new HttpURI with the given arguments.

      Parameters:
      scheme - the URI scheme (normalized to lower-case)
      host - the URI host
      port - the URI port, or -1 for no port
      path - the URI path
      query - the URI query
      fragment - the URI fragment
      Returns:
      a new HttpURI
    • asImmutable

      HttpURI.Immutable asImmutable()
      Returns:
      An immutable copy of this HttpURI.
    • asString

      String asString()
      Returns:
      The URI as a string.
    • getAuthority

      String getAuthority()
      Returns:
      The authority component of the URI in the form host:port, or just host if the port is not set, or null if no host is set.
    • getDecodedPath

      String getDecodedPath()
      Returns:
      The decoded path with percent-encoded characters decoded, or null if no path is set.
      See Also:
    • getCanonicalPath

      String getCanonicalPath()
      Returns:
      The canonical path with path parameters removed and percent-encoded characters decoded, or null if no path is set.
      See Also:
    • getFragment

      String getFragment()
      Returns:
      The fragment component of the URI (after the # character), or null if not set.
    • getHost

      String getHost()
      Returns:
      The host component of the URI, or null if not set.
    • getParam

      String getParam()

      Get a URI path parameter.

      Path parameters were defined in RFC 2068 and appear after a semicolon in the path, such as /path;param. This is distinct from query parameters which appear after the ? character.

      Returns:
      The last path parameter, or null if no path parameter is present. If there are multiple path parameters, only the last one is returned.
      See Also:
    • getPath

      String getPath()
      Returns:
      The raw, undecoded, path component of the URI including path parameters, or null if not set.
      See Also:
    • getPathQuery

      String getPathQuery()
      Returns:
      The raw, undecoded, path and query components combined as path?query, or just the path if no query is present, or null if no path is set.
    • getPort

      int getPort()
      Returns:
      The port number of the URI, or -1 if not set.
    • getQuery

      String getQuery()
      Returns:
      The query string component of the URI (after the ? character but before any # fragment), or null if not set.
      See Also:
    • getScheme

      String getScheme()
      Returns:
      The URI scheme such as http or https, or null if not set.
    • getUser

      String getUser()
      Returns:
      The user info component of the URI authority (before the @ character), or null if not set.
    • hasAuthority

      boolean hasAuthority()
      Returns:
      true if the URI has an authority component.
    • isAbsolute

      boolean isAbsolute()
      Returns:
      true if the URI has a scheme component.
    • isAmbiguous

      boolean isAmbiguous()

      Checks if the URI contains any ambiguous path violations that could be interpreted differently by different URI parsers.

      Returns:
      true if the URI has any ambiguous UriCompliance.Violations.
      See Also:
    • hasViolations

      boolean hasViolations()

      Checks if the URI has any compliance violations against the URI specification or best practices.

      Returns:
      true if the URI has any UriCompliance.Violations.
      See Also:
    • hasViolation

      boolean hasViolation(UriCompliance.Violation violation)
      Parameters:
      violation - The violation to check.
      Returns:
      true if the URI has the specified violation.
      See Also:
    • getViolations

      Returns:
      The set of UriCompliance.Violations detected in the URI, or an empty set if none.
      See Also:
    • hasAmbiguousSegment

      default boolean hasAmbiguousSegment()
      Returns:
      True if the URI has a possibly ambiguous segment like '..;' or '%2e%2e'
    • hasAmbiguousEmptySegment

      default boolean hasAmbiguousEmptySegment()
      Returns:
      True if the URI empty segment that is ambiguous like '//' or '/;param/'.
    • hasAmbiguousSeparator

      default boolean hasAmbiguousSeparator()
      Returns:
      True if the URI has a possibly ambiguous separator of %2f
    • hasAmbiguousParameter

      default boolean hasAmbiguousParameter()
      Returns:
      True if the URI has a possibly ambiguous path parameter like '..;'
    • hasAmbiguousEncoding

      default boolean hasAmbiguousEncoding()
      Returns:
      True if the URI has an encoded '%' character.
    • hasUtf16Encoding

      default boolean hasUtf16Encoding()
      Returns:
      True if the URI has UTF16 '%u' encodings.
    • toURI

      default URI toURI()