public class URLUtil
extends java.lang.Object
Constructor and Description |
---|
URLUtil() |
Modifier and Type | Method and Description |
---|---|
static java.lang.String |
chooseRepr(java.lang.String src,
java.lang.String dst,
boolean temp)
Given two urls, a src and a destination of a redirect, it returns the
representative url.
|
static java.lang.String |
getDomainName(java.lang.String url)
Returns the domain name of the url.
|
static java.lang.String |
getDomainName(java.net.URL url)
Returns the domain name of the url.
|
static DomainSuffix |
getDomainSuffix(java.lang.String url)
Returns the
DomainSuffix corresponding to the last public part of
the hostname |
static DomainSuffix |
getDomainSuffix(java.net.URL url)
Returns the
DomainSuffix corresponding to the last public part of
the hostname |
static java.lang.String |
getHost(java.lang.String url)
Returns the lowercased hostname for the url or null if the url is not well
formed.
|
static java.lang.String[] |
getHostBatches(java.lang.String url)
Partitions of the hostname of the url by "."
|
static java.lang.String[] |
getHostBatches(java.net.URL url)
Partitions of the hostname of the url by "."
|
static java.lang.String |
getPage(java.lang.String url)
Returns the page for the url.
|
static boolean |
isSameDomainName(java.lang.String url1,
java.lang.String url2)
Returns whether the given urls have the same domain name.
|
static boolean |
isSameDomainName(java.net.URL url1,
java.net.URL url2)
Returns whether the given urls have the same domain name.
|
static void |
main(java.lang.String[] args)
For testing
|
static java.net.URL |
resolveURL(java.net.URL base,
java.lang.String target)
Resolve relative URL-s and fix a java.net.URL error in handling of URLs
with pure query targets.
|
static java.lang.String |
toASCII(java.lang.String url) |
static java.lang.String |
toUNICODE(java.lang.String url) |
public static java.net.URL resolveURL(java.net.URL base, java.lang.String target) throws java.net.MalformedURLException
base
- base urltarget
- target url (may be relative)java.net.MalformedURLException
public static java.lang.String getDomainName(java.net.URL url)
getDomainName(conf, new URL(http://lucene.apache.org/))
apache.org
public static java.lang.String getDomainName(java.lang.String url) throws java.net.MalformedURLException
getDomainName(conf, new http://lucene.apache.org/)
apache.org
java.net.MalformedURLException
public static boolean isSameDomainName(java.net.URL url1, java.net.URL url2)
isSameDomain(new URL("http://lucene.apache.org")
, new URL("http://people.apache.org/"))
will return true.
public static boolean isSameDomainName(java.lang.String url1, java.lang.String url2) throws java.net.MalformedURLException
isSameDomain("http://lucene.apache.org"
,"http://people.apache.org/")
will return true.
java.net.MalformedURLException
public static DomainSuffix getDomainSuffix(java.net.URL url)
DomainSuffix
corresponding to the last public part of
the hostnamepublic static DomainSuffix getDomainSuffix(java.lang.String url) throws java.net.MalformedURLException
DomainSuffix
corresponding to the last public part of
the hostnamejava.net.MalformedURLException
public static java.lang.String[] getHostBatches(java.net.URL url)
public static java.lang.String[] getHostBatches(java.lang.String url) throws java.net.MalformedURLException
java.net.MalformedURLException
public static java.lang.String chooseRepr(java.lang.String src, java.lang.String dst, boolean temp)
Given two urls, a src and a destination of a redirect, it returns the representative url.
This method implements an extended version of the algorithm used by the
Yahoo! Slurp crawler described here:
How
does the Yahoo! webcrawler handle redirects?
src
- The source url.dst
- The destination url.temp
- Is the redirect a temporary redirect.public static java.lang.String getHost(java.lang.String url)
url
- The url to check.public static java.lang.String getPage(java.lang.String url)
url
- The url to check.public static java.lang.String toASCII(java.lang.String url)
public static java.lang.String toUNICODE(java.lang.String url)
public static void main(java.lang.String[] args)
Copyright © 2019 The Apache Software Foundation