public class JSParseFilter extends java.lang.Object implements ParseFilter, Parser
X_POINT_ID
X_POINT_ID
Constructor and Description |
---|
JSParseFilter() |
Modifier and Type | Method and Description |
---|---|
Parse |
filter(java.lang.String url,
WebPage page,
Parse parse,
HTMLMetaTags metaTags,
org.w3c.dom.DocumentFragment doc)
Scan the JavaScript looking for possible
Outlink 's |
Configuration |
getConf()
Get the
Configuration object |
java.util.Collection<WebPage.Field> |
getFields()
Gets all the fields for a given
WebPage Many datastores need to
setup the mapreduce job by specifying the fields needed. |
Parse |
getParse(java.lang.String url,
WebPage page)
Parse a JavaScript file and extract outlinks
|
static void |
main(java.lang.String[] args)
Main method which can be run from command line with the plugin option.
|
void |
setConf(Configuration conf)
Set the
Configuration object |
public Parse filter(java.lang.String url, WebPage page, Parse parse, HTMLMetaTags metaTags, org.w3c.dom.DocumentFragment doc)
Outlink
'sfilter
in interface ParseFilter
url
- URL of the WebPage
to be parsedpage
- WebPage
object relative to the URLparse
- Parse
object holding parse statusmetaTags
- within the HTMLMetaTags
doc
- The DocumentFragment
objectParse
object with additional outlinks from JavaScriptpublic Parse getParse(java.lang.String url, WebPage page)
public static void main(java.lang.String[] args) throws java.lang.Exception
args
- java.lang.Exception
public void setConf(Configuration conf)
Configuration
objectsetConf
in interface Configurable
public Configuration getConf()
Configuration
objectgetConf
in interface Configurable
public java.util.Collection<WebPage.Field> getFields()
WebPage
Many datastores need to
setup the mapreduce job by specifying the fields needed. All extensions
that work on WebPage are able to specify what fields they need.getFields
in interface FieldPluggable
Copyright © 2019 The Apache Software Foundation