References
summary | ||
public |
F runPromise(runnable: function(): Promise<any>|Promise<any>, thener: function(...args: any), catcher: function(...args: any)): void Runs a promise using the supplied thener and catcher functions |
config
summary | ||
public |
C Config Crawl config loader |
|
public |
|
|
public |
|
|
public |
|
|
public |
T UserScript: function(page: Page): Promise<void> |
|
public |
|
|
public |
|
crawler
summary | ||
public |
Crawler based on cyrus-and/chrome-remote-interface |
|
public |
C Helper A helper class providing utility methods |
|
public |
Monitor navigation and request events for crawling a page. |
|
public |
Monitors the HTTP requests made by a page and emits the 'network-idle' event when it has been determined the network is idle Used by PuppeteerCrawler |
|
public |
Crawler based on puppeteer |
|
public |
|
|
public |
E Page |
frontier
injectManager
summary | ||
public |
Manages the JavaScript that is injected into the page |
|
public |
T OnLoadInject: {scriptSource: string} |
|
public |
T OnNewDocumentInject: {source: string} |
injectManager/pageInjects
summary | ||
public |
Starts the collection of the outlinks. |
|
public |
F initCollectLinks(): void Function that is injected into every frame of the page currently being crawled that will setup the outlink collection depending if the frame injected into is the top frame or a sub frame. |
|
public |
Builds the WARC outlink metadata information and finds potential links to goto next from a page and build |
|
public |
F noNaughtyJS() Function that disables the setting of window event handlers onbeforeunload and onunload and disables the usage of window.alert, window.confirm, and window.prompt. |
|
public |
F scrollOnLoad() Function that is injected into every frame of the page being crawled that starts scrolling the page
once the |
|
public |
F async scrollPage(): Promise<void> Function that scrolls the page/frame injected into a maximum of 20 times or until no more scroll can be done |
launcher
summary | ||
public |
Utility class for launching or connecting to a Chrome/Chromium instance |
|
public |
Utility class that provides functionality for finding an suitable chrome executable |
|
public |
F async launch(options: ChromeOptions): Promise<!Puppeteer.Browser> Launch and connect or connect to Chrome/Chromium |
|
public |
E CRI |
runners
summary | ||
public |
F async chromeRunner(conf: CrawlConfig): Promise<void, Error> Launches a crawl using the supplied configuration file path |
|
public |
F async puppeteerRunner(conf: CrawlConfig): Promise<void, Error> Launches a crawl using the supplied configuration file path |
utils
summary | ||
public |
Utility class for displaying colored text in console |
|
public |
Class that initializes the warc naming function used when generating the warcs |
|
public |
F isEmptyPlainObject(object: Object): boolean Test to see if a |
|
public |
Promise wrapper around setTimeout |
|
public |
F makeRunnable(runnable: function(...args: any): Promise): function(...args: any): void Composes the supplied function with runPromise. |