Home Reference Source
import AutoWARCParser from 'node-warc/lib/parsers/autoWARCParser.js'
public class | source

AutoWARCParser

Extends:

EventEmitter → AutoWARCParser

Parses a WARC file automatically detecting if it is gzipped.

Example:

 const parser = new AutoWARCParser('<path-to-warcfile>')
 parser.on('record', record => { console.log(record) })
 parser.on('error', error => { console.error(error) })
 parser.start()
 const parser = new AutoWARCParser()
 parser.on('record', record => { console.log(record) })
 parser.on('error', error => { console.error(error) })
 parser.parseWARC('<path-to-warcfile>')
 // requires node >= 10
 for await (const record of new AutoWARCParser('<path-to-warcfile>')) {
   console.log(record)
 }

Constructor Summary

Public Constructor
public

Create a new AutoWARCParser

Member Summary

Public Members
public
Private Members
private
private

Method Summary

Public Methods
public

Alias for start except that you can supply the path to the WARC file to be parsed if one was not supplied via the constructor or to parse another WARC file.

public

Begin parsing the WARC file.

Private Methods
private

Returns a ReadStream for the WARC to be parsed.

private

_onEnd()

Listener for a parsers done event

private

_onError(error: Error)

Listener for a parsers error event

private

Listener for a parsers record event

Public Constructors

public constructor(wp: string) source

Create a new AutoWARCParser

Params:

NameTypeAttributeDescription
wp string
  • optional
  • nullable: true

path to the warc file to be parsed

Public Members

public [Symbol.asyncIterator]: AsyncIterator<WARCRecord>: * source

Private Members

private _parsing: boolean source

private _wp: string source

Public Methods

public parseWARC(wp: string): boolean source

Alias for start except that you can supply the path to the WARC file to be parsed if one was not supplied via the constructor or to parse another WARC file. If the path to WARC file to be parsed was supplied via the constructor and you supply a different path to this method. It will override the one supplied via the constructor

Params:

NameTypeAttributeDescription
wp string
  • optional
  • nullable: true

path to the WARC file to be parsed

Return:

boolean

indication if the parser has begun or is currently parsing a WARC file

Throw:

Error

if the path to the WARC file is null or undefined or another error occurred

public start(): boolean source

Begin parsing the WARC file. Once the start method has been called the parser will begin emitting

Return:

boolean

if the parser has begun or is currently parsing a WARC file

  • true: indicates the parser has begun parsing the WARC file true
  • false: indicated the parser is currently parsing a WARC file

Emit:

record

emitted when the parser has parsed a full record, the argument supplied to the listener will be the parsed record

done

emitted when the WARC file has been completely parsed

error

emitted if an exception occurs, the argument supplied to the listener will be the error that occurred.

Throw:

Error

if the path to the WARC file is null or undefined or another error occurred

Private Methods

private _getStream(): ReadStream | Gunzip source

Returns a ReadStream for the WARC to be parsed. If the WARC file is gziped the returned value will the results of ReadStream.pipe(zlib.createGunzip())

Return:

ReadStream | Gunzip

private _onEnd() source

Listener for a parsers done event

private _onError(error: Error) source

Listener for a parsers error event

Params:

NameTypeAttributeDescription
error Error

private _onRecord(record: WARCRecord) source

Listener for a parsers record event

Params:

NameTypeAttributeDescription
record WARCRecord