Home Reference Source
import WARCWriterBase from 'node-warc/lib/writers/warcWriterBase.js'
public class | source

WARCWriterBase

Extends:

EventEmitter → WARCWriterBase

Base class used for writing to the WARC

Constructor Summary

Public Constructor
public

constructor(defaultOpts: WARCFileOpts)

Create a new WARCWriter

Member Summary

Public Members
public
public
Private Members
private
private
private
private
private
private

_warcOutStream: WriteStream

Method Summary

Public Methods
public

end()

Close the underlying filestream to the WARC currently being written.

public

initWARC(warcPath: string, options: WARCFileOpts)

Initialize the writer.

public

Set the default WARC creation options

public

writeRecordBlock(recordBuffer: Buffer): Promise<void>

Write an record block to the WARC

public

async writeRecordChunks(recordParts: Buffer[]): Promise<void>

Write arbitrary number of items to the WARC

public

writeRequestRecord(targetURI: string, httpHeaderString: string, requestData: string | Buffer): Promise<void>

Write A Request Record

public

async writeRequestResponseRecords(targetURI: string, reqData: {headers: string, data?: Buffer|string}, resData: {headers: string, data?: Buffer|string}): Promise<void>

public

writeResponseRecord(targetURI: string, httpHeaderString: string, responseData: string | Buffer): Promise<void>

Write A Response Record

public

Write out the WARC-Type: info records.

public

writeWarcMetadata(targetURI: string, metaData: string | Buffer): Promise<void>

Write WARC-Type: metadata record

public

writeWarcMetadataOutlinks(targetURI: string, outlinks: string): Promise<void>

Write WARC-Type: metadata for outlinks

public

writeWarcRawInfoRecord(warcInfoContent: string | Buffer): Promise<void>

Write warc-info record

public

Writes a WARC Info record containing Webrecorder/Webrecorder Player bookmark (page list)

Private Methods
private

_onError(err: Error)

Emits an error if one occurs

private

Called when the WARC generation is finished

private

_writeRequestRecord(targetURI: string, resId: string, httpHeaderString: string, requestData: string | Buffer): Promise<void>

Write A Request Record

private

_writeResponseRecord(targetURI: string, resId: string, httpHeaderString: string, responseData: string | Buffer): Promise<void>

Write A Response Record

Public Constructors

public constructor(defaultOpts: WARCFileOpts) source

Create a new WARCWriter

Params:

NameTypeAttributeDescription
defaultOpts WARCFileOpts
  • optional
  • nullable: true

Optional default WARC file options

Public Members

public defaultOpts: WARCFileOpts source

public opts: WARCFileOpts source

Private Members

private _fileName: string source

private _lastError: Error source

private _now: string source

private _version: string source

private _warcInfoId: string source

private _warcOutStream: WriteStream source

Public Methods

public end() source

Close the underlying filestream to the WARC currently being written. The finished event will not be emitted until this method has been called

public initWARC(warcPath: string, options: WARCFileOpts) source

Initialize the writer. The options object is optional and defaults to appending = false and gzip = process.env.NODEWARC_WRITE_GZIPPED != null. Writing gzipped records is also controllable by setting NODEWARC_WRITE_GZIPPED environment variable. Options supplied to this method override the default options.

Params:

NameTypeAttributeDescription
warcPath string

the path for the WARC file to be written

options WARCFileOpts
  • optional
  • nullable: true

write options controlling how the WARC should be written

public setDefaultOpts(defaultOpts: WARCFileOpts) source

Set the default WARC creation options

Params:

NameTypeAttributeDescription
defaultOpts WARCFileOpts

The new default options

public writeRecordBlock(recordBuffer: Buffer): Promise<void> source

Write an record block to the WARC

Params:

NameTypeAttributeDescription
recordBuffer Buffer

Return:

Promise<void>

public async writeRecordChunks(recordParts: Buffer[]): Promise<void> source

Write arbitrary number of items to the WARC

Params:

NameTypeAttributeDescription
recordParts Buffer[]

Array of buffers to be writtern

Return:

Promise<void>

public writeRequestRecord(targetURI: string, httpHeaderString: string, requestData: string | Buffer): Promise<void> source

Write A Request Record

Params:

NameTypeAttributeDescription
targetURI string

The URL of the response

httpHeaderString string

Stringified HTTP headers

requestData string | Buffer
  • optional

Body of the request if any

Return:

Promise<void>

public async writeRequestResponseRecords(targetURI: string, reqData: {headers: string, data?: Buffer|string}, resData: {headers: string, data?: Buffer|string}): Promise<void> source

Params:

NameTypeAttributeDescription
targetURI string

The target URI for the request response record pairs

reqData {headers: string, data?: Buffer|string}

The request data

resData {headers: string, data?: Buffer|string}

The response data

Return:

Promise<void>

public writeResponseRecord(targetURI: string, httpHeaderString: string, responseData: string | Buffer): Promise<void> source

Write A Response Record

Params:

NameTypeAttributeDescription
targetURI string

The URL of the response

httpHeaderString string

Stringified HTTP headers

responseData string | Buffer
  • optional

The response body if it exists

Return:

Promise<void>

public writeWarcInfoRecord(winfo: Object | Buffer | string): Promise<void> source

Write out the WARC-Type: info records. If the contents for the info record is an object then the objects properties (property, property value pairs) are written otherwise (when Buffer or string) the content is written as is

Params:

NameTypeAttributeDescription
winfo Object | Buffer | string

The contents for the WARC info record

Return:

Promise<void>

public writeWarcMetadata(targetURI: string, metaData: string | Buffer): Promise<void> source

Write WARC-Type: metadata record

Params:

NameTypeAttributeDescription
targetURI string

The URL of the page the this metadata record is for

metaData string | Buffer

A string or buffer containing metadata information to be used as this records content

Return:

Promise<void>

Write WARC-Type: metadata for outlinks

Params:

NameTypeAttributeDescription
targetURI string

The target URI for the metadata record

outlinks string

A string containing outlink metadata

Return:

Promise<void>

public writeWarcRawInfoRecord(warcInfoContent: string | Buffer): Promise<void> source

Write warc-info record

Params:

NameTypeAttributeDescription
warcInfoContent string | Buffer

The contents of the warc-info record

Return:

Promise<void>

public writeWebrecorderBookmarksInfoRecord(pages: string | Array<string>): Promise<void> source

Writes a WARC Info record containing Webrecorder/Webrecorder Player bookmark (page list)

Params:

NameTypeAttributeDescription
pages string | Array<string>

The URL of the page this WARC contains or an Array of URLs for the pages this WARC contains

Return:

Promise<void>

Private Methods

private _onError(err: Error) source

Emits an error if one occurs

Params:

NameTypeAttributeDescription
err Error

Emit:

error

The error that occurred

private _onFinish() source

Called when the WARC generation is finished

Emit:

finished

emitted when WARC generation is complete

private _writeRequestRecord(targetURI: string, resId: string, httpHeaderString: string, requestData: string | Buffer): Promise<void> source

Write A Request Record

Params:

NameTypeAttributeDescription
targetURI string

The URL of the response

resId string
  • nullable: true

The id of the record this request recorrd is concurrent to, typically its response

httpHeaderString string

Stringified HTTP headers

requestData string | Buffer
  • optional

Body of the request if any

Return:

Promise<void>

private _writeResponseRecord(targetURI: string, resId: string, httpHeaderString: string, responseData: string | Buffer): Promise<void> source

Write A Response Record

Params:

NameTypeAttributeDescription
targetURI string

The URL of the response

resId string

The id to be used for the response record

httpHeaderString string

Stringified HTTP headers

responseData string | Buffer
  • optional

The response body if it exists

Return:

Promise<void>