Skip to content

crawlHTML

CrawlHTMLDetailTargetConfig

ts
export interface CrawlHTMLDetailTargetConfig extends CrawlCommonConfig {
  url: string
  headers?: Object | null
  priority?: number
  fingerprint?: DetailTargetFingerprintCommon | null
}
参数类型默认值描述
urlstring-url
headersObject | null-请求头
prioritynumber-优先级
fingerprintDetailTargetFingerprintCommon-设备指纹

CrawlHTMLAdvancedConfig

ts
export interface CrawlHTMLAdvancedConfig extends CrawlCommonConfig {
  targets: (string | CrawlHTMLDetailTargetConfig)[]
  intervalTime?: IntervalTime
  fingerprints?: DetailTargetFingerprintCommon[]

  headers?: Object

  onCrawlItemComplete?: (crawlDataSingleResult: CrawlHTMLSingleResult) => void
}
参数类型默认值描述
targets(string | CrawlDataDetailTargetConfig)[]-目标
intervalTimeIntervalTime-间隔时间
fingerprintsDetailTargetFingerprintCommon[]-设备指纹
headersObject-请求头
onCrawlItemComplete( crawlDataSingleResult: CrawlDataSingleResult ) => void-声明周期

CrawlHTMLSingleResult

ts
export interface CrawlHTMLSingleResult extends CrawlCommonResult {
  data: {
    statusCode: number | undefined
    headers: IncomingHttpHeaders // IncomingHttpHeaders 来自于 node:http
    html: string
  } | null
}

外部类型

  • IncomingHttpHeaders:来自于 nodejs 的 http

基于 MIT 许可发布