Skip to content

crawlPage

CrawlPageDetailTargetConfig

ts
export interface CrawlPageDetailTargetConfig extends CrawlCommonConfig {
  url: string
  headers?: Object | null
  cookies?: PageCookies | null
  priority?: number
  viewport?: Viewport | null // Viewport 来自于 puppeteer
  fingerprint?:
    | (DetailTargetFingerprintCommon & {
        maxWidth?: number
        minWidth?: number
        maxHeight?: number
        minHidth?: number
      })
    | null
}
参数类型默认值描述
urlstring-url
headersObject | null-请求头
cookiesPageCookies | null-cookies
prioritynumber-优先级
viewportViewport-设置视口大小
fingerprintDetailTargetFingerprintCommon & { maxWidth?: number; minWidth?: number; maxHeight?: number; minHidth?: number })-设备指纹

外部类型

  • Viewport:来自于 puppeteer ,viewport 会直接传给 page.setViewport 用于设置页面大小

CrawlPageAdvancedConfig

ts
export interface CrawlPageAdvancedConfig extends CrawlCommonConfig {
  targets: (string | CrawlPageDetailTargetConfig)[]
  intervalTime?: IntervalTime
  fingerprints?: (DetailTargetFingerprintCommon & {
    maxWidth?: number
    minWidth?: number
    maxHeight?: number
    minHidth?: number
  })[]

  headers?: Object
  cookies?: PageCookies
  viewport?: Viewport // Viewport:来自于 puppeteer

  onCrawlItemComplete?: (crawlPageSingleResult: CrawlPageSingleResult) => void
}
参数类型默认值描述
targets(string | CrawlDataDetailTargetConfig)[]-目标
intervalTimeIntervalTime-间隔时间
fingerprintsDetailTargetFingerprintCommon[]-设备指纹
headersObject-请求头
cookiesPageCookiesnull-
viewportViewport-设置视口大小
onCrawlItemComplete( crawlDataSingleResult: CrawlDataSingleResult ) => void-声明周期

外部类型

  • Viewport:来自于 puppeteer ,viewport 会直接传给 page.setViewport 用于设置页面大小

CrawlPageSingleResult

ts
export interface CrawlPageSingleResult extends CrawlCommonResult {
  data: {
    browser: Browser // Browser 来自于 puppeteer
    response: HTTPResponse | null // HTTPResponse 来自于 puppeteer
    page: Page // Page 自来于 puppeteer
  }
}

外部类型

基于 MIT 许可发布