Skip to content

crawlFile

CrawlFileDetailTargetConfig

ts
export interface CrawlFileDetailTargetConfig extends CrawlCommonConfig {
  url: string
  headers?: Object | null
  priority?: number
  storeDir?: string | null
  fileName?: string | null
  extension?: string | null
  fingerprint?: DetailTargetFingerprintCommon | null
}
参数类型默认值描述
urlstring-url
headersObject | null-请求头
prioritynumber-优先级
storeDirstring__dirname存储位置
fileNamestring-文件名
extensionstring-扩展名
fingerprintDetailTargetFingerprintCommon-设备指纹

CrawlFileAdvancedConfig

ts
export interface CrawlFileAdvancedConfig extends CrawlCommonConfig {
  targets: (string | CrawlFileDetailTargetConfig)[]
  intervalTime?: IntervalTime
  fingerprints?: DetailTargetFingerprintCommon[]
  storeDirs?: string | (string | null)[]
  extensions?: string | (string | null)[]
  fileNames?: (string | null)[]

  headers?: Object

  onCrawlItemComplete?: (crawlFileSingleResult: CrawlFileSingleResult) => void
  onBeforeSaveItemFile?: (info: {
    id: number
    fileName: string
    filePath: string
    data: Buffer
  }) => Promise<Buffer | void> | Buffer | void
}
参数类型默认值描述
targets(string | CrawlDataDetailTargetConfig)[]-目标
intervalTimeIntervalTime-间隔时间
fingerprintsDetailTargetFingerprintCommon[]-设备指纹
storeDirsstring | (string | null)[]__dirname存储位置
extensionstring | (string | null)[]-扩展名
fileName(string | null)[]-文件名
headersObject-请求头
onCrawlItemComplete( crawlDataSingleResult: CrawlDataSingleResult ) => void-声明周期
onBeforeSaveItemFile(info: { id: number; fileName: string; filePath: string; data: Buffer }) => Promise<Buffer | void> | Buffer | void-声明周期

CrawlFileSingleResult

ts
export interface CrawlFileSingleResult extends CrawlCommonResult {
  data: {
    statusCode: number | undefined
    headers: IncomingHttpHeaders // IncomingHttpHeaders 来自于 node:http
    data: {
      isSuccess: boolean
      fileName: string
      fileExtension: string
      mimeType: string
      size: number
      filePath: string
    }
  } | null
}

外部类型

  • IncomingHttpHeaders:来自于 nodejs 的 http

基于 MIT 许可发布