Replies: 4 comments
-
Can you provide a reproduction? I mean some code, what you provided is an invalid state, we need to understand how you got to that state. @metalwarrior665 any idea how this could happen? We could ignore |
Beta Was this translation helpful? Give feedback.
-
Maybe the user created an Error object and then set the message property to null or some library did this? I never saw this. |
Beta Was this translation helpful? Give feedback.
-
I remove this sqlite file,and it not happen again. //departStore is a lmdb instance
const crawler = new Apify.CheerioCrawler({
maxConcurrency: 100,
minConcurrency: 10,
// Let the crawler fetch URLs from our list.
requestQueue,
maxRequestRetries: 10000,
// This function will be called for each URL to crawl.
// It accepts a single parameter, which is an object with options as:
// https://sdk.apify.com/docs/typedefs/cheerio-crawler-options#handlepagefunction
// We use for demonstration only 2 of them:
// - request: an instance of the Request class with information such as URL and HTTP method
// - $: the cheerio object containing parsed HTML
handlePageFunction: async ({ request, $ }) => {
log.debug(`Processing ${request.url} `);
log.info(JSON.stringify(await requestQueue.getInfo()));
let allLinks = await RingoTools.extract.extractAllUrls($, request.loadedUrl);
let firstPageUrl = request.url;
if (request.url.includes("?")) {
firstPageUrl = request.url.split("?")[0];
}
let header = await RingoTools.extract.extractAttrByCss($, ["text"], ".z-header");
let links = allLinks.filter(link => link.startsWith(firstPageUrl + "?first_id="));
if (header.length === 0) {
throw new Error("no header info" + request.url);
}
if (request.userData && request.userData.type === "first") {
for (let i = 0; i < links.length; i++) {
await requestQueue.addRequest({ url: links[i], userData: { type: "list" } });
}
}
let detailLinks = allLinks.filter(link => link.startsWith("http://3g"));
let hospitalId = firstPageUrl.replace("http://3g", "").replace("-1.htm", "");
departStore.putSync(request.url, { hospitalId, detailLinks, html: $.html() });
},
// This function is called if the page processing failed more than maxRequestRetries+1 times.
handleFailedRequestFunction: async ({ request }) => {
await requestQueue.reclaimRequest(request);
}
}); |
Beta Was this translation helpful? Give feedback.
-
I'm seeing this issue as well - any suggestions on how to prevent this?
|
Beta Was this translation helpful? Give feedback.
-
Describe the bug
ArgumentError: (array
RequestOptions.errorMessages
) Expected values to be of typestring
but received typenull
To Reproduce
sqlite json field
{"id":"KBXoZCDAj5e6ZbO","url":"xxxxxxxxx","uniqueKey":"xxxxxxxxx","method":"GET","noRetry":false,"retryCount":1,"errorMessages":[null],"headers":{},"userData":{"page":1}}
Expected behavior
process exit
System information:
Beta Was this translation helpful? Give feedback.
All reactions