enhancementhelp wanted
Description
Hey guys,
I have an issue. In real world of scrappers when using proxies, checking if You have response code 200 or 500 is simply not enough. Plenty of proxies or even website itself can throw an output with code 200 which is invalid from developer point of view and yet its getting cached. Is there any way to apply some kind of custom logic which would determine if response is what we want? For example:
colly.CacheDir("some/dir", func(resp) {
if bytes.Contains(resp.Body, []byte("some magic marker")) {
return false // don't cache
}
})
I'm new to colly and i was digging thru source code for a bit, but didn't find such of feature. It's also useful without proxies.