Stay Hungry: crawl

Saturday, 17 May 2025

crawl

listing_res = await crawler.arun(url=listing_url)

this is by default sequentioal

change to concurrent:

listing_pages = await crawler.arun_many(

listing_urls,

config={"concurrency": 5, "delay": 2}

)

Yes—setting concurrency: 1 in your arun_many() config will effectively serialize all your fetches, making it behave just like calling arun() in a simple loop.

# this…

pages = await crawler.arun_many(

urls,

config={"concurrency": 1, "delay": 0}

)

# …is equivalent to:

pages = []

for url in urls:

pages.append(await crawler.arun(url=url))

The only difference is that arun_many still batches everything for you (and can apply a shared retry/backoff strategy), but with concurrency=1 it won’t open more than one page at a time. If you also set delay to 0 (or omit it), you’ll get a straight one-after-another crawl just like repeated arun() cal

Stay Hungry

Saturday, 17 May 2025

crawl

No comments:

Post a Comment