listing_res = await crawler.arun(url=listing_url)
this is by default sequentioal
change to concurrent:
listing_pages = await crawler.arun_many(
listing_urls,
config={"concurrency": 5, "delay": 2}
)
Yes—setting concurrency: 1 in your arun_many() config will effectively serialize all your fetches, making it behave just like calling arun() in a simple loop.
# this…
pages = await crawler.arun_many(
urls,
config={"concurrency": 1, "delay": 0}
)
# …is equivalent to:
pages = []
for url in urls:
pages.append(await crawler.arun(url=url))
The only difference is that arun_many still batches everything for you (and can apply a shared retry/backoff strategy), but with concurrency=1 it won’t open more than one page at a time. If you also set delay to 0 (or omit it), you’ll get a straight one-after-another crawl just like repeated arun() cal
No comments:
Post a Comment