-
Notifications
You must be signed in to change notification settings - Fork 34
Open
Description
cdxt --cc --from 2021 --to 2020 -v -v --limit 1 iter https://www.pbm.com/
INFO:cdx_toolkit.cli:set loglevel to DEBUG
DEBUG:cdx_toolkit.myrequests:getting https://index.commoncrawl.org/collinfo.json None
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): index.commoncrawl.org:443
DEBUG:urllib3.connectionpool:https://index.commoncrawl.org:443 "GET /collinfo.json HTTP/1.1" 200 1157
INFO:cdx_toolkit.commoncrawl:Found 87 endpoints in the Common Crawl index
INFO:cdx_toolkit:making a custom cc index list
INFO:cdx_toolkit.commoncrawl:using cc index range from https://index.commoncrawl.org/CC-MAIN-2021-04-index to https://index.commoncrawl.org/CC-MAIN-2020-50-index
INFO:cdx_toolkit:get_more: fetching cdx from https://index.commoncrawl.org/CC-MAIN-2021-04-index
The above date range should be empty.
Metadata
Metadata
Assignees
Labels
No labels