You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
111 lines
8.4 KiB
111 lines
8.4 KiB
5 years ago
|
2020-09-15 11:23:34 [scrapy.extensions.telnet] INFO: Telnet Password: ee22c12439cb5178
|
||
|
2020-09-15 11:23:34 [scrapy.middleware] INFO: Enabled extensions:
|
||
|
['scrapy.extensions.corestats.CoreStats',
|
||
|
'scrapy.extensions.telnet.TelnetConsole',
|
||
|
'scrapy.extensions.logstats.LogStats']
|
||
|
2020-09-15 11:23:34 [scrapy.middleware] INFO: Enabled downloader middlewares:
|
||
|
['scrapy.downloadermiddlewares.httpauth.HttpAuthMiddleware',
|
||
|
'scrapy.downloadermiddlewares.downloadtimeout.DownloadTimeoutMiddleware',
|
||
|
'scrapy.downloadermiddlewares.defaultheaders.DefaultHeadersMiddleware',
|
||
|
'scrapy.downloadermiddlewares.useragent.UserAgentMiddleware',
|
||
|
'scrapy.downloadermiddlewares.retry.RetryMiddleware',
|
||
|
'scrapy.downloadermiddlewares.redirect.MetaRefreshMiddleware',
|
||
|
'scrapy.downloadermiddlewares.httpcompression.HttpCompressionMiddleware',
|
||
|
'scrapy.downloadermiddlewares.redirect.RedirectMiddleware',
|
||
|
'scrapy.downloadermiddlewares.cookies.CookiesMiddleware',
|
||
|
'scrapy.downloadermiddlewares.httpproxy.HttpProxyMiddleware',
|
||
|
'scrapy.downloadermiddlewares.stats.DownloaderStats']
|
||
|
2020-09-15 11:23:34 [scrapy.middleware] INFO: Enabled spider middlewares:
|
||
|
['scrapy.spidermiddlewares.httperror.HttpErrorMiddleware',
|
||
|
'scrapy.spidermiddlewares.offsite.OffsiteMiddleware',
|
||
|
'scrapy.spidermiddlewares.referer.RefererMiddleware',
|
||
|
'scrapy.spidermiddlewares.urllength.UrlLengthMiddleware',
|
||
|
'scrapy.spidermiddlewares.depth.DepthMiddleware']
|
||
|
2020-09-15 11:23:34 [scrapy.middleware] INFO: Enabled item pipelines:
|
||
|
['demo1.pipelines.ziranweiyuanhuiPipline']
|
||
|
2020-09-15 11:23:34 [scrapy.core.engine] INFO: Spider opened
|
||
|
2020-09-15 11:23:34 [scrapy.extensions.logstats] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min)
|
||
|
2020-09-15 11:23:34 [scrapy.extensions.telnet] INFO: Telnet console listening on 127.0.0.1:6023
|
||
|
2020-09-15 11:23:34 [scrapy.core.engine] DEBUG: Crawled (200) <GET http://swt.shanxi.gov.cn/Main/list.action?channelId=27> (referer: None)
|
||
|
2020-09-15 11:23:34 [root] INFO: 这个链接已经爬过了-----:http://swt.shanxi.gov.cn/Main/cmsContent.action?articleId=4cb2c090-e719-41d0-ac0f-1abe541f183e
|
||
|
2020-09-15 11:23:34 [root] INFO: 这个链接已经爬过了-----:http://swt.shanxi.gov.cn/Main/cmsContent.action?articleId=43efe7bb-0a96-4484-b9f4-9184f35b94e8
|
||
|
2020-09-15 11:23:34 [root] INFO: 这个链接已经爬过了-----:http://swt.shanxi.gov.cn/Main/cmsContent.action?articleId=20355e00-5299-4693-b784-3ea132f68e12
|
||
|
2020-09-15 11:23:34 [root] INFO: 这个链接已经爬过了-----:http://swt.shanxi.gov.cn/Main/cmsContent.action?articleId=9daf0688-0f5d-467c-8531-ba1cefc92770
|
||
|
2020-09-15 11:23:34 [root] INFO: 这个链接已经爬过了-----:http://swt.shanxi.gov.cn/Main/cmsContent.action?articleId=0238484c-8496-4066-8996-3de03378979c
|
||
|
2020-09-15 11:23:34 [scrapy.spidermiddlewares.offsite] DEBUG: Filtered offsite request to 'fgw.shanxi.gov.cn': <GET http://fgw.shanxi.gov.cn/fggz/wngz/kjws/201911/t20191129_121660.shtml>
|
||
|
2020-09-15 11:23:34 [root] INFO: 这个链接已经爬过了-----:http://swt.shanxi.gov.cn/Main/cmsContent.action?articleId=e8944693-fe8b-4385-be73-4aa7715056f1
|
||
|
2020-09-15 11:23:35 [root] INFO: 这个链接已经爬过了-----:http://swt.shanxi.gov.cn/Main/cmsContent.action?articleId=b913adc3-775d-4c3c-9ef0-ccb66eb6987f
|
||
|
2020-09-15 11:23:35 [root] INFO: 这个链接已经爬过了-----:http://swt.shanxi.gov.cn/Main/cmsContent.action?articleId=318e14b2-ca25-4e91-b6b0-2b54a1f88348
|
||
|
2020-09-15 11:23:35 [root] INFO: 这个链接已经爬过了-----:http://swt.shanxi.gov.cn/Main/cmsContent.action?articleId=923c9f58-34a3-4518-853c-b86f33787ebc
|
||
|
2020-09-15 11:23:35 [root] INFO: 这个链接已经爬过了-----:http://swt.shanxi.gov.cn/Main/cmsContent.action?articleId=dff5d8f1-a830-44f2-ba68-3e2af3c52638
|
||
|
2020-09-15 11:23:35 [root] INFO: 这个链接已经爬过了-----:http://swt.shanxi.gov.cn/Main/cmsContent.action?articleId=d0a6ba2d-952b-4d93-8663-ae9a4008ae0a
|
||
|
2020-09-15 11:23:35 [root] INFO: 这个链接已经爬过了-----:http://swt.shanxi.gov.cn/Main/cmsContent.action?articleId=8be236d6-5365-44ef-990f-a6848a860346
|
||
|
2020-09-15 11:23:35 [root] INFO: 这个链接已经爬过了-----:http://swt.shanxi.gov.cn/Main/cmsContent.action?articleId=f9a6ad01-6902-495a-84e4-6500c5e8f3cc
|
||
|
2020-09-15 11:23:35 [root] INFO: 这个链接已经爬过了-----:http://swt.shanxi.gov.cn/Main/cmsContent.action?articleId=9fbb7bad-1119-4be7-b6df-9ecf2feb34f3
|
||
|
2020-09-15 11:23:35 [root] INFO: 这个链接已经爬过了-----:http://swt.shanxi.gov.cn/Main/cmsContent.action?articleId=50e277e2-9d8f-499e-816f-aea870f89c89
|
||
|
2020-09-15 11:23:35 [root] INFO: 这个链接已经爬过了-----:http://swt.shanxi.gov.cn/Main/cmsContent.action?articleId=4ee60e63-acca-4c86-8d9c-099f7bd3aa4f
|
||
|
2020-09-15 11:23:35 [root] INFO: 这个链接已经爬过了-----:http://swt.shanxi.gov.cn/Main/cmsContent.action?articleId=c40c816b-a596-4f9f-94ac-1fe6154a7cf3
|
||
|
2020-09-15 11:23:35 [root] INFO: 这个链接已经爬过了-----:http://swt.shanxi.gov.cn/Main/cmsContent.action?articleId=efa92a7b-16d3-496c-b07f-5a63525bafe1
|
||
|
2020-09-15 11:23:35 [root] INFO: 这个链接已经爬过了-----:http://swt.shanxi.gov.cn/Main/cmsContent.action?articleId=31eb36b4-f197-4c3b-9162-2f332b050ced
|
||
|
2020-09-15 11:23:35 [root] INFO: 这个链接已经爬过了-----:http://swt.shanxi.gov.cn/Main/cmsContent.action?articleId=33bb2acd-de5d-442a-859f-2e9d95f73504
|
||
|
2020-09-15 11:23:35 [scrapy.core.engine] INFO: Closing spider (finished)
|
||
|
2020-09-15 11:23:35 [root] INFO: 爬虫运行完毕了
|
||
|
2020-09-15 11:23:35 [scrapy.statscollectors] INFO: Dumping Scrapy stats:
|
||
|
{'downloader/request_bytes': 250,
|
||
|
'downloader/request_count': 1,
|
||
|
'downloader/request_method_count/GET': 1,
|
||
|
'downloader/response_bytes': 8192,
|
||
|
'downloader/response_count': 1,
|
||
|
'downloader/response_status_count/200': 1,
|
||
|
'elapsed_time_seconds': 0.765148,
|
||
|
'finish_reason': 'finished',
|
||
|
'finish_time': datetime.datetime(2020, 9, 15, 3, 23, 35, 245648),
|
||
|
'log_count/DEBUG': 2,
|
||
|
'log_count/INFO': 31,
|
||
|
'offsite/domains': 1,
|
||
|
'offsite/filtered': 1,
|
||
|
'request_depth_max': 1,
|
||
|
'response_received_count': 1,
|
||
|
'scheduler/dequeued': 1,
|
||
|
'scheduler/dequeued/memory': 1,
|
||
|
'scheduler/enqueued': 1,
|
||
|
'scheduler/enqueued/memory': 1,
|
||
|
'start_time': datetime.datetime(2020, 9, 15, 3, 23, 34, 480500)}
|
||
|
2020-09-15 11:23:35 [scrapy.core.engine] INFO: Spider closed (finished)
|
||
|
2020-09-16 08:47:16 [scrapy.extensions.telnet] INFO: Telnet Password: 1a617e64c04cecf7
|
||
|
2020-09-16 08:47:16 [scrapy.middleware] INFO: Enabled extensions:
|
||
|
['scrapy.extensions.corestats.CoreStats',
|
||
|
'scrapy.extensions.telnet.TelnetConsole',
|
||
|
'scrapy.extensions.logstats.LogStats']
|
||
|
2020-09-16 08:47:16 [scrapy.middleware] INFO: Enabled downloader middlewares:
|
||
|
['scrapy.downloadermiddlewares.httpauth.HttpAuthMiddleware',
|
||
|
'scrapy.downloadermiddlewares.downloadtimeout.DownloadTimeoutMiddleware',
|
||
|
'scrapy.downloadermiddlewares.defaultheaders.DefaultHeadersMiddleware',
|
||
|
'scrapy.downloadermiddlewares.useragent.UserAgentMiddleware',
|
||
|
'scrapy.downloadermiddlewares.retry.RetryMiddleware',
|
||
|
'scrapy.downloadermiddlewares.redirect.MetaRefreshMiddleware',
|
||
|
'scrapy.downloadermiddlewares.httpcompression.HttpCompressionMiddleware',
|
||
|
'scrapy.downloadermiddlewares.redirect.RedirectMiddleware',
|
||
|
'scrapy.downloadermiddlewares.cookies.CookiesMiddleware',
|
||
|
'scrapy.downloadermiddlewares.httpproxy.HttpProxyMiddleware',
|
||
|
'scrapy.downloadermiddlewares.stats.DownloaderStats']
|
||
|
2020-09-16 08:47:16 [scrapy.middleware] INFO: Enabled spider middlewares:
|
||
|
['scrapy.spidermiddlewares.httperror.HttpErrorMiddleware',
|
||
|
'scrapy.spidermiddlewares.offsite.OffsiteMiddleware',
|
||
|
'scrapy.spidermiddlewares.referer.RefererMiddleware',
|
||
|
'scrapy.spidermiddlewares.urllength.UrlLengthMiddleware',
|
||
|
'scrapy.spidermiddlewares.depth.DepthMiddleware']
|
||
|
2020-09-16 08:47:16 [scrapy.middleware] INFO: Enabled item pipelines:
|
||
|
['demo1.pipelines.ziranweiyuanhuiPipline']
|
||
|
2020-09-16 08:47:16 [scrapy.core.engine] INFO: Spider opened
|
||
|
2020-09-16 08:47:16 [scrapy.extensions.logstats] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min)
|
||
|
2020-09-16 08:47:16 [scrapy.extensions.telnet] INFO: Telnet console listening on 127.0.0.1:6033
|
||
|
2020-09-16 08:47:16 [scrapy.crawler] INFO: Overridden settings:
|
||
|
{'BOT_NAME': 'demo1',
|
||
|
'DOWNLOAD_DELAY': 1,
|
||
|
'LOG_FILE': 'logs/sxgongxinting_2020_9.log',
|
||
|
'NEWSPIDER_MODULE': 'demo1.spiders',
|
||
|
'RETRY_HTTP_CODES': [500, 502, 503, 504, 400, 403, 404, 408, 302],
|
||
|
'RETRY_TIMES': True,
|
||
|
'SPIDER_MODULES': ['demo1.spiders']}
|