You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
 

203 lines
14 KiB

2020-09-15 11:26:10 [scrapy.extensions.telnet] INFO: Telnet Password: 423034b8342a486e
2020-09-15 11:26:10 [scrapy.middleware] INFO: Enabled extensions:
['scrapy.extensions.corestats.CoreStats',
'scrapy.extensions.telnet.TelnetConsole',
'scrapy.extensions.logstats.LogStats']
2020-09-15 11:26:11 [scrapy.middleware] INFO: Enabled downloader middlewares:
['scrapy.downloadermiddlewares.httpauth.HttpAuthMiddleware',
'scrapy.downloadermiddlewares.downloadtimeout.DownloadTimeoutMiddleware',
'scrapy.downloadermiddlewares.defaultheaders.DefaultHeadersMiddleware',
'scrapy.downloadermiddlewares.useragent.UserAgentMiddleware',
'scrapy.downloadermiddlewares.retry.RetryMiddleware',
'scrapy.downloadermiddlewares.redirect.MetaRefreshMiddleware',
'scrapy.downloadermiddlewares.httpcompression.HttpCompressionMiddleware',
'scrapy.downloadermiddlewares.redirect.RedirectMiddleware',
'scrapy.downloadermiddlewares.cookies.CookiesMiddleware',
'scrapy.downloadermiddlewares.httpproxy.HttpProxyMiddleware',
'scrapy.downloadermiddlewares.stats.DownloaderStats']
2020-09-15 11:26:11 [scrapy.middleware] INFO: Enabled spider middlewares:
['scrapy.spidermiddlewares.httperror.HttpErrorMiddleware',
'scrapy.spidermiddlewares.offsite.OffsiteMiddleware',
'scrapy.spidermiddlewares.referer.RefererMiddleware',
'scrapy.spidermiddlewares.urllength.UrlLengthMiddleware',
'scrapy.spidermiddlewares.depth.DepthMiddleware']
2020-09-15 11:26:11 [scrapy.middleware] INFO: Enabled item pipelines:
['demo1.pipelines.ziranweiyuanhuiPipline']
2020-09-15 11:26:11 [scrapy.core.engine] INFO: Spider opened
2020-09-15 11:26:11 [scrapy.extensions.logstats] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min)
2020-09-15 11:26:11 [scrapy.extensions.telnet] INFO: Telnet console listening on 127.0.0.1:6023
2020-09-15 11:26:11 [scrapy.core.engine] DEBUG: Crawled (200) <GET http://kjj.taiyuan.gov.cn/zfxxgk/gggs/index.shtml> (referer: None)
2020-09-15 11:26:11 [root] INFO: 这个链接已经爬过了-----:http://kjj.taiyuan.gov.cn/doc/2020/09/07/1008391.shtml
2020-09-15 11:26:11 [root] INFO: 这个链接已经爬过了-----:http://kjj.taiyuan.gov.cn/doc/2020/09/04/1008199.shtml
2020-09-15 11:26:11 [root] INFO: 这个链接已经爬过了-----:http://kjj.taiyuan.gov.cn/doc/2020/08/21/1004590.shtml
2020-09-15 11:26:11 [root] INFO: 这个链接已经爬过了-----:http://kjj.taiyuan.gov.cn/doc/2020/08/13/1001630.shtml
2020-09-15 11:26:11 [root] INFO: 这个链接已经爬过了-----:http://kjj.taiyuan.gov.cn/doc/2020/08/08/999926.shtml
2020-09-15 11:26:11 [root] INFO: 这个链接已经爬过了-----:http://kjj.taiyuan.gov.cn/doc/2020/07/31/997727.shtml
2020-09-15 11:26:11 [root] INFO: 这个链接已经爬过了-----:http://kjj.taiyuan.gov.cn/doc/2020/07/17/993580.shtml
2020-09-15 11:26:11 [root] INFO: 这个链接已经爬过了-----:http://kjj.taiyuan.gov.cn/doc/2020/06/23/988275.shtml
2020-09-15 11:26:11 [root] INFO: 这个链接已经爬过了-----:http://kjj.taiyuan.gov.cn/doc/2020/06/22/988019.shtml
2020-09-15 11:26:11 [root] INFO: 这个链接已经爬过了-----:http://kjj.taiyuan.gov.cn/doc/2020/06/19/987592.shtml
2020-09-15 11:26:11 [root] INFO: 这个链接已经爬过了-----:http://kjj.taiyuan.gov.cn/doc/2020/06/15/986244.shtml
2020-09-15 11:26:11 [root] INFO: 这个链接已经爬过了-----:http://kjj.taiyuan.gov.cn/doc/2020/06/15/986238.shtml
2020-09-15 11:26:11 [root] INFO: 这个链接已经爬过了-----:http://kjj.taiyuan.gov.cn/doc/2020/06/15/986237.shtml
2020-09-15 11:26:11 [root] INFO: 这个链接已经爬过了-----:http://kjj.taiyuan.gov.cn/doc/2020/06/15/986236.shtml
2020-09-15 11:26:12 [scrapy.core.engine] DEBUG: Crawled (200) <GET http://kjj.taiyuan.gov.cn/doc/2020/09/12/1010113.shtml> (referer: http://kjj.taiyuan.gov.cn/zfxxgk/gggs/index.shtml)
2020-09-15 11:26:12 [scrapy.core.scraper] DEBUG: Scraped from <200 http://kjj.taiyuan.gov.cn/doc/2020/09/12/1010113.shtml>
{'biaoti': '关于征求太原市地方标准《科技成果评价规范(征求意见稿)》意见的通知',
'laiyuan': '太原市科学技术局',
'lianjie': 'http://kjj.taiyuan.gov.cn/doc/2020/09/12/1010113.shtml',
'shijian': '2020-09-12',
'wenjian': [{'file_name': '1.科技成果评价规范(征求意见稿).doc',
'file_url': 'http://kjj.taiyuan.gov.cn/uploadfiles/202009/12/2020091222053429459132.doc',
'new_file': '/2020/09/Yys4ES6z_2020091222053429459132.doc'},
{'file_name': '2.地方标准征求意见反馈表.doc',
'file_url': 'http://kjj.taiyuan.gov.cn/uploadfiles/202009/12/2020091221401014098186.doc',
'new_file': '/2020/09/ucvansUw_2020091221401014098186.doc'}],
'xiangqing': '<div id="Zoom"> \n'
' <!--<$[CONTENT]>start-->\n'
' <!--<p style="text-align:center;"><img src="" '
'/></p>-->\n'
' <p></p><p align="justify" style="text-align: justify; '
'line-height: 200%; text-indent: 0pt; -ms-text-autospace: '
'ideograph-numeric; -ms-text-justify: inter-ideograph;"><span '
'style="font-size: 11pt;"><span style="font-family: '
'宋体;">各相关单位</span></span><span style="font-size: 11pt;"><span '
'style="font-family: 宋体;">和个人</span></span><span '
'style="font-size: 11pt;"><span style="font-family: '
'宋体;">:</span></span></p>\n'
'\n'
'<p align="justify" style="text-align: justify; line-height: '
'200%; text-indent: 22pt; -ms-text-autospace: ideograph-numeric; '
'-ms-text-justify: inter-ideograph;"><span style="font-size: '
'11pt;"><span style="font-family: '
'宋体;">根据国家《地方标准管理办法》要求,现就太原市科学技术局提出,太原技术转移促进中心、山西产业互联网研究院、山西省大众科技评估中心起草的地方标准《科技成果评价规范(征求意见稿)》,向社会公开征求意见,请各有关单位及个人提出意见,并填写《征求意见反馈表》,于2020年10月11日前反馈至市科技局计划处。</span></span></p>\n'
'\n'
'<p align="justify" style="text-align: justify; line-height: '
'200%; text-indent: 22pt; -ms-text-autospace: ideograph-numeric; '
'-ms-text-justify: inter-ideograph;"><span style="font-size: '
'11pt;"><span style="font-family: 宋体;">联 系 人:</span></span><span '
'style="font-size: 11pt;"><span style="font-family: '
'宋体;">张晓军</span></span></p>\n'
'\n'
'<p align="justify" style="text-align: justify; line-height: '
'200%; text-indent: 22pt; -ms-text-autospace: ideograph-numeric; '
'-ms-text-justify: inter-ideograph;"><span style="font-size: '
'11pt;"><span style="font-family: 宋体;">联系电话:</span></span><span '
'style="font-size: 11pt;"><span style="font-family: '
'宋体;">4223750</span></span></p>\n'
'\n'
'<p align="justify" style="text-align: justify; line-height: '
'200%; text-indent: 22pt; -ms-text-autospace: ideograph-numeric; '
'-ms-text-justify: inter-ideograph;"><span style="font-size: '
'11pt;"><span style="font-family: 宋体;">电子邮箱:</span></span><span '
'style="font-size: 11pt;"><span style="font-family: '
'宋体;">cxfz701</span></span><span style="font-size: 11pt;"><span '
'style="font-family: 宋体;">@1</span></span><span '
'style="font-size: 11pt;"><span style="font-family: '
'宋体;">63</span></span><span style="font-size: 11pt;"><span '
'style="font-family: 宋体;">.com</span></span></p>\n'
'\n'
'<p align="justify" style="text-align: justify; line-height: '
'200%; text-indent: 22pt; -ms-text-autospace: ideograph-numeric; '
'-ms-text-justify: inter-ideograph;">\xa0</p>\n'
'\n'
'<p align="justify" style="text-align: justify; line-height: '
'200%; text-indent: 22pt; -ms-text-autospace: ideograph-numeric; '
'-ms-text-justify: inter-ideograph;"><span style="font-size: '
'11pt;"><span style="font-family: '
'宋体;">附</span></span>\xa0\xa0\xa0\xa0<span style="font-size: '
'11pt;"><span style="font-family: 宋体;">件:</span></span></p>\n'
'\n'
'<p align="justify" style="text-align: justify; line-height: '
'200%; text-indent: 22pt; -ms-text-autospace: ideograph-numeric; '
'-ms-text-justify: inter-ideograph;"><a '
'href="https://www.sxwikionline.com/gateway/enterprise/file/download/know?path=/home/enterprise/staticrec/policy/2020/09/Yys4ES6z_2020091222053429459132.doc" '
'target="_blank" '
'title="1.科技成果评价规范(征求意见稿).doc">1.科技成果评价规范(征求意见稿).doc</a></p>\n'
'\n'
'<p align="justify" style="text-align: justify; line-height: '
'200%; text-indent: 22pt; -ms-text-autospace: ideograph-numeric; '
'-ms-text-justify: inter-ideograph;"><a '
'href="https://www.sxwikionline.com/gateway/enterprise/file/download/know?path=/home/enterprise/staticrec/policy/2020/09/ucvansUw_2020091221401014098186.doc" '
'target="_blank" '
'title="2.地方标准征求意见反馈表.doc">2.地方标准征求意见反馈表.doc</a></p>\n'
'\n'
'<p align="justify" style="text-align: justify; line-height: '
'200%; text-indent: 0pt; -ms-text-autospace: ideograph-numeric; '
'-ms-text-justify: inter-ideograph;">\xa0</p>\n'
'\n'
'<p align="justify" style="text-align: justify; line-height: '
'200%; text-indent: 0pt; -ms-text-autospace: ideograph-numeric; '
'-ms-text-justify: inter-ideograph;">\xa0</p>\n'
'\n'
'<p align="right" style="text-align: right; line-height: 200%; '
'text-indent: 0pt; -ms-text-autospace: ideograph-numeric;"><span '
'style="font-size: 11pt;"><span style="font-family: '
'宋体;">太原市科学技术局</span></span></p>\n'
'\n'
'<p align="right" style="text-align: right; line-height: 200%; '
'text-indent: 0pt; -ms-text-autospace: ideograph-numeric;"><span '
'style="font-size: 11pt;"><span style="font-family: '
'宋体;">2020年9月12日</span></span></p>\n'
'\n'
' <!--<$[CONTENT]>end--> \n'
' </div>'}
2020-09-15 11:26:12 [scrapy.core.engine] INFO: Closing spider (finished)
2020-09-15 11:26:12 [root] INFO: 爬虫运行完毕了
2020-09-15 11:26:12 [scrapy.statscollectors] INFO: Dumping Scrapy stats:
{'downloader/request_bytes': 555,
'downloader/request_count': 2,
'downloader/request_method_count/GET': 2,
'downloader/response_bytes': 33217,
'downloader/response_count': 2,
'downloader/response_status_count/200': 2,
'elapsed_time_seconds': 1.491522,
'finish_reason': 'finished',
'finish_time': datetime.datetime(2020, 9, 15, 3, 26, 12, 594548),
'item_scraped_count': 1,
'log_count/DEBUG': 3,
'log_count/INFO': 25,
'request_depth_max': 1,
'response_received_count': 2,
'scheduler/dequeued': 2,
'scheduler/dequeued/memory': 2,
'scheduler/enqueued': 2,
'scheduler/enqueued/memory': 2,
'start_time': datetime.datetime(2020, 9, 15, 3, 26, 11, 103026)}
2020-09-15 11:26:12 [scrapy.core.engine] INFO: Spider closed (finished)
2020-09-16 08:47:17 [scrapy.extensions.telnet] INFO: Telnet Password: d2a8a3ac7c4697ab
2020-09-16 08:47:17 [scrapy.middleware] INFO: Enabled extensions:
['scrapy.extensions.corestats.CoreStats',
'scrapy.extensions.telnet.TelnetConsole',
'scrapy.extensions.logstats.LogStats']
2020-09-16 08:47:17 [scrapy.middleware] INFO: Enabled downloader middlewares:
['scrapy.downloadermiddlewares.httpauth.HttpAuthMiddleware',
'scrapy.downloadermiddlewares.downloadtimeout.DownloadTimeoutMiddleware',
'scrapy.downloadermiddlewares.defaultheaders.DefaultHeadersMiddleware',
'scrapy.downloadermiddlewares.useragent.UserAgentMiddleware',
'scrapy.downloadermiddlewares.retry.RetryMiddleware',
'scrapy.downloadermiddlewares.redirect.MetaRefreshMiddleware',
'scrapy.downloadermiddlewares.httpcompression.HttpCompressionMiddleware',
'scrapy.downloadermiddlewares.redirect.RedirectMiddleware',
'scrapy.downloadermiddlewares.cookies.CookiesMiddleware',
'scrapy.downloadermiddlewares.httpproxy.HttpProxyMiddleware',
'scrapy.downloadermiddlewares.stats.DownloaderStats']
2020-09-16 08:47:17 [scrapy.middleware] INFO: Enabled spider middlewares:
['scrapy.spidermiddlewares.httperror.HttpErrorMiddleware',
'scrapy.spidermiddlewares.offsite.OffsiteMiddleware',
'scrapy.spidermiddlewares.referer.RefererMiddleware',
'scrapy.spidermiddlewares.urllength.UrlLengthMiddleware',
'scrapy.spidermiddlewares.depth.DepthMiddleware']
2020-09-16 08:47:17 [scrapy.middleware] INFO: Enabled item pipelines:
['demo1.pipelines.ziranweiyuanhuiPipline']
2020-09-16 08:47:17 [scrapy.core.engine] INFO: Spider opened
2020-09-16 08:47:17 [scrapy.extensions.logstats] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min)
2020-09-16 08:47:17 [scrapy.extensions.telnet] INFO: Telnet console listening on 127.0.0.1:6037
2020-09-16 08:47:17 [scrapy.crawler] INFO: Overridden settings:
{'BOT_NAME': 'demo1',
'DOWNLOAD_DELAY': 1,
'LOG_FILE': 'logs/taiyuangongyehexinxihuaju_2020_9.log',
'NEWSPIDER_MODULE': 'demo1.spiders',
'RETRY_HTTP_CODES': [500, 502, 503, 504, 400, 403, 404, 408, 302],
'RETRY_TIMES': True, 'SPIDER_MODULES': ['demo1.spiders']}