[Python] 自定义Pipeline将scrapy采集的数据保存到mysql数据库 →→→→→进入此内容的聊天室

来自 , 2019-09-09, 写在 Python, 查看 136 次.

# Cannot use this to create the table, must have table already created
 
from twisted.enterprise import adbapi
import datetime
import MySQLdb.cursors
 
class SQLStorePipeline(object):
 
    def __init__(self):
        self.dbpool = adbapi.ConnectionPool('MySQLdb', db='mydb',
                user='myuser', passwd='mypass', cursorclass=MySQLdb.cursors.DictCursor,
                charset='utf8', use_unicode=True)
 
    def process_item(self, item, spider):
        # run db query in thread pool
        query = self.dbpool.runInteraction(self._conditional_insert, item)
        query.addErrback(self.handle_error)
 
        return item
 
    def _conditional_insert(self, tx, item):
        # create record if doesn't exist. 
        # all this block run on it's own thread
        tx.execute("select * from websites where link = %s", (item['link'][0], ))
        result = tx.fetchone()
        if result:
            log.msg("Item already stored in db: %s" % item, level=log.DEBUG)
        else:
            tx.execute(\
                "insert into websites (link, created) "
                "values (%s, %s)",
                (item['link'][0],
                 datetime.datetime.now())
            )
            log.msg("Item stored in db: %s" % item, level=log.DEBUG)
 
    def handle_error(self, e):
        log.err(e)
 
#//python/8392

回复 "自定义Pipeline将scrapy采集的数据保存到mysql数据库"

这儿你可以回复上面这条便签

作者你的名字是？

标题给你的便签一个标题。

语言你的便签是以

你的便签在这儿输入便签内容

# Cannot use this to create the table, must have table already created
 
from twisted.enterprise import adbapi
import datetime
import MySQLdb.cursors
 
class SQLStorePipeline(object):
 
    def __init__(self):
        self.dbpool = adbapi.ConnectionPool('MySQLdb', db='mydb',
                user='myuser', passwd='mypass', cursorclass=MySQLdb.cursors.DictCursor,
                charset='utf8', use_unicode=True)
 
    def process_item(self, item, spider):
        # run db query in thread pool
        query = self.dbpool.runInteraction(self._conditional_insert, item)
        query.addErrback(self.handle_error)
 
        return item
 
    def _conditional_insert(self, tx, item):
        # create record if doesn't exist. 
        # all this block run on it's own thread
        tx.execute("select * from websites where link = %s", (item['link'][0], ))
        result = tx.fetchone()
        if result:
            log.msg("Item already stored in db: %s" % item, level=log.DEBUG)
        else:
            tx.execute(\
                "insert into websites (link, created) "
                "values (%s, %s)",
                (item['link'][0],
                 datetime.datetime.now())
            )
            log.msg("Item stored in db: %s" % item, level=log.DEBUG)
 
    def handle_error(self, e):
        log.err(e)
 
#//python/8392

创建短链接创建一个较短的URL，连接到这个便签

私人私人便签不会显示在最近列表中

保存期限我们应该什么时候删除这张便签？

防滥用键入这些字符

Code666 (代码贴、代码片段)

[Python] 自定义Pipeline将scrapy采集的数据保存到mysql数据库 →→→→→进入此内容的聊天室

回复 "自定义Pipeline将scrapy采集的数据保存到mysql数据库"