试用PycURL – 笑遍世界

在Linux上有个常用的命令 curl（非常好用），支持curl的就是大名鼎鼎的libcurl库；libcurl是功能强大的，而且是非常高效的函数库。libcurl除了提供本身的C API之外，还有多达40种编程语言的Binding，这里介绍的PycURL就是libcurl的Python binding。
在Python中对网页进行GET/POST等请求，当需要考虑高性能的时候，libcurl是非常不错的选择，一般来说会比liburl、liburl2快不少，可能也会比Requests的效率更高。特别是使用PycURL的多并发请求时，更是效率很高的。个人感觉，其唯一的缺点是，由于是直接调用的是libcurl C库，PycURL的函数接口之类的还和C中的东西很像，可能不是那么的Pythonic，写代码的学习曲线稍微比liburl高一点儿。
还是看个简单的例子吧：

#! /usr/bin/env python
# -*- coding: utf-8 -*-

'''
Created on Dec 15, 2013

@author: Jay
'''

import sys
import pycurl
import time

class Test:
    def __init__(self):
        self.contents = ''

    def body_callback(self, buf):
        self.contents = self.contents + buf

sys.stderr.write("Testing %s\n" % pycurl.version)

start_time = time.time()

url = 'http://www.dianping.com/shanghai'
t = Test()
c = pycurl.Curl()
c.setopt(c.URL, url)
c.setopt(c.WRITEFUNCTION, t.body_callback)
c.perform()
end_time = time.time()
duration = end_time - start_time
print c.getinfo(pycurl.HTTP_CODE), c.getinfo(pycurl.EFFECTIVE_URL)
c.close()

print 'pycurl takes %s seconds to get %s ' % (duration, url)

print 'lenth of the content is %d' % len(t.contents)
#print(t.contents)

#! /usr/bin/env python

# -*- coding: utf-8 -*-

'''

Created on Dec 15, 2013

@author: Jay

'''

import sys

import pycurl

import time

class Test:

def __init__(self):

self.contents = ''

def body_callback(self, buf):

self.contents = self.contents + buf

sys.stderr.write("Testing %s\n" % pycurl.version)

start_time = time.time()

url = 'http://www.dianping.com/shanghai'

t = Test()

c = pycurl.Curl()

c.setopt(c.URL, url)

c.setopt(c.WRITEFUNCTION, t.body_callback)

c.perform()

end_time = time.time()

duration = end_time - start_time

print c.getinfo(pycurl.HTTP_CODE), c.getinfo(pycurl.EFFECTIVE_URL)

c.close()

print 'pycurl takes %s seconds to get %s ' % (duration, url)

print 'lenth of the content is %d' % len(t.contents)

#print(t.contents)

参考资料：
pycurl主页： http://pycurl.sourceforge.net/
pycurl API: http://pycurl.sourceforge.net/doc/pycurl.html
一个并发处理的例子： https://github.com/pycurl/pycurl/blob/master/examples/retriever-multi.py
libcurl C API: http://curl.haxx.se/libcurl/c/

一	二	三	四	五	六	日
« 11月				1月 »
						1
2	3	4	5	6	7	8
9	10	11	12	13	14	15
16	17	18	19	20	21	22
23	24	25	26	27	28	29
30	31

Related posts:

master

发表评论 取消回复

发表评论取消回复