#import the library used to query a website
import urllib2
 
#specify the url you want to query
url = "http://www.python.org"
 
#Query the website and return the html to the variable 'page'
page = urllib2.urlopen(url)
 
#import the Beautiful soup functions to parse the data returned from the website
from BeautifulSoup import BeautifulSoup
 
#Parse the html in the 'page' variable, and store it in Beautiful Soup format
soup = BeautifulSoup(page)
 
#to print the soup.head is the head tag and soup.head.title is the title tag
print soup.head
print soup.head.title
 
#to print the length of the page, use the len function
print len(page)
 
#create a new variable to store the data you want to find.
tags = soup.findAll('a')
 
#to print all the links
print tags
 
#to get all titles and print the contents of each title
titles = soup.findAll('span', attrs = { 'class' : 'titletext' })
for title in allTitles:
    print title.contents
#//python/8423

回复 "python通过BeautifulSoup分析网页信息"

这儿你可以回复上面这条便签

作者你的名字是？

标题给你的便签一个标题。

语言你的便签是以

你的便签在这儿输入便签内容

创建短链接创建一个较短的URL，连接到这个便签

私人私人便签不会显示在最近列表中

保存期限我们应该什么时候删除这张便签？

防滥用键入这些字符

Code666 (代码贴、代码片段)

[Python] python通过BeautifulSoup分析网页信息 →→→→→进入此内容的聊天室

回复 "python通过BeautifulSoup分析网页信息"