[Python] python通过BeautifulSoup分析网页信息 →→→→→进入此内容的聊天室

来自 , 2020-11-25, 写在 Python, 查看 121 次.
URL http://www.code666.cn/view/0cc6928e
  1. #import the library used to query a website
  2. import urllib2
  3.  
  4. #specify the url you want to query
  5. url = "http://www.python.org"
  6.  
  7. #Query the website and return the html to the variable 'page'
  8. page = urllib2.urlopen(url)
  9.  
  10. #import the Beautiful soup functions to parse the data returned from the website
  11. from BeautifulSoup import BeautifulSoup
  12.  
  13. #Parse the html in the 'page' variable, and store it in Beautiful Soup format
  14. soup = BeautifulSoup(page)
  15.  
  16. #to print the soup.head is the head tag and soup.head.title is the title tag
  17. print soup.head
  18. print soup.head.title
  19.  
  20. #to print the length of the page, use the len function
  21. print len(page)
  22.  
  23. #create a new variable to store the data you want to find.
  24. tags = soup.findAll('a')
  25.  
  26. #to print all the links
  27. print tags
  28.  
  29. #to get all titles and print the contents of each title
  30. titles = soup.findAll('span', attrs = { 'class' : 'titletext' })
  31. for title in allTitles:
  32.     print title.contents
  33. #//python/8423

回复 "python通过BeautifulSoup分析网页信息"

这儿你可以回复上面这条便签

captcha