bsoup
bsoup defines a beautiful-soup-like API for working with HTML documents
Functions
parseHtml
parseHtml(html string) SoupNode
parseHtml parses html from a string, returning the root SoupNode
Types
SoupNode
Methods
find
find(name, attrs, recursive, string, **kwargs)
retrieve the first occurrence of an element that matches arguments passed to find. works similarly to node.find()
find_all
find_all(name, attrs, recursive, string, limit, **kwargs)
retrieves all descendants that match arguments passed to find_all. works similarly to node.find_all()
attrs
attrs()
get a dictionary of element attributes works similarly to node.attrs
contents
contents()
gets the list of children of an element works similarly to soup.contents
child
child()
gets a single child element with the given tag name works like accessing a node using its tag name
parent
parent()
gets the parent node of an element works like node.parent
next_sibling
next_sibling()
gets the next sibling of an element works like node.next_sibling
prev_sibling
prev_sibling()
gets the previous sibling of an element works like node.prev_sibling
get_text
get_text()
all the text in a document or beneath a tag, as a single Unicode string: works like soup.get_text