Table of Contents

bsoup

bsoup defines a beautiful-soup-like API for working with HTML documents

Functions

parseHtml

parseHtml(html string) SoupNode

parseHtml parses html from a string, returning the root SoupNode

Types

SoupNode

Methods

find

find(name, attrs, recursive, string, **kwargs)

retrieve the first occurrence of an element that matches arguments passed to find. works similarly to node.find()

find_all

find_all(name, attrs, recursive, string, limit, **kwargs)

retrieves all descendants that match arguments passed to find_all. works similarly to node.find_all()

attrs

attrs()

get a dictionary of element attributes works similarly to node.attrs

contents

contents()

gets the list of children of an element works similarly to soup.contents

child

child()

gets a single child element with the given tag name works like accessing a node using its tag name

parent

parent()

gets the parent node of an element works like node.parent

next_sibling

next_sibling()

gets the next sibling of an element works like node.next_sibling

prev_sibling

prev_sibling()

gets the previous sibling of an element works like node.prev_sibling

get_text

get_text()

all the text in a document or beneath a tag, as a single Unicode string: works like soup.get_text