Starlark Standard Library

All qri transforms have access to a standard library of modules. You can use standard libarary modules to simplify common tasks in transform scripts.

Starlib is a a community-driven project to bring a standard library to the starlark programming dialect. Qri needs a standard library, and we thought it might benefit others to structure this library in a reusable way. starlib is admittedly biased towards Qri's needs Comments, Suggestions & Pull Requests welcome: https://github.com/qri-io/starlib

Using Standard Library Modules

To load a module within a starlark script, use the load statement. All module loading should happen in the first lines of your script. For example, loading the encoding/json module would look like this:

load("encoding/json", "json")

The first agrument ("encoding/json" in this case), is the location of the module. All module locations are listed below. The second argument ("json" in this case) is a symbol the script will access. After loading the json symbol, it's available for use. Here's a simplistic example:

load("encoding/json", "json")
decoded = json.loads("[1,2,3]")
print(decoded)

This script would output [1,2,3].

The Qri runtime doesn't (yet) support user-defined modules. Instead define any necessary extensions & functions within the transform script itself.

Starlib Packages

package namedescription
bsoupbsoup defines a beautiful-soup-like API for working with HTML documents
encoding/base64base64 defines base64 encoding & decoding functions, often used to represent binary as text.
encoding/csvcsv reads comma-separated values files
encoding/jsonjson provides functions for working with json data
encoding/yamlyaml provides functions for working with yaml data
geogeo defines geographic operations in two-dimensional space
htmlhtml defines jquery-like html selection & iteration functions for HTML documents
httphttp defines an HTTP client implementation
mathmath defines mathematical functions, it's intended to be a drop-in subset of python's math module for starlark: https://docs.python.org/3/library/math.html
rere defines regular expression functions, it's intended to be a drop-in subset of python's re module for starlark: https://docs.python.org/3/library/re.html
timetime defines time primitives for starlark
xlsxxlsx implements excel file readers in starlark. currently a highly-experimental package that will definitely change at some point in the future
zipfilezipfile reads & parses zip archives

bsoup

bsoup defines a beautiful-soup-like API for working with HTML documents

Functions

bsoup.parseHtml(html string) SoupNode

parseHtml parses html from a string, returning the root SoupNode

Types

SoupNode

Methods

SoupNode.find(name, attrs, recursive, string, **kwargs)

retrieve the first occurance of an element that matches arguments passed to find. works similarly to node.find()

SoupNode.find_all(name, attrs, recursive, string, limit, **kwargs)

retrieves all descendants that match arguments passed to find_all. works similarly to node.find_all()

SoupNode.attrs()

get a dictionary of element attributes works similarly to node.attrs

SoupNode.contents()

gets the list of children of an element works similarly to soup.contents

SoupNode.child()

gets a single child element with the given tag name works like accessing a node using its tag name

SoupNode.parent()

gets the parent node of an element works like node.parent

SoupNode.next_sibling()

gets the next sibling of an element works like node.next_sibling

SoupNode.prev_sibling()

gets the previous sibling of an element works like node.prev_sibling

SoupNode.get_text()

all the text in a document or beneath a tag, as a single Unicode string: works like soup.get_text


encoding/base64

base64 defines base64 encoding & decoding functions, often used to represent binary as text.

Functions

base64.decode(src,encoding="standard") string

parse base64 input, giving back the plain string representation

parameters:

nametypedescription
srcstringsource string of base64-encoded text
encodingstringoptional. string to set decoding dialect. allowed values are: standard,standard_raw,url,url_raw

base64.encode(src,encoding="standard") string

return the base64 encoding of src

parameters:

nametypedescription
srcstringsource string to encode to base64
encodingstringoptional. string to set encoding dialect. allowed values are: standard,standard_raw,url,url_raw


encoding/csv

csv reads comma-separated values files

Functions

csv.read_all(source, comma=",", comment="", lazy_quotes=False, trim_leading_space=False, fields_per_record=0, skip=0) [][]string

read all rows from a source string, returning a list of string lists

parameters:

nametypedescription
sourcestringinput string of csv data
commastringcomma is the field delimiter, defaults to "," (a comma). comma must be a valid character and must not be \r, \n, or the Unicode replacement character (0xFFFD).
commentstringcomment, if not "", is the comment character. Lines beginning with the comment character without preceding whitespace are ignored. With leading whitespace the comment character becomes part of the field, even if trim_leading_space is True. comment must be a valid character and must not be \r, \n, or the Unicode replacement character (0xFFFD). It must also not be equal to comma.
lazy_quotesboolIf lazy_quotes is True, a quote may appear in an unquoted field and a non-doubled quote may appear in a quoted field.
trim_leading_spaceboolIf trim_leading_space is True, leading white space in a field is ignored. This is done even if the field delimiter, comma, is white space.
fields_per_recordintfields_per_record is the number of expected fields per record. If fields_per_record is positive, read_all requires each record to have the given number of fields. If fields_per_record is 0, read_all sets it to the number of fields in the first record, so that future records must have the same field count. If fields_per_record is negative, no check is made and records may have a variable number of fields.
skipintnumber of rows to skip, omitting from returned rows

csv.write_all(source,comma=",") string

write all rows from source to a csv-encoded string

parameters:

nametypedescription
source[][]stringarray of arrays of strings to write to csv
commastringcomma is the field delimiter, defaults to "," (a comma). comma must be a valid character and must not be \r, \n, or the Unicode replacement character (0xFFFD).


encoding/json

json provides functions for working with json data

Functions

json.dumps(obj) string

serialize obj to a JSON string

parameters:

nametypedescription
objobjectinput object

json.loads(source) object

read a source JSON string to a starlark object

parameters:

nametypedescription
sourcestringinput string of json data


encoding/yaml

yaml provides functions for working with yaml data

Functions

yaml.dumps(obj) string

serialize obj to a yaml string

parameters:

nametypedescription
objobjectinput object

yaml.loads(source) object

read a source yaml string to a starlark object

parameters:

nametypedescription
sourcestringinput string of yaml data


geo

geo defines geographic operations in two-dimensional space

Functions

geo.Line(points) Line

Line constructor. Takes either an array of coordinate pairs or an array of point objects and returns the line that connects them. Points do not need to be collinear, providing a single point returns a line with a length of 0

parameters:

nametypedescription
points`[[]floatPoint]`

geo.MultiPolygon(polygons) MultiPolygon

MultiPolygon constructor. MultiPolygon groups a list of polygons to behave like a single polygon

parameters:

nametypedescription
polygons[Polygon]

geo.Point(x,y) Point

Point constructor, takes an x(longitude) and y(latitude) value and returns a Point object

parameters:

nametypedescription
xfloatx-dimension value (longitude if using geodesic space)
yfloaty-dimension value (latitude if using geodesic space)

geo.Polygon(rings) Polygon

Polygon constructor. Takes a list of lists of coordinate pairs (or point objects) that define the outer boundary and any holes / inner boundaries that represent a polygon. In GIS tradition, lists of coordinates that wind clockwise are filled regions and anti-clockwise represent holes.

parameters:

nametypedescription
rings[Line]list of closed lines that constitute the polygon

geo.parseGeoJSON(data) (geoms, properties)

Parses string data in IETF-7946 format (https://tools.ietf.org/html/rfc7946) returning a list of geometries and equal-length list of properties for each geometry

parameters:

nametypedescription
datastringstring of GeoJSON data

geo.within(geom,polygon) bool

Returns True if geom is entirely contained by polygon

parameters:

nametypedescription
geom[point,line,polygon]maybe-inner geometry
polygon[Polygon,MultiPolygon]maybe-outer polygon

Types

Line

an ordered list of points that define a line Methods

Line.length() float

Euclidean Length

Line.lengthGeodesic() float

Line length on the surface of a sphere with the same radius as Earth

MultiPolygon

MultiPolygon groups a list of polygons to behave like a single polygon### Point a two-dimensional point in space Methods

Point.distance(p2) float

Euclidean Distance to the other point

parameters:

nametypedescription
p2``point to measure distance to

Point.distanceGeodesic(p2) float

Distance on the surface of a sphere with the same radius as Earth

parameters:

nametypedescription
p2pointpoint to measure distance to

Polygon

an ordered list of closed lines (rings) that define a shape. lists of coordinates that wind clockwise are filled regions and anti-clockwise represent holes.** **

html

html defines jquery-like html selection & iteration functions for HTML documents

Functions

html.html(markup) selection

parse an html document returning a selection at the root of the document

parameters:

nametypedescription
markupstringhtml text to build a document from

Types

selection

an HTML document for querying Methods

selection.attr(name) string

gets the specified attribute's value for the first element in the Selection. To get the value for each element individually, use a looping construct such as each or map method

parameters:

nametypedescription
namestringattribute name to get the value of

selection.children() selection

gets the child elements of each element in the Selection

selection.children_filtered(selector) selection

gets the child elements of each element in the Selection, filtered by the specified selector

parameters:

nametypedescription
selectorstringa query selector string to filter the current selection, returning a new selection

selection.contents(selector) selection

gets the children of each element in the Selection, including text and comment nodes

parameters:

nametypedescription
selectorstringa query selector string to filter the current selection, returning a new selection

selection.find(selector) selection

gets the descendants of each element in the current set of matched elements, filtered by a selector

parameters:

nametypedescription
selectorstringa query selector string to filter the current selection, returning a new selection

selection.filter(selector) selection

filter reduces the set of matched elements to those that match the selector string

parameters:

nametypedescription
selectorstringa query selector string to filter the current selection, returning a new selection

selection.get(i) selection

retrieves the underlying node at the specified index. alias: eq

parameters:

nametypedescription
iintnumerical index of node to get

selection.has(selector) selection

reduces the set of matched elements to those that have a descendant that matches the selector

parameters:

nametypedescription
selectorstringa query selector string to filter the current selection, returning a new selection

selection.parent(selector) selection

gets the parent of each element in the Selection

parameters:

nametypedescription
selectorstringa query selector string to filter the current selection, returning a new selection

selection.parents_until(selector) selection

gets the ancestors of each element in the Selection, up to but not including the element matched by the selector

parameters:

nametypedescription
selectorstringa query selector string to filter the current selection, returning a new selection

selection.siblings() selection

gets the siblings of each element in the Selection

selection.text() string

gets the combined text contents of each element in the set of matched elements, including descendants

selection.first(selector) selection

gets the first element of the selection

parameters:

nametypedescription
selectorstringa query selector string to filter the current selection, returning a new selection

selection.last() selection

gets the last element of the selection

parameters:

nametypedescription
selectorstringa query selector string to filter the current selection, returning a new selection

selection.len() int

returns the number of the nodes in the selection

selection.eq(i) selection

gets the element at index i of the selection

parameters:

nametypedescription
iintnumerical index of node to get


http

http defines an HTTP client implementation

Functions

http.delete(url,params={},headers={},body="",form_body={},json_body={},auth=()) response

perform an HTTP DELETE request, returning a response

parameters:

nametypedescription
urlstringurl to request
headersdictoptional. dictionary of headers to add to request
bodystringoptional. raw string body to provide to the request
form_bodydictoptional. dict of values that will be encoded as form data
json_bodyanyoptional. json data to supply as a request. handy for working with JSON-API's
authtupleoptional. (username,password) tuple for http basic authorization

http.get(url,params={},headers={},auth=()) response

perform an HTTP GET request, returning a response

parameters:

nametypedescription
urlstringurl to request
headersdictoptional. dictionary of headers to add to request
authtupleoptional. (username,password) tuple for http basic authorization

http.options(url,params={},headers={},body="",form_body={},json_body={},auth=()) response

perform an HTTP OPTIONS request, returning a response

parameters:

nametypedescription
urlstringurl to request
headersdictoptional. dictionary of headers to add to request
bodystringoptional. raw string body to provide to the request
form_bodydictoptional. dict of values that will be encoded as form data
json_bodyanyoptional. json data to supply as a request. handy for working with JSON-API's
authtupleoptional. (username,password) tuple for http basic authorization

http.patch(url,params={},headers={},body="",form_body={},json_body={},auth=()) response

perform an HTTP PATCH request, returning a response

parameters:

nametypedescription
urlstringurl to request
headersdictoptional. dictionary of headers to add to request
bodystringoptional. raw string body to provide to the request
form_bodydictoptional. dict of values that will be encoded as form data
json_bodyanyoptional. json data to supply as a request. handy for working with JSON-API's
authtupleoptional. (username,password) tuple for http basic authorization

http.post(url,params={},headers={},body="",form_body={},json_body={},auth=()) response

perform an HTTP POST request, returning a response

parameters:

nametypedescription
urlstringurl to request
headersdictoptional. dictionary of headers to add to request
bodystringoptional. raw string body to provide to the request
form_bodydictoptional. dict of values that will be encoded as form data
json_bodyanyoptional. json data to supply as a request. handy for working with JSON-API's
authtupleoptional. (username,password) tuple for http basic authorization

http.put(url,params={},headers={},body="",form_body={},json_body={},auth=()) response

perform an HTTP PUT request, returning a response

parameters:

nametypedescription
urlstringurl to request
headersdictoptional. dictionary of headers to add to request
bodystringoptional. raw string body to provide to the request
form_bodydictoptional. dict of values that will be encoded as form data
json_bodyanyoptional. json data to supply as a request. handy for working with JSON-API's
authtupleoptional. (username,password) tuple for http basic authorization

Types

response

the result of performing a http request Fields

nametypedescription
urlstringthe url that was ultimately requested (may change after redirects)
status_codeintresponse status code (for example: 200 == OK)
headersdictdictionary of response headers
encodingstringtransfer encoding. example: "octet-stream" or "application/json"

Methods

response.body() string

output response body as a string

response.json()

attempt to parse resonse body as json, returning a JSON-decoded result


math

math defines mathematical functions, it's intended to be a drop-in subset of python's math module for starlark: https://docs.python.org/3/library/math.html

Functions

math.acos(x)

Return the arc cosine of x, in radians.

math.acosh(x)

Return the inverse hyperbolic cosine of x.

math.asin(x)

Return the arc sine of x, in radians.

math.asinh(x)

Return the inverse hyperbolic sine of x.

math.atan(x)

Return the arc tangent of x, in radians.

math.atan2(y, x)

Return atan(y / x), in radians. The result is between -pi and pi. The vector in the plane from the origin to point (x, y) makes this angle with the positive X axis. The point of atan2() is that the signs of both inputs are known to it, so it can compute the correct quadrant for the angle. For example, atan(1) and atan2(1, 1) are both pi/4, but atan2(-1, -1) is -3*pi/4.

math.atanh(x)

Return the inverse hyperbolic tangent of x.

math.ceil(x)

Return the ceiling of x, the smallest integer greater than or equal to x.

math.cos(x)

Return the cosine of x radians.

math.cosh(x)

Return the hyperbolic cosine of x.

math.degrees(x)

Convert angle x from radians to degrees.

math.exp(x)

Return e raised to the power x, where e = 2.718281… is the base of natural logarithms

math.fabs(x)

Return the absolute value of x.

math.floor(x)

Return the floor of x, the largest integer less than or equal to x.

math.hypot(x, y)

Return the Euclidean norm, sqrt(xx + yy). This is the length of the vector from the origin to point (x, y).

math.radians(x)

Convert angle x from degrees to radians.

math.round(x)

Returns the nearest integer, rounding half away from zero.

math.sin(x)

Return the sine of x radians.

math.sinh(x)

Return the hyperbolic sine of x.

math.sqrt(x)

Return the square root of x.

math.tan(x)

Return the tangent of x radians.

math.tanh(x)

Return the hyperbolic tangent of x.


re

re defines regular expression functions, it's intended to be a drop-in subset of python's re module for starlark: https://docs.python.org/3/library/re.html

Functions

re.findall(pattern, text, flags=0)

Returns all non-overlapping matches of pattern in string, as a list of strings. The string is scanned left-to-right, and matches are returned in the order found. If one or more groups are present in the pattern, return a list of groups; this will be a list of tuples if the pattern has more than one group. Empty matches are included in the result.

parameters:

nametypedescription
patternstringregular expression pattern string
textstringstring to find within
flagsintinteger flags to control regex behaviour. reserved for future use

re.split(pattern, text, maxsplit=0, flags=0)

Split text by the occurrences of pattern. If capturing parentheses are used in pattern, then the text of all groups in the pattern are also returned as part of the resulting list. If maxsplit is nonzero, at most maxsplit splits occur, and the remainder of the string is returned as the final element of the list.

parameters:

nametypedescription
patternstringregular expression pattern string
textstringinput string to split
maxsplitintmaximum number of splits to make. default 0 splits all matches
flagsintinteger flags to control regex behaviour. reserved for future use

re.sub(pattern, repl, text, count=0, flags=0)

Return the string obtained by replacing the leftmost non-overlapping occurrences of pattern in string by the replacement repl. If the pattern isn’t found, string is returned unchanged. repl can be a string or a function; if it is a string, any backslash escapes in it are processed. That is, \n is converted to a single newline character, \r is converted to a carriage return, and so forth.

parameters:

nametypedescription
patternstringregular expression pattern string
replstringstring to replace matches with
textstringinput string to replace
countintnumber of replacements to make, default 0 means replace all matches
flagsintinteger flags to control regex behaviour. reserved for future use


time

time defines time primitives for starlark

Functions

time.duration(string) duration

parse a duration

time.location(string) location

parse a location

time.now() time

implementations would be able to make this a constant

time.time(string, format=..., location=...) time

parse a time

time.zero() time

a constant

Types

duration

Fields

nametypedescription
hoursfloat
minutesfloat
nanosecondsint
secondsfloat

format(string) string

textual representation of time formatted according to the provided### in_location(string) time get time representing the same instant but in a different location### layout string

time

Methods

time.year() int

time.month() int

time.day() int

time.hour() int

time.minute() int

time.second() int

time.nanosecond() int


xlsx

xlsx implements excel file readers in starlark. currently a highly-experimental package that will definitely change at some point in the future

Functions

xlsx.get_url(url string)

fetch an excel file from a url

Types

File

an excel file Methods

File.get_sheets() dict

return a dict of sheets in this excel file

File.get_rows(sheetname) list

get all populated rows / columns as a list-of-list strings


zipfile

zipfile reads & parses zip archives

Functions

zipfile.ZipFile(data)

opens an archive for reading

Types

ZipFile

a zip archive object Methods

ZipFile.namelist() list

return a list of files in the archive

ZipFile.open(filename string) ZipInfo

open a file for reading

parameters:

nametypedescription
filenamestringname of the file in the archive to open

ZipInfo

Methods

ZipInfo.read() string

read the file, returning it's string representation