Naming Datasets in Qri
What's in a name? There are several ways of referring to datasets in Qri. Here are some examples of each.
In Qri, datasets are most commonly referred to by a combination of username and dataset name, separated by a slash:
This notation refers to the latest version of qri user
comics. All dataset names must be unique for their associated username, and all usernames must be unique, so this becomes a globally unique identifier for the dataset.
What if your username is, say,
golden_pear_ginger_pointer? First, why did you pick such a long username? Whatever your answer, it would be irritating to have to type your username every time, so we give you a special way to refer to yourself:
me. So if you have a dataset named
comics, you can just type:
In a sentence: "the latest version of my dataset named comics." Under the hood, we'll re-write your request to your username for you.
The full reference notation for a dataset is a combination of username, dataset name, network, and hash also known as a dataset reference. Qri usernames are unique, and all dataset names are unique to their associated username, meaning each dataset reference is globally unique across the qri network.
The full dataset reference takes this format:
an example of that looks like this:
In a sentence:
b5 is the name of a peer, who has a dataset named
comics, and it’s hash on the
ipfs network at a point in time was
We need peer names so lots of people can name datasets the same thing, and in this instance that giant hash thing is used to refer to a dataset at a specific version of a dataset, from an exact point in time.
Now, having to type
b5/[email protected]/ipfs/QmejvEPop4D7YUadeGqYWmZxHhLc4JBUCzJJHWMzdcMe2y every time you wanted a dataset would be irritating. So we have two defaults. The default network is
ipfs, and the default hash is the lastest known version of a dataset. We say latest known because sometimes things can fall out of sync. If you're only working with your own local datasets, this won't be an issue.
Anyway, that means we can cut down on the typing if we just want the latest version of
comics dataset, we can just type:
In a sentence: "the latest version of peers b5's dataset named comics."
Finally, it’s also possible to use just the hash. This is a perfectly valid dataset reference:
In this case we're ignoring naming altogether and simply referring to a dataset by it's network and hash. Because IPFS hashes are global, we can do this across the entire network. If you're coming from git, this is a fun new trick.
All of these are ways to refer to a dataset: