School of Data 2020 Workshop

Thank you for joining our workshop at the 2020 School of Data! Follow these steps to get started using Qri Desktop, our graphical interface for versioning and sharing datasets. To see the same workflow in CLI, see Qri CLI Quickstart.

Download and Sign up

First, download Qri Desktop for Mac or Windows. Complete the installer and run Qri.

You will be greeted with a Terms of Service screen.

Clicking 'accept' will bring you to the sign up page.

Choose a good username, it will be used to reference each of your datasets on the Qri network.

Provide an email and password, and you’re ready to start managing datasets in Qri!

Let's get some open data

For today's workshop, we're going to work with the Directory of Business Improvement Districts on the NYC Open Data Portal. Export the data as a CSV file, and save locally to your computer.

Add the file to Qri

There are 2 ways to add this dataset to Qri:

First way: Navigate to the Collections Pane. Find the CSV file stored on your computer, and drag-and-drop the CSV into Qri Desktop to kick off the import.

Psst. This is the fun way!

Second way: Click the 'Create a Dataset' link to open a modal where you may specify the csv file location.

The new dataset will be named based on your username and the filename of the CSV you added.

Voila! Qri has created a new dataset!

You’re now looking at Qri Desktop’s Dataset pane, where you can explore versions of this dataset. When Qri created this dataset from your file, it created a version of the dataset for you.

You can see the commit message that was automatically added when the dataset was created, along with indications that the body and structure components were added in this first commit.

Before moving on, let's explore the dataset body a bit. Click on a column header in the body view to see column stats and type information. Qri has guessed the column data types (string, number, boolean, etc) in the original CSV. This can give you a quick overview of what exists in the column without having to inspect each row.

Let's say you spot a mistake - either in content, formatting, or both. Let's show you how to make edits.

Editing the Body

Now that you've added the file to Qri, Qri is 'watching' the local copy of the CSV. Using any preferred app (numbers, excel, text editors, etc.), open the local copy of the Business Improvement Districts CSV, make a change (any change), and save.

Adding context to (a.k.a Editing) your Qri Dataset

From the “status” tab, you can view and edit components. Let’s add some metadata. Select the status tab, then choose ‘Meta’, you’ll see a form with fields like title, description, and keywords.

Add a clear one-line title. Add a few sentences to describe the dataset in description. Add some topic keywords. All of these will help you find this dataset later when your collection grows, and will also help collaborators find your dataset in search results. You can continue populating as much metadata as you like. Don’t worry, you can always come back later to add more.

Here we are adding a title and description to our dataset

Just like we did with the Meta component, you can use the Readme component to write free-form text that may help future you remember why you needed this dataset, or to let other users know what to keep in mind when they first open it. Qri Readmes support markdown, so you can add rich text, linkes, lists, images, and more!

As you make changes, Qri Desktop will let you know that your working directory has content that differs from the last version of the dataset. Meta now shows up with a green dot next to it since it was added since the last version.

Your working directory has changes, so you’re ready to commit!

Make a Commit

Committing creates a new version of your dataset. Click “Commit” at the bottom of the Status tab to show the commit form.

Add a title and message so others know what changes you made.

There are two fields on the commit pane: title, and message. Title is intended to be a short description of the changes being committed. Message is where you can go into greater detail on the changes with multiple lines of text. Most of the time only a title is sufficient, but the message section is always there if you need it.

Fill out a title and click “Save”. Congrats, you’ve just created a new version! Check the history tab and you’ll now see two commits, one created when you first imported the dataset, and the one you just completed. You can continue making changes this way, committing new versions whenever you reach a critical point. All of the older versions are intact in Qri, and you can inspect and export them at any time.

In this dataset, we also made a commit where we edited the body of the dataset before we edited the metadata.

Rename your dataset

Renaming your dataset is simple! Just click on the dataset name Qri has generated for you and you can input the dataset name you prefer. Names must be lowercase, can only contain numbers, letters, and underscores. They also must start with a letter. If your names does not follow these rules, don't worry, the desktop will let you know by highlighting the name in red.

Publish to the Qri Network & Qri Cloud

Publishing your dataset to the Qri Network allows anyone on Qri to add and explore your dataset. But that's not all, the Qri Cloud website allows anyone to view your dataset in their browser. Click the “Publish to Cloud” button on any commit to push it to Qri cloud.

From the collection page, or from the search modal, click on the dataset you want to publish. You will be sent to the Workbench Page. Clicking "Publish" makes the dataset available to the network and creates a dataset preview page on Qri Cloud

That’s it! Once the dataset is transferred, your dataset will have a shiny new preview page on as well as sending the dataset to the Qri network, where other users will be able to find it. It will also show up on your profile page, which lists all of your published datasets. Other users can now add your datasets to their Qri collection!

The Network Pane has a feed of recently published datasets. Once you have published your dataset, you can view it on the network. This screenshot was taken using a test network, which is why it is so sparsely populated :)

Explore other datasets on the Qri network

Click on the 'Network' icon (the globe in the top left of the screen) to checkout other datasets that are on the Qri network without having to leave the app. These datasets can also be found if you head over to to the Qri Cloud website. Here's what that looks like:

Recently published datasets appear at the top of feed on Qri Cloud

You can also use the search bar at the top of each page to search on the network or among your local datasets. Type into the search bar and hit 'Enter', or just hit 'Enter' after you have clicked the search bar in order to open the search modal.

Here we are searching for datasets that have to do with "synths" on the Qri network

Clicking this dataset will allow us to view a Dataset Preview on the Network Pane. This lets you explore a dataset before deciding whether or not it is useful for your purposes. If you decide you want to explore the entire dataset, just click the 'Clone Dataset' button, and Qri will add this dataset from the network to your computer. There, you will have access to the entire body of the dataset.

Qri allows you to explore a dataset from the network.

Let's look at search again. You can also use the search modal to search through your own collection of datasets. You can toggle between searching the network and searching locally by clicking the 'Local Only' switch at the top right of the search modal:

We clicked the "Local Only" switch and are searching for our local dataset about "earthquakes"

Next Steps

Here are some things to try now that you’re up and running: