School of Data 2020 Workshop
Thank you for joining our workshop at the 2020 School of Data! Follow these steps to get started using Qri Desktop, our graphical interface for versioning and sharing datasets. To see the same workflow in CLI, see Qri CLI Quickstart.
First, download Qri Desktop for Mac or Windows. Complete the installer and run Qri.
Clicking 'accept' will bring you to the sign up page.
Provide an email and password, and you’re ready to start managing datasets in Qri!
For today's workshop, we're going to work with the Directory of Business Improvement Districts on the NYC Open Data Portal. Export the data as a CSV file, and save locally to your computer.
There are 2 ways to add this dataset to Qri:
First way: Navigate to the Collections Pane. Find the CSV file stored on your computer, and drag-and-drop the CSV into Qri Desktop to kick off the import.
Second way: Click the 'Create a Dataset' link to open a modal where you may specify the csv file location.
The new dataset will be named based on your username and the filename of the CSV you added.
You’re now looking at Qri Desktop’s Dataset pane, where you can explore versions of this dataset. When Qri created this dataset from your file, it created a version of the dataset for you.
Before moving on, let's explore the dataset body a bit. Click on a column header in the body view to see column stats and type information. Qri has guessed the column data types (string, number, boolean, etc) in the original CSV. This can give you a quick overview of what exists in the column without having to inspect each row.
Let's say you spot a mistake - either in content, formatting, or both. Let's show you how to make edits.
Now that you've added the file to Qri, Qri is 'watching' the local copy of the CSV. Using any preferred app (numbers, excel, text editors, etc.), open the local copy of the Business Improvement Districts CSV, make a change (any change), and save.
From the “status” tab, you can view and edit components. Let’s add some metadata. Select the status tab, then choose ‘Meta’, you’ll see a form with fields like title, description, and keywords.
Add a clear one-line title. Add a few sentences to describe the dataset in description. Add some topic keywords. All of these will help you find this dataset later when your collection grows, and will also help collaborators find your dataset in search results. You can continue populating as much metadata as you like. Don’t worry, you can always come back later to add more.
Just like we did with the Meta component, you can use the Readme component to write free-form text that may help future you remember why you needed this dataset, or to let other users know what to keep in mind when they first open it. Qri Readmes support markdown, so you can add rich text, linkes, lists, images, and more!
As you make changes, Qri Desktop will let you know that your working directory has content that differs from the last version of the dataset. Meta now shows up with a green dot next to it since it was added since the last version.
Your working directory has changes, so you’re ready to commit!
Committing creates a new version of your dataset. Click “Commit” at the bottom of the Status tab to show the commit form.
There are two fields on the commit pane: title, and message. Title is intended to be a short description of the changes being committed. Message is where you can go into greater detail on the changes with multiple lines of text. Most of the time only a title is sufficient, but the message section is always there if you need it.
Fill out a title and click “Save”. Congrats, you’ve just created a new version! Check the history tab and you’ll now see two commits, one created when you first imported the dataset, and the one you just completed. You can continue making changes this way, committing new versions whenever you reach a critical point. All of the older versions are intact in Qri, and you can inspect and export them at any time.
Renaming your dataset is simple! Just click on the dataset name Qri has generated for you and you can input the dataset name you prefer. Names must be lowercase, can only contain numbers, letters, and underscores. They also must start with a letter. If your names does not follow these rules, don't worry, the desktop will let you know by highlighting the name in red.
Publishing your dataset to the Qri Network allows anyone on Qri to add and explore your dataset. But that's not all, the Qri Cloud website allows anyone to view your dataset in their browser. Click the “Publish to Cloud” button on any commit to push it to Qri cloud.
That’s it! Once the dataset is transferred, your dataset will have a shiny new preview page on qri.cloud as well as sending the dataset to the Qri network, where other users will be able to find it. It will also show up on your profile page, which lists all of your published datasets. Other users can now add your datasets to their Qri collection!
Click on the 'Network' icon (the globe in the top left of the screen) to checkout other datasets that are on the Qri network without having to leave the app. These datasets can also be found if you head over to to the Qri Cloud website. Here's what that looks like:
You can also use the search bar at the top of each page to search on the network or among your local datasets. Type into the search bar and hit 'Enter', or just hit 'Enter' after you have clicked the search bar in order to open the search modal.
Clicking this dataset will allow us to view a Dataset Preview on the Network Pane. This lets you explore a dataset before deciding whether or not it is useful for your purposes. If you decide you want to explore the entire dataset, just click the 'Clone Dataset' button, and Qri will add this dataset from the network to your computer. There, you will have access to the entire body of the dataset.
Let's look at search again. You can also use the search modal to search through your own collection of datasets. You can toggle between searching the network and searching locally by clicking the 'Local Only' switch at the top right of the search modal:
Here are some things to try now that you’re up and running: