Making Open Data Tables even easier: YQL Storage Editor


Making Open Data Tables even easier: YQL Storage Editor

One of the big benefits of YQL is that you can make your own Open Data Tables and use them. However, one of the disadvantages is that you either have to host your tables on a publicly available URL or you have to painstakingly use the yql.storage table to manage your environments and table definitions. There has been an open feature request for the YQL console for a while now to integrate the ability to edit yql.storage content but it isn’t the highest priority thing for the team right now. Since I am using YQL more and more lately I decided to build it myself, leveraging Bespin, Mozilla’s web based editor.

Another project that was done in someone’s free time is Zach Grave’s TableSaw that helps you generate tables from URLs. I am going to combine these two things in this post to show you how to make a new table and an environment to more easily use it. Ultimately I could see TableSaw and the YQL Storage Editor combined, to that end, I’ve released the source for this blog post on github.

I’m going to use a real example that I just recently walked through, creating an open data table for the bit.ly API. The table will be pretty simple, just allowing you to pass in a bit.ly link and it will return the long URL for that link. First thing to do is to go to the bit.ly API documentation for the expand REST API. That gives us a nice example URL that we can use in Table Saw to generate a table:

If you head over to Table Saw and put that in the text field and load URL you get:

[caption id=”attachment_1284" align=”alignnone” width=”605" caption=”Initial generated table for api.bit.ly/expand”]

[/caption]

To complete the form and update, fill in an author name and the Produces should be JSON. One current bug with Table Saw is that it doesn’t seem to properly remember the Produces field so we will have to edit that after we grab the definition. Highlight everything under “Your new YQL Table!” and copy it into the clipboard and then go to the YQL Storage Editor. On that page you will see a set of execute/select/update text fields for the various urls that you need to use a yql.storage entry. Everything is empty right now and we want to create a new one, so click the New button. That should fill in those URL fields with brand new store:// URLs. Put focus on the Bespin editor component and paste in the table we copied. If you then click save it should say that it updated the entry in YQL with this new information. You’ll notice that there is a text area with the URLs listed out, if you want to keep track of yql.storage entries you need to keep these special identifiers around so you can perform the various operations on the content. I generally have a list that I keep locally of the entries that I am using. If you paste the list into the file as a comment you can then also just remember the ‘select’ URL in order to come back later and find the other two. Lastly, when the execute key is changed, the two links on the right get updated. The first link goes to the YQL console with your execute key used as an environment and the second link goes to the YQL console with your execute key used as a table definition. These two links can help you quickly try out changes that you make to your environment or table without a lot of cutting and pasting.

Table Saw isn’t yet perfect, so I make a few edits: change the produces from XML to JSON, add a description, reformat the sample query, set the default version to 2.0.1 and make all the other keys required, this should leave you with a page that looks like:

[caption id=”attachment_1280" align=”alignnone” width=”670" caption=”Edited version of the bit.ly table”]

[/caption]

When you click save it should alert you that it was saved successfully by saying that it was updated. You can now test this table in the YQL console with the following query:

use ‘store://8tLqFvgCm8kdOAp60lPz24’ as bit.ly; select * from bit.ly where version=’2.0.1' and shortUrl=’http://bit.ly/31IqMl' and login=’bitlyapidemo’ and apiKey=’R_0da49e0a9118ff35f52f629d2d71bf07'

I noticed at this point that I forgot the itemPath and then set it to “json.results” in order to cut out the error messages and some other things that I wasn’t interested in. Nagesh Susarla had a great blog post on using YQL storage for secrets. We’ll also do that by using the YQL storage editor to create a new environment for us to use that could store our secret values that bit.ly needs in order to run the API. Create another YQL Storage Editor tab, click New and then we can make an environment that uses the table, sets the login and sets the API key:

[caption id=”attachment_1279" align=”alignnone” width=”517" caption=”Secret bit.ly environment with their API keys”]

[/caption]

Now using that execute URL as your env value in your query URLs you can have very simple queries against this API. For example, if you use this URL for the YQL Console you can use this much simpler query that gets all its authentication implicitly:

select * from bit.ly where shortUrl=’http://bit.ly/31IqMl'

This allows you to not embed those valuable keys directly in your client software or even in your source control system but hidden behind what amounts to a very secure random password.