QueryPic provides a simple visualisation of a set of search results. Instead of just presenting you with a list of matches, it plots the results over time as a familiar line graph. Each point on the graph represents the number of articles in that year matching your search query. You can even combine multiple queries to compare results.
It’s simple, but it’s also surprisingly powerful. QueryPic lets you see patterns and trends that normal search interfaces hide. It helps you frame your research questions – to survey the territory and decide where to dig.
QueryPic searches digitised newspapers from Australia and New Zealand published online by Trove and Papers Past. The data is accessed through APIs (Application Programming Interfaces) provided by Trove and DigitalNZ.
These are wonderfully rich resources, but if you’re going to interpret QueryPic results you really need to think about their limitations. What titles have been digitised? What sort of gaps remain? What is the quality of the text extracted by OCR? How might factors such as these affect your results?
And then there’s the question of how search actually works in the two databases. How ‘fuzzy’ are the matches? How do they handle hyphenated words? I’d suggest you become familiar with the help pages of both systems.
QueryPic lets you pursue your hunches, but what it creates are sketches, not arguments. You have to think critically about the data and how you’re accessing it.
The easiest way to make a QueryPic is to type a word or phrase in the keywords search box. To search for a phrase, enclose it in double quotation marks. If you submit multiple keywords, only articles that contain all the words will be matched.
As I said, it’s important to try and understand what’s going on behind the scenes when you search, so here’s the blow by blow description.
While you can build fairly complex queries using the keywords option, you’ll probably find it easier to build and test your query using the existing search interfaces to Trove and Papers Past. You’ll also be able to apply custom limits, such as date ranges.
The procedure is fairly simple. In Trove head to the ‘Advanced search’ page and start building your query. Alternatively, you can use the basic interface and filter your search by selecting facets. See the help pages for more information.
QueryPic recognises the following facets (or limits) in Trove searches:
Note that the option to search only certain parts of articles (such as the headings) is not currently supported by the Trove API, so QueryPic can’t apply it.
In DigitalNZ you can use the ‘filters’ option to apply a date range.
You can, of course, keep testing and tweaking until it looks like you’re getting the results you want (or none of the results you don’t want). Then it’s just a matter of copying the url in your browser’s address box, selecting the ‘query url’ tab in QueryPic and pasting in the url.
When you click ‘Show’, QueryPic will parse the url into it’s components and translate them into the form required by the APIs. As with the keyword search, it will gather data for each year and calculate proportions.
To make using the query url option even easier, you can install a bookmarklet that connects Trove and DigitalNZ directly to QueryPic.
The QueryPic bookmarklet is designed to copy a query url from Trove or DigitalNZ and feed it directly to QueryPic – no copying and pasting required!
To install the bookmarklet simply drag this link – QueryPic – to your browser’s bookmarks bar. Different browsers work slightly differently, so if this doesn’t seem to work see installing the bookmarklet for more detailed instructions.
Once the bookmarklet is installed just construct your search in Trove or DigitalNZ. When you’re happy with it click on the bookmarklet. That’s it! QueryPic will open automatically and start loading data.
If you come across an interesting QueryPic that you like to use as a basis for your own comparison you can use it to generate a new graph.
You can also regenerate a complete QueryPic.
One of the most useful aspects of QueryPic is it’s ability to compare queries – you can add as many lines as you like to a single QueryPic.
Note that you can’t add queries to a saved QueryPic, so wait until you’ve added all your queries before saving.
Also you can’t use the bookmarklet to add additional queries, so you’ll have to resort to copying and pasting the urls.
The contents of Trove and Papers Past will change over time. Additional newspapers and articles will be added, and corrections will be made to the text. A query you ran a year ago might produce a different result today. For this reason every saved QueryPic is date-stamped. You should include this date in any citation.
If you want to track changes over time you can easily regenerate a saved QueryPic. Just click on the big blue ‘Regenerate this QP’ button. The ‘Create’ page will open and QueryPic will retrieve a new dataset for all of the queries in the original graph. You can then save the new version.
Once you’re happy with your QueryPic you’ll want to save it. It’s easy – just click the big blue ‘Save’ button. A form will pop up and ask you for a few details. There are only two required fields:
Note that your email will not be displayed or shared.
The optional fields are:
Just fill in the fields and click on ‘Save’. Your details will be added to the database and you’ll be redirected to a freshly-minted, persistent url for your saved QueryPic. You can cite or share this url – tell the world!
Saved QueryPics are stored in a database. Just visit the explore page to browse. Limit the number of results displayed by entering a keyword in the filter box. Click on the arrows in the column headings to change the order of the results.
Each saved QueryPic is assigned a persistent url that can be cited or shared. You can find the url under the graph, or in your browser’s address bar.
A number of standard social network buttons have been included for easy sharing.
QueryPic’s graphs give you a new perspective on your newspaper searches, but eventually you’ll want to go back to the articles themselves – to find out what’s actually lurking under each point on the graph.
To retrieve a list of the first twenty matching articles for each year, just click on that point on the graph. QueryPic will once again fire off a request to the API and return the articles ordered by relevance. Click on an article to open it in Trove or Papers Past. To dig deeper just click on the ‘View more in…’ button at the bottom of the list of articles to view all the matching results.
To view a summary of a query just click on the appropriate tab in the right-hand side bar – they’re labelled ‘Query 1’, ‘Query 2’ etc.
The query summary displays some basic metadata about the query, including:
There are also two buttons, one opens up that query in either Trove or DigitalNZ. The other button uses the query to generate a new QueryPic.
By clicking and dragging you can zoom into any section of the graph. Click the ‘Reset zoom’ button to return to the original view.
Click on a label in the legend to temporarily hide a query. This can be useful if you’re comparing multiple queries. Note that when you hide a query the vertical scale is reset to suit the remaining data. This makes it easier to study queries with fewer results.
There are two buttons in the top right-hand corner or every graph. One prints the graph, while the other lets you download the graph as an image in a choice of file formats.
Fuzzy searches expand the number of results returned by truncating or stemming your keywords to match a variety of possible endings. So a search for ‘smile’ would return ‘smiles’, ‘smiling’ and ‘smiley’.
This is great if you’re not exactly sure what you’re looking for – it maximises your chances of discovery. But if you want to track the occurrence of a particular word or phrase it can be rather annoying.
Both Trove and DigitalNZ include some degree of fuzziness by default. In Trove, the situation is further complicated by the way the indexing handles hyphenated words.
Fuzzy searching can be switched off in Trove by using the ‘fulltext:’ modifier, but it can take a fair bit of trial and error to find out what actually works. The Trove Forum is a useful source of guidance and tips, see for example:
Another trap is that, by default, Trove searches include user-contributed tags and comments. So if you search for ‘World War I’ you’ll get some matches from the period 1914–18 (think about it). This is because some diligent user has added the tag ‘World War I’ to a number of articles from the period. There’s no easy way of avoiding these sorts of anachronisms. Once you become aware of such cases you can explicitly exclude matching tags, but it’s not a wholly satisfactory solution.
The only answer to the problems of fuzzy searching is to experiment with your queries and remain critical of the results.
Make sure your browser’s bookmarks toolbar is visible.
Chrome: View -> Always show bookmarks bar
Firefox: Views -> Toolbars -> Bookmarks toolbar
Safari: Views -> Show bookmarks bar
IE: Tools -> Toolbars -> Favourites
For Chrome, Firefox or Safari, simply click and drag this link – QueryPic – to your bookmarks bar.