Home > Bruce's Posts, Java > In Search of Quotes

In Search of Quotes

Wikiquote

My search continues for a source of interesting quotations that I can incorporate in my mobile application. Wikimedia’s Wikiquote appeared to be an excellent source. It holds thousands of quotes. Wikiquote describes itself as a free compendium of quotations that is being written collaboratively by the readers.  The trick is how to access those quotes with an API.

Wikiquote is powered by MediaWiki the software that runs various Wikimedia sites such as Wikipedia. MediaWiki provides a web service API to access its pages. But while the page contents is easily accessed it is not structured in way that allows it to be easily parsed into discrete data elements. In other words, extract quotes from any given Wikiquote page isn’t straight forward. I could write a parser but I suspected that something like this had already been done. This suspicion lead me to  DBpedia.

DBpediaDBpedia describes itself as …

… a crowd-sourced community effort to extract structured information from Wikipedia and to make this information available on the Web. DBpedia allows you to make sophisticated queries against Wikipedia, and to link other data sets on the Web to Wikipedia data.

DBpedia regularly extracts data from Wikipedia and stores it using a Resource Description Framework (RDF) model for data interchange. Those resources can be remotely queried using SPARQL a query language to RDF.  DBpedia’s ontology contains a quotation property. Unfortunately when I started querying DBpedia for resources that had quotes very few were returned. Apparently Wikiquote is not one of the Wikimedia sites that DBpedia sources from. So while DBpedia looked promising it turned out to be a dead end.

Freebase

More searching lead me to Freebase. Freebase is very similar to DBpedia. It is an open collection of structured data that can be accessed using a remote API. Here too data is pulled from a variety of sources such as Wikipedia and stored as a  graph model comprised of nodes (data objects) and relationships between nodes. This model can be queried with Freebase’s proprietary Metaweb Query Language (MQL).

For example, the follow query will return quotations for Albert Einstein.

[{
  "type": "/people/person",
  "id": null,
  "name": "Albert Einstein",
  "gender": {
    "type": "/people/gender",
    "id": null,
    "name": null
    },
  "/people/person/quotations": [{
      "type": "/media_common/quotation",
      "id": null,
      "name": null,
      "subjects": [],
      "limit": 5
      }]
  }]

Here are the results …

<em id="__mceDel">{
  "code":          "/api/status/ok",
  "result": [{
    "/people/person/quotations": [
      {
        "id":       "/en/imagination_is_more_important_than_knowledge",
        "name":     "Imagination is more important than knowledge.",
        "subjects": [],
        "type":     "/media_common/quotation"
      },
      {
        "id":       "/m/02kpjn_",
        "name":     "Great spirits have always encountered violent opposition from mediocre minds.",
        "subjects": [],
        "type":     "/media_common/quotation"
      },
      {
        "id":       "/m/02nrfj2",
        "name":     "Not everything that counts can be counted, and not everything that can be counted counts.",
        "subjects": [],
        "type":     "/media_common/quotation"
      },
      {
        "id":   "/quotationsbook/quote/21171",
        "name": "If men as individuals surrender to the call of their elementary instincts, avoiding pain and seeking satisfaction only for their own selves, the result for them all taken together must be a state of insecurity, of fear, and of promiscuous misery.",
        "subjects": [
          "Instinct"
        ],
        "type": "/media_common/quotation"
      },
      {
        "id":   "/quotationsbook/quote/23603",
        "name": "The ideals which have always shone before me and filled me with the joy of living are goodness, beauty, and truth.",
        "subjects": [
          "Life and Living"
        ],
        "type": "/media_common/quotation"
      }
    ],
    "gender": {
      "id":   "/en/male",
      "name": "Male",
      "type": "/people/gender"
    },
    "id":   "/en/albert_einstein",
    "name": "Albert Einstein",
    "type": "/people/person"
  }],
  "status":        "200 OK",
  "transaction_id": "cache;cache02.p01.sjc1:8101;2013-02-14T06:45:22Z;0064"
}

With Freebase I now had an online source for thousands of interesting quotes. The next step was how best to use them.

Advertisements
Categories: Bruce's Posts, Java
  1. bz33kn
    April 27, 2013 at 2:26 pm

    Could it be that in the query the comma behind the “5” of “limit” should be omitted, but in the line before a comma behind the “[]” is missing?

    • April 28, 2013 at 1:16 am

      Yup, the MQL had a typo. I fixed it.

  1. No trackbacks yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: