Home > Bruce's Posts, Freebase > Google Freebase Client Library – Topic API

Google Freebase Client Library – Topic API

The following comment was recently posted on the Freebase discussion mailing list …

Yes, we are actually deprecating it [Text] in favor of getting a description from the Topic API. The good part is that the new solution will return you entity descriptions across 40 languages from Wikipedia (on top of course of any user entered descriptions).
We haven’t announced it yet since we just finished that feature, but we will do so soon.

That prompted me to look again at Google’s Freebase client library. The Topic API returns all the known facts for a given topic including images and text blurbs. At first blush it appeared to be a bit of overkill for what I needed but a second looked showed that it was pretty straight forward to deal with.

It takes a bit time to become comfortable with Topic’s response. It’s a hierarchy of maps and lists represented by a JSON object. Here’s how I wrapped my brain around it.

At the root level of the response there is an object ({}) named “property”. It’s a map of domain properties. If you didn’t filter the request then you’re getting back all the domains associated with the topic. If you did apply a filter then you’ll only get back those domains you specified.

This is a response fragment containing a domain object.

"/type/object/name": {
"valuetype": "string",
"values": [
{
"text": "William Shakespeare",
"lang": "en",
"value": "William Shakespeare",
"creator": "/user/santiago_aguiar",
"timestamp": "2011-03-15T12:51:40.000Z"
}
],
"count": 39.0
}

Each domain is an object ({}) with the following properties:

Name Type Description
valuetype string The name of the “property value”
values array Any array ([]) of values where a value is a object ({}).
count integer The total number of values that exist in Freebase.”values” is an array of objects.

The values array contains objects with a consistent form.

If the value type is “float”, “int”, “bool”, “datetime”, “uri”, “key”, or “object” each object in the array will be of the following form.

Name Type Description
text string
lang string The language type
value many This can one of the following:

  • literal – a primitive such as a string, integer, or boolean
  • foreign key – the id (a string) of a topic contained in another dataset
  • topic reference – the id (a string) of another topic
creator string Id of the value’s creator.
timestamp string When the value was created.

There can be additional properties. For example, the domain /common/topic/description will add a “citation” object.

"/common/topic/description": {
"valuetype": "string",
"values": [
{
...
"citation": {
"provider": "Wikipédia",
"statement": "Description licensed under the Creative Commons Attribution-ShareAlike License (http://en.wikipedia.org/wiki/Wikipedia:Text_of_Creative_Commons_Attribution-ShareAlike_3.0_Unported_License).",
"uri": "http://fr.wikipedia.org/wiki/William_Shakespeare"
}
}
],
"count": 39.0
}

Things are straight forward when the value type is a literal, foreign key, or topic reference. In those cases “value” holds a holds something which can be dealt with simply.

When the value type is “compound” things get a bit more interesting.

Name Type Description
text string
lang string The language type
id string The id (a string) of the topic contained in “property”.
creator string Id of the value’s creator.
timestamp string When the value was created.
property string A map of the topic’s (see id) properties.

A “compound” object nests another topic as a property value. The “value” property is gone and replaced by “id” and “property”. “id” holds the id of the topic described in “property”. This parallels the response that Topic returns when queried.

With that understanding in place I tried out the Topic API found in the Google Freebase client library. Here’s some sample code.


HttpTransport httpTransport = new NetHttpTransport();
JsonFactory jsonFactory = new JacksonFactory();
HttpRequestInitializer httpRequestInitializer = new HttpRequestInitializer() {
  @Override
  public void initialize(HttpRequest request) throws IOException {}
};

Freebase.Builder fbb = new Freebase.Builder(httpTransport, jsonFactory, httpRequestInitializer);
fbb.setApplicationName("freebase-test");
Freebase freebase = fbb.build();

try {
 Freebase.Topic.Lookup lookup = freebase.topic().lookup(Arrays.asList(mid));
 TopicLookup topic = lookup.execute();
 if (topic != null) {
  System.out.printf("topic: %s\n", topic.getId());

  Property property = topic.getProperty();
  if (property != null) {
   System.out.println(property);
   }
  }
 }
} catch (IOException e) {
 e.printStackTrace();
}

So far so good. The client code executes correctly and I get back a valid response. But at this point I hit a wall. The Property class doesn’t do anything specific with the response. It’s a sub-class of GenericJson and that means all the properties are held in a very generic way. I was a bit surprised by this given that there are classes (TopicPropertyvalue and TopicValue) in place to hold topic properties and values. Since this library is a work in progress maybe the work hasn’t progressed that far. I didn’t want to go slogging through maps and arrays so I modified Property to use them. I changed its super class from GenricJson to ArrayMap<String, TopicPropertyvalue>. With that in place I could now output the first value of all the topic’s properites quite simply …

// for each domain in the property map
for (Object name : topic.getProperty().keySet()) {
 TopicPropertyvalue tpv = property.get(name); // get the domain name
 String valuetype = tpv.getValuetype();       // get the value type
 List<TopicValue> values = tpv.getValues();   // get the list of values
 // translate the value type to the key needed to get the "property value"
 // this will translate to value, id, or property
 String key = ValueType.valueOf(valuetype.toUpperCase()).getKey();
 Object value = values.get(0).get(key);       // from the first value object get the "primary value"

 System.out.printf("\t domain: %s primary value (%s): %s\n", name, valuetype, value);
}

Not too bad.

Advertisements
Categories: Bruce's Posts, Freebase
  1. No comments yet.
  1. No trackbacks yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: