Friday, March 11, 2016

A simple plan of action

I think I've pretty much figured out what I'm going to do. An important observation when looking at the example questions provided in Free917 is that most end up simply asking for a property of an entity. For example, one of the questions is asking "What was the cover price of X-Men issue 1?" All the computer needs to do here is find the entity that represents "X-Men issue 1" and access the property "cover price." Of course, translating the actual utterance "X-Men issue 1" is a whole other task but once it's been done then the rest is trivial. More than half of the questions in Free917 are similar to this. This means that getting the answer to these questions is really simple: extract the entity from the KDG and then extract whatever property is being asked. Here's an example:

Here is the KDG (in list-form) of the question "What type of tea is gunpowder tea?" (I actually have no idea what gunpowder tea but this question illustrates the example well.)

root: E<is-5>
  agent: type-2
    product_of: tea-4
      instance_of: <tea>
        is_subclass_of: <food>
    trait: ?-1
      instance_of: <?>
        is_subclass_of: <object>
    instance_of: <type>
      is_subclass_of: <cognition>
  recipient: tea-7
    complement_word: gunpowder-6
      instance_of: <gunpowder>
    instance_of: <tea>
      is_subclass_of: <food>
  instance_of: <be>
    is_subclass_of: <stative>

From here, I am easily able to extract {tea, gunpowder} as the words describing the entity in question and {type, tea} as the property being requested of the entity. This can easily be done in the same fashion for all questions.

Of course, there are different kinds of questions and some of them are too difficult to deal with. One other kind, however, that I can deal with is questions that ask "How many ...?" which is simply querying for an entity and counting the number of results. There are, however, some questions, like "What is the average temperature in Sydney in August?" that cannot be answered in this way.

Update to the previous post about named-entity recognition:

I overlooked the actual Freebase website: http://www.freebase.com/. It provides a robust search feature that returns what you'd expect it to, and it also lists properties of each entity nicely. The only question is whether it has a nice API to use. (API stands for Application Program Interface, and it's one way for computer programs to use services provided online.)


4 comments:

  1. This comment has been removed by the author.

    ReplyDelete
  2. Hey Sid, Can you not grow a dictionary of words or actions in the parser to act upon the type of questions being asked?

    ReplyDelete
    Replies
    1. Words that determine the type of question (such as what, when, or who) are easy because there is a very small number of them. One paper I read (don't remember which, though) said that they had only 17 kinds of questions. Manually creating a lexicon for entities is very tedious because there are thousands upon thousands of them, however.

      Delete
    2. Words that determine the type of question (such as what, when, or who) are easy because there is a very small number of them. One paper I read (don't remember which, though) said that they had only 17 kinds of questions. Manually creating a lexicon for entities is very tedious because there are thousands upon thousands of them, however.

      Delete