Sunday, March 20, 2016

Reading papers

I've been pretty much dry on ideas of how to improve the model I described in the previous post so under the advice of the Ph.D. student I'm working with I've been reading papers that pertain to the same problem. Here are the main ones:

Large-scale Semantic Parsing via Schema Matching and Lexicon Extension: http://cis-linux1.temple.edu/~yates/papers/textual-schema-matching.pdf

Semantic Parsing via Paraphrasing: http://cs.stanford.edu/~pliang/papers/paraphrasing-acl2014.pdf

Semantic Parsing on Freebase from Question-Answer Pairs: http://cs.stanford.edu/~pliang/papers/freebase-emnlp2013.pdf

Enhancing Freebase Question Answering Using Textual Evidence (very recent): http://arxiv.org/abs/1603.00957

Other than that, there's really not much to talk about about the past week. I did spend a lot of time learning about machine learning (partly out of curiosity, but also because I may need it later in this project), specifically artificial neural networks (ANNs). On the surface, the idea sounds simple but I found the details difficult to wrap my mind around even after reading quite a few introductory articles on them, including the Wikipedia page. The one that made it actually "click" was this one.

Friday, March 11, 2016

A simple plan of action

I think I've pretty much figured out what I'm going to do. An important observation when looking at the example questions provided in Free917 is that most end up simply asking for a property of an entity. For example, one of the questions is asking "What was the cover price of X-Men issue 1?" All the computer needs to do here is find the entity that represents "X-Men issue 1" and access the property "cover price." Of course, translating the actual utterance "X-Men issue 1" is a whole other task but once it's been done then the rest is trivial. More than half of the questions in Free917 are similar to this. This means that getting the answer to these questions is really simple: extract the entity from the KDG and then extract whatever property is being asked. Here's an example:

Here is the KDG (in list-form) of the question "What type of tea is gunpowder tea?" (I actually have no idea what gunpowder tea but this question illustrates the example well.)

root: E<is-5>
  agent: type-2
    product_of: tea-4
      instance_of: <tea>
        is_subclass_of: <food>
    trait: ?-1
      instance_of: <?>
        is_subclass_of: <object>
    instance_of: <type>
      is_subclass_of: <cognition>
  recipient: tea-7
    complement_word: gunpowder-6
      instance_of: <gunpowder>
    instance_of: <tea>
      is_subclass_of: <food>
  instance_of: <be>
    is_subclass_of: <stative>

From here, I am easily able to extract {tea, gunpowder} as the words describing the entity in question and {type, tea} as the property being requested of the entity. This can easily be done in the same fashion for all questions.

Of course, there are different kinds of questions and some of them are too difficult to deal with. One other kind, however, that I can deal with is questions that ask "How many ...?" which is simply querying for an entity and counting the number of results. There are, however, some questions, like "What is the average temperature in Sydney in August?" that cannot be answered in this way.

Update to the previous post about named-entity recognition:

I overlooked the actual Freebase website: http://www.freebase.com/. It provides a robust search feature that returns what you'd expect it to, and it also lists properties of each entity nicely. The only question is whether it has a nice API to use. (API stands for Application Program Interface, and it's one way for computer programs to use services provided online.)


Friday, March 4, 2016

The challenge of named-entity recognition

One of the biggest challenges that I face is named-entity recognition. I'll illustrate the meaning of the term with an example.

Imagine you have a list of directions written in a language you do not know. One of them reads:
Gok bok nok
You haven't the slightest clue as to what this means, until someone tells you "gok bok" roughly translates to "get." Now you just have to figure out what "nok" means and retrieve it, so you go to the library and ask the person working there for "nok." They reply with the question "Brok nok ork grok nok?" Your limited capabilities with this language tell you that the clerk is asking you to specify which kind of nok you want, as there are two different meanings of the word, one relating to "brok" and one to "grok."

You have no idea which one you need so you go back to the person who gave you the instructions to get some context. However, he just gives you some more gibberish that you have to translate and somehow relate to the words "grok" and "brok," the two different kind of noks to find out which nok you need.

It should be clear where this is going; you are the computer trying to figure out what an English sentence means and you've been thrown an ambiguous word. In the question "Who is the pitcher for the Chicago Cubs?" the meaning of the word "pitcher" might seem obvious to someone who knows what the Chicago Cubs are, and it's possible for computers to deal with this kind of ambiguity (is it a tool for holding liquids or a baseball player?) but not without significant work.

However, in my case I face another challenge on top of this. in the above example I assumed that the worker had access to a library that knew the meanings of the word "nok." I've been looking for one, and I've come across two serious candidates. My needs include complete search results (I should find what I'm looking for) and each result must be linked to its counterpart in Freebase.

The first is Google's Knowledge Graph (wiki) which Google claims is the successor of Freebase (which I find dubious, for reasons outside the scope of this post). Put mildly, its searching capability is miserable. Here are the results obtained from searching for "First Amendment" (an entity I know for a fact exists in Freebase):
First amendment, Book by Ashley McConnell (score 142.049927)
First Amendment, Song by Silent Civilian (score 141.980042)
First Amendment, Musical Group (score 141.068527)
First Amendment, TV Episode (score 137.776077)
First Amendment, Song by Silent Civilian (score 126.532188)
Where's the actual amendent?! Unless there's something I'm completely missing, Google's Knowledge Graph does not contain the information I need. This is saddening because this is the closest knowledge base to Freebase that is easy to use- each result that I get is directly linked to the corresponding entity in Freebase. The other option I tried has a much better search yet lacks this.

It's called "WordNet" and I am content with the amount of information it has. The search for "First Amendment" returns the result I'm looking for:
  • S: (n) First Amendment (an amendment to the Constitution of the United States guaranteeing the right of free expression; includes freedom of assembly and freedom of the press and freedom of religion and freedom of speech)
However, since I need to go from "First Amendment" to the correct entity in Freebase (something Google's product allows but this one does not), it will be difficult to use this service. Simply knowing the definition will not help.

There is a third option- perform a direct search on Freebase itself, but getting a copy of Freebase up and running is not something I can do with this laptop. Here is an excerpt from the instructions:
make sure you have at least 60GB of memory
Doesn't look like the 8GB I have here is going to do. I'm currently working to get access to a computing cluster at ASU that will be able to put up a copy of Freebase.