Friday, April 22, 2016

Results

Though preliminary, my experiments show promising results. I ran the training program on the training split of Free917 to get the model trained then tested it against the test split:

Total:                276
Total not broken:     206
Total simple:         158
Total found:          97 (61% simple)
Total not found:      61

I only count simple questions (easy object -> property questions) because it's too tedious to generate training data for non simple sentences. It's important to note, however, that my method, however, isn't bound to only simple questions. The real takeaway is that this method for extracting the main entity from the sentence works 61% of the time, which is higher than I was expecting.

Looking at errors:
  • Question pattern lookup fail: The question pattern simply wasn't recorded in the training set. This is solved by adding a couple more questions to the training set to cover the missed cases
  • Freebase search fail: The words extracted from the KDG are insufficient to find the correct entity on Freebase. This problem is much more difficult to solve.
  • Further K-parser errors: This problem is out of my control.
I have two options now: find out how to improve on the 61% as much as possible or start working on the other half of the problem, getting the "property" out of the KDG. The latter is more interesting but I'm not sure if it's even possible.

No comments:

Post a Comment