Comp Ling Glossary


  • context free grammar (CFG) - in formal language theory a formal grammar where every production rule has the form A-> b where A is non-terminal and b is 0+ terminal and/or nonterminals.
    • linguistics sometimes calls them phrase structure grammars
    • comp sci often uses Backus-Naur Form (BNF) for CFGs.
      <symbol> ::= __expression__
      <US-address> ::= <name> <street> <zip> "USA"
  • perplexity - a measure of how well test data is predicted by a model.  
    • I read that the lowest perplexity found using a 3gram on the Brown Corpus is 247.
    • The perfect/true model for the data would have a perplexity of 0.
  • precision & recall - In IR precision is the fraction of retrieved instances that are relevant and recall is the fraction of relevant instances that are retrieved.
    • Max precision is no false positives (being conservative about finding a match).
    • Max recall is no false negatives (being exhaustive and thorough).