Final project ideas

Here's a list of possible final project ideas — all things that would be fruitful to complete.

You needn't limit yourself to items from this list. Feel free to modify or blend these questions, or to develop your own question, deriving from material we covered or work you're doing independently.

I tend to imagine the final project taking roughly the form of one of the section pages at this site, with code and description interwoven — a less structured form of literate programming. However, I am open to other forms. Whatever gets your ideas across. I do, though, urge you to find ways to be open about the your data and methods, aiming to make it straightforward for others to reproduce your analyses, visualizations, and results.

SCALES We discussed three methods for obtaining lexical scales based (loosely speaking) on sentiment strength:

Compare these methods. A sample of questions you might address: What are their strengths and weaknesses? Are there tasks for which we might favor (disfavor) one or the other of them? Could they be combined effectively (say, via clustering, or in a MaxEnt model)?

COMPOSITION One of the shortcomings of IQAP experiments is that they do not do proper compositional analysis. Rather, they turn the terminals inside the -CONTRAST nodes into bags of words and then do ad hoc comparisons. Define functions that implement some aspects of compositional analysis, and evaluate them in the context of the deterministic model or the MaxEnt model (or some variation thereof).

(For one approach to compositional analysis, see my talk with Alex Djalali at the Semantics for Textual Inference Workshop.)

QMAXENT There are a host of question-type dialog tags in the SwDA. To what extent are they distinguishable and predictable?

  1. Select a coherent subset of the q-type tags, taking into consideration their relationships as well as more practical issues like the number of tokens representing each. Feel free to pool them into groups (e.g. rhetorical vs. information-seeking). Keep in mind that large imbalances in class sizes can be problematic for classifiers.
  2. Build a function that maps swda.Utterance instances to feature dictionaries. (The function will resemble this one but be defined for swda.Utterance instances, rather than iqap.Item instances). Motivate the subfunctions that comprise your function.
  3. Modify IqapClassifier so that it works with swda.CorpusReader instances rather than iqap.IqapCorpusReader instances. (This is a thorough overhaul but the changes are pretty straightforward. An ambitious but useful extension: create a class that works with a wide range of corpus types.)
  4. Run an experiment with your classifier, report the results, and do some error analysis, searching in particular for places where the model seems to fail.

This project has about the scope of an ACL paper, which is a lot but which also suggests a potential submission.

CLAUSETYPING The SwDA clause-typing experiment looked at the extent to which polar interrogative forms match pragmatic function (dialog act). Conduct and describe a similar experiment for another (clause-type, dialog-act) pair, and explain how the results are relevant for research concerning the relationship between syntactic structures and pragmatic functions.

CLAUSETYPES2ACTS This exploratory section of the clause-typing page defines speaker and hearer prespectives on the associations between clause-types (syntax) and dialog acts (pragmatics). It seems there should be some deep points of alignment between these two distributions. After all, we are all both speakers and hearers, and part of playing those roles is anticipating what the person in the other role will do and then adjusting one's behavior (in production or generation) accordingly. What is the precise nature of this alignment? (This is a very open-ended and vague question; part of the project would involve making it more precise and manageable.)

CLUSTER The SwDA clustering section used the k-means algorithm to analyze what dialog act tags can tell us about word meanings. As noted in the analysis section, the hard-clustering that k-means performs is not ideal, since it cannot handle ambiguity. Pantel and Lin (2002) describe a non-hierarchical soft-clustering algorithm that might be better for this purpose. Implement their algorithm and informally assess how it does. (If your implementation extends SwdaWordTagClusterer in swda_wordtag_clusterer.py, then you should be able to use its matrix-building functionality.)

NONCANON What is the relationship between non-canonical clause types and discourse coherence?

  1. Write a function that identifies the clause type or types you are interested in, and evaluate the function for precision and recall.
  2. Select an annotation from the PDTB 2.0 that you want to predict. The likely contenders are, I think, Relation, ConnHeadSemClass1, or some clustering of conn_str(), but others might be worth looking at too.
  3. Explore the correlation between the values of your clause-typing function and the PDTB variable you chose. To what extent are they related? (Or, to what extent is their relationship evident from your analysis?) Are there areas (combinations of elements) for which the association is very strong (weak)?