Natural Language Processing – Project – Spring 2007
Initial Project Guidelines
to apply NLP techniques to a corpora.
of the exercise must be clearly specified
text to be analyzed can be obtained from any (legal) source.
IM, web pages, on-line text are all fair play.
projects will be individual projects.
can choose whatever technology you are comfortable with: .Net, Java, Perl,
Lisp, Prolog, Python. You need to get approval
from me on what you are going to be using though.
on-line news stories to determine what is the top topic of the day.
a news paper based on user interests
clinical concepts from medical dictation.
appropriate technical report that matches user interest.
IM to identify major topics discussed.
press releases to determine what the press release is about
spam email and identifying it as spam.
to identify project area by Feb 15, 2007. Submit a half a page proposal on
what you plan to do.
a plan of action and get my concurrence by Feb 28, 2007. The plan should
at a minimum include progress milestones that are bi-weekly.
two weeks submit a half a page progress report.
demo of the system right after spring break.
demo of the system at the end of the semester.