| 
  • If you are citizen of an European Union member nation, you may not use this service unless you are at least 16 years old.

View
 

2ndLectureNotes

Page history last edited by mike@mbowles.com 12 years, 11 months ago

We'll talk some more about LSI using the Deerwester paper.  Here's a link

DeerwesterJASIS90.pdf

 

We'll also talk about the Porter stemming algorithms

defPorter.txt

 

There's a complete book on information retrieval that's available on-line.  It gives very good coverage to a lot of the preparatory steps that we'll discuss and gives another angle on LSI and using SVD to regularize text searching. 

http://nlp.stanford.edu/IR-book/

 

We'll go through some code in class.  Here are the .r files. 

porter_Rstem.R

porter_snow.R

tmExamp.R

oNLP.R

 

Here's something you can work on to exercise yourself on the tools and techniques that we've talked about so far.

MLText-HW1.txt

 

Here's the recording of the second class:

https://datamining.webex.com/datamining/ldr.php?AT=pb&SP=MC&rID=97962907&rKey=5e7e3307ca626d85

Comments (0)

You don't have permission to comment on this page.