| 
View
 

2ndLectureNotes

Page history last edited by mike@mbowles.com 13 years, 3 months ago

We'll talk some more about LSI using the Deerwester paper.  Here's a link

DeerwesterJASIS90.pdf

 

We'll also talk about the Porter stemming algorithms

defPorter.txt

 

There's a complete book on information retrieval that's available on-line.  It gives very good coverage to a lot of the preparatory steps that we'll discuss and gives another angle on LSI and using SVD to regularize text searching. 

http://nlp.stanford.edu/IR-book/

 

We'll go through some code in class.  Here are the .r files. 

porter_Rstem.R

porter_snow.R

tmExamp.R

oNLP.R

 

Here's something you can work on to exercise yourself on the tools and techniques that we've talked about so far.

MLText-HW1.txt

 

Here's the recording of the second class:

https://datamining.webex.com/datamining/ldr.php?AT=pb&SP=MC&rID=97962907&rKey=5e7e3307ca626d85

Comments (0)

You don't have permission to comment on this page.