In the last homework you combines the "acquisition" articles and the "crude" articles from the Reuter's document set and then tried to cluster then based on tfidf matrix. Now take use the tfidf matrix to do two things.
1. put together a an LDA model with 10 topics. have a look at the topics and see if you can identify which ones correspond to the original classifications of the documents.
2. generate a supervised LDA model with 10 topics using +/- 1 labels corresponding to whether the article is from "crude" or "acquisitions". How do the topics change from above.
Comments (0)
You don't have permission to comment on this page.