enlp.understanding.distributions.important_words_per_doc¶

enlp.understanding.distributions.important_words_per_doc(scores, doc_id=None, n=5)[source]¶

Based on tfidf scores, return most important words per document

Parameters

scorespandas.DataFrame: pandas dataframe where every word is a feature and every document is an observation, computed by compute_tfidf method
doc_idslist: list of document ids for indexing results, default is to compute for all documents

Returns

imp_wordslist: list of doc lists where doc list contains tuples of important word and its score in the document