enlp.understanding.distributions.important_words_per_doc

enlp.understanding.distributions.important_words_per_doc(scores, doc_id=None, n=5)[source]

Based on tfidf scores, return most important words per document

Parameters
scorespandas.DataFrame

pandas dataframe where every word is a feature and every document is an observation, computed by compute_tfidf method

doc_idslist

list of document ids for indexing results, default is to compute for all documents

Returns
imp_wordslist

list of doc lists where doc list contains tuples of important word and its score in the document