变身抓重点小能手:机器学习中的文本摘要入门指南 | 资源( 五 )

3

4# Tokenizing the sentences

5sentences = sent_tokenize(article)

6

7# Algorithm for scoring a sentence by its words

8sentence_scores = _calculate_sentence_scores(sentences, frequency_table)

9

10# Getting the threshold

11threshold = _calculate_average_score(sentence_scores)

12

13# Producing the summary

14article_summary = _get_article_summary(sentences, sentence_scores, 1.5 * threshold)

15

16print(article_summary)

第一步:准备数据

这里使用了Beautiful Soup库。

推荐阅读