Text Summarization – In short

Share

Summarization is the task of producing a shorter version of a document while preserving its important information. Some models can extract text from the original input, while other models can generate entirely new text.

Extractive Summarization

Extractive Text Summarization methods work by identifying and extracting the salient information in a text

Abstractive Summarization

Abstractive methods seek to generate a novel summary that appropriately summarizes the information within a text. 

Code

Python
from haystack.nodes import TransformersSummarizer
from haystack import Document

docs = [Document("Transfer learning, where a model is first pre-trained on a data-rich task before being fine-tuned on a downstream task, has emerged as a powerful technique in natural language processing (NLP). The effectiveness of transfer learning has given rise to a diversity of approaches, methodology, and practice. In this paper, we explore the landscape of transfer learning techniques for NLP by introducing a unified framework that converts every language problem into a text-to-text format. Our systematic study compares pretraining objectives, architectures, unlabeled datasets, transfer approaches, and other factors on dozens of language understanding tasks. By combining the insights from our exploration with scale and our new “Colossal Clean Crawled Corpus”, we achieve state-of-the-art results on many benchmarks covering summarization, question answering, text classification, and more. To facilitate future work on transfer learning for NLP, we release our dataset, pre-trained models, and code.")]

summarizer = TransformersSummarizer(model_name_or_path="google-t5/t5-small")
summary = summarizer.predict(documents=docs)
print("t5 Extractive Summary: ",summary[0].meta["summary"],"\n\n")

summarizer = TransformersSummarizer(model_name_or_path="google/pegasus-xsum")
summary = summarizer.predict(documents=docs)
print("Pegasus Abstractive Summary: ",summary[0].meta["summary"])

Output

PowerShell
t5 Extractive Summary:  the effectiveness of transfer learning has given rise to a diversity of approaches, methodologies, and practice. the systematic study compares pretraining objectives, architectures, unlabeled datasets, transfer approaches, and other factors on language understanding tasks.

Pegasus Abstractive Summary:  In this paper, we explore the landscape of introducing transfer learning techniques for NLP by a unified framework that converts every language problem into a text-to-text format.