Live Blog Summarization
Abstract. Live blogs are an increasingly popular news format to cover breaking news and live events in online journalism. Online news websites around the world are using this medium to give their readers a minute by minute update on an event. Good summaries enhance the value of the live blogs for a reader, but are often not available. In this article, (a) we first define the task of summarizing a live blog, (b) study ways of automatically collecting corpora for live blog summarization, and (c) understand the complexity of the task by empirically evaluating well-known state-of-the-art unsupervised and supervised summarization systems on our new corpus. We show that live blog summarization poses new challenges in the field of news summarization, since frequency and positional signals cannot be used. We make our tools publicly available to reconstruct the corpus and to conduct our empirical experiments. This encourages the research community to build upon and replicate our results.