Article Abstracts

I am looking for a way to automatically create an abstract, basically the first few sentences / paragraphs of a blog entry, to display articles (which are written in markdown) in the list. I am currently doing something like this:

def abstract(article, paras=3):
    return '\n'.join(article.split('\n')[0:paras])

just to grab the first few lines of text, but I'm not quite happy with the results.

What I'm really looking for is to finally get 1/3 of the screen of the formatted text that will be displayed in the list of records, but using the above algorithm, the number of popped ends up with wildly different amounts, since it's a bit like a line or two, often mixed with essays of a more ideal size.

Is there a library that is good at this? if not, do you have suggestions for improving the output?

+5
source share
2 answers

EDIT:

You can do something like this:

from textwrap import wrap

def getAbstract(text, lines=5, screenwidth=100):
    width = len(' '.join([
               line for block in text.splitlines()
               for line in wrap(block, width=screenwidth)
            ][:lines]))
    return text[:width] + '...'

This uses the textwrap algorithm to get the perfect text length. It breaks the text into lines of screen size and uses them to calculate the length of the desired number of lines.

For example, applying this algorithm on the pikon wikipedia page :

print getAbstract(text, lines=7)

will give you this result:

Python - . 2 . [3] Python , "[] ", [4] . .

Python ( -, ) , Perl, Ruby, Scheme Tcl. , Python ...


. , , textwrap

, 100 , :

import textwrap

abstract = textwrap.wrap(text, 100)[0]

, .

+7

, .

X "...". "" ( ).

0

All Articles