NYCPHP Meetup

NYPHP.org

[nycphp-talk] Blog Posts with Embedded Content

csnyder chsnyder at gmail.com
Wed Oct 8 14:15:38 EDT 2008


On Tue, Oct 7, 2008 at 9:19 PM, John Campbell <jcampbell1 at gmail.com> wrote:

> The safest approach is probably to pass the html through tidy, and
> then into DOM, and traverse and count the length of text nodes, but
> that would be quite slow if you ran it on every request.

Right, +1 for Tidy and DOM, it's the "real" way to do it. You won't
need to do it on every request -- you can either store the summary
itself as a separate text field, or store the length of the summary as
an integer.

This is crying out for a web service: The Excerpter. POST markup, get
the first X display characters back as a response, with embedded HTML
intact.


Chris Snyder
http://chxor.chxo.com/



More information about the talk mailing list