Lila 0.2 – Compute word counts for WordPress posts and corpus

I promised code. After Reading is an assembly of many content types, including essays, memoir, book reviews, images, and … code. Lila is the concept that I am coding in steps. Lila will take on an increasing role as this series progresses. I have just coded Lila 0.2, as shown in the figure. The code is PHP and Javascript and uses the lovely Google Charts library. The code is available at my Github Gist site.

In the chart,

  • The blue line shows the count of words in each After Reading post, ordered in sequence. For example, post 21 has 1370 words.
  • The grey line shows a linear trend — the word count per post is increasing as the series progresses.
  • The red is a constant, the average words per post. The word count for post 21 is larger than both the trend and the average.

The code is just a beginning. Many more metrics will be added to analyze the text of After Reading. I want to be analyze the style of the posts, and several word measures can be calculated: frequency, feeling, concreteness, complexity, etc. Together they profile the style of posts and can be used to compare to the corpus. Even more interesting, it builds a platform for computational understanding of a text. More to come.

Email a comment:
Get posts by email