How We Talk

last modified: September 19, 2011

Ever wonder what we talk about here? I just finished a word frequency study of this site. I counted the number of pages that contained any particular word. These are the top ten frequently occurring words of at least six letters.

10844 category
 7225 people
 6411 because
 6300 should
 6200 software
 5815 things
 5803 programming
 5796 something
 5460 really
 5349 system

The count is the number of pages upon which the words can be found. Here are all 354 words that appear on at least a thousand pages.

Interesting, eh? -- WardCunningham

Have WikiNames been filtered from this list? -- StijnSanders

No, their parts are treated as separate words so ProgrammingLanguage counts as one for Programming and one for Language.


It looks like they confirm the topic of the Wiki as PeopleProjectsAndPatterns. Structures, systems, patterns, problems all score highly.


Following the example of WikiWordStatistics, we find in this list some of the ExtremeProgramming practices:

some AntiPatterns:

an observation:

finally, for the SmugSmalltalkWeenies:


This was just crying out for a bit of PoemWiki. http://downlode.org/wiki/wikiwordspoetry.cgi -- EarleMartin


I've looked into BNC (British National Corpus, which is one of the largest English corpus in the world) word frequency list at http://www.itri.brighton.ac.uk/~Adam.Kilgarriff/bnc-readme.html (BrokenLink) try (http://www.natcorp.ox.ac.uk/) and chose those with at least six letters and listed the top ten:

128393 should
125430 people
103003 because
 91141 between
 75588 through
 67219 become
 66894 government
 61912 system
 60607 number
 60498 however

Compare this list with the one above. (One of the interesting things to notice is that "system" is a very common word in general English written or spoken.)

-- JuneKim


Search Corpus (BNC)


See WikiStatistics, WikiWordStatistics


CategoryWikiStructure


Loading...