Wiki Word Statistics

last modified: January 9, 2007

A search for each of these words against the search database available on 7 March 2000 gave these results

The database contained 7546 pages.

It's interesting to compare those last two separately, rather than combine them.

Please append to, rather than modifying these figures, so that we can compare against them at some later date. My guess would be that in, say, a years time, the XP pages will be a lower proportion of the total, since the WikiMind will have drifted elsewhere. --KeithBraithwaite

May 12th, 2001

The database contained 15,289 pages. Searching for word hits didn't work too well, as the size of the resulting pages caused network dropouts.

The database grew by 102%. ExtremeProgramming grew by 75%. XP grew by 104%, Wiki grew by 161%, and patterns by 27%.


It is a part of usual LanguageOrientedProgramming practice to look at the words that are used in a system. I had postponed this for a while (being new to the Wiki), but now I did.

The list at the end of this page is the first part of the output of processing wikiList.

If you do this on a software API you usually find something interesting. Special words, redundant words, wrong words ... but at first sight, I didn't find anything of significance.

Of course, if you look at the first few lines (strip some simple words) you find what this Wiki is about: Wiki Programming Patterns Extreme.

But when I read on, I felt like a shaman priest having thrown a bag of bones to read the present and the future:

to name just a few. Just try it! Perhaps some expert can read and interpret this. I'm unable to. -- HelmutLeitner

If you play a little loose, the very first few words sum up Wiki pretty well:


See also WikiMines.


On a similar note, I am trying to divise a way to determine the "centres" of a wiki (or things similar to wikis). My best attempt so far has been http://usemod.com/cgi-bin/mb.pl?ShortestPathPages. -- SunirShah


As one might expect, the number of occurrences of a given WikiWord per WikiPage obeys a PowerLaw. This hypothesis was tested in March 2003 with 724 pages containing the WikiWord "UnitTest". A LogLog plot of the count of the pages with a given number of occurrences of "UnitTest" was created. The values are linear for the first two orders of magnitude, though they diverge from the ideal value as the number of occurrences of "UnitTest" per page increases:

Linear regression yields r-squared = 0.936.

A second test with "ExtremeProgramming" and 1,189 BackLinked pages gave a similar result, with r-squared = 0.950:

Binning the data increases r-squared to 0.99+.


The original version of this list counted each entry twice. This has been corrected.

Files: 1 Found: 80843

Count Statistic:

582 The
541 Wiki
463 Of
293 And
272 Programming
247 Patterns
234 To
232 Extreme
225 Is
210 Xp
208 In
202 Pattern
189 Software
172 For
166 Java
163 Language
161 Design
146 Object
126 Test
111 Code
110 Page
108 On
 93 Web
 90 What
 85 As
 84 Category
 83 With
 82 John
 80 Are
 79 Not
 78 Smalltalk
 76 Unit
 75 It
 73 Discussion
 73 You
 72 Two
 70 Refactoring
 69 One
 68 Use
 67 Mc
 66 Do
 63 Group
 62 Project
 61 David
 60 How
 59 New
 58 From
 57 By
 56 About
 55 This
 54 Component
 54 Topic
 54 Work
 53 At
 52 Testing
 52 Vs
 50 Be
 50 Ejb
 50 Objects
 50 System
 50 Systems
 49 Development
 49 Dont
 49 Meeting
 48 User
 48 Visual
 47 Name
 47 Why
 47 Your
 46 Model
 46 Tcpg
 45 Management
 45 Michael
 45 Time
 44 First
 44 Good
 44 Process
 42 Class
 42 People
 41 Method
 40 Refactor
 39 All
 39 Com
 39 Data
 39 Free
 39 Just
 39 More
 39 Net
 39 Server
 39 That
 39 Three
 38 Architecture
 38 Link
 38 Mark
 38 Problem
 38 Value
 37 Big
 37 Book
 37 Interface
 36 Changes
 36 No
 36 Peter
 35 An
 35 Bill
 35 Cpp
 35 Jim
 35 Meta
 35 Source
 34 Challenge
 34 Programmer
 33 Books
 33 Case
 33 Dot
 33 Exceptions
 33 List
 33 Open
 33 Pages
 32 Change
 32 Engineering
 32 Mike
 32 My
 32 Robert
 31 Computer
 31 Dave
 31 Plus
 31 Principle
 30 Game
 30 Links
 30 Microsoft
 30 Oriented
 29 Pair
 29 Tom
 28 De
 28 Eric
 28 Go
 28 Methodology
 28 Story
 27 James
 27 Knowledge
 27 Mode
 27 Richard
 27 Steve
 27 Thing
 27 Way
 26 Bob
 26 Me
 26 Mind
 26 Space
 26 Up
 26 World
 25 Art
 25 Business
 25 Chris
 25 Example
 25 Form
 25 Function
 25 Law
 25 Real
 25 Stories
 25 Technology
 25 Vb
 25 Ytwok
 24 Information
 24 Martin
 24 Nine
 24 Or
 24 Paul
 24 Python
 24 Things
 24 Tim
 24 Too
 23 Alan
 23 Anti
 23 Ats
 23 Community
 23 Framework
 23 History
 23 Recent
 23 State
 23 Team
 23 Tests
 23 Text
 23 Thomas
 23 When
 23 Write
 22 Analysis
 22 Bad
 22 Delete
 22 Great
 22 Isa
 22 Life
 22 Make
 22 Metaphor
 22 Perl
 22 Thread
 22 Twenty
 22 Words
 22 Works
 21 Basic
 21 Beans
 21 Box
 21 Can
 21 Music
 21 Need
 21 Public
 21 Thousand
 21 Uml
 20 Based
 20 Exception
 20 Home
 20 Idea
 20 Languages
 20 Quality
 20 Science
 20 Talk
 20 Who
 20 Word
 19 Brian
 19 Coding
 19 Does
 19 Four
 19 Functional
 19 Green
 19 Jeff
 19 Once
 19 Review
 19 Rule
 19 Rules
 19 Self
 19 Should
 19 Smith
 19 Stone
 19 Users
 18 Abstract
 18 Before
 18 Common
 18 Interfaces
 18 Like
 18 Non
 18 Oo
 18 Out
 18 Scott
 18 Script
 18 Seven
 18 Together
 18 Tool
 18 We
 18 Writing
 17 Browser
 17 Classes
 17 Document
 17 Factory
 17 Implementation
 17 Little
 17 Ninety
 17 Plan
 17 Programmers
 17 Reuse
 17 Right
 17 Solution
 17 View
 16 Anonymous
 16 Bug
 16 Comments
 16 Components
 16 Considered
 16 Dead
 16 Distributed
 16 Hard
 16 Its
 16 Joe
 16 Know
 16 Leadership
 16 Mac
 16 Machine
 16 Multi
 16 Order
 16 Other
 16 Post
 16 Problems
 16 Program
 16 Question
 16 Questions
 16 Style
 16 Types
 16 Visitors
 15 Andrew
 15 Bean
 15 Card
 15 Content
 15 Could
 15 Dan
 15 Database
 15 Documentation
 15 Edit
 15 Enterprise
 15 Faq
 15 Fic
 15 Frank
 15 Games
 15 Gof
 15 Greg
 15 Grok
 15 Int
 15 Love
 15 Man
 15 Message
 15 Only
 15 Over
 15 Paper
 15 Please
 15 Point
 15 Power
 15 Reviews
 15 Side
 15 Simple
 15 Six
 15 Soft
 15 Solutions
 15 Stephen
 15 Success
 15 Think
 15 Tools
 15 Unix
 15 Using
 15 Ward
 15 Will
 14 Agent
 14 Application
 14 Bruce
 14 Computing
 14 Daniel
 14 Definition
 14 Effect
 14 Entity
 14 Flow
 14 Immersion
 14 Kent
 14 Kevin
 14 Line
 14 Methods
 14 Null
 14 Person
 14 Reading
 14 Requirements
 14 Roger
 14 Ron
 14 Search
 14 Star
 14 Thinking
 14 Tips
 14 Well
 14 Workshop
 13 Another
 13 Cant
 13 Cards
 13 Cee
 13 Clear
 13 Culture
 13 Developer
 13 Domain
 13 Don
 13 Doug
 13 Editing
 13 End
 13 Evil
 13 Examples
 13 Full
 13 Future
 13 Get
 13 Harmful
 13 Has
 13 Have
 13 Here
 13 Junit
 13 Lazy
 13 Learning
 13 Library
 13 Map
 13 Modeling
 13 Old
 13 Oopsla
 13 Planning
 13 Plop
 13 Principles
 13 Pro
 13 Resource
 13 Second
 13 Simplest
 13 So
 13 Task
 13 Type
 13 Van
 13 Wall
 13 Win
 12 Active
 12 Analogy
 12 Back
 12 Bell
 12 Best
 12 Binary
 12 But
 12 Client
 12 Control
 12 Corporation
 12 Cplus
 12 Editor
 12 Emacs
 12 Five
 12 Fix
 12 George
 12 God
 12 Human
 12 Ideal
 12 Inheritance
 12 Long
 12 Most
 12 News
 12 Quote
 12 Reference
 12 Research
 12 Sand
 12 Session
 12 Single
 12 Society
 12 Stuff
 12 Theory
 12 Tri
 12 Variables
 12 William
 12 Writers
 11 Age
 11 Better
 11 Between
 11 Blue
 11 Bugs
 11 Builder
 11 Charles
 11 Command
 11 Complex
 11 Context
 11 Continuous
 11 Cool
 11 Copy
 11 Cost
 11 Death
 11 Driven
 11 Ed
 11 Edward
 11 Factor
 11 File
 11 Frameworks
 11 Guide
 11 He
 11 Hot
 11 Hyper
 11 Integration
 11 Keith
 11 Ken
 11 Keyboard
 11 Lisp
 11 Memory
 11 Multiple
 11 Names
 11 Nature
 11 Org
 11 Play
 11 Plug
 11 Processing
 11 Small
 11 Spaces
 11 Standard
 11 Structure
 11 There
 11 Trial
 11 University
 11 Values
 11 Ware
 11 Where
 11 Zen
 10 Applications
 10 Architect
 10 Architectural
 10 Around
 10 Author
 10 Black
 10 Blocks
 10 Build
 10 Call
 10 Composite
 10 Crc
 10 Cultural
 10 Douglas
 10 Down
 10 Environment
 10 Evolutionary
 10 Evolving
 10 Experiment
 10 External
 10 Failure
 10 Fast
 10 Forth
 10 Groups
 10 Ian
 10 Institute
 10 Inter
 10 Issues
 10 Jean
 10 Larry
 10 Linux
 10 Load
 10 Never
 10 Nick
 10 Os
 10 Own
 10 Possibly
 10 Practice
 10 Product
 10 Projects
 10 Proof
 10 Quotes
 10 Ralph
 10 Read
 10 Really
 10 Replace
 10 Risk
 10 Rob
 10 Role
 10 Room
 10 Sam
 10 Servlet
 10 Short
 10 Silicon
 10 Study
 10 Thirty
 10 Threads
 10 Tree
 10 Very
 10 Visitor
 10 Without
 10 Xml

See also HowWeTalk, WikiStatistics


CategoryWikiStructure CategoryStatistics


Loading...