DeleteMe after 2006-10-08 -- Moved to CategorizationModels
I couldn't find any wiki pages on general classification techniques, so I created one. (There is a topic called "CategoryTheory", but it's name overlaps with the wiki category framework system, so it gets lost in the shuffle.) [Oops: found CategorizationModels. Guess I gotta work on a merge]
Classification is a way to identify items so that they are easier to find and easier to relate to each other. Here are some of the common techniques:
-
Hierarchies
- Pro: Intuitive
- Con: Don't deal with orthogonal traits well. See LimitsOfHierarchies
-
See-also lists
- Pro: Intuitive
- Con: Require too much effort to update
-
Sets
- Pro: Flexible
- Con: Not inherently intuitive, difficult to inspect
-
Weighted Sets
- Pro:
- Con:
The Cadillac[1] of classification systems in my opinion would be a combination of weighted sets and weighted see-also's. For example, for any item to be classified, one could create a list of weighted categories and weighted see-also's. Example:
Item: "Perkins Lab Coat #6"
Categories
-----------
Medical Supplies: 1.0
Scientific Supplies: 1.0
Clothing: 0.8
Safety Equipment: 0.4
Halloween Supplies: 0.3
See Also:
--------------
JVM medical washing machine: 0.9
Perkins all-in-one surgery kit: 0.6
Weights are assumed to be zero to one. In practice, this could be implemented with a RDBMS using two many-to-many tables with weights.
[1] AmericanCulturalAssumption: "Cadillac" used to be the most expensive car. Imports generally have taken over this niche, but the phrase has remained.
--top