Search and root pairs are based on morfix dictionary.
It is pretty good dictionary of modern hebrew.

There are a few not-roots (particles, prepositions) and some of roots have non-root letters (nun, tav, mem, and so on), some roots are new and some are from another languages. Nevertheless seeing the items you can quickly identify them. The percentage of all such "irregularities" is not strong and does not influence much statistic.

The decision, not to restrict the field of roots on the old hebrew, is coming from the idea, that the tendencies,even though not so strong, as in the past, are still alive in the modern language.

For full-text analysis in hebrew and aramaic is use the following resorce:
http://www1.snunit.k12.il/snunit/kodesh/bible/

For Indo-European languages analasys where used following sites:

  • http://www.wordgumbo.com/ie/cmp/index.htm for roots
  • gopher://ftp.std.com/11/obi/book/Religion/Vulgate for full-text in Latin