Posted by: Alexandre Borovik | January 9, 2011

Google Ngram Viewer

Google released a powerful tool for analysis of long-term cultural trends: Google Ngram Viewer, a database of 500 billion words – mainly in English – and their occurrence in books over the last 2 centuries.

Hear is a graph for “logarithm, square root,exponent,cosine” from 1880 to 2008. From 1960-s, all mathematical terms appear to show a  steady and significant decline in frequency of their occurrence in books. What could this mean?

Frequency of words "logarithm, square root, exponent, cosine" from 1880 to 2008

If you wish to run proper statistical analysis, Google kindly provides files with raw data.

About these ads

Responses

  1. If you add the word “calculator”, you may get a partial answer to your question ;-) I hesitate, whether to suggest including also the word “challenged” or “challenge”, as it may be considered offensive.

  2. Indeed, as simple as that. Adding after that “slide rule” is even more illuminatiting.

  3. Alas, it is not that simple at all. When looking at terms from the domain of economic (or financial) numeracy: marginal rate, percentage change (which are lay synonyms of logarithmic derivative), compound interest (which is, of course, a manifestation of exponent), you get a picture which raises even more questions.

    • I did not experiment with the economic terms, but the general tendency should persist across the spectrum. Except for a handful few who actually understand the links and relationship between different numeric indicators, the “silent” (earning) majority learns a few tidbits and sound bites that substitute for the understanding. Once you learn that four legs good, two legs bad, it is very difficult to address the questions of stability of kinematic mechanisms with different number of joints and support points…

  4. I also recommend a “Science” article “Quantitative Analysis of Culture Using Millions of Digitized Books”, by Jean-Baptiste Michel et al. http://www.sciencemag.org/content/early/2010/12/15/science.1199644, and on-line supplementary material, http://www.sciencemag.org/content/suppl/2010/12/16/science.1199644.DC1/Michel.SOM.pdf for various caveats and disclaimers — but also for methodological advice and some examples.

    For serious analysis, if I will ever need some, I would perhaps go to the level of raw data.

  5. [...] This post was mentioned on Twitter by CW, Google News US. Google News US said: [wikio.com] Google Ngram Viewer (Mathematics under the Microscope): Google released a powerful tool for … http://bit.ly/ea46B5 #google [...]

  6. I wonder how much it simply reflects in the increase in publication – with many new areas opening up (both within and outside mathematics) so that the frequency of existing terms should be expected to decline.

    • My thoughts exactly! And the effect would probably be dominated by areas outside mathematics. Dips in many areas should be correlated with increases in the variety of what’s published.

      A pure quantity-based approach would clear out some of these problems, but since the Google n-gram viewer is based on a sample of about 4% of books each year, that would be impossible with the data.

  7. @Tom Franklin: I believe you are right. Look at this graphs:

    http://ngrams.googlelabs.com/graph?content=logarithm,cohomology,category+theory,topological+space,Lie+group,finite+difference,+dynamical+system&year_start=1880&year_end=2000&corpus=0&smoothing=3

    There are new mathematical terms which compete with classical one like logarithm.

  8. Looks like peoples have stopped loving mathematics recently , specially from 1970′s

  9. that last idea seems pretty good — relative frequencies are just diminishing. i dont really know cosmology, but i’ve heard information is never lost in the universe, except maybe in black holes, so maybe the elementary functions are just moving to outerspace as evolutionary mathematical succession occurs. ET’s might be catching up on trig now.
    i also think, given some forms of math platonism (eg tegmark’s ‘shut up and calculate’—-all there is, is physics, and the idea that math exists exists is just a form of ‘false consciousness’ used to justify textbook sales) possibly, following the us Supreme Court’s re-affirmation that corporations are people (as well as current discussions about ‘grammar’) that given this is natural law (us constitution->newton->etc.) maybe math terms (like others) are as real as quarks and jauguars, and hence (following s Jay gould) may have periods of existence as ‘species of thought’ (to use a term of biologist d s wilson). so maybe they are just going into the fossil record, so that future archaelogists (who of course will be actually google algorithms implemented on dell computers) will have some data to mine.
    perhaps a jurassic park can be created for entertainment, in which people visit to see ancient operations.
    (more likely though due to strict finitism (eg edward nelson of princeton) there won’t be enough time. )

  10. 4% of books can make a VERY representative sample — but it depends on criteria for their selection.

  11. For serious analysis, if I will ever need some, I would perhaps go to the level of raw data.

  12. will have some data to mine.
    perhaps a jurassic park can be created for entertainment, in which people visit to see ancient operations.


Categories

Follow

Get every new post delivered to your Inbox.

Join 75 other followers

%d bloggers like this: