Make science boring again

I’ve read the book “Science Fiction” by Stuart Richie (a psychologist at LSE), as well as the book “Make it clear” by the late MIT professor Patrick H. Winston. I think there is a little conflict between the two books.

One of the many problems listed by the author of “Science Fiction” is the huge hyping problem of science. Scientists are fighting for eyeballs (and in turn prestige and grant money) so that they need to hype. Some preliminary and perhaps also transitional findings are hyped to be the next big thing. The example cited by the author is the NASA’s arsenic-life study. NASA hyped the preliminary findings of a bacterium allegedly can use arsenic instead of phosphorus to build DNA and said “that will impact the search for evidence of extraterrestrial life”. The paper has been (and still is) published in the prestige journal Science. But later this study has been refuted and the evidence for arsenic life is nothing but chemical contamination.

In the book “Make it clear”, however, a portion of it is about selling your research findings. There is a very, very thin line between “selling your findings” and “hyping your findings”. But some aspects of “Make it clear” is borderline hyping to me. My criticism is that Winston though a research finding being incremental is not good. He discouraged the use of the word “improve”. One should always emphasize the novelty. Well, that could be hyping.

It is related to the field that Winston was in: he was an AI researcher. (I mean the real AI; the direction was more on the AGI, not logistic regression or dictionary-based sentiment analysis.) A research finding to him was a product of engineering, e.g. an algorithm, a model or a dataset. There are no such thing as “preliminary” in AI research. One must present a usable end product. The applicability of AI research is almost always immediate; there is neither translational phase (from animal research to human research) nor “observational-to-causality” phase like most social science research. To me, the harm of using same sales speech for an AI algo is not as severe as, let’s say, the efficacy of a COVID-19 vaccine. If one frames it this way, the guiding principle for whether or not “selling” (not hyping) a research finding is acceptable depends on the potential impact of errors. It seems that I am a philosopher practicing negative consequentialism.

Richie thinks that almost all research findings are incremental. In his book, there is the anti-hyping motto “Make science boring again!”. The motto is not new; fast.ai uses the motto “Make neural nets uncool again!” and they have an explanation of why we need to make neural nets uncool. I have stolen this motto and use “Making automated content analysis uncool again” as the motto of the book I am working on. Being uncool (there is a slight difference between uncool and boring, although a boring guy like me is usually also uncool) is about being inclusive. Hyping means you are the only one who are fast enough, knowledgeable enough, and resource-rich enough, to get such finding. Being boring/uncool means you think your finding is incremental and invites further refinement (improvement, replication etc.), probably from other scientists. It also reduces the potential risk of premature application of preliminary research findings (e.g. the public health disaster associated with the Lancet MMR autism fraud). That’s how science should work.

After this long introduction, it’s time for “my take on this”.

Make science attractive

First thing first, this is all about methodological research, a thing that is quasi a new thing to me. Applying the above logic of negative consequatialism, the risk of bringing bad outcomes is relatively low. Of course, a wrong method can still be a bad thing. But one should always practice humbleness (you are not the center of the universe).

Here, you can practice a little bit of Patrick Winston here to sell your ideas, but not hype. In Winston’s book (and his annual MIT lecture), he introduced the concept of Winston star: Symbol, Slogan, Surprise, Salient, and Story. All of them start with an S, thus they represent the five vertices of a *S*tar. I’ll use two recent examples from me, namely, oolong and rectr, to explain them.

Disclaimer: I am not successful academically, so following this at your own risk. Your mileage might vary. If your aim is to be successful academically, then the attractiveness of your idea is in most of the time not as important as your reputation, collegiality, and sociability.

Symbol



some logos I created

A graphical symbol is always a good thing to have, although researchers are not good artists/designers. These are the symbols of rectr and oolong. Both of them were drawn by me with a primitive program called LibreOffice Impress, the opensource equivalent of Microsoft PowerPoint.

A good symbol tells a story. Rectr’s logo is a fictional skyline of Mannheim, because I like the city. Oolong’s logo is a homage to the design of a bottled oolong tea from Taiwan. It is the best selling bottled oolong tea in Hong Kong. The design of it, however, is a homage, or more appropriately, a “pakuri” (ぱくり) 1 of a Japanese bottled oolong tea brand.

Actually, I think names such as rectr and oolong are symbols too. But let’s talk about it in the next point.

Slogan



A slogan as the title of my paper

Germans love acronyms. Probably because their language is so well-known for its compound words. They like to use acronyms, or Abkürzung. They have initialisms such as AGB (Allgemeine Geschäftsbedingungen), FKK (Freikörperkultur), or DSGVO (Datenschutz-Grundverordnung). But also speakable acronyms such as Azubi (Auszubildender) and Kita (Kindertagesstätte).

I used to spend a pathological lot of time on coming up with a good acronym for each of my project. I really hope that I can come up with great acronyms such as SAINT (Symbolic Automatic INTegrator) or GLoVE (Global Vectors for Word Representation). I must also admit that my acronyms are not particularly good. For example, rectr is a play of the Latin word Rector, while means ruler. In Germany the head of an university can either be called “Kanzler(in)” or “Rektor(in)”. It incidentally also expands into “Reproducible Extraction of Cross-lingual Topic with R”. But I am not very fond of this acronym, because I always feel that someday people will joke about it (maybe about its performance) and called it “rectrum”. Despite this bad acronym, the REPRODUCIBLE part is very salient. If I can make you think about the reproducibility of cross-lingual analysis, I think the name rectr has fulfilled its purpose.

I have given up trying to come up with a good acronym and switched to use metaphoric names such as oolong instead. Not an acronym, but a text symbol that tells a story. (see the section on Symbol above)

Internally a slogan gives me a handle to talk about my research project. For example, I can just say I am now working on oolong; rather than “that R package one can make validation tests with.” Externally, it tells a story. Oolong is so named because it represents two things. First, it references the paper by Chang et al. on “Reading Tea Leaves”. Next, in my native language (as well as Taiwanese Mandarin) oolong (烏龍) also means “being confused” e.g. 擺烏龍 or even 老烏, making careless mistakes. In football terminology, it means “own goal”. The name captures the spirit of human-in-the-loop validation test.

The R community would argue that googleability is important. I used to feel that way too and came up with weird names such as ngramrr. But I don’t feel that way anymore. What if I pick a generic name such as oolong and still can make people want to google it? Remember, in the open source world there are ungoogleable names such as Bash, Parallel, Octave, Bison, Screen, and the king and queen of all, Make and all programming languages with a single character name such as R, C, D, and F. If I can make a stopword popular, I would probably try to do that too.

Salient



Bring a salient idea. What makes your idea better than other ideas out there?

For rectr, that would be the reproducibility. For oolong, that would be the reusability. I think the actual salient idea for oolong is for it being the ambassador of semantic validation: Oolong is being there so that you always remember to validate your text analysis.

Surprise



The Monty Hall problem (Source: Wikipedia)

We expect a research project would produce papers. But how about also an R package?

Also from your idea, highlight the part that is counterintuitive. Do you know the Monty Hall problem? It is timeless because of its counterintuitiveness.

However, a big warning: searching for surprises creates incentives to hype. See the arsenic DNA example above. Also, doing something nonscientific, or wrapping non-science as science, is indeed a surprise. But it is surprisingly bad for your reputation too.

Story


Ytterby (Source: Wikipedia)

What makes us different from other primates is our ability to understand stories. We love stories. Stories are colors to your monochrome idea.

After reading this blog post, you might not remember all the salient ideas. However, you might probably remember anecdotes: I love the city I am living so I create a logo resemble the city’s skyline. I worry about people would joke about the name of rectr.

A very common technique is to use examples. For example, in the paper by Lucas et al. (2015), they use an example of Google Translation of Edward Snowden to illustrate the limit of Google Translate and how that can influence the results of their approach.

Let me demonstrate the power of a story: In the periodic table, there are 4 elements: Yttrium (39), Terbium (65), Erbium (68), and Ytterbium (70).

Thanks for the information, you might say. But what if the information is like this: In the periodic table, there are 4 elements: Yttrium (39), Terbium (65), Erbium (68), and Ytterbium (70). Interestingly, they were discovered by the same chemist in exactly the same spot in Sweden. This chemist named all of these elements after Ytterby, where he found all those elements. So now, you know why these elements are named Yttrium, Terbium, Erbium, and Ytterbium.

Writing papers is good. Tossing in a story or two from your findings makes it memorable.

Contribution: Make science boring but attractive

This blog post illustrates how to package your idea using Winston’s Star without hyping. I hope it helps to make science attractive but still boring.

Footnotes

  1. This Japanese word is like the unofficial name for knock-off products. It can be written in both Katakana (パクリ) and Hiragana (ぱくり). But I would say this term as written in Katakana has a bit of racist history because it was probably from “Park” and “Lee”, two common Korean surnames. Japanese people use this term to describe the knock-off products (food and toy) mostly from Korea in the 80s-90s. Back then, it was not that rare to see Korean knock-off products in Japan and all over Asia.