# Thesis writing: tell a story not a statistic

Self-portrait of Joseph Ducreux yawning and stretching. Photo: wikipedia.org

Worrying about how to analyze your data can get you so focussed on statistics that by the time you come to write up you results you might forget that writing a good thesis is all about telling a good story. No journalist has ever been persuaded to pay attention to a new discovery because the scientist started shouting about the size of their F statistic or the smallness of their p values. Rather, it is by telling a good story that you will capture the curiosity and imagination of your intended audience.

So when you come to write up your results, remember to start with your ideas and only thereafter turn your attention to your analyses. A good approach is something like the following. First, the basic idea or hypothesis phrased in an informal but interesting way (e.g., I think that A is probably the underlying cause of B because of C). Second, the formal test or measure of the idea (e.g., I don’t have powerful approach to being able to test the idea so I am going to use the weaker measure of correlation to see whether there is at least some sort of association between A and B). Third, the results of the formal measure (e.g., I found a correlation between A and B of 0.94). And finally, the conclusion about the idea based on the results of the formal measure (e.g., the surprisingly high correlation shows that there is a strong linear relationship between A and B but the possibility of a causal link will need further investigation).

That particular formula is neither the only one you could use nor necessarily the best one … but it is far from the worst approach and many many times better than simply reporting in succession, the fact that you calculated the correlation coefficient between A and B, did a regression analysis of C, D and E against F, and so forth. Remember, a weak study that is told in an interesting way will almost always get a better examination result than a stronger study that bores your reader to death.

# Focus groups and missing species

Focus groups are frequently conducted in a roundtable format. Photo: United States Department of Veterans Affairs.

Focus groups have a venerable history. They appear to have been first mentioned in a letter of 2 March 1938 from the English diplomat, Sir Harold George Nicolson. He wrote “I went to such an odd luncheon yesterday. It is called ‘The Focus Group’, and is one of Winston’s things.” Whether Winston Churchill invented focus groups, I don’t know, but since his time they’ve become ubiquitous amongst hucksters of all kinds—marketers, politicians, sociologists, psychologists, and the like.

If you’re planning focus group research, you’ll probably want to know how many groups you should run. Sometimes the question will be answered simply and quickly by your available budget; if you can’t afford to run more than two groups, then that is all you’re going to plan for. But what if you can afford more? Is there a rational basis for settling on any particular number of groups?

Most of the suggestions that I have seen are variants on a single theme — sample until you stop hearing anything new. Lunt and Livingstone [1], for example, say “…one should continue to run new groups until the last group has nothing new to add, but merely repeats previous contributions.” That sounds simple enough but what does it really mean? If you run two groups which proffer essentially the same three opinions, does that mean you should stop? Or should you assume that a third group might come up with some, as yet unstated opinions, and continue running more groups? Would it make any difference if each group gave exactly the same 20 different opinions, as compared with the possibility that each group offered only two opinions?

One way of looking at the decision problem is to try to create a model of the process that is assumed to underlie the formation of opinions and their subsequent revalation in the focus group. The closest that I have found to an explicit statement of the model that underlies the previously described rule of “sampling to saturation” is in another remark by Lunt and Livingstone, namely, “A useful rule of thumb holds that for any given category of people discussing a particular topic there are only so many stories to be told.” That might not sound like a description of a model, but here is my attempt at enlarging upon their statement.

Regarding any particular topic, there are a finite number of stories (or beliefs, or opinions) floating around in the ether. Each person acts as an “story trap”; stories that approach too closely to the trap are caught, and when the person is subsequently interviewed, in a focus group for example, the contents of the trap are revealed.

The reason that I have phrased the model in these unusual terms is because it then becomes obvious how the activity of running a focus group to discover opinions is similar to the activity of a biologist who is trying to discover how many species of animal there are in a particular environment. The biologist sets special traps which capture, mark and release the animals that are ensnared and at the end of the day the biologist has information on how many species were captured only once, and how many of them were captured multiple times. Statisticians have developed various methods [for example, 3] for estimating, from the capture data, the number of undiscovered or untrapped species.

Similar approaches could be made, first, to determining how many opinions one has not managed to tap by the focus groups one has run so far, and second, estimating how many more focus groups one should run to increase the probability of capturing those opinions with some arbitrary likelihood. I know of only one paper [4] that touches on first problem and I know of no research that has attempted to tackle the second problem. Given the very large sums of money that are devoted to market research, both problems seem to me to be worthy of more attention than they have so far been given.

[1] Lunt, P., & Livingstone, S. (1996). Rethinking the focus group in media and communications research. Journal of Communication, 46(2), 79–98.

[2] Karanth, K. U. (1995). Estimating tiger Panthera tigris populations from camera-trap data using capture\u2014recapture models. Biological Conservation, 71(3), 333–338.

[3] Efron, B., & Thisted, R. (1976). Estimating the Number of Unseen Species: How Many Words Did Shakespeare Know? Biometrika, 63(3), 435–447

[4] Griffin, A., & Hauser, J.R. (1993). The voice of the customer. Marketing Science, 12(1), 1–27.

Contributors: Mark R. Diamond

# A test of fundamental economics

Photo: en.wikipedia.org

No, not something that will win you the Bank of Sweden Prize but rather a simple honours research project crossing the domains of economics and human behaviour. The subject is toilet paper.

You might have noticed that the quality of toilet paper in large office blocks, universities, schools and sporting complexes usually isn’t a patch on what you might have a home. A quick check of warehouse prices for bulk buys of toilet paper suggests that you could spend anything from AUD\$0.40 to AUD\$1.60 per roll so my guess is that purchasers believe that buying a lower quality product will save money overall. But does it, or does usage increase as quality decreases, more than compensating for any of the original cost-per-item saving? I’m assuming that price and quality are highly correlated but they might not be.

It couldn’t be too hard to create nicely controlled experiments to answer both the question about the relationship between price and quality, and the question about quality and usage. Using my own mythical numbers, I estimate that a building of 1000 people could save around AUD\$20,000 annually by answering the questions.

Contributors: Mark R. Diamond

# Blogging, thesis writing, and limerence

I hadn’t realised until I sat down to write a posting today, just how much writing a blog is like writing a thesis. The beginning of either is not unlike the early stages of a love affair; a stage that marital therapists sometimes refer to as “limerance”. Just as in a relationship, you begin your dissertation (or blog) with an enormous sense of enthusiasm and desire, and want to spend as much time as possible with it. Getting on with your research, planning it, and thinking about when you’ll next be able to spend time with it, are constant preoccupations. You have thoughts about how important a PhD is to you, what sorts of goals and dreams and plans the PhD will help you achieve, and you allow it to overflow into other areas of your life, maybe even pushing other important commitments into the background. In the same way, I discovered that planning the layout of the blog, the content, designing the style sheets, and trying to work on an appropriate writing style occupied a great deal of my own thinking. And despite the amount of time that it occupies (or occupied), the process seemed quite effortless at first since the intensity of one’s enthusiasm makes all the associated demands seem minor.

Later on, just as with a relationship, a thesis (or blog) enters a so-called “mature phase”, although, in the case of this particular blog is, I think it still a long way off! But in between the early phase of limerance and the mature stage, there is a long intermediate period which lacks some of the early shine of the beginning. The change away from a phase of limerance is obvious. Working on your thesis becomes harder. The demands of daily life, like doing the grocery shopping, going out, seeing friends, and earning money, all compete to take your attention away from work on your thesis. Consequently, the research and writing process become ever harder.

You might also begin to have doubts that never crossed your mind in the early stages. When you started, it was “obvious” that a dissertation, PhD, blog, or whatever, was the right thing to do. You might not even have thought much about why it was so important. In the middle period, you begin seriously to doubt your sanity about having taken on the project, and might deliberately avoid thinking about it, in just the same way that you initially thought so much.

So, what to do?

If you are reading this blog two years into your dissertation research, you might think that you have missed the boat as far as the early planning stage is concerned … but I don’t think that’s entirely true. Even if you did not consider the questions that I have posed when you began your thesis, you can still ask and answer them now.

Be willing to adjust your expectations as your research proceeds. Most research projects at PhD level, as opposed to Honours or Masters, do not go according to plan. Most of the reason for this is that a plan for a PhD is almost an oxymoron. If you knew how your research was going to go, it would imply that you already knew the outcome of the initial stages of the research, in which case they would be unnecessary. What usually happens is that after your initial framing of a topic, and early explorations or experiments in the area, new and unforseen experiments or avenues of exploration present themselves, and your research tacks in a direction that is different from the one you initially anticipated. So you need to be open to new ideas, and be prepared to be flexible.

And on that note, I shall end. If you have read this far in one sitting, thank you. But I am about to take my own, next piece of advice. Namely, be prepared to take breaks. I shall write more next time.

Contributors: Mark R. Diamond

# More unblocking

Unlike writer’s block, unblocking a drain can be done easily. Photo: Mark Diamond

Last time, I commented that there are a multitude of ways to overcome writer’s block, and having describe one method in the previous blog, I propose to focus on a few more over the coming weeks.

Today, I have decided to focus on what might be termed “investment free” writing. Far too many people, and especially, though not exclusively, those who are early in their writing career, confuse evaluations of their writing with evaluations of themselves. For example, you struggle hard to produce a piece of writing with which you are not entirely satisfied, and then reach conclusions like “I’m hopeless”, “I knew I was no good”, and so on. Or your supervisor criticizes your punctuation and you think, “I’m not smart enough to get a PhD”. Not only are conclusions like this self-defeating, in that they lead you to feel miserable and demotivated (instead of more energized as you would want), but they are focussed on you as a person, rather than on your writing.

One way that some people have found of disengaging themselves from their writing product, is to practice writing something that is deliberately “wrong”. It might be wrong in any number of ways, and the particular way is entirely up to you, but you might consider the following —

Write as if English (if that is the language of your thesis) is not your first language. Of course, it might actually be true that English isn’t your first language, but that is not the point. The purpose is to write in a way that makes it show in an exaggerated way that writing in English doesn’t come easily to you. That much will be true, otherwise you wouldn’t have writer’s block. Here is example. I copy from Tom Lehrer song called “Lobachevsky”. If you not hear this song before, you must listen. Tom Lehrer he is very good. I am remember the time I first hear him. He make big impression to me. I am thinking, “If this man, he can make money with voice like that, then I can be millionaire.” But first, I am giving supervisor new song draft!

Another approach is to take a viewpoint with which you disagree entirely, and write a spoof or a parody. Put in jokes. Sure, whatever you write might not end up in the final version of your thesis, but (a) it might just succeed in unblocking the writing flow, (b) I’d be surprised if writing a spoof did not help you clarify what you really do think, and (c) there might be parts that you can use just as you have written them.