Technical Writing Tips for Graduate Students
Effective technical writing is one of the most important skills that a graduate student can develop and hone during their academic career. Your research - no matter how enlightened and groundbreaking it may be - will not have an impact if you cannot communicate it effectively. Publication in conference proceedings and archival journals is the engine that drives science. If you do not publish your work, it will not influence other scientists, clinicians, industry, and society.
More practically, the value of your academic degree is directly dependent upon the quality of your technical writing. That’s right. If you can’t write, your degree loses value. The Ph.D. (or M.S.) degree that you hope to obtain through your work in the BiRG lab is a research degree. The faculty search committee at Prestigious State University will not be impressed by your 4.0 GPA. Rather, it is the number, quality and research area of the conference presentations and journal papers that you have authored that will attract the attention of your soon-to-be peers.
So, what does it take to write a good technical document? Here I’ll outline a few key points to help you with the process. Before I begin, it should be noted that effective technical writing is a skill, and as with any skill it is not sufficient to only study about writing. You will not become an effective writer by reading this web page. To become a solid technical writer, you must (you guessed it) write. Then you must edit, re-write, and edit and re-write again. Along the way, it doesn’t hurt to read, read, and read some more. A good target for a graduate student is to read three journal articles in your research area every week. This will not only keep you up to date in your field (essential for any scientist) but also will steep you in the linguistic constructs and techniques of precise technical writing. If you take only one message away from this document, let that be it. To write well, begin by reading often and critically. You will find many examples of poorly written manuscripts in the literature, and a few gems. Keep those gems at hand. Highlight the most important sections and re-read them before you sit down to work on your dissertation or your latest submission to Science or PNAS.
Writing a Journal Article
A well written journal article serves several purposes. First and foremost, it reports your novel and interesting research results to the rest of the scientific community. Secondly, your article serves as an archival record of your work. This is of critical importance. It is through publication that your research becomes a brick in the ongoing structure of science.
To ensure that your work gains lasting impact, you need to accomplish two things:
-
Your paper must report your results in a clear, compelling, accessible and accurate manner. Yes, you are writing for other scientists. Yes, many of them are experts in your specific field of research. It is a mistake, however, to assume that readers of your paper will understand your results and their importance unless these things are clearly articulated in your manuscript. Moreover, scientists are trained to read the literature with a critical eye. It is incumbent upon you, the writer, to convince your peers of the accuracy and import of your results. The best way to do this is through effective selection and presentation of figures and tables, which I will discuss more thoroughly below.
-
Your paper must describe your methods in such a way that your work is reproducible. As any experienced scientist will tell you, this is more difficult than it sounds. If your methods are well written and well organized, another scientist should be able to reproduce your result, starting from scratch, without communicating with you or any of the other authors of your paper. You must include every detail necessary to re-create your experiment, while omitting unnecessary, implementation-specific details that unecessarily complicate the description.
Presenting results through figures and tables
When you are ready to prepare your paper, you should have in mind the story you want to tell. Imagine what you would say if your grandmother asked you about your research. What would you tell her? If a reporter called and asked you to sum up your results in a five minute phone conversation, what would you say? Before you type a single word of the title of your paper, you should formulate in your mind the story you intend to tell. Usually that story will read something like this:
- Why is your research area important?
- What is already known in your area?
- What important facts remain unknown? What “holes” are there in the scientific body of knowledge in this area?
- What were your results? How do your results close one or more of those holes?
- What is the practical importance of your results?
Once you have this story in mind, you should consider how you can convey the story through your figures and tables. As you do this, keep in mind that many scientists will decide whether your paper is worth reading by first scanning your figures and tables. Most research articles contain five to seven figures and tables. Ideally, these figures and tables, along with their legends, should tell the story you would tell to the reporter. Why is your research important? What did you find? Why does it matter? Choosing your figures and tables is the key step that determines the content of the rest of your paper. Choose carefully, and run your ideas past your advisor, your lab, and your collaborators before you settle on a set of figures to include.
Once you have decided what figures and tables to include, consider how they should be presented. In general, a figure is more effective than a table in conveying results and their importance. Use simple, clearly labeled figures wherever possible. Read through these guides for more advice on creating your figures:
Take a look at 15 Stunning Data Visualizations to see how an effective visualization can convey a result in a clear and compelling manner. When creating your graphs and charts, remember the following:
- Label your X and Y axes clearly.
- Include a complete description of the X and Y axes and the data being represented in the legend.
- If you have more than one graph (or subgraph) with similar or related results, make sure the X and Y scales match unless there is a good reason not to. If they don’t match, explicitly say so in the legend(s).
- You do not have to do all of your graphing in Excel and Matlab. There are many, many specialized tools available for scientific data visualization. Here are just a few.
Writing for reproducibility
In order to explore the idea of writing for reproducibility, let us consider a specific example. Bob the bioinformatics graduate student has spent the last two years analyzing homologous genes among a variety of eukaryotic organisms in order to determine whether or not prokaryotes in general are becoming more AT rich over time. If another researcher wanted to replicate Bob’s work, she would need to know such information as:
- What data sets were used? Either make the original research data avaliable and provide a URL/URI or else include the database AND build used. Bioinformatics data sets change rapidly, so the specific build and/or date that you downloaded the data is important.
- How was the data culled, normalized, pre-processed, or otherwise prepared for analysis?
- What algorithms and statistical methods were used for the analysis. Cite original papers for existing methods. Provide clear and complete algorithms for new methods.
- How were the results tested and validated?
It is often particularly challenging to describe novel algorithms in an intuitive and detailed manner. Where possible, provide source code links as supplemental information to ensure that your colleagues can reporoduce your results precisely.
While many manuscripts are guilty of not providing sufficient detail to allow reproduction of the results, it is also possible to provide too much information. Any detail that is not necessary for reproducing your work should be omitted. For example, if Bob provided a clear and complete description of his methods, than another researcher should be able to replicate his results without knowing:
- File names, variable names, and other implementation details,
- What language(s) the scripts and analysis programs are written in,
- Why Bob chose to store the data in CSV instead of XML format, etc.
Things to remember while writing
Cite primary sources
Wherever possible, you should cite the primary source (that is, the original paper) for existing results and methods. Avoid citing textbooks except in very rare cases for commonly known facts. Use the textbook’s bibliography to find the primary source and cite that instead. If you are reading your three papers a day, you should have a wealth of material to cite. Use a citation manager such as EndNote or Mendeley to ease the process of creating and maintaining your citation library.
Avoid making unsubstantiated claims. Many graduate students feel compelled to make judgements like this: “Any good scientific study should include control data. For this study we…” When you are a well known full professor you can publish position papers. For now, stick to the facts. Be sure to cite appropriate sources when discussing results and/or methods that are not your own.
Grammar and style
-
Read the following links immediately:
- How to use an apostrophe
- How to use a semicolon
- What it means when you say literally (not so important for scientific writing, but read it anyway)
- When to use i.e. in a sentence
- Ten words you need to stop misspelling
-
Use the active voice wherever possible. See this link for more details: BioMedical Editor - Active and Passive Voice.
-
Avoid the use of quotation marks for novel terms. Suppose you introduce a new technique called “fractal atomic density” for the measurement of local atomic density in macromolecules. It is acceptable (though I don’t recommend it) to use quotes when first defining the term. Thereafter, leave them out. If you include quotes too often, it soon appears to the reader as if your new technique is not “real science”, and maybe the author doesn’t “know what he’s talking about”.
-
When using acronyms, make sure your definate article agrees with the acronym the way that it is read out loud. In other words, if I’m using STR for “short tandem repeat”, then I should refer to an STR in the manuscript, not a STR. If you don’t see why, try reading both of them out loud.
-
Avoid the urge to add extra quantifiers and adverbs. The word “very” can almost always be removed or changed. Example: “The classifier achieved very high accuracy for the third dataset.” How high is “very high”? Remove “very”, or use a more precise term like, “substantially higher”, or better yet, report the accuracy as a number and avoid “very high” altogether. In particular, watch out for the word “significantly”. Almost all scientists associated this term with statistical significance. If that is not what you mean, find another word.
-
Words in the title (as well as chapter and section titles) should only be capitalized if they are a proper noun or an acronym.
-
Chapters and figures are proper nouns, and should be capitialized in the text (e.g. “as shown in Figure 7”).
-
Avoid over-using the comma. Often a sentence can be reorganized to avoid using multiple commas.
-
Keep your sentences short and simple, where possible. A scientific paper is difficult enough for a reader to digest without adding run-on sentences into the mix. Each sentence should have a clear semantic concept that it is attempting to get across.
-
Be careful with proper word usage. For example, some common errors include the use of the word “since” when the word “as” or “because” is more appropriate (“since” is a temporal verb and should ONLY be used when referring to time). Another common error is the use of the word “between” when the word “among” is more appropriate (between implies two choice, among implies more than two). A list of these sorts of common mistakes can be found in any good technical writing text.
-
Avoid colloquial/informal terms. Example: “Classifier accuracy for the third run broke 95%”. becomes “Classifier accuracy for the third run exceeded 95%”.
Paragraphs and flow
The first sentence of each section should provide a natural transition from the previous section. A common mistake in thesis/dissertation/paper writing is to move to a new topic with no transition, leaving the reader asking: “Why is the author telling me this?”. For example, consider the following section of a hypothetical manuscript:
DATA COLLECTION
Genome sequences for 12 prokaryotic and 14 eukaryotic species were downloaded from the NCBI full genomes dataset (build 1207, downloaded from ftp://niftydatasets.com/genomes on 12-18-2011). …
PRINCIPAL COMPONENTS ANALYSIS
Principal components analysis uses the eigenvectors of the sample covariance matrix to produce a reduced-dimensionality projection of multivariate data sets for visualization and data exploration. …
Huh? I was reading about data collection, and suddenly I’m getting a tutorial on PCA. Why is the author telling me this? How does it relate to what I was just reading? The literary whiplash caused by this abrupt section transition can easily be corrected by relating the content of the new section to the overall flow of the paper. A sentence like this will do the trick: “After data collection and preprocessing, the resulting codon counts were visualized using principal component analysis (PCA).” Now I can transition into the existing section without causing stress and confusion for the reader.
Citations
All direct citations should be quoted. Brief direct citations should be enclosed in quotation marks. Long direct citations should be offset via indentation on both margins. Paraphrased citations do not need to be quoted, but the cited work must be clearly semantically delimited. If the paraphrased work extends over multiple paragraphs, this should be indicated semantically in text. Make certain to reference EVERY advanced concept that you use. For example, if you use a Genetic Algorithm, you should provide a reference for a reader that is unfamiliar with GAs to follow up with. Ideally, this reference should be the source of your understanding of GAs.
Writing a thesis or dissertation
All of the material above applies to theses and dissertations as well. Your dissertation does not have the strict space limitations of a journal article, so you have the opportunity to provide a more complete and detailed treatment of your methods, results, and conclusions.
A thesis/dissertation generally has the following Chapter format:
-
Abstract (1-3 pages)
- Chapter One: Introduction
- Include brief background necessary to understand the research at high level
- Include a list of your novel contributions
- Include a brief discussion of why the field in general and your contribution in particular are important
- The final section is an outline of the remainder of the work
- Chapter Two: Background and Related Literature
- Include a section on all major research areas involved in your work and that are necessary to understand its significance
- It is important here to perform a complete and detailed literature survey, discussing other researchers who are doing similar/related work. Describe how your work relates to the previous work in the field.
- Chapter 3 – (x-1): One chapter per major contribution
- Ideally each chapter poses a problem, presents a solution, and provides results which demonstrate the application of that solution.
- Ideally each chapter poses a problem, presents a solution, and provides results which demonstrate the application of that solution.
- Chapter X: Conclusions and future work
- Final remarks, a repeated list of all contributions, “lessons learned”, and recommendations on future directions for related work
- Final remarks, a repeated list of all contributions, “lessons learned”, and recommendations on future directions for related work
- Appendix A - N: (if necessary)
- Appendices are often used to include documented code, tables of data, or other material that is too long to be included in the work proper
- Appendices are often used to include documented code, tables of data, or other material that is too long to be included in the work proper
- Bibliography