Research Methodology — Practical Guide B2–C1 Academic Writing
10 Principles of Strong Methodology
- Fit the question. Design follows the research question.
- Replicable. A careful reader could run your study from your description.
- Justified. Every choice (design, sample, measures, analysis) has a reason.
- Transparent limits. Name weaknesses + mitigations.
- Bias control. Randomization, blinding, triangulation, inter-rater checks.
- Ethical. Consent, confidentiality, data protection, risk management.
- Operationalized. Abstract ideas → measurable variables or analyzable artifacts.
- Validity & reliability. Show how you checked them.
- Appropriate analysis. Tools match the data & question.
- Professional tone. Precise verbs, hedging, clean structure.
Typical Methodology Structure
- Design (approach & rationale)
- Participants / Sampling (who, how many, how selected)
- Materials / Instruments (constructs & measures)
- Procedure / Data Collection (what happened, when, where)
- Data Preparation (coding, cleaning, transcription)
- Data Analysis (statistics / qual strategy)
- Validity/Reliability or Trustworthiness
- Ethical Considerations
- Limitations & Delimitations
Sentence Frames
Design: This study adopts a [experimental / survey / case study / ethnography / mixed-methods] design because …
Sampling: Participants were recruited via [source] using [random / stratified / purposive / snowball] sampling; inclusion criteria were …
Operationalization: [Construct] was operationalized as [observable measure], captured by [instrument].
Procedure: After consent, participants completed [task] for [duration] under [conditions].
Analysis (quant): We tested H1 with [t-test/ANOVA/OLS/logistic regression] after confirming assumptions.
Analysis (qual): Transcripts were analyzed via reflexive thematic analysis: familiarization → coding → themes → review → naming.
Quality checks: Inter-rater reliability was κ = … / We used member checking and triangulation.
Ethics: The protocol received approval from [board] (Ref. [ID]); data were stored on encrypted drives.
Limitations: Results may not generalize beyond [context] due to [limit]; however, [mitigation].
What a Skeptical Examiner Will Ask
- Why this design? → Link to logic of the question.
- Why this sample size? → Power analysis (quant) / saturation (qual).
- Why these measures? → Validity & reliability/citations; pilot checks.
- What could bias results? → Confounds & controls; reflexivity log.
- Is analysis appropriate? → State assumptions/tests or epistemology & audit trail.
Integrity Checklist (auto-saves)
Sample Methodology — Quantitative (≈550 words)
Design and Rationale. We used a randomized controlled trial because the question concerns causal impact of an instructional technique (retrieval practice) on retention. Two parallel groups (Retrieval vs. Restudy) were taught the same 80 academic words over four weeks.
Participants and Sampling. Sixty Terminale students (Mage = 17.5, 34F) from two French high schools participated. After obtaining parental and student consent, volunteers were block-randomized within class by prior English grade (high/low) to balance baseline proficiency. Inclusion criteria: regular attendance; no diagnosed learning disorder affecting memory.
Materials and Measures. Vocabulary items were selected from B2–C1 academic lists aligned to the curriculum. Immediate learning was checked each week with a 10-item multiple-choice quiz (formative only). Primary outcome was a delayed cued-recall test at Week 5 (80 items; L1 cues → L2 response) scored as correct/incorrect. A transfer task (80 items embedded in sentences) assessed application. Parallel forms minimized test–retest effects; items were counterbalanced across participants.
Procedure. Both groups received identical 20-minute lessons (presentation + short exercises).
— Retrieval group: finished with a low-stakes quiz (free recall, then feedback).
— Restudy group: used the same time to reread examples and complete fill-in-the-blank worksheets.
No grades were assigned to avoid motivational confounds. At Week 5, all students sat the delayed tests under exam conditions (25 minutes).
Data Preparation. We preregistered the analysis plan and exclusion rules (absent for ≥2 sessions; <50% completion). Missing values (<3%) were listwise deleted after confirming randomness. Assumptions (normality, variance) were checked; violations triggered robust tests.
Analysis. Primary analysis: independent-samples t-test on delayed cued-recall scores. Secondary analyses: two-way ANOVA (Condition × Baseline proficiency), and logistic mixed-effects models for item-level accuracy (random intercepts for participant and item). Effect sizes: Cohen’s d and odds ratios with 95% CIs. α = .05 (two-tailed); Holm–Bonferroni controls multiplicity.
Validity and Reliability. Content validity derives from curriculum alignment and expert review (two teachers independently vetted items). Internal consistency of the delayed test was estimated via KR-20. To reduce expectancy effects, proctors were blind to condition during Week-5 testing.
Ethics. Approval was granted by the school research committee. Students could withdraw at any time; data were anonymized with numeric IDs and stored on encrypted drives.
Limitations. Classes, not individuals, received instruction together; spillover between groups cannot be ruled out, though scheduling minimized contact. Results may not generalize beyond academically tracked streams or beyond short-term retention.
Sample Methodology — Qualitative (≈520 words)
Design and Rationale. We adopted an interpretivist, phenomenological approach to explore lived experiences, aiming for rich descriptions rather than causal inference.
Participants and Sampling. We used purposive sampling to recruit diversity in gender, grades, and language backgrounds. Fifteen students (8F) from two schools volunteered after class announcements. Inclusion: enrollment in LVA; recent receipt of at least two marked essays. Recruitment ended at thematic sufficiency (no new codes emerging in consecutive interviews).
Data Collection. We conducted semi-structured interviews (30–45 minutes) in a quiet room at school. The guide probed: moments of helpful vs. unhelpful feedback; emotions; how students used comments; perceived fairness. Interviews were audio-recorded and transcribed verbatim. Field notes captured nonverbal cues immediately after each session.
Researcher Reflexivity. The interviewer, a teacher with assessment experience, maintained a reflexive journal and discussed emerging biases in fortnightly peer-debriefs to manage role conflict.
Analysis. We used reflexive thematic analysis (Braun & Clarke): familiarization → initial codes → candidate themes → review → naming → reporting. Coding in NVivo. For credibility, a second coder independently coded 20% of transcripts; disagreements were discussed to refine the codebook (we report examples, not a target κ). Member checks were performed via one-page summaries of preliminary themes sent to participants.
Trustworthiness. Credibility: triangulation of interviews and field notes; member checks. Dependability: audit trail (guide versions; codebook iterations). Confirmability: reflexive memos; peer-debrief minutes. Transferability: thick description of context.
Ethics. Informed consent/assent; pseudonyms in all reports. Students could skip questions or withdraw without penalty. Sensitive comments about teachers were handled carefully and removed from school-level reporting to prevent identification.
Limitations. Findings are context-bound; participants were volunteers who may value feedback more than peers. Interviews rely on retrospective accounts; classroom observation could complement self-report in future work.
Style & Language Micro-Advice
- Prefer concrete nouns (block randomization, cued-recall test) over vague ones.
- Replace value words with criteria: not “reliable,” but “KR-20 = .82”.
- Keep paragraphs modular: one decision per paragraph; end with the rationale.
- Use signposting (first/then/finally; in contrast; on balance) and hedging (appears, suggests, may).
- Tables/flows help: CONSORT-style for experiments; sampling flow for qual.
Mini Methodology Generator
Your assembled preview
Common Pitfalls (and quick fixes)
- Over-long summary. → Cap to 2–3 main decisions per subsection.
- Vague measures. → Define constructs; cite validity/reliability or pilot.
- Unstated assumptions. → Name tests (quant) or epistemology (qual).
- No bias control. → Add pre-specification, blinding, triangulation, inter-rater checks, reflexivity.
- Ethics as an afterthought. → Consent, anonymity, storage plan, withdrawal rights—explicit.
