Research Data Management
Resources, tools, and best practices to help you manage your research data at Lakehead University.
Need Help? Contact Our Specialist
For personalized assistance with data management plans (DMPs), security protocols, storage options, or any other RDM questions, please reach out.
Andrew Austin
Research Security & Data Management Specialist
What Is Research Data Management?
Research data management is how you organize, store, protect, and share your research data throughout your project and beyond. Think of it as the difference between a lab bench covered in unlabeled samples versus one where you can find exactly what you need, when you need it.
Why it matters to you:
- Find your own data later. Six months from now, will you remember what "final_version_v3_FINAL.xlsx" actually contains?
- Meet funding requirements. Tri-Agency grants now require data management plans. No plan = no funding.
- Protect sensitive information. Health data, Indigenous community data, and personal information need specific security measures. Get this wrong and you risk real harm to participants and your research ethics approval.
- Collaborate effectively. Your grad students, co-investigators, and future researchers need to understand your data structure.
- Preserve your work. When you leave Lakehead or when hard drives fail, your data needs to survive.
What it actually involves:
- Choosing appropriate storage based on your data sensitivity (not everything belongs on Google Drive)
- Naming files and folders systematically so others can navigate them
- Documenting what your data means (variable names, units, collection methods)
- Planning for Canadian data residency requirements when your funder or ethics board demands it
- Knowing where your data lives and who controls it
RDM isn't optional paperwork—it's basic research infrastructure that saves you time, meets compliance requirements, and protects your participants.
About Data Management Plans (DMPs)
▼What is a DMP?
A Data Management Plan (DMP) is a structured document that outlines how research data will be collected, organized, stored, protected, and shared over the course of a project. It acts as a roadmap for the data lifecycle, from creation to long-term preservation or disposal.
In practical terms, a DMP describes:
- What data you plan to collect or generate (e.g., interviews, lab results, code, images).
- How you will manage it (e.g., file naming, version control, metadata standards).
- Where you will store it (e.g., secure drives, approved cloud services, repositories).
- Who can access it (e.g., research team only, or broader sharing).
- How long it will be kept and how it will eventually be archived or disposed of.
It is a living document, meaning it can evolve as your project changes.
Why You Need a DMP
A DMP is more than a formality - it is a tool that strengthens your research and protects the people, data, and institutions involved.
- Research quality and integrity: By planning ahead, you ensure that data are collected consistently, remain accurate, and can be verified or reproduced later.
- Ethics and compliance: Many funders, including the Tri-Agencies, now require DMPs. A strong plan also helps you align with Lakehead's data classification standards, privacy laws, and commitments to research participants.
- Efficiency and risk reduction: A DMP saves time by preventing data loss, duplication, or confusion within research teams. It also helps reduce risks of unauthorized access or breaches.
- Future value of data: Well-managed data can be reused, cited, and built upon by others, increasing the visibility and impact of your work.
- Community responsibility: For projects involving Indigenous Peoples, a DMP provides a way to respect Indigenous data sovereignty by embedding OCAP® and CARE principles into your research practices.
General DMP Tips
- Start planning early and involve your team.
- Avoid discipline-specific jargon and define acronyms.
- Describe your data in detail (types, volume, formats).
- Ensure clear documentation and file-naming conventions.
- Describe your storage strategy and security measures.
- Describe all ethical and legal considerations.
- Have a plan for long-term preservation and sharing.
RDM Best Practices
▼The FAIR Principles
▼A best practice in RDM is to make your data 'FAIR' (Findable, Accessible, Interoperable, and Reusable).
Data Storage Tips (The 3-2-1 Rule)
▼A common recommended practice for backing up and storing your data is the 3-2-1 Rule, which says you should keep:
- 3 copies of your data on
- 2 types of storage media and
- 1 copy should be off-site
Having 1 copy offsite protects your data from local risks like theft, lab fires, flooding, or natural disasters.
Using 2 storage media improves the likelihood that at least one version will be readable in the future should one media type become obsolete or degrade unexpectedly.
Having 3 copies helps ensure that your data will exist somewhere without being overly redundant.
Survey Tips: How to Avoid Bots
▼Layer 1: Proactive Planning & Active Monitoring
This is your foundation. Strong planning and oversight can prevent most problems before they start.
- Start with a "Soft Launch": Before a full-scale launch, release the survey to a small, controlled group. Set a response limit (e.g., 50 responses) and then pause to check the data quality. You can increase the limit gradually as you gain confidence.
- Assign a Daily Monitor: During the first week of a major launch, have a team member review incoming responses daily. Catching suspicious activity early is crucial.
- State Your Intentions: Include a clear warning at the beginning of your survey, such as: "To ensure data quality, all responses are monitored for automated and fraudulent submissions. Only valid, good-faith responses will be eligible for compensation" (If you are compensating your respondents)
- Use Unique Survey Links: This is the single most effective tactic. If your platform allows, avoid posting one public link. Instead, send unique, single-use links directly to your target participants.
Layer 2: Smart Survey Design & Technical Barriers
This layer involves building clever traps and checks directly into your survey to filter out bots and inattentive humans.
- Implement CAPTCHA: This is a non-negotiable first line of defense. Use a "CAPTCHA" or "I'm not a robot" feature at the beginning of your survey.
- Set a "Honeypot" Trap: Create a text field that is hidden from human view using the survey tool's settings. Bots will read the code and fill it in, while humans will never see it. Any response with data in this hidden field can be instantly discarded.
- Use Attention Checks: Include simple questions to ensure the respondent is paying attention. Examples:
- "To show you are paying attention, please select 'Strongly Agree' for this item."
- "What is 4+2?"
- Check for Internal Consistency: Ask for the same information in different ways. For example, ask for their year of birth on page 2 and their age on page 10. Any responses that don't logically align are a major red flag.
- Require Thoughtful, Open-Ended Responses: Bots are notoriously bad at providing coherent, context-specific answers to "why" or "how" questions. Making one or two of these mandatory can be a very effective filter.
Layer 3: Rigorous Data Screening & Secure Payouts
This final layer serves as your quality control check before accepting the data and distributing incentives.
- The Golden Rule: NEVER Automate Incentives: This is the most important takeaway. Always have a human review the quality of a response before sending any payment or gift card.
- Look for Red Flags in the Data:
- Impossible Speed: Flag responses completed in a time that is inhumanly fast (e.g., under 30% of the time it took you to test it).
- Suspicious Origins: Look for a large number of responses coming from the same IP address or geographic location (especially if it's outside your recruitment area).
- Odd Timing: Be wary of large clusters of responses submitted at unusual hours, like a wave of submissions at 3 AM.
- Gibberish & Patterns: Instantly disqualify responses with nonsensical open-text answers (e.g., "asdfgh") or clear patterns in scaled questions (e.g., "straight-lining" by choosing '3' for every single item).
- Delay Your Payouts: State that incentives will be processed and distributed a few days after the survey period closes. This delay is a powerful deterrent for fraudsters seeking a quick reward.
By systematically applying these three layers of defense, you can significantly reduce the risk of compromised data and ensure that your research is built on a foundation of valid, high-quality responses.
Research Websites
▼If your research project includes a dedicated website, it is your responsibility as the researcher to ensure its content is accurate, up-to-date, and properly maintained.
This responsibility extends to the end of the project lifecycle. Once your research is complete and the website is no longer active or necessary, you must ensure it is taken offline to prevent outdated information from persisting and to mitigate potential security risks.
Artificial Intelligence in Research
▼Large Language Models (LLMs) like ChatGPT, Gemini, and Claude are powerful tools, but they come with security risks. Also, many software tools you already use (e.g., Grammarly, Microsoft Office) now include AI features that may collect your data.
Recommended AI LLM: Google Gemini
Ensure you access it through your Lakehead University Google account.
- Enhanced Privacy: Your data (prompts, outputs) is NOT used for AI training.
- Data Ownership: You own your data, not Google.
- Confidentiality: Google never sells customer data to third parties or uses it for advertising.
- Cost & Support: It is free with your institutional account and supported by Lakehead TSC.
Note: You must be logged in with your Lakehead account to receive these protections.
In addition to Gemini for general tasks, NotebookLM is an amazing research tool specifically designed to help you analyze and synthesize your own documents. You can upload your research papers, notes, and other sources, and NotebookLM will act as an expert assistant grounded in your content.
Security Alert: Avoid DeepSeek
Do NOT use DeepSeek. Canadian security agencies have identified significant privacy and security risks for Canadian users. This includes data collection practices that may violate Canadian privacy standards and the potential for foreign government access to your data.
AI Usage Guidelines
✅ Safe to Use with AI
- Published research summaries
- Writing assistance (grammar, structure)
- Understanding complex concepts
- Programming help (generic examples)
- Grant proposal drafts (with disclosure)
❌ Never Use Any AI For
- Confidential research data
- Participant information (PII)
- Export-controlled or sensitive data
- Proprietary collaboration data
- Personal info (students, colleagues)
Disclosure Requirements
Always disclose when AI has provided significant contributions to your work. This includes:
- AI helped write publications, grants, or reports.
- AI assisted with data analysis.
- AI contributed to student work beyond basic grammar.
Sample Disclosure:
"This work used AI assistance (Google Gemini) for [literature review / writing enhancement / data interpretation]. All outputs were verified by the authors."
Lakehead Resources
Data Classification
DMP Tools
Research Tools
Artificial Intelligence
External & National Resources
Events & Workshops
Upcoming Events
Past Events
October 22, 2025
October 9, 2025
March 21, 2024
Asynchronous Training
Self-Register on MyCourseLink
