Research Data Management
Resources, tools, and best practices to help you manage your research data at Lakehead University.
Quick Start: I need help with...
Click a topic to jump to the relevant section.
On This Page
Jump to any section below.
Getting Started Planning Your Research | Managing Your Data Compliance & Ethics | Sharing & Preservation Special Considerations Resources |
The Research Data Lifecycle
Good data management spans your entire research project. This guide follows you through each stage.
Plan
Collect
Analyze
Share
Preserve
Click any section below to learn what to do at each stage.
What Is Research Data Management?
Research data management is how you organize, store, protect, and share your research data throughout your project and beyond. Think of it as the difference between a lab bench covered in unlabeled samples versus one where you can find exactly what you need, when you need it.
Why it matters:
- Find your own data later. Six months from now, will you remember what "final_version_v3_FINAL.xlsx" contains?
- Meet funding requirements. Tri-Agency grants now require data management plans. No plan = no funding.
- Protect sensitive information. Health data, Indigenous community data, and personal information need specific security measures.
- Collaborate effectively. Your grad students and co-investigators need to understand your data structure.
- Preserve your work. When you leave Lakehead or when hard drives fail, your data needs to survive.
Contact & Support
For personalized assistance with data management plans, security protocols, storage options, or any other RDM questions.
Andrew Austin
Research Security & Data Management Specialist
Additional Support
For Graduate Students
▼Data management habits you build now will serve you throughout your career. Here's what you need to know as a graduate student.
Your Data vs. Your Supervisor's Data
Key principle: Data ownership is typically determined by who funded the research, not who collected it.
- Grant-funded research: Data usually belongs to the institution and/or PI, governed by the grant terms
- Your thesis work: You retain rights to your thesis, but underlying data may have shared ownership
- Collaborative projects: Clarify data ownership and access rights at the start
Action: Have an explicit conversation with your supervisor about data ownership, access after graduation, and publication rights before you start collecting data.
What Happens When You Graduate?
- Your Lakehead accounts will be deactivated — You'll lose access to Google Drive, email, and institutional systems
- Plan your data transition: Identify what you need to take, what stays with your supervisor, and what gets deposited in a repository
- Export before you leave: Download personal copies of files you're entitled to keep
- Document everything: Your successor needs to understand your data — create clear README files and documentation
Building Good Habits Early
✓ Do This
- Use consistent file naming from day one
- Back up regularly (3-2-1 rule)
- Document as you go, not at the end
- Use version control for code
- Keep raw data separate and untouched
✗ Avoid This
- Storing data only on your laptop
- Using personal cloud accounts for research
- Waiting until thesis writing to organize
- "final_v2_FINAL_revised.xlsx" naming
- Assuming you'll remember what files contain
💡 Thesis Data Checklist
- ☐ Data ownership discussed with supervisor
- ☐ Storage location agreed upon
- ☐ Backup strategy in place
- ☐ File naming convention established
- ☐ README file started
- ☐ Exit plan for graduation documented
Data Management Plans (DMPs)
▼What is a DMP?
A Data Management Plan is a structured document outlining how research data will be collected, organized, stored, protected, and shared. It describes:
- What data you'll collect or generate
- How you'll manage it (file naming, version control, metadata)
- Where you'll store it
- Who can access it
- How long you'll keep it and how it will be archived or disposed of
Why You Need a DMP
- Research quality: Ensures data are collected consistently and can be verified
- Funding compliance: Tri-Agency RDM Policy requires DMPs for many grants
- Efficiency: Prevents data loss and confusion within teams
- Future value: Well-managed data can be reused and cited
Common DMP Mistakes
- ✗ Saying data "cannot be shared" without explaining why
- ✗ Not specifying who is responsible for data management
- ✗ Listing storage without explaining security measures
- ✗ Naming a repository that doesn't fit your discipline
- ✗ Forgetting to budget for data management costs
DMP Resources
Funder-Specific Guidance
Data Classification
▼All research data at Lakehead must be classified into one of three levels. Your classification determines storage requirements, access controls, and handling procedures.
Confidential
Severe harm if disclosed
- Participant names, health info
- Medical or income data
- Unpublished research
Internal
Minor harm if disclosed
- Meeting minutes
- Partner contracts
- Internal correspondence
Public
No harm if disclosed
- Published data
- Contact information
- Aggregated anonymous data
When in doubt, classify higher.
Classification Resources
Data Collection Best Practices
▼How you collect data determines its quality and usability. Plan your collection methods carefully before you begin.
Before You Collect
- Define your variables: What exactly will you measure or record? Be precise.
- Choose your tools: Survey platform, lab instruments, interview recording — select before starting
- Create a data dictionary: Document what each variable means, units, valid ranges, codes for missing data
- Design for analysis: How will you analyze this data? Structure collection accordingly
- Pilot test: Test your collection process with a small sample first
During Collection
Quality Control
- Use validation rules where possible
- Check data regularly during collection
- Document any deviations from protocol
- Note environmental conditions if relevant
- Back up immediately after collection sessions
Common Pitfalls
- Inconsistent data entry formats
- Missing metadata about collection context
- No backup until collection is "complete"
- Changing collection methods mid-study
- Not documenting instrument settings
Survey-Specific Guidance
Online surveys face unique challenges including bot responses and data quality issues.
- Use CAPTCHA: Essential first line of defense against bots
- Include attention checks: "Please select 'Strongly Agree' for this question"
- Add consistency checks: Age and birth year should match
- Use unique links: Single-use survey links prevent multiple submissions
- Soft launch: Test with 50 responses before full deployment
- Never auto-pay incentives: Review responses before compensation
Approved Collection Tools
Costs & Budgeting for RDM
▼Many RDM resources are free, but some projects require budget allocation. Include data management costs in your grant applications.
What's Free
Lakehead Resources
- Google Drive (100GB)
- RDM consultation and support
- Data classification training
- DMP review assistance
National Resources
- DRAC Nextcloud (100GB Canadian storage)
- Borealis data repository
- FRDR (Federated Research Data Repository)
- DMP Assistant tool
- DRAC compute resources (basic allocation)
What May Have Costs
- Specialized software: Statistical packages, qualitative analysis tools (check TSC for institutional licenses first)
- Large storage needs: Beyond free allocations, additional storage may require RAC applications or fees
- Data curation services: Professional data cleaning, formatting, or migration
- Transcription: Audio/video transcription services
- Long-term preservation: Some discipline-specific repositories charge fees
- Personnel time: Data management as part of RA responsibilities
Including RDM in Grant Budgets
Tri-Agency grants allow data management as an eligible expense. Consider including:
- Personnel: RA time for data organization, documentation, curation
- Software: Licenses for data collection or analysis tools
- Storage: Cloud storage or backup solutions beyond free allocations
- Services: Transcription, data entry, format conversion
- Training: Team training on data management practices
- Preservation: Repository fees or data migration costs
💡 Budget Tip
Reviewers want to see realistic data management plans. Saying "data will be stored on Google Drive" is fine for small projects, but larger grants should demonstrate you've thought about long-term preservation and sharing costs.
Where Should I Store My Data?
▼All Lakehead staff and students receive 100GB of Google Drive storage through the Google Education Tenant. Data is encrypted at rest and in transit, but Google Drive stores data in US data centres (not Canadian), which may not meet all research requirements.
For research requiring Canadian data residency, the Digital Research Alliance of Canada offers Nextcloud — a Dropbox-like service with 100GB storage hosted in Canadian data centres (British Columbia).
🗺️ Storage Decision Guide
Answer these questions to find the right storage for your data.
Does your data contain personal or sensitive information?
Health records, personal identifiers, Indigenous community data, etc.
| Data Type | Recommended Storage | Notes |
|---|---|---|
| Public / Internal | Lakehead Google Drive | 100GB per user, US data centres, encrypted at rest/transit |
| Internal (Canadian residency) | DRAC Nextcloud | 100GB, Canadian data centres (BC), requires CCDB account |
| Confidential | TSC-approved solutions | Contact TSC for current options |
| Long-term archival | Borealis, FRDR | After project completion |
| Large datasets / HPC | Digital Research Alliance | Compute-intensive research |
Common Questions
"Can I use personal Dropbox or OneDrive?"
Not approved for research data with personal information. May store data outside Canada.
"What about US-based cloud services?"
May violate FIPPA. Even Canadian companies may route through US servers. Note: Google Drive data is stored in the US.
"I need Canadian data residency."
Use DRAC Nextcloud (100GB, BC data centres). Requires a free CCDB account. Data syncs between devices and is backed up nightly.
"I need more than 100GB."
Contact TSC for additional allocation. For very large datasets, consider DRAC.
File Naming & Organization
▼✓ Do This
2024-03-15_interview_P01.mp3survey_cleaned_v02.csv- ISO dates (YYYY-MM-DD)
- Underscores or hyphens, not spaces
- Version numbers
✗ Don't Do This
final_FINAL_v2.xlsxmy file (1).docxdata.csv- Spaces or special characters
- Vague names
Recommended Folder Structure
ProjectName/ ├── 01_RawData/ # Original (READ-ONLY) ├── 02_ProcessedData/ # Cleaned, transformed ├── 03_Analysis/ # Scripts, outputs ├── 04_Documentation/ # README, codebooks ├── 05_Outputs/ # Final reports └── README.txt
Key principle: Never modify raw data—always work on copies.
RDM Best Practices
▼The FAIR Principles
Make your data Findable, Accessible, Interoperable, and Reusable:
- Findable: Rich metadata and persistent identifiers (DOIs)
- Accessible: Retrievable with clear access conditions
- Interoperable: Standardized formats and vocabularies
- Reusable: Clear licenses and provenance
Note: FAIR ≠ open. Data can be FAIR with controlled access.
🌟 Learn more at GO-FAIR.orgThe 3-2-1 Backup Rule
The gold standard for protecting your research data from loss.
Copies of your data
Original + 2 backups
Different storage types
Don't put all eggs in one basket
Off-site copy
Protects against local disasters
Example: 3-2-1 Using Lakehead Resources
Here's how a researcher could implement 3-2-1 for a typical project:
| Copy | Location | Storage Type | Purpose |
|---|---|---|---|
| Copy 1 | Lakehead Google Drive Working copy | Cloud storage (US servers) | Day-to-day work, collaboration, automatic sync |
| Copy 2 | External hard drive Office or home | Local physical storage | Weekly backup, fast recovery if cloud fails |
| Copy 3 | DRAC Nextcloud Canadian data centre (BC) | Cloud storage (different provider) | Off-site backup, Canadian data residency |
Why this works: If Google has an outage, you have local backup. If your office floods, you have two cloud copies. If one cloud provider fails, you have another. Different failure modes are covered.
Alternative Configurations
- For Canadian data residency:
DRAC Nextcloud (primary) + External drive + Borealis (archive) - For large datasets:
DRAC project storage (primary) + Tape/nearline + Google Drive (docs only) - For sensitive health data:
TSC-approved storage + Encrypted external drive + Encrypted off-site
Common Mistakes
- Two copies on same physical drive ≠ 2 copies
- Synced folders aren't backups (deletions sync too)
- External drive kept next to computer isn't "off-site"
- Never testing if backups actually restore
- Backing up only at project end
💡 Backup Schedule Suggestion
- Daily: Working files auto-sync to cloud (Google Drive/Nextcloud)
- Weekly: Manual backup to external drive + verify sync is working
- Monthly: Test restore a random file from each backup location
- At milestones: Create dated archive copy (e.g., "2025-01-15_data_collection_complete")
README Files
Every dataset needs a README explaining:
- Project description and collection methods
- File inventory and variable definitions
- Units, formats, missing data codes
- Access conditions and contact info
Metadata & Documentation
▼Metadata is "data about data" — the information that makes your data findable, understandable, and reusable. Without good metadata, even well-organized files become unusable.
Data Dictionaries / Codebooks
Essential for any dataset with variables. Document each variable with:
| Variable Name | Description | Type | Valid Values | Missing Code |
|---|---|---|---|---|
| participant_id | Unique participant identifier | String | P001-P999 | N/A |
| age_years | Age at enrollment | Integer | 18-99 | -99 |
| consent_date | Date consent signed | Date | YYYY-MM-DD | blank |
Persistent Identifiers
Permanent links that ensure your work remains findable even if websites change.
DOIs (Digital Object Identifiers)
Permanent links for datasets, publications, and other research outputs.
Example: 10.5683/SP3/ABC123
Borealis and FRDR automatically assign DOIs to deposited datasets.
ORCIDs (Researcher IDs)
Your unique researcher identifier that links all your work.
Example: 0000-0002-1234-5678
Discipline-Specific Metadata Standards
Many fields have established standards. Using them makes your data interoperable.
💡 Documentation Tip
Write documentation as if you're explaining your data to a stranger who will use it five years from now — because that stranger might be you.
Collaboration & Sharing
▼Lakehead provides collaboration tools through Google Workspace.
Google Workspace
Best Practices
- Use institutional accounts — Personal Gmail lacks protections
- Set appropriate permissions — Viewer vs Editor
- Avoid "Anyone with link" for sensitive data
- Review access periodically
Privacy & Legal Compliance
▼Key Legislation
FIPPA
Freedom of Information and Protection of Privacy Act — governs personal information at Ontario public institutions.
PHIPA
Personal Health Information Protection Act — additional requirements for health information.
PIPEDA
Federal private-sector privacy law — applies to partnerships with private companies.
TCPS 2
Tri-Council Policy Statement — ethical guidelines for research involving humans.
Canadian Data Residency
- FIPPA generally requires Ontario personal information to stay in Canada
- Some REB approvals mandate Canadian-only storage
- US-based cloud services may be subject to US government access under the US CLOUD Act
- Note: Lakehead Google Drive stores data in US data centres
Warning: "Canadian company" doesn't guarantee Canadian data residency. Many route through US servers.
Canadian Storage Option: DRAC Nextcloud provides 100GB cloud storage hosted in Canadian data centres (British Columbia). Free with a CCDB account.
Security Resources
🔐 Lakehead Cybersecurity & Encryption GuideHandling Sensitive Data
▼Sensitive data requires extra precautions throughout its lifecycle. This section covers practical techniques for protecting confidential information.
De-identification vs. Anonymization
These terms are often confused, but they have different meanings with significant legal and ethical implications.
| Aspect | De-identified Data | Anonymous Data |
|---|---|---|
| Definition | Direct identifiers removed, but re-identification may be possible with additional information | No reasonable possibility of re-identification, even with additional data |
| Key linking | Often maintains a key linking codes to identities (held separately) | No key exists — link permanently broken |
| Privacy law status | Still considered personal information under FIPPA/PHIPA | May fall outside privacy legislation scope |
| REB oversight | Usually still requires REB approval and oversight | May not require ongoing REB oversight (but verify) |
| Data sharing | Typically requires DSAs and restricted access | Can often be shared more freely |
| Reversibility | Can be re-identified if needed (e.g., for follow-up) | Cannot be reversed — participants cannot be contacted again |
De-identification Example
A health study replaces patient names with codes (P001, P002) and stores the linking key in a separate secure file. The researcher can re-contact participants if needed.
Risk: If someone obtains both the data and the key, participants can be identified.
Anonymization Example
A survey dataset has all identifiers permanently removed, dates generalized to year only, and geographic data aggregated to regional level. No key exists.
Trade-off: Cannot go back to participants for clarification or follow-up studies.
Types of Identifiers
Direct Identifiers (Always Remove)
- Names (including initials)
- Social Insurance Numbers
- Health card numbers
- Email addresses
- Phone numbers
- Full addresses
- Photos/videos showing faces
- Biometric data
- IP addresses
Indirect/Quasi-Identifiers (Assess Risk)
- Dates (birth, admission, death)
- Geographic data (postal codes, cities)
- Occupation + employer combination
- Rare diseases or conditions
- Ethnicity in small populations
- Unique event dates
- Institutional affiliations
- Detailed age (use ranges instead)
Common De-identification Techniques
| Technique | Description | Example |
|---|---|---|
| Suppression | Remove the value entirely | Delete name column |
| Generalization | Make values less specific | Age 47 → "45-49" range |
| Pseudonymization | Replace with artificial identifiers | "Jane Smith" → "P0042" |
| Date shifting | Shift all dates by random interval | All dates +/- 30 days |
| Top/bottom coding | Cap extreme values | Age 95 → "90+" |
| Data swapping | Exchange values between records | Swap postal codes between similar records |
⚠️ The "Mosaic Effect"
Even when individual data elements seem harmless, combining multiple quasi-identifiers can uniquely identify someone. Example: "Female + Age 34 + Profession: Pilot + City: Thunder Bay" may identify only one person. Always assess re-identification risk across the entire dataset, not just individual fields.
De-identification Resources
Encryption
Encryption scrambles data so only authorized users can read it.
When Encryption is Required
- Confidential data on portable devices
- Data transfers outside secure networks
- PHIPA-regulated health information
- When specified by REB or funder
Types of Encryption
- At rest: Files on disk (BitLocker, FileVault)
- In transit: Data being transmitted (HTTPS, SFTP)
- End-to-end: Only sender/receiver can decrypt
Secure File Transfer
Never send confidential data via regular email.
- SFTP: Secure File Transfer Protocol — encrypted transfers to servers
- Google Drive (Lakehead): Share links with specific people, not "anyone with link"
- Encrypted email: Use institutional tools for sensitive attachments
- Globus: For large research dataset transfers between institutions
⚠️ If You Suspect a Data Breach
- Don't panic, but act quickly.
- Document: What data, how many records, when discovered, how it may have occurred
- Report immediately: Contact TSC and the REB
- Preserve evidence: Don't delete files or emails related to the incident
- Follow institutional procedures: Lakehead has breach notification requirements
TSC Security: Report security incidents to the Technology Services Centre immediately.
Data Sharing Agreements
▼Formal agreements that govern how data can be shared, used, and protected between parties. Essential for collaborative research and data transfers.
When You Need a DSA
- Multi-institutional research: Sharing data with collaborators at other universities
- Industry partnerships: Any data exchange with private sector partners
- International transfers: Sending data outside Canada (additional requirements apply)
- Secondary use: Using data collected for a different purpose
- Receiving data: When another organization provides data to you
What DSAs Typically Cover
- What data will be shared
- Permitted uses and restrictions
- Security and storage requirements
- Who can access the data
- Duration of agreement
- Data destruction requirements
- Publication rights
- Intellectual property
- Liability and indemnification
- Breach notification procedures
International Data Transfers
GDPR (EU): If you're working with European collaborators or EU citizen data, the General Data Protection Regulation applies. This requires specific contractual clauses and may restrict where data can be stored.
- Consult with Research Services before transferring data internationally
- Ensure your REB approval covers international collaboration
- Some countries have data localization requirements
💡 Getting a DSA
Contact the Office of Research Services to initiate a data sharing agreement. Allow 4-6 weeks for negotiation and signing. Start early — don't wait until you need the data.
Indigenous Data Sovereignty
▼Indigenous Peoples have inherent rights over data about their communities, lands, and knowledge. For research with First Nations, Inuit, or Métis communities, standard RDM practices must be adapted.
OCAP® Principles (First Nations)
O — Ownership
Communities collectively own their data and knowledge.
C — Control
Communities control research affecting them.
A — Access
Communities must have access to their data.
P — Possession
Physical control remains with the community.
CARE Principles (International)
Collective Benefit
Data should benefit Indigenous communities.
Authority to Control
Rights to govern their data must be recognized.
Responsibility
Support self-determination and build relationships.
Ethics
Indigenous wellbeing is the primary concern.
At Lakehead
Conflicts between Lakehead guidelines and OCAP® or community protocols must be resolved with the Office of Research Services before the project begins.
Data Retention & Disposal
▼Retention Requirements
Per the LUFA Collective Agreement: minimum 7 years after project completion. Contracts or funders may extend this.
The Tri-Agency RDM Policy also requires data preservation for validation and reuse purposes.
Disposal Methods
| Classification | Electronic | Paper | Devices |
|---|---|---|---|
| Confidential | Secure wipe | Certified shred | Return to TSC |
| Internal | Delete + backups | Shred | Return to TSC |
| Public | Delete | Any method | Return to TSC |
Third-party contracts: Providers must return or destroy data with written certification within 30 days.
Data Deposit & Publication
▼Depositing your data in a repository preserves it for the long term and makes it findable and citable. This is often required by funders and journals.
When to Deposit
- At publication: Many journals require data availability statements and DOIs
- At project completion: Before grant closes and team disperses
- After embargo: Some data can be embargoed during patent applications or ongoing analysis
- Before you leave: If you're graduating or leaving Lakehead, deposit before losing access
Choosing a Repository
| Repository | Best For | Key Features |
|---|---|---|
| Borealis (Lakehead) | Most research data | Canadian, free, DOIs, access controls |
| FRDR | Large datasets (100GB+) | Curated, discovery platform |
| Zenodo | Code, supplementary materials | GitHub integration, free |
| Discipline-specific | Field standards | ICPSR (social), GenBank (genomics), etc. |
Use re3data.org to find discipline-specific repositories.
Preparing Data for Deposit
File Preparation
- Use open, non-proprietary formats (CSV, TXT, PDF/A)
- Remove or de-identify personal information
- Include README and data dictionary
- Organize files logically
- Use clear, descriptive file names
Metadata to Include
- Title, authors, description
- Keywords and subject terms
- Collection methods
- Geographic and temporal coverage
- Related publications
Choosing a License
Licenses tell others how they can use your data.
CC0 (Public Domain)
No restrictions. Maximum reusability. Recommended for data.
CC-BY (Attribution)
Users must cite you. Good for most research data.
CC-BY-NC (Non-Commercial)
No commercial use. Limits some research applications.
Restricted Access
Users must request access. For sensitive data.
DOIs and Data Citation
When you deposit data, repositories assign a DOI (Digital Object Identifier) — a permanent link that makes your data citable.
Example citation:
Smith, J., & Jones, M. (2024). Survey data on Northern Ontario housing [Data set]. Borealis. https://doi.org/10.5683/SP3/EXAMPLE
Include your data DOI in publications and link your dataset to your ORCID profile.
✓ Deposit Checklist
- ☐ Data cleaned and de-identified (if needed)
- ☐ Files in open formats
- ☐ README file included
- ☐ Data dictionary/codebook included
- ☐ License selected
- ☐ Metadata complete
- ☐ Embargo period set (if needed)
- ☐ DOI obtained and recorded
Discipline-Specific Guidance
▼Different research fields have unique data management considerations. Find guidance for your discipline below.
🏥 Health Research
Key requirements:
- PHIPA compliance: Ontario health information must be protected under the Personal Health Information Protection Act
- De-identification required: Remove all direct identifiers before sharing or publishing
- Secure storage: Confidential classification — contact TSC for approved solutions
- Data sharing: Often requires DSAs and REB approval for secondary use
- Retention: Typically 10+ years for clinical research
Repositories: Restricted-access deposits on Borealis, ICPSR for survey data, dbGaP for genomic data
📊 Social Sciences
Key considerations:
- Qualitative data: Interview transcripts and field notes require careful de-identification
- Consent for sharing: Include data sharing in consent forms from the start
- Codebooks essential: Survey data needs comprehensive variable documentation
- Longitudinal considerations: Plan for linking data across time points securely
Repositories: Borealis, ICPSR, Qualitative Data Repository (QDR), UK Data Archive
🔬 Lab Sciences
Key considerations:
- Instrument data: Document equipment settings, calibration, and software versions
- Lab notebooks: Electronic lab notebooks provide version control and timestamps
- Raw vs. processed: Preserve raw data separately; document all processing steps
- Reproducibility: Include analysis scripts and computational environment details
- Large file sizes: May require DRAC storage or discipline-specific repositories
Repositories: Zenodo, Figshare, discipline-specific (GenBank, PDB, PANGAEA)
💻 Computational Research
Key considerations:
- Version control: Use Git for code; tag releases corresponding to publications
- Environment documentation: requirements.txt, conda environments, Docker containers
- Code citation: Get DOIs for software through Zenodo-GitHub integration
- Licensing: Choose appropriate open-source license (MIT, GPL, Apache)
- README files: Include installation, usage instructions, and examples
Repositories: GitHub + Zenodo, Software Heritage, CodeOcean
🌲 Environmental & Field Research
Key considerations:
- Geospatial data: Include coordinate reference systems, precision, and collection methods
- Temporal data: Document time zones, sampling frequency, and any gaps
- Field conditions: Record weather, equipment issues, and deviations from protocol
- Indigenous territories: Follow OCAP® principles for research on traditional lands
- Sensor data: Document calibration and any post-processing applied
Repositories: PANGAEA, Dryad, Environmental Data Initiative (EDI), GBIF
AI in Research
▼LLMs like ChatGPT, Gemini, and Claude are powerful but come with security risks. Many tools (Grammarly, Microsoft Office) now include AI features that may collect your data.
Recommended: Google Gemini
Access through your Lakehead account for enhanced privacy protections:
- Data NOT used for AI training
- You own your data
- Free with institutional account
⚠️ Avoid DeepSeek
Canadian security agencies have identified significant privacy and security risks with this Chinese AI service.
Government of Canada: Guide on the Use of Generative AI✅ Safe to Use
- Published research summaries
- Writing assistance
- Understanding concepts
- Generic code examples
❌ Never Use For
- Confidential research data
- Participant information
- Sensitive or controlled data
- Personal info (students, colleagues)
Disclosure
Disclose significant AI contributions to publications, grants, or analysis.
"This work used AI assistance (Google Gemini) for [task]. All outputs were verified by the authors."
Survey Tips: Avoiding Bots
▼Layer 1: Planning
- Soft launch: Test with 50 responses, check quality, scale up
- Daily monitoring: Review responses in the first week
- Unique links: Single-use links, not one public URL
Layer 2: Technical Barriers
- CAPTCHA: Non-negotiable first defense
- Honeypots: Hidden fields only bots fill
- Attention checks: "Select 'Strongly Agree'"
- Consistency checks: Year of birth + age should match
Layer 3: Screening
- Never automate incentives: Human review first
- Flag red flags: Impossible speed, IP clusters, gibberish
- Delay payouts: Process after survey closes
Resources
🛡️ Qualtrics Fraud DetectionResearch Websites
▼If your project includes a dedicated website, you're responsible for keeping it accurate and up-to-date.
When your research is complete, take the website offline to prevent outdated information and security risks.
National Research Infrastructure
▼The Digital Research Alliance of Canada (DRAC) provides national infrastructure including storage, compute resources, and data management tools.
Data Management
Computing & Storage
Lakehead University TSC
▼The Technology Services Centre (TSC) provides IT support, software, and infrastructure for Lakehead researchers. Contact TSC for questions about storage, software licensing, and technical support.
Key Resources
When to Contact TSC
- Need additional storage beyond 100GB
- Questions about approved software
- Setting up shared drives for research teams
- Device returns for secure data disposal
- VPN or remote access issues
- Security concerns or incidents
💡 Tip: Storage Requests
If your research requires more than 100GB of Google Drive storage, contact TSC with details about your project and estimated storage needs. For very large datasets (500GB+), consider Digital Research Alliance storage options.
Training & Events
▼Upcoming Events
Past Events (Resources)
Templates & Downloads
▼Ready-to-use templates and checklists to support your research data management.
Documentation Templates
| README Template Cornell University | Download → |
| Data Dictionary Template OSF Template | Download → |
| Codebook Guide ICPSR | View Guide → |
| DMP Assistant DRAC Tool | Open Tool → |
Checklists
Project Start Checklist
- ☐ DMP created or updated
- ☐ Storage location selected
- ☐ File naming convention established
- ☐ Backup strategy in place
- ☐ Data classification determined
- ☐ Team roles and access defined
- ☐ REB requirements confirmed
Project End Checklist
- ☐ Data cleaned and organized
- ☐ Documentation complete
- ☐ Data deposited in repository
- ☐ DOI obtained and recorded
- ☐ Access permissions updated
- ☐ Retention schedule confirmed
- ☐ Secure destruction of copies
Quick Reference Guides
| Data Classification Quick Reference Lakehead PDF | Download PDF → |
| File Naming Best Practices DRAC PDF | Download PDF → |
Need a Custom Template?
Contact RDM support if you need help adapting templates for your specific research needs or discipline.
What If Things Go Wrong?
▼Data emergencies happen. Here's what to do when things don't go as planned.
🗑️ Accidentally Deleted Files
Google Drive:
- Check Trash — files stay for 30 days
- For files deleted from Trash, contact TSC immediately — recovery may be possible within 25 days
Local files:
- Stop using the drive immediately to prevent overwriting
- Check backups (external drives, cloud sync)
- Contact TSC — they may have backup options
🔒 Lost Access to Storage
- Lakehead account issues: Contact TSC Help Desk
- Shared drive access: Contact the drive owner or your supervisor
- DRAC/Alliance resources: Contact Alliance support or renew your CCDB account
- Left/graduated: Your supervisor or department can request access to institutional data
⚠️ Suspected Data Breach
- Report immediately to TSC and your supervisor
- Document what happened, when, and what data may be affected
- Don't delete anything — preserve evidence
- Notify REB if human participant data is involved
- Follow institutional procedures — Lakehead has breach notification requirements
TSC Security Contact: TSC Help Desk
💾 Hardware Failure
- Laptop/computer died: If drive is intact, data may be recoverable — contact TSC
- External drive failed: Professional recovery is expensive ($500-$2000+) and not guaranteed
- Prevention: Follow the 3-2-1 rule (3 copies, 2 different media, 1 offsite)
👥 Collaborator Conflict Over Data
- Review agreements: Check your DMP, DSA, or any written agreements about data ownership
- Consult your supervisor or department head
- Contact Research Services: They can advise on institutional policies and help mediate
- Document everything: Keep records of contributions and communications
Prevention: Clarify data ownership and access rights in writing before starting collaborative projects.
📁 Corrupted Files
- Check version history: Google Drive keeps versions for 30 days (or 100 versions)
- Restore from backup: This is why regular backups matter
- Try file repair: Some software can recover partially corrupted files
- Raw data priority: If you have raw data, you can regenerate processed files
🛡️ Prevention is Better Than Recovery
Most data emergencies are preventable with good practices:
- Regular automated backups
- Version control for important files
- Clear documentation so others can help
- Data ownership discussions before projects start
Glossary of Terms
▼Quick reference for common research data management terms and acronyms.
CCDB
Compute Canada Database — account system for accessing Digital Research Alliance resources
De-identification
Removing direct identifiers from data; re-identification may still be possible with additional information
DMP
Data Management Plan — document outlining how research data will be handled throughout a project
DOI
Digital Object Identifier — permanent link for datasets, publications, and other research outputs
DRAC
Digital Research Alliance of Canada — national organization providing research computing and data management infrastructure
DSA
Data Sharing Agreement — formal contract governing how data can be shared between parties
FAIR
Findable, Accessible, Interoperable, Reusable — principles for scientific data management
FIPPA
Freedom of Information and Protection of Privacy Act — Ontario legislation governing public sector privacy
FRDR
Federated Research Data Repository — Canadian national repository for large research datasets
GDPR
General Data Protection Regulation — European Union privacy law affecting research with EU data
Metadata
"Data about data" — information describing the content, context, and structure of research data
OCAP®
Ownership, Control, Access, Possession — First Nations principles for data governance
ORCID
Open Researcher and Contributor ID — unique identifier linking researchers to their work
PHIPA
Personal Health Information Protection Act — Ontario law governing health information privacy
PI
Principal Investigator — lead researcher responsible for a research project
PIPEDA
Personal Information Protection and Electronic Documents Act — federal Canadian privacy law
RAC
Resource Allocation Competition — process for requesting large allocations of DRAC computing resources
REB
Research Ethics Board — committee that reviews research involving human participants
TCPS 2
Tri-Council Policy Statement — Canadian ethical guidelines for research involving humans
TSC
Technology Services Centre — Lakehead University's IT department
