Research Data Management

Research data management is how you organize, store, protect, and share your research data throughout your project and beyond. Think of it as the difference between a lab bench covered in unlabeled samples versus one where you can find exactly what you need, when you need it.

Why it matters:

  • Find your own data later. Six months from now, will you remember what "final_version_v3_FINAL.xlsx" contains?
  • Meet funding requirements. Tri-Agency grants now require data management plans. No plan = no funding.
  • Protect sensitive information. Health data, Indigenous community data, and personal information need specific security measures.
  • Collaborate effectively. Your grad students and co-investigators need to understand your data structure.
  • Preserve your work. When you leave Lakehead or when hard drives fail, your data needs to survive.

Read Lakehead's RDM Institutional Strategy


Contact & Support

For personalized assistance with data management plans, security protocols, storage options, or any other RDM questions.

Andrew Austin
Research Security & Data Management Specialist
rdm.research@lakeheadu.ca
+1 (807) 343-8010 ext. 8190
CASES Building - FB 2004J

Planning Your Research

Data Management Plans (DMPs)

What is a DMP?

A Data Management Plan is a structured document outlining how research data will be collected, organized, stored, protected, and shared. It describes:

  • What data you'll collect or generate
  • How you'll manage it (file naming, version control, metadata)
  • Where you'll store it
  • Who can access it
  • How long you'll keep it and how it will be archived or disposed of

Why You Need a DMP

  • Research quality: Ensures data are collected consistently and can be verified
  • Funding compliance:Tri-Agency RDM Policy requires DMPs for many grants
  • Efficiency: Prevents data loss and confusion within teams
  • Future value: Well-managed data can be reused and cited

Common DMP Mistakes

  • Saying data "cannot be shared" without explaining why
  • Not specifying who is responsible for data management
  • Listing storage without explaining security measures
  • Naming a repository that doesn't fit your discipline
  • Forgetting to budget for data management costs
Data Classification

All research data at Lakehead must be classified into one of three levels. Your classification determines storage requirements, access controls, and handling procedures.

When in doubt, classify higher.

Classification Resources

Data Collection Best Practices

How you collect data determines its quality and usability. Plan your collection methods carefully before you begin.

Before You Collect

  • Define your variables: What exactly will you measure or record? Be precise.
  • Choose your tools: Survey platform, lab instruments, interview recording — select before starting
  • Create a data dictionary: Document what each variable means, units, valid ranges, codes for missing data
  • Design for analysis: How will you analyze this data? Structure collection accordingly
  • Pilot test: Test your collection process with a small sample first

During Collection

Quality Control

  • Use validation rules where possible
  • Check data regularly during collection
  • Document any deviations from protocol
  • Note environmental conditions if relevant
  • Back up immediately after collection sessions

Common Pitfalls

  • Inconsistent data entry formats
  • Missing metadata about collection context
  • No backup until collection is "complete"
  • Changing collection methods mid-study
  • Not documenting instrument settings

Survey-Specific Guidance

Online surveys face unique challenges including bot responses and data quality issues.

  • Use CAPTCHA: Essential first line of defense against bots
  • Include attention checks: "Please select 'Strongly Agree' for this question"
  • Add consistency checks: Age and birth year should match
  • Use unique links: Single-use survey links prevent multiple submissions
  • Soft launch: Test with 50 responses before full deployment
  • Never auto-pay incentives: Review responses before compensation

Qualtrics Fraud Detection Guide


Approved Collection Tools

Costs & Budgeting for RDM

Many RDM resources are free, but some projects require budget allocation. Include data management costs in your grant applications.

What's Free

  • Lakehead Resources
  • Google Drive (100GB)
  • RDM consultation and support
  • Data classification training
  • DMP review assistance

National Resources

  • DRAC Nextcloud (100GB Canadian storage)
  • Borealis data repository
  • FRDR (Federated Research Data Repository)
  • DMP Assistant tool
  • DRAC compute resources (basic allocation)

What May Have Costs

  • Specialized software: Statistical packages, qualitative analysis tools (check TSC for institutional licenses first)
  • Large storage needs: Beyond free allocations, additional storage may require RAC applications or fees
  • Data curation services: Professional data cleaning, formatting, or migration
  • Transcription: Audio/video transcription services
  • Long-term preservation: Some discipline-specific repositories charge fees
  • Personnel time: Data management as part of RA responsibilities

Including RDM in Grant Budgets

Tri-Agency grants allow data management as an eligible expense. Consider including:

  • Personnel: RA time for data organization, documentation, curation
  • Software: Licenses for data collection or analysis tools
  • Storage: Cloud storage or backup solutions beyond free allocations
  • Services: Transcription, data entry, format conversion
  • Training: Team training on data management practices
  • Preservation: Repository fees or data migration costs

Budget Tip
Reviewers want to see realistic data management plans. Saying "data will be stored on Google Drive" is fine for small projects, but larger grants should demonstrate you've thought about long-term preservation and sharing costs.

For Graduate Students

Data management habits you build now will serve you throughout your career. Here's what you need to know as a graduate student.

Your Data vs. Your Supervisor's Data

Key principle: Data ownership is typically determined by who funded the research, not who collected it.

  • Grant-funded research: Data usually belongs to the institution and/or PI, governed by the grant terms
  • Your thesis work: You retain rights to your thesis, but underlying data may have shared ownership
  • Collaborative projects: Clarify data ownership and access rights at the start

Action: Have an explicit conversation with your supervisor about data ownership, access after graduation, and publication rights before you start collecting data.


What Happens When You Graduate?

  • Your Lakehead accounts will be deactivated — You'll lose access to Google Drive, email, and institutional systems
  • Plan your data transition: Identify what you need to take, what stays with your supervisor, and what gets deposited in a repository
  • Export before you leave: Download personal copies of files you're entitled to keep
  • Document everything: Your successor needs to understand your data — create clear README files and documentation

Building Good Habits Early

Do This

  • Use consistent file naming from day one
  • Back up regularly (3-2-1 rule)
  • Document as you go, not at the end
  • Use version control for code
  • Keep raw data separate and untouched

Avoid This

  • Storing data only on your laptop
  • Using personal cloud accounts for research
  • Waiting until thesis writing to organize
  • "final_v2_FINAL_revised.xlsx" naming
  • Assuming you'll remember what files contain

Thesis Data Checklist

  • Data ownership discussed with supervisor
  • Storage location agreed upon
  • Backup strategy in place
  • File naming convention established
  • README file started
  • Exit plan for graduation documented

Managing Your Data

Where Should I Store My Data?

All Lakehead staff and students receive 100GB of Google Drive storage through the Google Education Tenant. Data is encrypted at rest and in transit, but Google Drive stores data in US data centres (not Canadian), which may not meet all research requirements.

For research requiring Canadian data residency, the Digital Research Alliance of Canada offers Nextcloud — a Dropbox-like service with 100GB storage hosted in Canadian data centres (British Columbia).

Data TypeRecommended Storage Notes
 Public / InternalLakehead Google Drive 100GB per user, US data centres, encrypted at rest/transit
 Internal (Canadian residency)DRAC Nextcloud 100GB, Canadian data centres (BC), requires CCDB account
 ConfidentialTSC-approved solutions Contact TSC for current options
 Long-term archivalBorealis, FRDR After project completion
 Large datasets / HPCDigital Research Alliance Compute-intensive research

Common Questions

"Can I use personal Dropbox or OneDrive?"

Not approved for research data with personal information. May store data outside Canada.

"What about US-based cloud services?"

May violate FIPPA. Even Canadian companies may route through US servers. Note: Google Drive data is stored in the US.

"I need Canadian data residency."

Use DRAC Nextcloud (100GB, BC data centres). Requires a free CCDB account. Data syncs between devices and is backed up nightly.

"I need more than 100GB."

Contact TSC for additional allocation. For very large datasets, consider DRAC.

Storage Resources

File Name & Organization

Do This

  • 2024-03-15_interview_P01.mp3
  • survey_cleaned_v02.csv
  • ISO dates (YYYY-MM-DD)
  • Underscores or hyphens, not spaces
  • Version numbers

Don't Do This

  • final_FINAL_v2.xlsx
  • my file (1).docx
  • data.csv
  • Spaces or special characters
  • Vague names

Recommended Folder Structure

ProjectName/
├── 01_RawData/ # Original (READ-ONLY)
├── 02_ProcessedData/ # Cleaned, transformed
├── 03_Analysis/ # Scripts, outputs
├── 04_Documentation/ # README, codebooks
├── 05_Outputs/ # Final reports
└── README.txt

Key principle: Never modify raw data—always work on copies.

 

 

RDM Best Practices

The FAIR Principles

Make your data Findable, Accessible, Interoperable, and Reusable:

  • Findable: Rich metadata and persistent identifiers (DOIs)
  • Accessible: Retrievable with clear access conditions
  • Interoperable: Standardized formats and vocabularies
  • Reusable: Clear licenses and provenance

Note: FAIR ≠ open. Data can be FAIR with controlled access.

Learn more at GO-FAIR.org


The 3-2-1 Backup Rule

The gold standard for protecting your research data from loss.

Example: 3-2-1 Using Lakehead Resources

Here's how a researcher could implement 3-2-1 for a typical project:

CopyLocationStorage TypePurpose
Copy 1Lakehead Google Drive
Working copy
Cloud storage (US servers)Day-to-day work, collaboration, automatic sync
Copy 2External hard drive
Office or home
Local physical storageWeekly backup, fast recovery if cloud fails
Copy 3DRAC Nextcloud
Canadian data centre (BC)
Cloud storage (different provider)Off-site backup, Canadian data residency

Why this works: If Google has an outage, you have local backup. If your office floods, you have two cloud copies. If one cloud provider fails, you have another. Different failure modes are covered.

Alternative Configurations

  • For Canadian data residency:
    DRAC Nextcloud (primary) + External drive + Borealis (archive)
  • For large datasets:
    DRAC project storage (primary) + Tape/nearline + Google Drive (docs only)
  • For sensitive health data:
    TSC-approved storage + Encrypted external drive + Encrypted off-site

Common Mistakes

  • Two copies on same physical drive ≠ 2 copies
  • Synced folders aren't backups (deletions sync too)
  • External drive kept next to computer isn't "off-site"
  • Never testing if backups actually restore
  • Backing up only at project end

Backup Schedule Suggestion

  • Daily: Working files auto-sync to cloud (Google Drive/Nextcloud)
  • Weekly: Manual backup to external drive + verify sync is working
  • Monthly: Test restore a random file from each backup location
  • At milestones: Create dated archive copy (e.g., "2025-01-15_data_collection_complete")

README Files

Every dataset needs a README explaining:

  • Project description and collection methods
  • File inventory and variable definitions
  • Units, formats, missing data codes
  • Access conditions and contact info

Cornell Readme Template

Metadata & Documentation

We’ll be learning more about Yarrick, and the wider war for Armageddon, very soon

Metadata is "data about data" — the information that makes your data findable, understandable, and reusable. Without good metadata, even well-organized files become unusable.

Data Dictionaries / Codebooks

Essential for any dataset with variables. Document each variable with:

Variable NameDescriptionTypeValid ValuesMissing Code
participant_idUnique participant identifierStringP001-P999N/A
age_yearsAge at enrollmentInteger18-99-99
consent_dateDate consent signedDateYYYY-MM-DDblank

Persistent Identifiers

Permanent links that ensure your work remains findable even if websites change.

DOIs (Digital Object Identifiers)

Permanent links for datasets, publications, and other research outputs.
Example: 10.5683/SP3/ABC123
Borealis and FRDR automatically assign DOIs to deposited datasets.

ORCIDs (Researcher IDs)

Your unique researcher identifier that links all your work.
Example: 0000-0002-1234-5678
Register for free at orcid.org


Discipline-Specific Metadata Standards

Many fields have established standards. Using them makes your data interoperable.

Collaboration & Sharing

Lakehead provides collaboration tools through Google Workspace.

Google Workspace

Best Practices

  • Use institutional accounts — Personal Gmail lacks protections
  • Set appropriate permissions — Viewer vs Editor
  • Avoid "Anyone with link" for sensitive data
  • Review access periodically

Compliance & Ethics

Privacy & Legal Compliance

Key Legislation

FIPPA
Freedom of Information and Protection of Privacy Act — governs personal information at Ontario public institutions.

PHIPA
Personal Health Information Protection Act — additional requirements for health information.

PIPEDA
Federal private-sector privacy law — applies to partnerships with private companies.

TCPS 2
Tri-Council Policy Statement — ethical guidelines for research involving humans.


 

Canadian Data Residency

  • FIPPA generally requires Ontario personal information to stay in Canada
  • Some REB approvals mandate Canadian-only storage
  • US-based cloud services may be subject to US government access under the US CLOUD Act
  • Note: Lakehead Google Drive stores data in US data centres

Warning: "Canadian company" doesn't guarantee Canadian data residency. Many route through US servers.

Canadian Storage Option: DRAC Nextcloud provides 100GB cloud storage hosted in Canadian data centres (British Columbia). Free with a CCDB account.

Security Resources

 

Handling Sensitive Data

Sensitive data requires extra precautions throughout its lifecycle. This section covers practical techniques for protecting confidential information.

De-identification vs. Anonymization

These terms are often confused, but they have different meanings with significant legal and ethical implications.

AspectDe-identified DataAnonymous Data
DefinitionDirect identifiers removed, but re-identification may be possible with additional informationNo reasonable possibility of re-identification, even with additional data
Key linkingOften maintains a key linking codes to identities (held separately)No key exists — link permanently broken
Privacy law statusStill considered personal information under FIPPA/PHIPAMay fall outside privacy legislation scope
REB oversightUsually still requires REB approval and oversightMay not require ongoing REB oversight (but verify)
Data sharingTypically requires DSAs and restricted accessCan often be shared more freely
ReversibilityCan be re-identified if needed (e.g., for follow-up)Cannot be reversed — participants cannot be contacted again

De-identification Example

A health study replaces patient names with codes (P001, P002) and stores the linking key in a separate secure file. The researcher can re-contact participants if needed.

Risk: If someone obtains both the data and the key, participants can be identified.

Anonymization Example

A survey dataset has all identifiers permanently removed, dates generalized to year only, and geographic data aggregated to regional level. No key exists.

Trade-off: Cannot go back to participants for clarification or follow-up studies.

Types of Identifiers

Direct Identifiers (Always Remove)

  • Names (including initials)
  • Social Insurance Numbers
  • Health card numbers
  • Email addresses
  • Phone numbers
  • Full addresses
  • Photos/videos showing faces
  • Biometric data
  • IP addresses

Indirect/Quasi-Identifiers (Assess Risk)

  • Dates (birth, admission, death)
  • Geographic data (postal codes, cities)
  • Occupation + employer combination
  • Rare diseases or conditions
  • Ethnicity in small populations
  • Unique event dates
  • Institutional affiliations
  • Detailed age (use ranges instead)

Common De-identification Techniques

TechniqueDescriptionExample
SuppressionRemove the value entirelyDelete name column
GeneralizationMake values less specificAge 47 → "45-49" range
PseudonymizationReplace with artificial identifiers"Jane Smith" → "P0042"
Date shiftingShift all dates by random intervalAll dates +/- 30 days
Top/bottom codingCap extreme valuesAge 95 → "90+"
Data swappingExchange values between recordsSwap postal codes between similar records

The "Mosaic Effect"
Even when individual data elements seem harmless, combining multiple quasi-identifiers can uniquely identify someone. Example: "Female + Age 34 + Profession: Pilot + City: Thunder Bay" may identify only one person. Always assess re-identification risk across the entire dataset, not just individual fields.

De-identification Resources


 

Encryption

Encryption scrambles data so only authorized users can read it.

When Encryption is Required

  • Confidential data on portable devices
  • Data transfers outside secure networks
  • PHIPA-regulated health information
  • When specified by REB or funder

Types of Encryption

  • At rest: Files on disk (BitLocker, FileVault)
  • In transit: Data being transmitted (HTTPS, SFTP)
  • End-to-end: Only sender/receiver can decrypt

Lakehead Encryption Guide


Secure File Transfer

Never send confidential data via regular email.

  • SFTP: Secure File Transfer Protocol — encrypted transfers to servers
  • Google Drive (Lakehead): Share links with specific people, not "anyone with link"
  • Encrypted email: Use institutional tools for sensitive attachments
  • Globus: For large research dataset transfers between institutions

If You Suspect a Data Breach

Don't panic, but act quickly.

  • Document: What data, how many records, when discovered, how it may have occurred
  • Report immediately: Contact TSC and the REB
  • Preserve evidence: Don't delete files or emails related to the incident
  • Follow institutional procedures: Lakehead has breach notification requirements

TSC Security: Report security incidents to the Technology Services Centre immediately.

 

 

Data Sharing Agreements

Formal agreements that govern how data can be shared, used, and protected between parties. Essential for collaborative research and data transfers.

When You Need a DSA

  • Multi-institutional research: Sharing data with collaborators at other universities
  • Industry partnerships: Any data exchange with private sector partners
  • International transfers: Sending data outside Canada (additional requirements apply)
  • Secondary use: Using data collected for a different purpose
  • Receiving data: When another organization provides data to you

 

What DSAs Typically Cover

  • What data will be shared
  • Permitted uses and restrictions
  • Security and storage requirements
  • Who can access the data
  • Duration of agreement
  • Data destruction requirements
  • Publication rights
  • Intellectual property
  • Liability and indemnification
  • Breach notification procedures

 

International Data Transfers

GDPR (EU): If you're working with European collaborators or EU citizen data, the General Data Protection Regulation applies. This requires specific contractual clauses and may restrict where data can be stored.

  • Consult with Research Services before transferring data internationally
  • Ensure your REB approval covers international collaboration
  • Some countries have data localization requirements

Getting a DSA

Contact the Office of Research Services to initiate a data sharing agreement. Allow 4-6 weeks for negotiation and signing. Start early — don't wait until you need the data.

Indigenous Data Sovereignty

Indigenous Peoples have inherent rights over data about their communities, lands, and knowledge. For research with First Nations, Inuit, or Métis communities, standard RDM practices must be adapted.

OCAP® Principles (First Nations)

O — Ownership
Communities collectively own their data and knowledge.

C — Control
Communities control research affecting them.

A — Access
Communities must have access to their data.

P — Possession
Physical control remains with the community.


 

CARE Principles (International)

Collective Benefit
Data should benefit Indigenous communities.

Authority to Control
Rights to govern their data must be recognized.

Responsibility
Support self-determination and build relationships.

Ethics
Indigenous wellbeing is the primary concern.

At Lakehead

Conflicts between Lakehead guidelines and OCAP® or community protocols must be resolved with the Office of Research Services before the project begins.

Resources

Data Retention & Disposal

Retention Requirements

Retention Requirements

Per the LUFA Collective Agreement: minimum 7 years after project completion. Contracts or funders may extend this.

The Tri-Agency RDM Policy also requires data preservation for validation and reuse purposes.


 

Disposal Methods

ClassificationElectronicPaperDevices
ConfidentialSecure wipeCertified shredReturn to TSC
InternalDelete + backupsShredReturn to TSC
PublicDeleteAny methodReturn to TSC

Third-party contracts: Providers must return or destroy data with written certification within 30 days.

Sharing & Preservation

Data Deposit & Publication

Depositing your data in a repository preserves it for the long term and makes it findable and citable. This is often required by funders and journals.

When to Deposit

  • At publication: Many journals require data availability statements and DOIs
  • At project completion: Before grant closes and team disperses
  • After embargo: Some data can be embargoed during patent applications or ongoing analysis
  • Before you leave: If you're graduating or leaving Lakehead, deposit before losing access

Choosing a Repository

RepositoryBest ForKey Features
Borealis (Lakehead)Most research dataCanadian, free, DOIs, access controls
FRDRLarge datasets (100GB+)Curated, discovery platform
ZenodoCode, supplementary materialsGitHub integration, free
Discipline-specificField standardsICPSR (social), GenBank (genomics), etc.

Use re3data.org to find discipline-specific repositories.


Preparing Data for Deposit

File Preparation

  • Use open, non-proprietary formats (CSV, TXT, PDF/A)
  • Remove or de-identify personal information
  • Include README and data dictionary
  • Organize files logically
  • Use clear, descriptive file names

Metadata to Include

  • Title, authors, description
  • Keywords and subject terms
  • Collection methods
  • Geographic and temporal coverage
  • Related publications

Choosing a License

Licenses tell others how they can use your data.

CC0 (Public Domain)
No restrictions. Maximum reusability. Recommended for data.

CC-BY (Attribution)
Users must cite you. Good for most research data.

CC-BY-NC (Non-Commercial)
No commercial use. Limits some research applications.

Restricted Access
Users must request access. For sensitive data.


DOIs and Data Citation

When you deposit data, repositories assign a DOI (Digital Object Identifier) — a permanent link that makes your data citable.

Example citation:

Smith, J., & Jones, M. (2024). Survey data on Northern Ontario housing [Data set]. Borealis. https://doi.org/10.5683/SP3/EXAMPLE

Include your data DOI in publications and link your dataset to your ORCID profile.

Deposit Checklist

  • Data cleaned and de-identified (if needed)
  • Files in open formats
  • README file included
  • Data dictionary/codebook included
  • License selected
  • Metadata complete
  • Embargo period set (if needed)
  • DOI obtained and recorded
Discipline-Specific Guidance

Different research fields have unique data management considerations. Find guidance for your discipline below.

Health Research

Key requirements:

  • PHIPA compliance: Ontario health information must be protected under the Personal Health Information Protection Act
  • De-identification required: Remove all direct identifiers before sharing or publishing
  • Secure storage: Confidential classification — contact TSC for approved solutions
  • Data sharing: Often requires DSAs and REB approval for secondary use
  • Retention: Typically 10+ years for clinical research

Repositories: Restricted-access deposits on Borealis, ICPSR for survey data, dbGaP for genomic data


Social Sciences

Key considerations:

  • Qualitative data: Interview transcripts and field notes require careful de-identification
  • Consent for sharing: Include data sharing in consent forms from the start
  • Codebooks essential: Survey data needs comprehensive variable documentation
  • Longitudinal considerations: Plan for linking data across time points securely
  • Repositories: Borealis, ICPSR, Qualitative Data Repository (QDR), UK Data Archive

Lab Sciences

Key considerations:

  • Instrument data: Document equipment settings, calibration, and software versions
  • Lab notebooks: Electronic lab notebooks provide version control and timestamps
  • Raw vs. processed: Preserve raw data separately; document all processing steps
  • Reproducibility: Include analysis scripts and computational environment details
  • Large file sizes: May require DRAC storage or discipline-specific repositories

Repositories: Zenodo, Figshare, discipline-specific (GenBank, PDB, PANGAEA)


 

Computational Research

Key considerations:

  • Version control: Use Git for code; tag releases corresponding to publications
  • Environment documentation: requirements.txt, conda environments, Docker containers
  • Code citation: Get DOIs for software through Zenodo-GitHub integration
  • Licensing: Choose appropriate open-source license (MIT, GPL, Apache)
  • README files: Include installation, usage instructions, and examples

Repositories: GitHub + Zenodo, Software Heritage, CodeOcean


Environmental & Field Research

Key considerations:

  • Geospatial data: Include coordinate reference systems, precision, and collection methods
  • Temporal data: Document time zones, sampling frequency, and any gaps
  • Field conditions: Record weather, equipment issues, and deviations from protocol
  • Indigenous territories: Follow OCAP® principles for research on traditional lands
  • Sensor data: Document calibration and any post-processing applied

Repositories: PANGAEA, Dryad, Environmental Data Initiative (EDI), GBIF

Special Considerations

AI in Research

LLMs like ChatGPT, Gemini, and Claude are powerful but come with security risks. Many tools (Grammarly, Microsoft Office) now include AI features that may collect your data.

Recommended: Google Gemini

Access through your Lakehead account for enhanced privacy protections:

  • Data NOT used for AI training
  • You own your data
  • Free with institutional account

Google Gemini

NotebookLM


Avoid DeepSeek

Canadian security agencies have identified significant privacy and security risks with this Chinese AI service.

Government of Canada: Guide on the Use of Generative AI


Safe to Use

  • Published research summaries
  • Writing assistance
  • Understanding concepts
  • Generic code examples

Never Use For

  • Confidential research data
  • Participant information
  • Sensitive or controlled data
  • Personal info (students, colleagues)

Disclosure

Disclose significant AI contributions to publications, grants, or analysis.

"This work used AI assistance (Google Gemini) for [task]. All outputs were verified by the authors."

 

Survey Tips: Avoiding Bots

Layer 1: Planning

  • Soft launch: Test with 50 responses, check quality, scale up
  • Daily monitoring: Review responses in the first week
  • Unique links: Single-use links, not one public URL

Layer 2: Technical Barriers

  • CAPTCHA: Non-negotiable first defense
  • Honeypots: Hidden fields only bots fill
  • Attention checks: "Select 'Strongly Agree'"
  • Consistency checks: Year of birth + age should match

Layer 3: Screening

  • Never automate incentives: Human review first
  • Flag red flags: Impossible speed, IP clusters, gibberish
  • Delay payouts: Process after survey closes

Resources

Research Websites

If your project includes a dedicated website, you're responsible for keeping it accurate and up-to-date.

When your research is complete, take the website offline to prevent outdated information and security risks.

Resources

National Research Infrastructure

The Digital Research Alliance of Canada (DRAC) provides national infrastructure including storage, compute resources, and data management tools.

Lakehead University TSC

The Technology Services Centre (TSC) provides IT support, software, and infrastructure for Lakehead researchers. Contact TSC for questions about storage, software licensing, and technical support.

Key Resources

When to Contact TSC

  • Need additional storage beyond 100GB
  • Questions about approved software
  • Setting up shared drives for research teams
  • Device returns for secure data disposal
  • VPN or remote access issues
  • Security concerns or incidents

Tip: Storage Requests
If your research requires more than 100GB of Google Drive storage, contact TSC with details about your project and estimated storage needs. For very large datasets (500GB+), consider Digital Research Alliance storage options.

Training & Events

Upcoming Events

Check back for local Lakehead events!


Past Events (Resources)


Self-Paced Training

Templates & Downloads

Ready-to-use templates and checklists to support your research data management.

Documentation Templates

README Template
Cornell University Download

Data Dictionary Template
OSF Template Download

Codebook Guide
ICPSR View Guide

DMP Assistant
DRAC Tool Open Tool


Checklists

Project Start Checklist

  • DMP created or updated
  • Storage location selected
  • File naming convention established
  • Backup strategy in place
  • Data classification determined
  • Team roles and access defined
  • REB requirements confirmed

Project End Checklist

  • Data cleaned and organized
  • Documentation complete
  • Data deposited in repository
  • DOI obtained and recorded
  • Access permissions updated
  • Retention schedule confirmed
  • Secure destruction of copies

Quick Reference Guides

Need a Custom Template?
Contact RDM support if you need help adapting templates for your specific research needs or discipline.

What If Things Go Wrong?

Data emergencies happen. Here's what to do when things don't go as planned.

Accidentally Deleted Files

Google Drive:

  • Check Trash — files stay for 30 days
  • For files deleted from Trash, contact TSC immediately — recovery may be possible within 25 days

Local files:

  • Stop using the drive immediately to prevent overwriting
  • Check backups (external drives, cloud sync)
  • Contact TSC — they may have backup options

Lost Access to Storage

  • Lakehead account issues: Contact TSC Help Desk
  • Shared drive access: Contact the drive owner or your supervisor
  • DRAC/Alliance resources: Contact Alliance support or renew your CCDB account
  • Left/graduated: Your supervisor or department can request access to institutional data

Suspected Data Breach

  • Report immediately to TSC and your supervisor
  • Document what happened, when, and what data may be affected
  • Don't delete anything — preserve evidence
  • Notify REB if human participant data is involved
  • Follow institutional procedures — Lakehead has breach notification requirements

TSC Security Contact: TSC Help Desk


Hardware Failure

Laptop/computer died: If drive is intact, data may be recoverable — contact TSC
External drive failed: Professional recovery is expensive ($500-$2000+) and not guaranteed
Prevention: Follow the 3-2-1 rule (3 copies, 2 different media, 1 offsite)


 Collaborator Conflict Over Data

  • Review agreements: Check your DMP, DSA, or any written agreements about data ownership
  • Consult your supervisor or department head
  • Contact Research Services: They can advise on institutional policies and help mediate
  • Document everything: Keep records of contributions and communications

Prevention: Clarify data ownership and access rights in writing before starting collaborative projects.


Corrupted Files

  • Check version history: Google Drive keeps versions for 30 days (or 100 versions)
  • Restore from backup: This is why regular backups matter
  • Try file repair: Some software can recover partially corrupted files
  • Raw data priority: If you have raw data, you can regenerate processed files

Prevention is Better Than Recovery

Most data emergencies are preventable with good practices:

  • Regular automated backups
  • Version control for important files
  • Clear documentation so others can help
  • Data ownership discussions before projects start
Glossary of Terms

Quick reference for common research data management terms and acronyms.

  • CCDB
    Compute Canada Database — account system for accessing Digital Research Alliance resources
  • De-identification
    Removing direct identifiers from data; re-identification may still be possible with additional information
  • DMP
    Data Management Plan — document outlining how research data will be handled throughout a project
  • DOI
    Digital Object Identifier — permanent link for datasets, publications, and other research outputs
  • DRAC
    Digital Research Alliance of Canada — national organization providing research computing and data management infrastructure
  • DSA
    Data Sharing Agreement — formal contract governing how data can be shared between parties
  • FAIR
    Findable, Accessible, Interoperable, Reusable — principles for scientific data management
  • FIPPA
    Freedom of Information and Protection of Privacy Act — Ontario legislation governing public sector privacy
  • FRDR
    Federated Research Data Repository — Canadian national repository for large research datasets
  • GDPR
    General Data Protection Regulation — European Union privacy law affecting research with EU data
  • Metadata
    "Data about data" — information describing the content, context, and structure of research data
  • OCAP®
    Ownership, Control, Access, Possession — First Nations principles for data governance
  • ORCID
    Open Researcher and Contributor ID — unique identifier linking researchers to their work
  • PHIPA
    Personal Health Information Protection Act — Ontario law governing health information privacy
  • PI
    Principal Investigator — lead researcher responsible for a research project
  • PIPEDA
    Personal Information Protection and Electronic Documents Act — federal Canadian privacy law
  • RAC
    Resource Allocation Competition — process for requesting large allocations of DRAC computing resources
  • REB
    Research Ethics Board — committee that reviews research involving human participants
  • TCPS 2
    Tri-Council Policy Statement — Canadian ethical guidelines for research involving humans
  • TSC
    Technology Services Centre — Lakehead University's IT department