Contingent Summer Research Analyst Intern
About the Project
The CKD study uses 5 years of structured EHR data from a large private nephrology practice with over 50,000 patients. The study aims to:
Build standardized, analysis-ready analytic files (SAFs) spanning 2021–2025
Assess feasibility of longitudinal data elements (labs, prescriptions, disease history)
Characterize CD patients using contemporary clinical and treatment data
Evaluate the availability of specific variables (imaging, genetics, family history) in unstructured clinical records
The intern will be embedded in an active project team that includes biostatisticians, epidemiologists, data scientists, and clinical nephrologists, and will contribute to analytic work from day one.
Key Responsibilities
Contribute to construction and QC of longitudinal electronic health record (EHR) analytic files using structured data
Conduct descriptive analyses of patient demographics, lab values, medication use, and clinical characteristics
Summarize data availability, follow-up patterns, and measurement frequency across CKD subgroups
Support feasibility assessments by generating counts, proportions, and distributional summaries
Produce clean, well-documented analytic code and contribute to draft tables and figures
Participate in biweekly internal team meetings and client meetings, and contribute to written deliverables
Qualifications
Required:
Currently enrolled in a graduate program (MPH, MS, PhD, or equivalent) in biostatistics, epidemiology, data science, health informatics, or a related field
Proficiency in Python, R, or SAS for data manipulation and descriptive analysis
Comfort working with big data – large, messy, real-world datasets
Strong attention to detail and ability to write clean, reproducible, well-commented code
Ability to work independently with remote supervision
Comfort using AI-assisted coding tools (e.g., Claude, GitHub Copilot)
Preferred:
Familiarity with EHR data or claims-based data
Experience with longitudinal data structures (e.g., repeated lab measurements, time-to-event)
Experience with version control (Git)
Position Details
Duration: approximately July 1 – August 29, 2026 (flexible start; contingent on contract execution)
Hours: full-time (~40 hrs/week) or near full-time
Location: fully remote; no travel required
Compensation: paid internship (rate commensurate with experience)
Supervisor: Brian Bieber, MS, Research Scientist, Data Science
How to Apply
Submit a CV and a brief cover letter (1 page max) describing your relevant experience and availability. Applications will be reviewed on a rolling basis — early submission is strongly encouraged given the July start date.
Pay
$27 USD per hour