Isochoric Nucleation Detection Analytics Pipeline

Fully automated cryobiology analytics workflow that gathers pressure-temperature data, transforms and sanitizes data, generates analytics and outputs graphs in real time, saving 30 hours/week on data treatment.

Fully automated cryobiology analytics workflow that gathers pressure-temperature data, transforms and sanitizes data, generates analytics and outputs graphs in real time, saving 30 hours/week on data treatment.

Fully automated cryobiology analytics workflow that gathers pressure-temperature data, transforms and sanitizes data, generates analytics and outputs graphs in real time, saving 30 hours/week on data treatment.

About the project

The INDe Analytics Pipeline project was developed for a laboratory at the University of California-Berkeley, as part of PhD research in cryobiology, to investigate ice nucleation behavior in polymeric solutions. The project aimed to transform raw experimental data from isochoric nucleation detection experiments into publication-ready statistical analyses and visualizations. The result was a modular, three-phase workflow that processed hundreds of freeze-thaw cycles in real-tie, identified nucleation events while doing signal-to-noise processing, calculated survival functions, fitted theoretical models to experimental data and provided the relevant publication-ready graphs in seconds.

The INDe Analytics Pipeline project was developed for a laboratory at the University of California-Berkeley, as part of PhD research in cryobiology, to investigate ice nucleation behavior in polymeric solutions. The project aimed to transform raw experimental data from isochoric nucleation detection experiments into publication-ready statistical analyses and visualizations. The result was a modular, three-phase workflow that processed hundreds of freeze-thaw cycles in real-tie, identified nucleation events while doing signal-to-noise processing, calculated survival functions, fitted theoretical models to experimental data and provided the relevant publication-ready graphs in seconds.

The INDe Analytics Pipeline project was developed for a laboratory at the University of California-Berkeley, as part of PhD research in cryobiology, to investigate ice nucleation behavior in polymeric solutions. The project aimed to transform raw experimental data from isochoric nucleation detection experiments into publication-ready statistical analyses and visualizations. The result was a modular, three-phase workflow that processed hundreds of freeze-thaw cycles in real-tie, identified nucleation events while doing signal-to-noise processing, calculated survival functions, fitted theoretical models to experimental data and provided the relevant publication-ready graphs in seconds.

Date:

Jan 16, 2025

Client:

UC Berkeley

Services:

Project Details

The pipeline consisted of three stages. Stage 1 handled raw data inputs from a Raspberry Pi that pinpointed when a nucleation event occurs from strain and temperature measurements. The raw data was filtered for outliers using standard deviation criteria, and exported processed cycle data. Stage 2 transformed individual cycle data into population statistics, calculating unfrozen fractions, generating survival curves, and producing interactive violin plots and scatter visualizations comparing different sample conditions. Stage 3 implemented advanced Poisson nucleation modeling with curve fitting optimization, calculating nucleation kinetics parameters through least-squares regression and orthogonal distance regression, complete with error analysis and R² metrics. The workflow was enhanced with a monitoring dashboard and enabled full reproducibility of the data processing without need for a human in the loop.

"The analytical framework that Bruno created transformed weeks of manual data processing into an automated, reproducible workflow that directly supported our published research. We would have an idea in the morning, and instead of waiting for the next day to discuss, we discussed in front of the dashboard like watching a show."

Boris Rubinsky

Principal Investigator, UC-Berkeley

Things I Did

I designed and implemented the complete three-stage data pipeline architecture, translating cryobiology research requirements into computational workflows. I built the data import and quality control system with configurable statistical thresholds for nucleation event detection. I developed interactive data visualizations for multi-condition comparisons across experimental groups. I implemented the Poisson statistical modeling framework with SciPy optimization routines, including RMSE calculation and curve fitting with error propagation. I created the data persistence system using common data storage to enable seamless, reproducible, data flow between pipeline stages.

Create a free website with Framer, the website builder loved by startups, designers and agencies.