Healthcare and life science organizations are increasingly working with large-scale, multimodal datasets that include structured records, clinical notes, diagnostic images, and PDF documents.

Sharing this data for research and AI development requires rigorous de-identification to ensure patient privacy — without compromising the ability to extract insights across time and modalities.

In this webinar, experts from John Snow Labs and Databricks will demonstrate an end-to-end solution for automating the de-identification and tokenization of medical data with regulatory-grade accuracy. You’ll learn how to:

  • Automatically de-identify structured data, unstructured text, DICOM & JPEG images, whole-slide pathology images (SVS), and PDFs using John Snow Labs’ industry-leading software and AI models
  • Apply patient tokenization to enable linking of de-identified data across modalities and time points
  • Use Databricks to process and scale these capabilities across large, real-world datasets
  • Support HIPAA, GDPR, and other regulatory requirements for privacy-preserving research

This session is ideal for data scientists, clinical researchers, compliance teams, and healthcare IT leaders working with multimodal patient data who want to enable longitudinal, privacy-compliant research at scale.

 

REGISTER HERE

PRESENTED BY:

Srikanth Kumar

Srikanth Kumar Rana
Solutions Architect
Databricks

Srikanth Kumar Rana is a seasoned Field Engineer at Databricks, bringing extensive experience in helping organizations unlock the full potential of data and AI. With a strong focus on empowering customers, Srikanth has consistently demonstrated expertise in complex deployments, driving adoption, and enabling businesses to achieve tangible outcomes on the Databricks Lakehouse platform.

 

Alberto Andreotti

Alberto Andreotti
Data Scientist
John Snow Labs

Alberto Andreotti is a data scientist at John Snow Labs, specializing in Machine Learning, Natural Language Processing, and Distributed Computing. With a background in Computer Engineering, he has expertise in developing software for both Embedded Systems and Distributed Applications. Alberto is skilled in Java and C++ programming, particularly for mobile platforms. His focus includes Machine Learning, High-Performance Computing (HPC), and Distributed Systems, making him a pivotal member of the John Snow Labs team.


WATCH LAST YEAR'S WEBINAR