IDC type Breast Cancer Classification

Course project of ECE1512 - Digital Image Processing

H&E stained pathology tissue

Breast cancer classification on Keras (based on the implementaion of CancerNet algorithm by Adrian Rosebrock [1]).

  • Breast cancer is the most common form of cancer in women, and Invasive Ductal Carcinoma (IDC) is the most common form of breast cancer. Accurately identifying and categorizing breast cancer subtypes is an important clinical task, and automated methods can be used to save time and reduce error.

  • The goal of this script is to identify IDC when it is present in otherwise unlabeled histopathology images. The dataset consists of 277,524 50x50 pixel RGB digital image patches that were derived from 162 H&E-stained breast histopathology samples. These images are small patches that were extracted from digital images of breast tissue samples.

  • The breast tissue contains many cells but only some of them are cancerous. Patches that are labeled “1” contain cells that are characteristic of invasive ductal carcinoma.

  • The dataset was originally curated by Janowczyk and Madabhushi and Roa et al. but is available in public domain on Kaggle site.

  • For more information about the data used, see here and here.


[1]. Adrian Rosebrock, Breast cancer classification with Keras and Deep Learning, PyImageSearch, , accessed on 22 February 2019.

Mahesh Sudhakar
Mahesh Sudhakar
Computer Vision Research Engineer

Computer Vision | Robotics | Machine Learning