Wiki for material and resources, Deep Learning for COVID XRay detection

Wiki for material and resources, Deep Learning for COVID XRay detection

*[Description of available models, codes and data](https://gitlab.version.fz-juelich.de/MLDL_FZJ/juhaicu/jsc_public/sharedspace/playground/covid_xray_deeplearning/wiki/-/blob/master/Description.md)

-[Description of available models, codes and data](Description.md)

- The code is maintained [here](https://gitlab.version.fz-juelich.de/MLDL_FZJ/juhaicu/jsc_public/sharedspace/playground/covid_xray_deeplearning/covid19_detection)

- Next steps are available in the issues of the code repository: [Next steps](https://gitlab.version.fz-juelich.de/MLDL_FZJ/juhaicu/jsc_public/sharedspace/playground/covid_xray_deeplearning/covid19_detection/-/issues)

- compute budget: dlmpdxi_cov2, **3.2 Mcore-h (i.e., 25 Kgpu-h) for two months**

- JJ: currently, it is limited until 31.10.2020, see JuDOOR.

We will apply for proper full computational time project collecting results obtained until then.

## Next Steps

- JJ: Long term vision is a system digesting different types of image modalities (not only X-Ray), continually improving generic model of image understanding (with some focus on medical diagnostics and analysis), allowing fast transfer to a domain of interest (if a new domain X appears, triggered by an unknown novel pathogen causing a disease that can be diagnosed via medical imaging, the generic model, pretrained on millions of different images from distinct domains, can be used to derive quickly an expert model for domain X)

- JJ: Directions to go:

* Uncertainty estimation : current output does not contain info how uncertain is the network about the prediction made

- making uncertainty estimate available would allow to see how confident network is on making prediction and for example be very careful with outputs that signal too high uncertainty

- a recent example from medical imaging: https://www.nature.com/articles/s41598-019-50587-1

- in general, look at Bayesian Neural Network methods

- good code overview here : https://github.com/JavierAntoran/Bayesian-Neural-Networks/

- short review : https://engineering.papercup.com/posts/bayesian-neural-nets/

- classical books and papers: McKay, Bishop, Neal

- Monte Carlo DropOut (MCD, Yarin Gal) : outdated, but can be a baseline

- original paper: https://arxiv.org/abs/1506.02142 , http://proceedings.mlr.press/v48/gal16.pdf

- somewhat more recent code : https://github.com/aredier/monte_carlo_dropout

- Monte Carlo Dropout (MCD) is an approximate variational inference method based on dropout. The approximating distribution q(w) takes the form of the product between Bernoulli random variables and the corresponding weights. Hence, sampling from q(w) reduces to sampling Bernoulli variables, and is thus very efficient.

- amounts to training with DropOut and using DropOut during inference to obtain uncertainty estimates (by running inference several times)

- related material: http://mlg.eng.cam.ac.uk/yarin/blog_3d801aa532c1ce.html,

- related technique : MC-DropConnect https://arxiv.org/abs/1906.04569

- Hamiltonian Monte Carlo (HMC), scalable versions

- Recent work https://papers.nips.cc/paper/6117-bayesian-optimization-with-robust-bayesian-neural-networks.pdf

- also applied to hyperparameter estimation in ResNets

- some overview on uncertainty in neural nets here : <https://arxiv.org/abs/1909.09884>

- another good overview : <https://papers.nips.cc/paper/7141-what-uncertainties-do-we-need-in-bayesian-deep-learning-for-computer-vision.pdf>

- "Our model based on DenseNet can process a 640×480 resolution image in 150ms on a NVIDIA Titan X GPU. The aleatoric uncertainty models add negligible compute. However, epistemic models require expensive Monte Carlo dropout sampling. For models such as ResNet, this is possible to achieve economically **because only the last few layers contain dropout**. Other models, like **DenseNet**, require the entire architecture to be sampled. This is difficult to parallelize due to GPU memory constraints, and often results in a **50xslow-down** for 50 Monte Carlo samples"

- Ensemble methods (training a ensemble of networks, variance in the output; disadvantage is effort for multiple training; advantage is scalability and simplicity)

- good baseline : Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles

"We propose an alternative to Bayesian NNs that is **simple to implement, readily parallelizable, requires very little hyperparameter tuning**, and yields high quality predictive uncertainty estimates. Through a series of experiments on classification and regression benchmarks, we demonstrate that our method produces well-calibrated uncertainty estimates which are **as good or better than approximate Bayesian NNs**"

* Interpretability via iNNvestigate or captum packages (heat maps for relevant image regions responsible for output):

- heat map may reveal whether classifier is relying on artificial "turked" information in the image (e.g, repetitive signatures of text, etc )

- example of COVID X deep learning work that uses the packages in simple way (FH Aachen, nearby folks, may contact them for talking to Uni Klinik Aachen):

- DeepCOVIDExplainer: Explainable COVID-19 Predictions Based on Chest X-ray Images, https://arxiv.org/abs/2004.04582

- https://github.com/rezacsedu/DeepCOVIDExplainer

- https://github.com/BioXAI/DeepCOVIDExplainer

- example of package usage : https://github.com/BioXAI/DeepCOVIDExplainer/blob/master/noteboks/Decision_Visualization_GradCAM_LRP_ResNet18.ipynb

# Past Meetings

-[Monday, 04, May 2020](https://gitlab.version.fz-juelich.de/codiMD/GzaHZYfTSJmHnGZfSuugFw)

- Further Relevant Info (to be digested into wiki) : [Notes Meeting 08.05, COVID Project Discussion and First Tests](https://gitlab.version.fz-juelich.de/codiMD/60YzfPDYR9-RwWVVtw0HUQ#COVID-X-Ray-Deep-Learning)

# Relevant links

- Radiology assistant from MILA: (same people who provide dataset) <https://josephpcohen.com/w/chester-the-ai-radiology-assistant/>, <https://mlmed.org/tools/xray/>

- here, a functionality to take over is out-of-distribution detection, where the system is able to signal that incoming image is "too far" from what the model was trained before