The work investigates the use of two types of glottal flow derivative-based image variants of the input signal with an n-dilated (nD)-inception-layers-based deep learning model for providing optimal labels. The authors have proposed an n-dilated (nD) inception layer-based adversarial pathological response (APR) net deep learning model. This model is trained using the two image databases separately in an adversarial manner so that when a test image is common to test image is applied to both the networks. The results show a mean accuracy of 96.82%, 96.36%, and 99.35% for the Glottal inverse filtering with extended Kalman Filter-Morse scalogram (GIFEKF-MS) APRNet, Glottal inverse filtering with extended Kalman Filter-spectrogram (GIFEKF-S) APRNet, and proposed APR fusion net respectively using the VOice ICar fEDerico II (VOICED) dataset; and mean accuracies 95.67%, 93.27%, and 99.04% for the GIFEKF-MS APRNet, GIFEKF-S APRNet, and proposed APR fusion net respectively using the Saarbrucken voice database (SVD)dataset.