## Summary

Voxel-wise gray matter (GM) probability maps obtained from T1 weighted magnetic resonance imaging (MRI) were used to train a multi-class support vector machine (SVM) in order to discriminate individuals with prabable mild cognitive impairment (MCI) due to Alzheimer's disease (AD), patients with probable AD in the manifest stage, and elderly controls. Confounds of age and total intracranial volume were reduced by kernel linear regression.

## Methods

### Overview

The proposed and similar methods previously worked well with T1 weighted images such as MPRAGE, MP2RAGE, and others. The origin of the T1 weightet images was initially set to the center of the image in order to get a decent initial guess for the rigid body registration, which was performed by SPM8. The VBM8 toolbox was used to perform the segmentation into GM, white matter, and cerebro-spinal fluid. Only the GM segment was used for the classification. Structural images were registered to the IXI550 template provided with the VBM8 toolbox and GM density maps were modulated to account for non-linear deformations. This procedure was applied to all training and test data, taking approximately one hour per image on a single core. Note, that the computation of all subsequent steps, including computation of the kernel matrix, feature selection, and training of the classifier was much faster than the image pre-processing.

For this wiki we provide scripts to pre-process the images, to obtain adjusted input features, and perform the classification. The segmentation step requires Matlab, SPM8, and VBM8. The remaining processing, such as the correction for covariates and classification should be straight forward and can be performed on many software systems, such as Python, Java, C/C++, and Matlab. We provide general guidence and easy-to-use Matlab functions for all critical steps.

#### Software Requirements

In order to perform the pre-processing, following software is required:

• Matlab 2013a (other versions could work as well)
• SPM8
• VBM8

Covariate correction and classification was performed in matlab using external libraries, but can also be performed in other programming languages. In general, following functionality is needed.

• libriary to load Nifti images (e.g. niftilib for C/C++)\footnote{SPM8 has the required tools on-board}
• function to perform matrix-matrix multiplications (part of Matlab core)
• SVM solver (e.g. libsvm 3.14) for classification

Because of the high dimensionality of the data, it is possible that not all the data fits into the computer's memory. Typically, one imaeg occupies 16MB, which means that a computer that is equipped with 16GB RAM, around 1000 images can be load. Effectively it is even less, beacause other programs consume additional memory. However, this is not a limiting factor, as the regression and classification can be performed in kernel space, which is more efficient in terms of memory cosumption (for the given problem).

For the remainder of this document, we describe the pre-processing, adjustment for covariates, and classification using Matlab and appropriate libraries. We give sufficient details in order to perform regression and cleassification in othe programming environments as well.

### Detailed depiction of computation steps

Execution of the code was tested with Matlab 2013a (The MathWorks Inc.), SPM8\ (revision number 5236) , VBM8 (r435), and libsvm 3.17. Here, we provide code for the registration and segmentation using SPM8/VBM8. The remaining steps essentially involve loading of imaging data and performing matrix-matrix operations which can be performed in many programming languages. The popular libsvm solver provides many interfaces, such as for Matlab, Python, Java, or C++.

#### Segmentation of tissue types (GM and WM)

Depending on the data set, the initial position of the head might be non-overlapping in which case the registration algorithm fails. Therefore, the origin must be set in a suitable way. A simple method is to set it to the center of the image, as this is an easy and relatively reliable initial guess. The for coregistration and segmentation step described below include also the option to set the origin to the image center.

The segmentation must be performed for every input image. However, this step does not always produce satisfactory results. Usually, faulty segmentations is caused by bad registration, which in turn might be caused due to inapropriate initial guess of the registration. Users should manually check the segmentation and registration to the template fullfile(spm('dir'),'canonical','single_subj_T1.nii,1');. The output GM data of the subjects is stored as m0wrp1IMAGE.nii, where IMAGE is the base name of the input image.

SPM8 provides the functions spm_read_vols and spm_vol to load parts or the entirety of an image series. Suppose that images is a cell array of images, then we obtain the data matrix of voxel values as detailed below.


% images: cell array of images
images = { '/path/to/images/subject_01.nii'
'/path/to/images/subject_01.nii'
... and so on
'/path/to/images/subject_99.nii' };

% hdr:   create handles to images on file system
%        this is particularely useful for display purposes
hdr    = spm_vol(images);

% dim: image dimensions (in voxel)
dim    = hdr{1}.dim;

% X:     data matrix
%        load data, permute and reshape to obtain a data matrix
%        of dimensions N times prod(dim)
X      = reshape(permute(spm_read_vols(cell2mat(hdr)),[4 1 2 3]),[N prod(dim)]);

% K:     kernel matrix of linear dot-products
K      = X*X';

Note, that this step can be done for multiple structural modalidies (e.g. GM and WM maps), but also for instance using parameter maps from functional activations. The only requirement ist that the images are in the same space and have the same dimensions and voxel space (i.e. images must be coregistered and sampled in the same space).

Feature selection Given data matrix $\mathbf{X}\in\mathbb{R}^{N\times D}$ and a label vector $\mathbf{Y}\in \{-1,+1\}^N$, a {\tt model} and the appropriate weight vector $\mathbf{w}$ is obtained according to Listing \ref{lst:libsvmmodel}.


% Veryfy that the correct 'svmtrain' function is used. It must be the one
%  provided by libsvm, not the one provided by the Bioinformatics Toolbox.
which('svmtrain');
% model:  trained libsvm model
model   = svmtrain(Y,[(1:N)' X*X'],sprintf('-s 0 -t 4 -q -c \%f',1.0));
% w:      weight image
w       = X(model.SVs,:)'*model.sv_coef;

The magnitude of the elements of the weight vector $\mathbf{w}$ are sometimes used as proxy to rank importance of input dimensions. However, this does not take into account the variance of the data. It is better to compare the observed weights with the respective null-distribution obtained by randomly permuting the labels. This procedure can be approximated analytically, as shown by Gaonkar et al. (2013). The paper provides implementation details and their code can be downloaded here.

Regression and classificationis based on the matrix of the pair-wise dot-products (linear kernel). Given the data matrix $X\in\mathbb{R}^{N\times D}$, where N is the number of examples and D the number of input features, the linear kernel is $\left(k_{ij}\right)=\mathbf{K}=\mathbf{X}\mathbf{X}^T\in\mathbb{R}^{N\times N}$. The squared exponential radial basis function (RBF) kernel function is defined as $k_{\mathrm{RBF}}(\mathbf{x}_i,\mathbf{x}_j)=\exp(-\gamma||\mathbf{x}_i-\mathbf{x}_j||^2)$. Note, that $||\mathbf{x}_i-\mathbf{x}_j||=\mathbf{x}_i^T\mathbf{x}_i-2\cdot\mathbf{x}_i^T\mathbf{x}_j+\mathbf{x}_j^T\mathbf{x}_j=k_{ii}-2\cdot k_{ij}+k_{jj}$ and thus can expressed in terms of linear dot-products without requiring the original data matrix $\mathbf{X}$.

Adjusting for effects of age is done by estimating the effect of age in feature space $\Phi: \mathbb{R}^D\mapsto\mathbb{R}^P$ and subtracting the estimate from the observation, leaving a residual that is explained by the effect of interest (the disease) and some unkown and independent error. The idea behind the processing stages is depicted in Figure 1.

Listing kernelLinearDetrend.m implements robust linear kernel regression. That is, that the function can deal with rank deficient designs. It also provides a means to specify a sub-set of the data as training set to estimate the parameters on. We recommend to use healthy controls only. When doing cross-validation tests, it is important not to include the left out subject(s) in the estimation of parameters.

## Discussion

This (and similar) methods are relatively easy to apply and performed well on a wide range of data sets involving neuro-degenerative diseases. The method performed well in the CADDementia competition terms of average AUC, but only moderate in terms of accuracy. Improving the prior weights for the classes is adviced, in order to improve classification performance. With very little effort, data typically used for voxel-based morphometry can be used to perform regression and classification, rather than mass-univariate tests. This can be useful to gain a better understanding of spatially distributed group differences. In addition, classification accuracy or AUC is a useful marker of how distinct two diseases states are. The application to statistical maps of fMRI experiments is streight forward as well, although the inter-subject variance is significantly larger.

...

## Contact

For questions regarding the usage of the code, or for bug reports, please contactt Ahmed Abdulkadir. University of Freiburg, Department of Computer Science, ahmed [dot] abdulkadir [at] cs [dot] uni-freiburg [dot] de

## Code

### kernelLinearDetrend.m


% Kd = kernelLinearDetrend(K,X,IDX,LMBD) linearly detrends kernel K using the
%  regressors in X. The logical array IDX indicates training set used to
%  estimate the parameters for the detrend. The parameter LMBD is the ridge
%  parameter that biases the solution towards zero in order to reduce
%  overfitting.
%
%  Usage Example:
%    X = [ones(100,1) [linspace(0,20,40)'; 20*rand(20,1) ;linspace(0,20,40)']];
%    Y = [X*[1 2]' [ones(50,8); -ones(50,8)] 0.6*randn(100,10)];
%    K = Y*Y';
%    idx = true(100,1);
%    idx(41:60) = false;
%    lmbd = 0.01; % ridge parameter
%    Kd = kernelLinearDetrend(K,X,idx,lmbd);
%    figure(1);
%    subplot(1,2,1); imagesc(K);axis image;
%    subplot(1,2,2); imagesc(Kd); axis image
%
%

function [Kd,R] = kernelLinearDetrend(K,X,idx,lmbd)

% lmbd: ridge parameter
if ~exist('lmbd','var') || isempty(lmbd); lmbd = 0; end

% X: detrending matrix
% !! the 'gather' function is required because 'pinv.m' is not compatible
% with gpuArrays (as of R2013a)
X = gather(X);

% N: number of exapmles
N = size(K, 1);

% check if idx is defined and take all examples as training otherwise
if ~exist('idx','var') || isempty(idx),
idx = true(N,1);
else
% idx: assure logical array
idx = logical(idx);
end

% assert compatible dimensions
assert(size(K, 2) == N, 'Kernel matrix K must be square.');
assert(size(X, 1) == N, 'size(X,1) must be equal to size(K,1).');
assert(numel(idx) == N, 'numel(idx) must be equal to size(K,1).');

% training kernel including all examples
if nargout > 1
[Kd,R] = doKernelDetrending(K,X,idx,lmbd);
else
Kd = doKernelDetrending(K,X,idx,lmbd);
end

end

function [Kd,R] = doKernelDetrending(K,X,idx,lmbd)

% N: number of elements
N = numel(idx);

% p: number of columns in X
p = size(X,2);

% R:  residual-forming matrix
pinvX     = zeros(p,N);
if lmbd>0,
pinvX(:,idx) = (X(idx,:)'*X(idx,:) + lmbd*eye(p))^-1*X(idx,:)';
else
pinvX(:,idx) = pinv(gather(X(idx,:)));
end
R = (speye(N) - X*pinvX);

% compute detrended kernel
Kd = R*K*R';

end

### CoregSegmentVBM8.m


%% Coregister and Segment Structural MRI Images
%
%  This function requires SPM8 (r5236) and VBM8 (r435).
%
%  $Id: CoregSegmentVBM8.m 281 2014-07-18 08:23:39Z ahmed.abdulkadir@cs.uni-freiburg.de$
%
%  CoregSegment(T1imagefile,doCoreg,doSetOrigin)
%  Inputs:
%   T1imagefile : file name of structural T1 image to be processed
%   doCoreg     : logical variable that determines whether coregistration
%                 is done or not
%   doSetOrigin : logical variable that deteremines wheter the origin is set
%                 to the center of the image or not
%
%  Outputs:
%   (realigned and segmented images output to file system)

function CoregSegmentVBM8(T1,doCoreg,doSetOrigin)

% set default to no coregistration and no reset of the origin
if ~exist('doCoreg',    'var'),     doCoreg = false; end
if ~exist('doSetOrigin','var'), doSetOrigin = false; end
% make sure that the variables are logical
assert(islogical(doCoreg),    'variable doCoreg must be logical');
assert(islogical(doSetOrigin),'variable doSetOrigin must be logical');

% check if SPM is available
%spm('defaults', 'FMRI'); spm_jobman('initcfg');
% check if desired image file exists
assert(exist(T1,'file') > 0, ['ERROR: desired T1 image file does '         ...
'not exist %s'], T1);
% set reference (fixed) image
reference = fullfile(spm('dir'),'canonical','single_subj_T1.nii,1');
assert(exist(reference(1:end-2),'file') > 0, ['ERROR: desired reference'   ...
' image does'...
' not exist']);

% input image
[T1pth,T1fname,~] = fileparts(T1);

% reset origin if requested
if doSetOrigin,
setOriginToImageCenter(T1);
end

jobs = repmat({ defineSegmentJobVBM8(T1,doCoreg) },1,1);
% run SPM batch
spm_jobman('serial', jobs,'',{});

% move resulting files to subfolder
VBM8Path        = fullfile(T1pth,'VBM8');
T1_seg8          = fullfile(T1pth,[T1fname '_seg8.mat']);

mkdir(VBM8Path);
movefile(T1_seg8,VBM8Path);
movefile(fullfile(T1pth,['m0wrp1' T1fname '.nii']),VBM8Path);
movefile(fullfile(T1pth,['m0wrp2' T1fname '.nii']),VBM8Path);
movefile(fullfile(T1pth,['m0wrp3' T1fname '.nii']),VBM8Path);
movefile(fullfile(T1pth,['y_r' T1fname '.nii']),VBM8Path);
movefile(fullfile(T1pth,['iy_r' T1fname '.nii']),VBM8Path);
movefile(fullfile(T1pth,['jac_wrp1' T1fname '.nii']),VBM8Path);
movefile(fullfile(T1pth,['wmr' T1fname '.nii']),VBM8Path);
movefile(fullfile(T1pth,['m' T1fname '.nii']),VBM8Path);
movefile(fullfile(T1pth,['p' T1fname '_seg8.txt']),VBM8Path);

% here, we clear some files, that are not of interest
delete(fullfile(VBM8Path,['y_r' T1fname '.nii']));
delete(fullfile(VBM8Path,['iy_r' T1fname '.nii']));
delete(fullfile(VBM8Path,['jac_wrp1' T1fname '.nii']));
delete(fullfile(VBM8Path,['wmr' T1fname '.nii']));
% _______________________________________________________________________ %

% _______________________________________________________________________ %
function matlabbatch = defineSegmentJobVBM8(T1,doCoreg)
%-----------------------------------------------------------------------
% Job configuration created by cfg_util (rev $Rev: 4252$)
%-----------------------------------------------------------------------
matlabbatch{1}.cfg_basicio.cfg_named_file.name = 'sMRI Images (T1)';
matlabbatch{1}.cfg_basicio.cfg_named_file.files = { {T1} }';
% Coregistration T1->MNI
if doCoreg,
matlabbatch{end+1}.spm.spatial.coreg.estimate.ref = {                  ...
fullfile(spm('dir'),'canonical','single_subj_T1.nii,1') };
matlabbatch{end}.spm.spatial.coreg.estimate.source(1) = cfg_dep;
matlabbatch{end}.spm.spatial.coreg.estimate.source(1).tname = 'Source Image';
matlabbatch{end}.spm.spatial.coreg.estimate.source(1).tgt_spec{1}(1).name = 'class';
matlabbatch{end}.spm.spatial.coreg.estimate.source(1).tgt_spec{1}(1).value = 'cfg_files';
matlabbatch{end}.spm.spatial.coreg.estimate.source(1).tgt_spec{1}(2).name = 'strtype';
matlabbatch{end}.spm.spatial.coreg.estimate.source(1).tgt_spec{1}(2).value = 'e';
matlabbatch{end}.spm.spatial.coreg.estimate.source(1).sname = 'Named File Selector: sMRI Images (T1)(1) - Files';
matlabbatch{end}.spm.spatial.coreg.estimate.source(1).src_exbranch = substruct('.','val', '{}',{1}, '.','val', '{}',{1});
matlabbatch{end}.spm.spatial.coreg.estimate.source(1).src_output = substruct('.','files', '{}',{1});
matlabbatch{end}.spm.spatial.coreg.estimate.other = {''};
matlabbatch{end}.spm.spatial.coreg.estimate.eoptions.cost_fun = 'nmi';
matlabbatch{end}.spm.spatial.coreg.estimate.eoptions.sep = [4 2];
matlabbatch{end}.spm.spatial.coreg.estimate.eoptions.tol = [0.02 0.02  ...
0.02 0.001 0.001 0.001 0.01 0.01 0.01 0.001 0.001 0.001];
matlabbatch{end}.spm.spatial.coreg.estimate.eoptions.fwhm = [7 7];
end
% Segmentation of T1
matlabbatch{end+1}.spm.tools.vbm8.estwrite.data(1) = cfg_dep;
matlabbatch{end}.spm.tools.vbm8.estwrite.data(1).tname = 'Volumes';
matlabbatch{end}.spm.tools.vbm8.estwrite.data(1).tgt_spec{1}(1).name = 'class';
matlabbatch{end}.spm.tools.vbm8.estwrite.data(1).tgt_spec{1}(1).value = 'cfg_files';
matlabbatch{end}.spm.tools.vbm8.estwrite.data(1).tgt_spec{1}(2).name = 'strtype';
matlabbatch{end}.spm.tools.vbm8.estwrite.data(1).tgt_spec{1}(2).value = 'e';
matlabbatch{end}.spm.tools.vbm8.estwrite.data(1).sname = ['Named File '    ...
'Selector: sMRI Images (T1,FLAIR)(1) - Files'];
matlabbatch{end}.spm.tools.vbm8.estwrite.data(1).src_exbranch =            ...
substruct('.','val', '{}',{1}, '.','val', '{}',{1});
matlabbatch{end}.spm.tools.vbm8.estwrite.data(1).src_output =              ...
substruct('.','files', '{}',{1});
spmversion = spm('ver');
switch spmversion
case 'SPM8',
matlabbatch{end}.spm.tools.vbm8.estwrite.opts.tpm = {
fullfile(spm('dir'),'toolbox','Seg','TPM.nii') };
case 'SPM12b',
spmversion);
otherwise,
error('Unknown or incompatible SPM version: %s\n',spmversion);
end
matlabbatch{end}.spm.tools.vbm8.estwrite.opts.ngaus = [2 2 2 3 4 2];
matlabbatch{end}.spm.tools.vbm8.estwrite.opts.biasreg = 0.0001;
matlabbatch{end}.spm.tools.vbm8.estwrite.opts.biasfwhm = 60;
matlabbatch{end}.spm.tools.vbm8.estwrite.opts.affreg = 'mni';
matlabbatch{end}.spm.tools.vbm8.estwrite.opts.warpreg = 4;
matlabbatch{end}.spm.tools.vbm8.estwrite.opts.samp = 3;
matlabbatch{end}.spm.tools.vbm8.estwrite.extopts.dartelwarp = 1;
matlabbatch{end}.spm.tools.vbm8.estwrite.extopts.ornlm = 0.7;
matlabbatch{end}.spm.tools.vbm8.estwrite.extopts.mrf = 0.15;
matlabbatch{end}.spm.tools.vbm8.estwrite.extopts.cleanup = 1;
matlabbatch{end}.spm.tools.vbm8.estwrite.extopts.print = 1;
matlabbatch{end}.spm.tools.vbm8.estwrite.output.GM.native = 0;
matlabbatch{end}.spm.tools.vbm8.estwrite.output.GM.warped = 0;
matlabbatch{end}.spm.tools.vbm8.estwrite.output.GM.modulated = 2;
matlabbatch{end}.spm.tools.vbm8.estwrite.output.GM.dartel = 0;
matlabbatch{end}.spm.tools.vbm8.estwrite.output.WM.native = 0;
matlabbatch{end}.spm.tools.vbm8.estwrite.output.WM.warped = 0;
matlabbatch{end}.spm.tools.vbm8.estwrite.output.WM.modulated = 2;
matlabbatch{end}.spm.tools.vbm8.estwrite.output.WM.dartel = 0;
matlabbatch{end}.spm.tools.vbm8.estwrite.output.CSF.native = 0;
matlabbatch{end}.spm.tools.vbm8.estwrite.output.CSF.warped = 0;
matlabbatch{end}.spm.tools.vbm8.estwrite.output.CSF.modulated = 2;
matlabbatch{end}.spm.tools.vbm8.estwrite.output.CSF.dartel = 0;
matlabbatch{end}.spm.tools.vbm8.estwrite.output.bias.native = 1;
matlabbatch{end}.spm.tools.vbm8.estwrite.output.bias.warped = 1;
matlabbatch{end}.spm.tools.vbm8.estwrite.output.bias.affine = 0;
matlabbatch{end}.spm.tools.vbm8.estwrite.output.label.native = 0;
matlabbatch{end}.spm.tools.vbm8.estwrite.output.label.warped = 0;
matlabbatch{end}.spm.tools.vbm8.estwrite.output.label.dartel = 0;
matlabbatch{end}.spm.tools.vbm8.estwrite.output.jacobian.warped = 1;
matlabbatch{end}.spm.tools.vbm8.estwrite.output.warps = [1 1];

% _______________________________________________________________________ %
function setOriginToImageCenter(fnames)
% check if argumant has been given
if ~exist('fnames','var');
fnames = spm_select([1 Inf],'image','Select 3D Images ..','');
end
% transform char array of images into cell array of char
if ischar(fnames),
fnames = cellstr(fnames);
end
% N: number of images
N = size(fnames,1);
% apply translation
for i=1:N, % for all images
% V: memory-mapped volumes
V = spm_vol(fnames{i});
% center.vx: image center in voxel coordinates
center.vx = [V.dim(1)/2 V.dim(2)/2 V.dim(3)/2 1];
% center.vx: image center in world coordinates
center.mm = (V.mat*center.vx')';
% apply translation
mat = spm_matrix(-center.mm(1:3))*V.mat;
% _______________________________________________________________________ %