matlab - What is the correct order of the prior vector in fitensemble? -
when using matlabs fitensemble learn classifier can specify parameter prior parameter classnames.
has order of elements in both vectors same? , standard value true/false classes?
to more specific: assume true class has prior probability 0.6, false class 0.4; should use:
ens = fitensemble(...,'prior',[0.6 0.4]) or
ens = fitensemble(...,'prior',[0.4 0.6]) or
ens = fitensemble(...,'prior',[0.4 0.6],'classnames',[true false]) or
ens = fitensemble(...,'prior',[0.4 0.6],'classnames',[false,true]) ?
i cannot find answer in documentation.
the documentation of perfcurve more specifc:
prior: either string or array 2 elements. represents prior probabilities positive , negative class, respectively. default 'empirical', is, perfcurve derives prior probabilities class frequencies. if set 'uniform', perfcurve sets prior probabilities equal.
ens = fitensemble(x,y,method,nlearn,learners) creates ensemble model predicts responses data. ensemble consists of models listed in learners.
first part
you have use prior in alphabetical order of class labels.
so if labels ['a','b'], use 'prior',[p(a) p(b)],
or if labels ['true','false'], use 'prior',[p(false) p(true)],
or if labels [-1 10], use 'prior',[p(-1) p(10)].
second part
about classnames, option used can call fitensemble fewer classes in data.
imagine have 4 classes a,b,c,d, y like:
y = [a;a;b;d;b;a;c;a;a;a;d, ... ]; now may write 'classnames',['a';'b'], if want fitensemble 2 classes , same 'classnames',['b';'a'],.
i know late answer, hope helps.
example
i have used 'fisheriris' database, has 3 classes (setosa',versicolor,virginica`).
because has 150 cases , 50 of each class, randomized data , selected 100 samples.
load fisheriris rng(12); idx = randperm(size(meas,1)); meas = meas(idx,:); species = species(idx,:); meas = meas(1 : 100,:); species = species(1 : 100,:); trueprior = [ sum(strcmp(species,'setosa')),... sum(strcmp(species,'versicolor')),... sum(strcmp(species,'virginica'))] / 100; the trueprior = [0.32,0.30,0.38] shows true prior probabilities.
in following code have trained 3 fitensembles, first 1 default options prior probability empirical (is same trueprior); second 1 trained pprior set trueprior have same results fist (because trueprior in alphabetical order of class labels). third 1 trained non-alphabetical order , shows different results first two.
ada1 = fitensemble(meas,species,'adaboostm2',20,'tree'); subplot(311) plot(resubloss(ada1,'mode','individual')); title('resubstitution error default prior (empirical)'); ada2 = fitensemble(meas,species,'adaboostm2',20,'tree','prior',trueprior); subplot(312) plot(resubloss(ada2,'mode','individual')); title('resubstitution error prior alphabetical order of class labels'); ada3 = fitensemble(meas,species,'adaboostm2',20,'tree','prior',trueprior(end:-1:1)); subplot(313) plot(resubloss(ada3,'mode','individual')); title('resubstitution error prior random order'); 
i trained fitensemble 2 classes using classnames option
ada4 = fitensemble(meas,species,'adaboostm1',20,'tree','classnames',... {'versicolor','virginica'}); as proof adaboosm1 doesn't support more 2 classes works fine here 2 classes.
Comments
Post a Comment