## How to determine the optimal factor of PLS model

Just like I provided the Y explained variance of cross validation. The optimal factor is 8 that The Unscrambler gave. I want to know how to determine the optimal factor by calculation algorithm not by eyes.

Thank you very much.

Xudong

No algorithm can automatically find the correct rank (or underlying structures) for every
data set. Cross validation and test set validation have shown to be conservative methods in
this context. The Unscrambler® adds a small punishing constant to the validated residual
variance for each factor added to the model because the absolute minimum validated variance
may not be the optimal number in terms of how many factors to interpret as systematic
variation. A punishing constant of 1 percent relative to the residual variance after 0 factors,
i.e. after centering and scaling, is added for each factor to avoid being too optimistic. The
logic is (see attachment for clarity):
Aopt = 0
FOR factors 1:AMax
Difference = (ResidualVariance(a) + a*0.01*ResidualVariance(0)) -(ResidualVariance(a+1) + (a+1)*0.01*ResidualVariance(0))
IF Difference > 0
Aopt = Aopt + 1
ELSE
BREAK
END
END

I have tested the Unscrambler method you have provided. However, the results were not match with the Unscrambler recommend.  The Y residual variance for Brix, dry matter and Flesh color had been uploaded. Could you help me to test the results for how Unscrambler software to determine optimal factor?

Thanks

How Unscrambler software to calculate Y Total Residual Variance when factor is zero (Factor-0).

Thanks.

Xudong