F-statistic and degree of class separation

Report
Question
views

Please explain why do you think this question should be reported?

Report Cancel

Hello,

Is it possible to do variable selection methods with the Unscrambler X program. I am working with ICP-MS data and want to calculate F-ratios for each element and subsequently perform degree of class separation to try and refine the elements i’m using within my PCA plots, for the purpose of maximising separation between classes.

Any information would be helpful,

Thank you,
Josh

About the Author

Thread Reply

  1. 28/02/2020 at 12:35 pm

    Please briefly explain why you feel this answer should be reported .

    Report Cancel

    Variable selection can be performed in Unscrambler by means of an uncertainty test when the selected validation method is cross validation. When setting up your PCA model, go to the ‘Validation’ tab and tick on the option ‘Uncertainty test’, as shown in the attached figure. Note that ‘Cross validation’ must be selected as the cross validation method.

    The results will then include a table with a p-value for each variable (in your case, each element) under the PCA model node that appears in the Project Navigator (at the left of the Unscrambler interface): PCA->Validation->p-values for Loadings. Elements with a p-value <0.05 can be considered significant for the PCA model.

    Note that in your case, since you want to find the optimal elements that separate certain classes by which you color each sample in the PCA scores plot (using Sample Grouping), it might be a better idea to perform the uncertainty test for a supervised classification. For this, build a partial least squares regression model using the elements ICP-MS data as your X and dummy variables 1 and 0 to indicate classes as your Y, where 1 would indicate class membership and 0 no class membership. (For example, if your samples belong to one of two classes: ‘compliant’ or ‘not compliant’, your Y variable for PLS would be a single column showing a ‘1’ at the rows corresponding to compliant samples and a ‘0’ at the rows corresponding to non-compliant samples. For 3 or more classes, you would need one column for each class.) Tick on the ‘Uncertainty test’ option in the validation tab and build the model. From the resulting PLS model node, go to PLS->Validation->p-values for Beta Coefficients and select the elements with p<0.05 as significant. Build a PCA model using only the selected elements. For PLS, significant variables are automatically marked  in the PLS overview. From the loadings plot, it is possible to create a new columnset for the marked variables (elements) by right-clicking and selecting ‘Create Range’.

    Details on the uncertainty test can be found in the Unscrambler help menu: Help->Contents->Validation->Theory->Uncertainty testing with cross validation.


    Attachment

Leave an answer