pca
atscale.eda.pca.pca
Performs principal component analysis (PCA) on the numeric features specified. This is only supported for Snowflake at this time.
- Parameters:
- dbconn (Snowflake) – The database connection that pca will interact with
- data_model (DataModel) – The data model corresponding to the features provided
- pc_num (int) – The number of principal components to be returned from the analysis. Must be in the range of [1, # of numeric features to be analyzed] (inclusive)
- numeric_features (List *[*str ]) – The query names of the numeric features to be analyzed via pca
- granularity_levels (List *[*str ]) – The query names of the categorical features corresponding to the level of granularity desired in numeric_features
- if_exists (enums.TableExistsAction , optional) – The default action that pca takes when creating a table with a preexisting name. Does not accept APPEND or IGNORE. Defaults to ERROR.
- write_database (str) – The database that pca will write tables to. Defaults to the database associated with the given dbconn.
- write_schema (str) – The schema that pca will write tables to. Defaults to the schema associated with the given dbconn.
- Returns: A pair of Dicts, the first containing the PCs and the second containing : their percent weights
- Return type: Tuple[DataFrame, DataFrame]