Skip to main content

2.5.0 release notes

This release of AI-Link continues the trend of optimizing operations for programmatic interaction with AtScale and expands available methods of authorization:

  • Expanded CRUD operation support for dimensions and hierarchies. This gives AI-Link full coverage for Create operations on Atscale objects
  • Enabled OAuth based authentication for Azure AD users
  • Transition of the Connection object to private. This simplifies the customer experience by removing an object a user must manage when using AI-Link
  • Beta features for semantic layer backed Data Science and Data Analytic operations via the new Stats module

Please refer to our API documentation for the latest syntax to use with AI-Link. See below for updates associated with this release.

Privatization of the Connection object

  • Connection object is now private, reducing the number of objects a user must manage in a notebook
  • User facing functionality that previously existed in the Connection is now accessible via the Client, Project, or DataModel
  • Alternate authentication flow added to support users on Azure OAuth

New Python Helper Functions for Programmatic Interaction

  • CRUD operation support: added additional functions create and update various objects in the semantic layer
  • Renaming DataModel functions: various methods in the DataModel have been renamed to reduce ambiguity
  • Renaming function parameters: reviewed all customer facing functions and standardized parameter names for ease of use

Non-Functional Updates

  • The AI-Link package is now hosted on PyPi, making it easier to install/manage
  • Renaming of all instances of phrase 'unpublished' to 'draft' to be better in line with broader AtScale syntax
  • Bug fixes: bug fixes addressing roleplaying and features with different key/value columns when joining objects to the semantic layer

Changelog for Syntax Updates

enums.py

NEW CLASS:

  • CheckFeaturesErrMsg
    • an enum for specifying the sort of error message to be displayed in standard feature checks.
    • this does not have direct customer use cases but is publicly visible.

connection.py

UPDATED CLASS:

  • Connection class has been made private and is now _Connection
    • this change will better align AI-Link with Client/Server design and reduce the
      number of objects a user must import

    • all customer facing functionality has been moved to the Client

client.py::Client

NEW FUNCTIONS:

  • get_connected_warehouses

    • gets metadata on all warehouses visible to the connected client
  • get_connected_databases

    • gets a list of databases the organization can access in the provided warehouse
  • get_connected_schemas

    • get a list of schemas the organization can access in the provided warehouse and database
  • get_connected_tables

    • get a list of tables the organization can access in the provided warehouse, database, and schema
  • get_table_columns

    • get metadata on all columns in a given table
  • get_query_columns

    • get all columns of a direct query as they are represented by AtScale
  • clone_project

    • clones the provided project

UPDATED FUNCTIONS:

  • __init__

    • removed optional parameter engine_port.

      • This no longer needed as the information is available via an internal endpoint
    • If Azure OAuth is identified as the authentication method, the user will be prompted to retrieve a token from the browser

    • Added verify parameter to pass or disable cert verification in requests session

    • adjusted default values of jdbc_driver_class and jdbc_driver_path to None

  • select_project

    • Renamed optional parameter unpublished_project_id to draft_project_id

    • added optional parameter published_project_id

      • allows users to confidently use select_project get a Project object without user prompt
    • added optional parameter include_soft_publish

REMOVED FUNCTIONS:

  • getters and setters for atconn parameter
    • the connection is now private, it should not meant for user interaction

data_model.py::DataModel

NEW FUNCTIONS:

  • get_dataset

    • returns the metadata for a dataset
  • update_query_dataset

    • allows users to update metadata of a query dataset
  • create_dimension

    • creates a new dimension in the data model
  • create_hierarchy

    • adds a new hierarchy to a dimension
  • add_level_to_hierarchy

    • adds a new level to a hierarchy
  • update_dimension

    • update the dimension's metadata
  • update_hierarchy

    • update the hierarchy's metadata
  • submit_atscale_query

    • submits custom query to AtScale data model

UPDATED FUNCTIONS:

  • get_dataset_names

    • added optional parameter include_unused to include datasets not used in the data model
  • The raise_multikey_warning optional parameter was added to the following functions. It sets if a warning should be raised when a multikey is found

    • DataModel.get_data
    • DataModel.get_data_direct
    • DataModel.get_data_jdbc
    • DataModel.get_data_spark_jdbc
    • DataModel.get_data_spark
    • DataModel.get_database_query
  • create_secondary_attribute

    • renamed parameter new_attribute_name-> new_feature_name
  • create_denormalized_categorical_feature

    • renamed parameter name-> new_feature_name
  • create_aggregate_feature

    • renamed parameter name-> new_feature_name
    • renamed parameter name-> new_feature_name
  • create_aggregate_feature

    • renamed parameter name-> new_perspective_name
  • bulk_operator

    • additional optional parameter error_limit added
      • Defaults to 5, the maximum number of similar errors to collect before abbreviating
  • autogen_semantic_layer

    • additional optional parameter default_aggregation_type added
      • The default aggregation type for numeric columns. Defaults to SUM
  • function get_data_spark renamed to get_data_spark_jdbc to be more accurate to function operation

    • Renamed parameter sparkSession -> spark_session to better fit AI-Link code standards
  • function get_data_spark_from_spark renamed to get_data_spark to be more accurate to function operation

    • Removed parameter dbconn
  • function writeback_spark renamed to writeback_spark_jdbc to be more accurate to function operation

  • function writeback_spark_to_spark renamed to writeback_spark to be more accurate to function operation

    • parameter table_name is now required
    • Added parameter schema
    • Added optional parameter database
    • Removed parameter alt_database_path

bigquery.py::BigQuery

UPDATED FUNCTIONS:

  • execute_statements
    • renamed parameter statements->statement_list

databricks.py::Databricks

UPDATED FUNCTIONS:

  • execute_statements
    • renamed parameter statements->statement_list

iris.py::Iris

UPDATED FUNCTIONS:

  • execute_statements
    • renamed parameter statements->statement_list

mssql.py::MSSQL

UPDATED FUNCTIONS:

  • execute_statements
    • renamed parameter statements->statement_list

postgres.py::Postgres

UPDATED FUNCTIONS:

  • execute_statements
    • renamed parameter statements->statement_list

redshift.py::Redshift

UPDATED FUNCTIONS:

  • execute_statements
    • renamed parameter statements->statement_list

snowflake.py::Snowflake

UPDATED FUNCTIONS:

  • execute_statements
    • renamed parameter statements->statement_list

synapse.py::Synapse

UPDATED FUNCTIONS:

  • execute_statements
    • renamed parameter statements->statement_list

sql_connection.py::SQLConnection

UPDATED FUNCTIONS:

  • execute_statements
    • renamed parameter statements->statement_list

sqlalchemy_connection.py::SQLAlchemyConnection

UPDATED FUNCTIONS:

  • execute_statements
    • renamed parameter statements->statement_list

feature_engineering.py

NEW FUNCTIONS:

  • create_covariance_feature

    • creates a new feature off of the published project showing the covariance of two features
  • create_correlation_feature

    • creates a new feature off of the published project showing the correlation of two features

linear_regression.py

UPDATED FUNCTIONS:

  • linear_regression
    • parameter dbconn now accepts only SnowFlake connection objects
    • added optional parameter write_database that tables will be written to
    • added optional parameter write_schema that tables will be written to

pca.py

UPDATED FUNCTIONS:

  • pca
    • parameter dbconn now accepts only SnowFlake connection objects
    • added optional parameter write_database that tables will be written to
    • added optional parameter write_schema that tables will be written to

stats.py

NEW FUNCTIONS:

  • variance

    • calculates the variance of a given feature
  • covariance

    • calculates the covariance of a given feature
  • std

    • calculates the standard deviation of a given feature
  • corrcoef

    • calculates the correlation of 2 given features

project.py::Project

NEW FUNCTIONS:

  • update_snapshot
    • updates metadata of a snapshot

UPDATED CLASS:

  • Project
    • renamed class variable project_id to draft_project_id

UPDATED FUNCTIONS:

  • __init__

    • renamed parameter project_id to draft_project_id

    • removed parameter atconn

    • added parameter Client

    • added optional parameter include_soft_publish to specify if soft published should be included
      when looking for publishes

  • get_published_projects

    • added optional parameter include_soft_publish to specify if soft published should be included
      when looking for publishes
  • select_published_projects

    • added optional parameter include_soft_publish to specify if soft published should be included
      when looking for publishes
  • setter project_id renamed to draft_project_id to match class variable renaming

REMOVED FUNCTIONS:

  • getters and setters for atconn parameter
    the connection is now private, it should not meant for user interaction

  • clone removed in place of Client operations

prediction_utils.py

UPDATED FUNCTIONS:

  • join_udf
    • additional optional parameter roleplay_features added.