abacusai.pipeline
Classes
A config class for python function arguments |
|
Code source for python-based custom feature groups and models |
|
A reference to a pipeline to the objects it is run on. |
|
A step in a pipeline. |
|
A version of a pipeline. |
|
A Pipeline For Steps. |
Module Contents
- class abacusai.pipeline.PythonFunctionArgument
Bases:
abacusai.api_class.abstract.ApiClassA config class for python function arguments
- Parameters:
variable_type (PythonFunctionArgumentType) – The type of the python function argument
name (str) – The name of the python function variable
is_required (bool) – Whether the argument is required
value (Any) – The value of the argument
pipeline_variable (str) – The name of the pipeline variable to use as the value
- variable_type: abacusai.api_class.enums.PythonFunctionArgumentType
- value: Any
- class abacusai.pipeline.CodeSource(client, sourceType=None, sourceCode=None, applicationConnectorId=None, applicationConnectorInfo=None, packageRequirements=None, status=None, error=None, publishingMsg=None, moduleDependencies=None)
Bases:
abacusai.return_class.AbstractApiClassCode source for python-based custom feature groups and models
- Parameters:
client (ApiClient) – An authenticated API Client instance
sourceType (str) – The type of the source, one of TEXT, PYTHON, FILE_UPLOAD, or APPLICATION_CONNECTOR
sourceCode (str) – If the type of the source is TEXT, the raw text of the function
applicationConnectorId (str) – The Application Connector to fetch the code from
applicationConnectorInfo (str) – Args passed to the application connector to fetch the code
packageRequirements (list) – The pip package dependencies required to run the code
status (str) – The status of the code and validations
error (str) – If the status is failed, an error message describing what went wrong
publishingMsg (dict) – Warnings in the source code
moduleDependencies (list) – The list of internal modules dependencies required to run the code
- __repr__()
Return repr(self).
- to_dict()
Get a dict representation of the parameters in this class
- Returns:
The dict value representation of the class parameters
- Return type:
- import_as_cell()
Adds the source code as an unexecuted cell in the notebook.
- class abacusai.pipeline.PipelineReference(client, pipelineReferenceId=None, pipelineId=None, objectType=None, datasetId=None, modelId=None, deploymentId=None, batchPredictionDescriptionId=None, modelMonitorId=None, notebookId=None, featureGroupId=None)
Bases:
abacusai.return_class.AbstractApiClassA reference to a pipeline to the objects it is run on.
- Parameters:
client (ApiClient) – An authenticated API Client instance
pipelineReferenceId (str) – The id of the reference.
pipelineId (str) – The id of the pipeline for the reference.
objectType (str) – The object type of the reference.
datasetId (str) – The dataset id of the reference.
modelId (str) – The model id of the reference.
deploymentId (str) – The deployment id of the reference.
batchPredictionDescriptionId (str) – The batch prediction description id of the reference.
modelMonitorId (str) – The model monitor id of the reference.
notebookId (str) – The notebook id of the reference.
featureGroupId (str) – The feature group id of the reference.
- __repr__()
Return repr(self).
- class abacusai.pipeline.PipelineStep(client, pipelineStepId=None, pipelineId=None, stepName=None, pipelineName=None, createdAt=None, updatedAt=None, pythonFunctionId=None, stepDependencies=None, cpuSize=None, memory=None, timeout=None, pythonFunction={}, codeSource={})
Bases:
abacusai.return_class.AbstractApiClassA step in a pipeline.
- Parameters:
client (ApiClient) – An authenticated API Client instance
pipelineStepId (str) – The reference to this step.
pipelineId (str) – The reference to the pipeline this step belongs to.
stepName (str) – The name of the step.
pipelineName (str) – The name of the pipeline this step is a part of.
createdAt (str) – The date and time which this step was created.
updatedAt (str) – The date and time when this step was last updated.
pythonFunctionId (str) – The python function_id.
stepDependencies (list[str]) – List of steps this step depends on.
cpuSize (str) – CPU size specified for the step function.
memory (int) – Memory in GB specified for the step function.
timeout (int) – Timeout for the step in minutes, default is 300 minutes.
pythonFunction (PythonFunction) – Information about the python function for the step.
codeSource (CodeSource) – Information about the source code of the step function.
- __repr__()
Return repr(self).
- to_dict()
Get a dict representation of the parameters in this class
- Returns:
The dict value representation of the class parameters
- Return type:
- delete()
Deletes a step from a pipeline.
- Parameters:
pipeline_step_id (str) – The ID of the pipeline step.
- update(function_name=None, source_code=None, step_input_mappings=None, output_variable_mappings=None, step_dependencies=None, package_requirements=None, cpu_size=None, memory=None, timeout=None)
Creates a step in a given pipeline.
- Parameters:
function_name (str) – The name of the Python function.
source_code (str) – Contents of a valid Python source code file. The source code should contain the transform feature group functions. A list of allowed imports and system libraries for each language is specified in the user functions documentation section.
step_input_mappings (List) – List of Python function arguments.
output_variable_mappings (List) – List of Python function outputs.
step_dependencies (list) – List of step names this step depends on.
package_requirements (list) – List of package requirement strings. For example: [‘numpy==1.2.3’, ‘pandas>=1.4.0’].
cpu_size (str) – Size of the CPU for the step function.
memory (int) – Memory (in GB) for the step function.
timeout (int) – Timeout for the pipeline step, default is 300 minutes.
- Returns:
Object describing the pipeline.
- Return type:
- rename(step_name)
Renames a step in a given pipeline.
- Parameters:
step_name (str) – The name of the step.
- Returns:
Object describing the pipeline.
- Return type:
- refresh()
Calls describe and refreshes the current object’s fields
- Returns:
The current object
- Return type:
- class abacusai.pipeline.PipelineVersion(client, pipelineName=None, pipelineId=None, pipelineVersion=None, createdAt=None, updatedAt=None, completedAt=None, status=None, error=None, stepVersions={}, codeSource={}, pipelineVariableMappings={})
Bases:
abacusai.return_class.AbstractApiClassA version of a pipeline.
- Parameters:
client (ApiClient) – An authenticated API Client instance
pipelineName (str) – The name of the pipeline this step is a part of.
pipelineId (str) – The reference to the pipeline this step belongs to.
pipelineVersion (str) – The reference to this pipeline version.
createdAt (str) – The date and time which this pipeline version was created.
updatedAt (str) – The date and time which this pipeline version was updated.
completedAt (str) – The date and time which this pipeline version was updated.
status (str) – The status of the pipeline version.
error (str) – The relevant error, if the status is FAILED.
stepVersions (PipelineStepVersion) – A list of the pipeline step versions.
codeSource (CodeSource) – information on the source code
pipelineVariableMappings (PythonFunctionArgument) – A description of the function variables into the pipeline.
- __repr__()
Return repr(self).
- to_dict()
Get a dict representation of the parameters in this class
- Returns:
The dict value representation of the class parameters
- Return type:
- refresh()
Calls describe and refreshes the current object’s fields
- Returns:
The current object
- Return type:
- describe()
Describes a specified pipeline version
- Parameters:
pipeline_version (str) – Unique string identifier for the pipeline version
- Returns:
Object describing the pipeline version
- Return type:
- reset(steps=None, include_downstream_steps=True)
Reruns a pipeline version for the given steps and downstream steps if specified.
- Parameters:
- Returns:
Object describing the pipeline version
- Return type:
- list_logs()
Gets the logs for the steps in a given pipeline version.
- Parameters:
pipeline_version (str) – The id of the pipeline version.
- Returns:
Object describing the logs for the steps in the pipeline.
- Return type:
- skip_pending_steps()
Skips pending steps in a pipeline version.
- Parameters:
pipeline_version (str) – The id of the pipeline version.
- Returns:
Object describing the pipeline version
- Return type:
- class abacusai.pipeline.AbstractApiClass(client, id)
- __eq__(other)
Return self==value.
- _get_attribute_as_dict(attribute)
- class abacusai.pipeline.Pipeline(client, pipelineName=None, pipelineId=None, createdAt=None, notebookId=None, cron=None, nextRunTime=None, isProd=None, warning=None, createdBy=None, steps={}, pipelineReferences={}, latestPipelineVersion={}, codeSource={}, pipelineVariableMappings={})
Bases:
abacusai.return_class.AbstractApiClassA Pipeline For Steps.
- Parameters:
client (ApiClient) – An authenticated API Client instance
pipelineName (str) – The name of the pipeline this step is a part of.
pipelineId (str) – The reference to the pipeline this step belongs to.
createdAt (str) – The date and time which the pipeline was created.
notebookId (str) – The reference to the notebook this pipeline belongs to.
cron (str) – A cron-style string that describes when this refresh policy is to be executed in UTC
nextRunTime (str) – The next time this pipeline will be run.
isProd (bool) – Whether this pipeline is a production pipeline.
warning (str) – Warning message for possible errors that might occur if the pipeline is run.
createdBy (str) – The email of the user who created the pipeline
steps (PipelineStep) – A list of the pipeline steps attached to the pipeline.
pipelineReferences (PipelineReference) – A list of references from the pipeline to other objects
latestPipelineVersion (PipelineVersion) – The latest version of the pipeline.
codeSource (CodeSource) – information on the source code
pipelineVariableMappings (PythonFunctionArgument) – A description of the function variables into the pipeline.
- __repr__()
Return repr(self).
- to_dict()
Get a dict representation of the parameters in this class
- Returns:
The dict value representation of the class parameters
- Return type:
- refresh()
Calls describe and refreshes the current object’s fields
- Returns:
The current object
- Return type:
- describe()
Describes a given pipeline.
- update(project_id=None, pipeline_variable_mappings=None, cron=None, is_prod=None)
Updates a pipeline for executing multiple steps.
- Parameters:
project_id (str) – A unique string identifier for the pipeline.
pipeline_variable_mappings (List) – List of Python function arguments for the pipeline.
cron (str) – A cron-like string specifying the frequency of the scheduled pipeline runs.
is_prod (bool) – Whether the pipeline is a production pipeline or not.
- Returns:
An object that describes a Pipeline.
- Return type:
- rename(pipeline_name)
Renames a pipeline.
- list_versions(limit=200)
Lists the pipeline versions for a specified pipeline
- Parameters:
limit (int) – The maximum number of pipeline versions to return.
- Returns:
A list of pipeline versions.
- Return type:
- run(pipeline_variable_mappings=None)
Runs a specified pipeline with the arguments provided.
- Parameters:
pipeline_variable_mappings (List) – List of Python function arguments for the pipeline.
- Returns:
The object describing the pipeline
- Return type:
- create_step(step_name, function_name=None, source_code=None, step_input_mappings=None, output_variable_mappings=None, step_dependencies=None, package_requirements=None, cpu_size=None, memory=None, timeout=None)
Creates a step in a given pipeline.
- Parameters:
step_name (str) – The name of the step.
function_name (str) – The name of the Python function.
source_code (str) – Contents of a valid Python source code file. The source code should contain the transform feature group functions. A list of allowed imports and system libraries for each language is specified in the user functions documentation section.
step_input_mappings (List) – List of Python function arguments.
output_variable_mappings (List) – List of Python function outputs.
step_dependencies (list) – List of step names this step depends on.
package_requirements (list) – List of package requirement strings. For example: [‘numpy==1.2.3’, ‘pandas>=1.4.0’].
cpu_size (str) – Size of the CPU for the step function.
memory (int) – Memory (in GB) for the step function.
timeout (int) – Timeout for the step in minutes, default is 300 minutes.
- Returns:
Object describing the pipeline.
- Return type:
- describe_step_by_name(step_name)
Describes a pipeline step by the step name.
- Parameters:
step_name (str) – The name of the step.
- Returns:
An object describing the pipeline step.
- Return type:
- unset_refresh_schedule()
Deletes the refresh schedule for a given pipeline.
- pause_refresh_schedule()
Pauses the refresh schedule for a given pipeline.
- resume_refresh_schedule()
Resumes the refresh schedule for a given pipeline.
- create_step_from_function(step_name, function, step_input_mappings=None, output_variable_mappings=None, step_dependencies=None, package_requirements=None, cpu_size=None, memory=None)
Creates a step in the pipeline from a python function.
- Parameters:
step_name (str) – The name of the step.
function (callable) – The python function.
step_input_mappings (List[PythonFunctionArguments]) – List of Python function arguments.
output_variable_mappings (List[OutputVariableMapping]) – List of Python function ouputs.
step_dependencies (List[str]) – List of step names this step depends on.
package_requirements (list) – List of package requirement strings. For example: [‘numpy==1.2.3’, ‘pandas>=1.4.0’].
cpu_size (str) – Size of the CPU for the step function.
memory (int) – Memory (in GB) for the step function.