Packages

  • package root

    Snowpark by itself is a powerful library, but still some utility functions can always help.

    Snowpark by itself is a powerful library, but still some utility functions can always help.

    snowpark-extensions

    Snowpark by itself is a powerful library, but still some utility functions can always help.

    The source code for this library is available here

    Installation

    With Maven you can add something like this to your POM:

    <dependency>
        <groupId>net.mobilize.snowpark-extensions</groupId>
        <artifactId>snowparkextensions</artifactId>
        <version>0.0.9</version>
    </dependency>
    

    or with sbt use

    libraryDependencies += "net.mobilize.snowpark-extensions" % "snowparkextensions" % "0.0.16"
    

    Usage

    just import it at the top of your file and it will automatically extend your snowpark package.

    For example:

    
    
    
    import com.snowflake.snowpark_extensions.Extensions._
    import com.snowflake.snowpark.Session
    
    val new_session = Session.builder.from_snowsql().appName("app1").create()
    

    Extensions

    See Session Extensions
    See Session Builder Extensions
    See DataFrame Extensions
    See Column Extensions
    See Function Extensions

    Definition Classes
    root
  • package com
    Definition Classes
    root
  • package snowflake
    Definition Classes
    com
  • package snowpark_extensions
    Definition Classes
    snowflake
  • package implicits
    Definition Classes
    snowpark_extensions
  • object DataFrameExtensions

    DataFrame Extensions object containing implicit functions to the Snowpark DataFrame object.

    DataFrame Extensions object containing implicit functions to the Snowpark DataFrame object.

    Definition Classes
    implicits
  • ExtendedDataFrame

class ExtendedDataFrame extends AnyRef

DataFrame extension class.

Linear Supertypes
AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. ExtendedDataFrame
  2. AnyRef
  3. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Instance Constructors

  1. new ExtendedDataFrame(df: DataFrame)

    df

    DataFrame to extend functionality.

Value Members

  1. final def !=(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  4. final def asInstanceOf[T0]: T0
    Definition Classes
    Any
  5. def cache(): DataFrame

    Caches the result of the DataFrame and creates a new Dataframe, whose operations won't affect the original DataFrame.

    Caches the result of the DataFrame and creates a new Dataframe, whose operations won't affect the original DataFrame.

    returns

    New cached DataFrame.

  6. def clone(): AnyRef
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... ) @native() @HotSpotIntrinsicCandidate()
  7. def collectAsList(): List[Row]

    Implementation of Spark's collectAsList function.

    Implementation of Spark's collectAsList function. Collects the DataFrame and converts it to a java.util.List[Row] object.

    returns

    A java.util.List[Row] representation of the DataFrame.

  8. def columns: Seq[String]

    Function that returns a Seq of strings with the DataFrame's column names.

    Function that returns a Seq of strings with the DataFrame's column names.

    returns

    list of columns in the DataFrame

  9. def dropDuplicates(columns: Seq[String]): DataFrame

    Overload of dropDuplicates to comply with Spark's implementations of dropDuplicates function.

    Overload of dropDuplicates to comply with Spark's implementations of dropDuplicates function. Unspecified columns from the dataframe will be preserved, but won't be considered to calculate duplicates. For rows with different values on unspecified columns, it will return the first row.

    columns

    List of columns to group by to detect the duplicates.

    returns

    DataFrame without duplicates on the specified columns.

  10. final def eq(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  11. def equals(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  12. def filter(conditionExpr: String): DataFrame

    Overload of filter to comply with Spark's implementation of filter function when receiving a SQL expression.

    Overload of filter to comply with Spark's implementation of filter function when receiving a SQL expression.

    conditionExpr

    SQL conditional expression to filter the dataset on it.

    returns

    DataFrame filtered on the specified SQL expression.

  13. final def getClass(): Class[_]
    Definition Classes
    AnyRef → Any
    Annotations
    @native() @HotSpotIntrinsicCandidate()
  14. def hashCode(): Int
    Definition Classes
    AnyRef → Any
    Annotations
    @native() @HotSpotIntrinsicCandidate()
  15. def head(n: Int): Array[Row]

    Equivalent to Spark's head.

    Equivalent to Spark's head. Returns the first N rows. Spark's default behavior with Empty DataFrames throw an error when executing this function, however this behavior isn't replicated exactly.

    n

    Amount of rows to return.

    returns

    Array with the amount of rows specified in the parameter.

  16. def head(): Option[Row]

    Equivalent to Spark's head.

    Equivalent to Spark's head. Returns the first row. Since this is an Option element, a .get is required to get the actual row. Spark's default behavior with Empty DataFrames throw an error when executing this function, however this behavior isn't replicated exactly.

    returns

    The first row of the DataFrame.

  17. final def isInstanceOf[T0]: Boolean
    Definition Classes
    Any
  18. final def ne(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  19. final def notify(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native() @HotSpotIntrinsicCandidate()
  20. final def notifyAll(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native() @HotSpotIntrinsicCandidate()
  21. def orderBy(sortCol: String, sortCols: String*): DataFrame

    Alias for Spark OrderBy function.

    Alias for Spark OrderBy function. Receives column names. While testing Spark OrderBy, it doesn't accept SQL Expressions, only column names.

    sortCol

    Column name 1

    sortCols

    Variable column names

    returns

    DataFrame filtered on the variable names.

  22. def orderBy(sortExprs: Column*): DataFrame

    Alias for Spark OrderBy function.

    Alias for Spark OrderBy function. Receives columns or column expressions.

    sortExprs

    Column expressions to order the dataset by.

    returns

    Returns the dataset ordered by the specified expressions

  23. def printSchema(): Unit

    Alias for Spark printSchema.

    Alias for Spark printSchema. This is a shortcut to schema.printTreeString(). Prints the schema of the DataFrame in a tree format. Includes column names, data types and if they're nullable or not. The results between this function and Spark's implementation is not identical, but it is very similar.

  24. def selectExpr(exprs: String*): DataFrame

    Equivalent to Spark's selectExpr.

    Equivalent to Spark's selectExpr. Selects columns based on the expressions specified. They could either be column names, or calls to other functions such as conversions, case expressions, among others.

    exprs

    Expressions to apply to select from the DataFrame.

    returns

    DataFrame with the selected expressions as columns. Unspecified columns are not included.

  25. def startCols: ColumnsSimplifier

    Column simplifier object to increase performance of withColumns functionality.

    Column simplifier object to increase performance of withColumns functionality.

    returns

    Column simplifier class

  26. final def synchronized[T0](arg0: ⇒ T0): T0
    Definition Classes
    AnyRef
  27. def take(n: Int): Array[Row]

    Equivalent to Spark's take.

    Equivalent to Spark's take. Returns the first N rows. Spark's implementation of this function differs from the one from head. In paper they have the same functionality, but in practice they have different implementations since head is mostly used for returning small numbers whereas take can be used for larger amounts of rows. This function does not make a difference on implementation from head.

    n

    Amount of rows to return.

    returns

    Array with the amount of rows specified in the parameter.

  28. def toJSON: DataFrame

    Implementation of Spark's toJSON function.

    Implementation of Spark's toJSON function. Converts each row into a JSON object and returns a DataFrame with a single column.

    returns

    DataFrame with 1 column whose value corresponds to a JSON object of the row.

  29. def toString(): String
    Definition Classes
    AnyRef → Any
  30. def transform(func: (DataFrame) ⇒ DataFrame): DataFrame

    Transforms the DataFrame according to the function from the parameter.

    Transforms the DataFrame according to the function from the parameter.

    func

    Function to apply to the DataFrame.

    returns

    DataFrame with the transformation applied.

  31. final def wait(arg0: Long, arg1: Int): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  32. final def wait(arg0: Long): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... ) @native()
  33. final def wait(): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  34. def withColumnRenamed(existingName: String, newName: String): DataFrame

    Function that returns the dataframe with a column renamed.

    Function that returns the dataframe with a column renamed.

    existingName

    Name of the column to rename.

    newName

    New name to give to the column.

    returns

    DataFrame with the column renamed.

Deprecated Value Members

  1. def finalize(): Unit
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] ) @Deprecated @deprecated
    Deprecated

    (Since version ) see corresponding Javadoc for more information.

Inherited from AnyRef

Inherited from Any

Ungrouped