com.snowflake.snowpark_extensions.implicits.DataFrameExtensions

ExtendedDataFrame

class ExtendedDataFrame extends AnyRef

DataFrame extension class.

Linear Supertypes

AnyRef, Any

Ordering

Alphabetic
By Inheritance

Inherited

ExtendedDataFrame
AnyRef
Any

Hide All
Show All

Visibility

Public
All

Instance Constructors

new ExtendedDataFrame(df: DataFrame)
df
DataFrame to extend functionality.

Value Members

final def !=(arg0: Any): Boolean

Definition Classes
AnyRef → Any
final def ##(): Int

Definition Classes
AnyRef → Any
final def ==(arg0: Any): Boolean

Definition Classes
AnyRef → Any
final def asInstanceOf[T0]: T0

Definition Classes
Any
def cache(): DataFrame
Caches the result of the DataFrame and creates a new Dataframe, whose operations won't affect the original DataFrame.
Caches the result of the DataFrame and creates a new Dataframe, whose operations won't affect the original DataFrame.
returns
New cached DataFrame.
def clone(): AnyRef

Attributes
protected[lang]
Definition Classes
AnyRef
Annotations
@throws( ... ) @native() @HotSpotIntrinsicCandidate()
def collectAsList(): List[Row]
Implementation of Spark's collectAsList function.
Implementation of Spark's collectAsList function. Collects the DataFrame and converts it to a java.util.List[Row] object.
returns
A java.util.List[Row] representation of the DataFrame.
def columns: Seq[String]
Function that returns a Seq of strings with the DataFrame's column names.
Function that returns a Seq of strings with the DataFrame's column names.
returns
list of columns in the DataFrame
def dropDuplicates(columns: Seq[String]): DataFrame
Overload of dropDuplicates to comply with Spark's implementations of dropDuplicates function.
Overload of dropDuplicates to comply with Spark's implementations of dropDuplicates function. Unspecified columns from the dataframe will be preserved, but won't be considered to calculate duplicates. For rows with different values on unspecified columns, it will return the first row.
columns
List of columns to group by to detect the duplicates.
returns
DataFrame without duplicates on the specified columns.
final def eq(arg0: AnyRef): Boolean

Definition Classes
AnyRef
def equals(arg0: Any): Boolean

Definition Classes
AnyRef → Any
def filter(conditionExpr: String): DataFrame
Overload of filter to comply with Spark's implementation of filter function when receiving a SQL expression.
Overload of filter to comply with Spark's implementation of filter function when receiving a SQL expression.
conditionExpr
SQL conditional expression to filter the dataset on it.
returns
DataFrame filtered on the specified SQL expression.
final def getClass(): Class[_]

Definition Classes
AnyRef → Any
Annotations
@native() @HotSpotIntrinsicCandidate()
def hashCode(): Int

Definition Classes
AnyRef → Any
Annotations
@native() @HotSpotIntrinsicCandidate()
def head(n: Int): Array[Row]
Equivalent to Spark's head.
Equivalent to Spark's head. Returns the first N rows. Spark's default behavior with Empty DataFrames throw an error when executing this function, however this behavior isn't replicated exactly.
n
Amount of rows to return.
returns
Array with the amount of rows specified in the parameter.
def head(): Option[Row]
Equivalent to Spark's head.
Equivalent to Spark's head. Returns the first row. Since this is an Option element, a .get is required to get the actual row. Spark's default behavior with Empty DataFrames throw an error when executing this function, however this behavior isn't replicated exactly.
returns
The first row of the DataFrame.
final def isInstanceOf[T0]: Boolean

Definition Classes
Any
final def ne(arg0: AnyRef): Boolean

Definition Classes
AnyRef
final def notify(): Unit

Definition Classes
AnyRef
Annotations
@native() @HotSpotIntrinsicCandidate()
final def notifyAll(): Unit

Definition Classes
AnyRef
Annotations
@native() @HotSpotIntrinsicCandidate()
def orderBy(sortCol: String, sortCols: String*): DataFrame
Alias for Spark OrderBy function.
Alias for Spark OrderBy function. Receives column names. While testing Spark OrderBy, it doesn't accept SQL Expressions, only column names.
sortCol
Column name 1
sortCols
Variable column names
returns
DataFrame filtered on the variable names.
def orderBy(sortExprs: Column*): DataFrame
Alias for Spark OrderBy function.
Alias for Spark OrderBy function. Receives columns or column expressions.
sortExprs
Column expressions to order the dataset by.
returns
Returns the dataset ordered by the specified expressions
def printSchema(): Unit
Alias for Spark printSchema.
Alias for Spark printSchema. This is a shortcut to schema.printTreeString(). Prints the schema of the DataFrame in a tree format. Includes column names, data types and if they're nullable or not. The results between this function and Spark's implementation is not identical, but it is very similar.
def selectExpr(exprs: String*): DataFrame
Equivalent to Spark's selectExpr.
Equivalent to Spark's selectExpr. Selects columns based on the expressions specified. They could either be column names, or calls to other functions such as conversions, case expressions, among others.
exprs
Expressions to apply to select from the DataFrame.
returns
DataFrame with the selected expressions as columns. Unspecified columns are not included.
def startCols: ColumnsSimplifier
Column simplifier object to increase performance of withColumns functionality.
Column simplifier object to increase performance of withColumns functionality.
returns
Column simplifier class
final def synchronized[T0](arg0: ⇒ T0): T0

Definition Classes
AnyRef
def take(n: Int): Array[Row]
Equivalent to Spark's take.
Equivalent to Spark's take. Returns the first N rows. Spark's implementation of this function differs from the one from head. In paper they have the same functionality, but in practice they have different implementations since head is mostly used for returning small numbers whereas take can be used for larger amounts of rows. This function does not make a difference on implementation from head.
n
Amount of rows to return.
returns
Array with the amount of rows specified in the parameter.
def toJSON: DataFrame
Implementation of Spark's toJSON function.
Implementation of Spark's toJSON function. Converts each row into a JSON object and returns a DataFrame with a single column.
returns
DataFrame with 1 column whose value corresponds to a JSON object of the row.
def toString(): String

Definition Classes
AnyRef → Any
def transform(func: (DataFrame) ⇒ DataFrame): DataFrame
Transforms the DataFrame according to the function from the parameter.
Transforms the DataFrame according to the function from the parameter.
func
Function to apply to the DataFrame.
returns
DataFrame with the transformation applied.
final def wait(arg0: Long, arg1: Int): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )
final def wait(arg0: Long): Unit

Definition Classes
AnyRef
Annotations
@throws( ... ) @native()
final def wait(): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )
def withColumnRenamed(existingName: String, newName: String): DataFrame
Function that returns the dataframe with a column renamed.
Function that returns the dataframe with a column renamed.
existingName
Name of the column to rename.
newName
New name to give to the column.
returns
DataFrame with the column renamed.

Deprecated Value Members

def finalize(): Unit

Attributes
protected[lang]
Definition Classes
AnyRef
Annotations
@throws( classOf[java.lang.Throwable] ) @Deprecated @deprecated
Deprecated
(Since version ) see corresponding Javadoc for more information.

Packages

snowpark-extensions

Installation

Usage

Extensions

ExtendedDataFrame

class ExtendedDataFrame extends AnyRef

Instance Constructors

Value Members

Deprecated Value Members

Inherited from AnyRef

Inherited from Any

Ungrouped

Packages

snowpark-extensions

Installation

Usage

Extensions

ExtendedDataFrame 

class ExtendedDataFrame extends AnyRef

Instance Constructors

Value Members

Deprecated Value Members

Inherited from AnyRef

Inherited from Any

Ungrouped

ExtendedDataFrame