DataFusion Functions
This page is generated from the Apache DataFusion project's documents:
- DataFusion Scalar Functions
- DataFusion Scalar Functions (NEW)
- DataFusion Aggregate Functions
- DataFusion Window Functions
Scalar Functions
Scalar functions operate on a single row at a time and return a single value.
Math Functions
- abs
- acos
- acosh
- asin
- asinh
- atan
- atan2
- atanh
- cbrt
- ceil
- cos
- cosh
- cot
- degrees
- exp
- factorial
- floor
- gcd
- isnan
- iszero
- lcm
- ln
- log
- log10
- log2
- nanvl
- pi
- pow
- power
- radians
- random
- round
- signum
- sin
- sinh
- sqrt
- tan
- tanh
- trunc
abs
Returns the absolute value of a number.
abs(numeric_expression)
Arguments
- numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.
acos
Returns the arc cosine or inverse cosine of a number.
acos(numeric_expression)
Arguments
- numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.
acosh
Returns the area hyperbolic cosine or inverse hyperbolic cosine of a number.
acosh(numeric_expression)
Arguments
- numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.
asin
Returns the arc sine or inverse sine of a number.
asin(numeric_expression)
Arguments
- numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.
asinh
Returns the area hyperbolic sine or inverse hyperbolic sine of a number.
asinh(numeric_expression)
Arguments
- numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.
atan
Returns the arc tangent or inverse tangent of a number.
atan(numeric_expression)
Arguments
- numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.
atan2
Returns the arc tangent or inverse tangent of expression_y / expression_x.
atan2(expression_y, expression_x)
Arguments
- expression_y: First numeric expression to operate on. Can be a constant, column, or function, and any combination of arithmetic operators.
- expression_x: Second numeric expression to operate on. Can be a constant, column, or function, and any combination of arithmetic operators.
atanh
Returns the area hyperbolic tangent or inverse hyperbolic tangent of a number.
atanh(numeric_expression)
Arguments
- numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.
cbrt
Returns the cube root of a number.
cbrt(numeric_expression)
Arguments
- numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.
ceil
Returns the nearest integer greater than or equal to a number.
ceil(numeric_expression)
Arguments
- numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.
cos
Returns the cosine of a number.
cos(numeric_expression)
Arguments
- numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.
cosh
Returns the hyperbolic cosine of a number.
cosh(numeric_expression)
Arguments
- numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.
cot
Returns the cotangent of a number.
cot(numeric_expression)
Arguments
- numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.
degrees
Converts radians to degrees.
degrees(numeric_expression)
Arguments
- numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.
exp
Returns the base-e exponential of a number.
exp(numeric_expression)
Arguments
- numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.
factorial
Factorial. Returns 1 if value is less than 2.
factorial(numeric_expression)
Arguments
- numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.
floor
Returns the nearest integer less than or equal to a number.
floor(numeric_expression)
Arguments
- numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.
gcd
Returns the greatest common divisor of expression_x and expression_y. Returns 0 if both inputs are zero.
gcd(expression_x, expression_y)
Arguments
- expression_x: First numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.
- expression_y: Second numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.
isnan
Returns true if a given number is +NaN or -NaN otherwise returns false.
isnan(numeric_expression)
Arguments
- numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.
iszero
Returns true if a given number is +0.0 or -0.0 otherwise returns false.
iszero(numeric_expression)
Arguments
- numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.
lcm
Returns the least common multiple of expression_x and expression_y. Returns 0 if either input is zero.
lcm(expression_x, expression_y)
Arguments
- expression_x: First numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.
- expression_y: Second numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.
ln
Returns the natural logarithm of a number.
ln(numeric_expression)
Arguments
- numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.
log
Returns the base-x logarithm of a number. Can either provide a specified base, or if omitted then takes the base-10 of a number.
log(base, numeric_expression)
log(numeric_expression)
Arguments
- base: Base numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.
- numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.
log10
Returns the base-10 logarithm of a number.
log10(numeric_expression)
Arguments
- numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.
log2
Returns the base-2 logarithm of a number.
log2(numeric_expression)
Arguments
- numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.
nanvl
Returns the first argument if it's not NaN. Returns the second argument otherwise.
nanvl(expression_x, expression_y)
Arguments
- expression_x: Numeric expression to return if it's not NaN. Can be a constant, column, or function, and any combination of arithmetic operators.
- expression_y: Numeric expression to return if the first expression is NaN. Can be a constant, column, or function, and any combination of arithmetic operators.
pi
Returns an approximate value of π.
pi()
pow
Alias of power.
power
Returns a base expression raised to the power of an exponent.
power(base, exponent)
Arguments
- base: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.
- exponent: Exponent numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.
Aliases
- pow
radians
Converts degrees to radians.
radians(numeric_expression)
Arguments
- numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.
random
Returns a random float value in the range [0, 1). The random seed is unique to each row.
random()
round
Rounds a number to the nearest integer.
round(numeric_expression[, decimal_places])
Arguments
- numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.
- decimal_places: Optional. The number of decimal places to round to. Defaults to 0.
signum
Returns the sign of a number.
Negative numbers return -1.
Zero and positive numbers return 1.
signum(numeric_expression)
Arguments
- numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.
sin
Returns the sine of a number.
sin(numeric_expression)
Arguments
- numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.
sinh
Returns the hyperbolic sine of a number.
sinh(numeric_expression)
Arguments
- numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.
sqrt
Returns the square root of a number.
sqrt(numeric_expression)
Arguments
- numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.
tan
Returns the tangent of a number.
tan(numeric_expression)
Arguments
- numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.
tanh
Returns the hyperbolic tangent of a number.
tanh(numeric_expression)
Arguments
- numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.
trunc
Truncates a number to a whole number or truncated to the specified decimal places.
trunc(numeric_expression[, decimal_places])
Arguments
- numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.
- decimal_places: Optional. The number of decimal places to
truncate to. Defaults to 0 (truncate to a whole number). If
decimal_placesis a positive integer, truncates digits to the right of the decimal point. Ifdecimal_placesis a negative integer, replaces digits to the left of the decimal point with0.
Conditional Functions
coalesce
Returns the first of its arguments that is not null. Returns null if all arguments are null. This function is often used to substitute a default value for null values.
coalesce(expression1[, ..., expression_n])
Arguments
- expression1, expression_n: Expression to use if previous expressions are null. Can be a constant, column, or function, and any combination of arithmetic operators. Pass as many expression arguments as necessary.
Example
> select coalesce(null, null, 'datafusion');
+----------------------------------------+
| coalesce(NULL,NULL,Utf8("datafusion")) |
+----------------------------------------+
| datafusion |
+----------------------------------------+
greatest
Returns the greatest value in a list of expressions. Returns null if all expressions are null.
greatest(expression1[, ..., expression_n])
Arguments
- expression1, expression_n: Expressions to compare and return the greatest value.. Can be a constant, column, or function, and any combination of arithmetic operators. Pass as many expression arguments as necessary.
Example
> select greatest(4, 7, 5);
+---------------------------+
| greatest(4,7,5) |
+---------------------------+
| 7 |
+---------------------------+
ifnull
Alias of nvl.
least
Returns the smallest value in a list of expressions. Returns null if all expressions are null.
least(expression1[, ..., expression_n])
Arguments
- expression1, expression_n: Expressions to compare and return the smallest value. Can be a constant, column, or function, and any combination of arithmetic operators. Pass as many expression arguments as necessary.
Example
> select least(4, 7, 5);
+---------------------------+
| least(4,7,5) |
+---------------------------+
| 4 |
+---------------------------+
nullif
Returns null if expression1 equals expression2; otherwise it returns expression1.
This can be used to perform the inverse operation of coalesce.
nullif(expression1, expression2)
Arguments
- expression1: Expression to compare and return if equal to expression2. Can be a constant, column, or function, and any combination of operators.
- expression2: Expression to compare to expression1. Can be a constant, column, or function, and any combination of operators.
Example
> select nullif('datafusion', 'data');
+-----------------------------------------+
| nullif(Utf8("datafusion"),Utf8("data")) |
+-----------------------------------------+
| datafusion |
+-----------------------------------------+
> select nullif('datafusion', 'datafusion');
+-----------------------------------------------+
| nullif(Utf8("datafusion"),Utf8("datafusion")) |
+-----------------------------------------------+
| |
+-----------------------------------------------+
nvl
Returns expression2 if expression1 is NULL otherwise it returns expression1.
nvl(expression1, expression2)
Arguments
- expression1: Expression to return if not null. Can be a constant, column, or function, and any combination of operators.
- expression2: Expression to return if expr1 is null. Can be a constant, column, or function, and any combination of operators.
Example
> select nvl(null, 'a');
+---------------------+
| nvl(NULL,Utf8("a")) |
+---------------------+
| a |
+---------------------+\
> select nvl('b', 'a');
+--------------------------+
| nvl(Utf8("b"),Utf8("a")) |
+--------------------------+
| b |
+--------------------------+
Aliases
- ifnull
nvl2
Returns expression2 if expression1 is not NULL; otherwise it returns expression3.
nvl2(expression1, expression2, expression3)
Arguments
- expression1: Expression to test for null. Can be a constant, column, or function, and any combination of operators.
- expression2: Expression to return if expr1 is not null. Can be a constant, column, or function, and any combination of operators.
- expression3: Expression to return if expr1 is null. Can be a constant, column, or function, and any combination of operators.
Example
> select nvl2(null, 'a', 'b');
+--------------------------------+
| nvl2(NULL,Utf8("a"),Utf8("b")) |
+--------------------------------+
| b |
+--------------------------------+
> select nvl2('data', 'a', 'b');
+----------------------------------------+
| nvl2(Utf8("data"),Utf8("a"),Utf8("b")) |
+----------------------------------------+
| a |
+----------------------------------------+
String Functions
- ascii
- bit_length
- btrim
- char_length
- character_length
- chr
- concat
- concat_ws
- contains
- ends_with
- find_in_set
- initcap
- instr
- left
- length
- levenshtein
- lower
- lpad
- ltrim
- octet_length
- overlay
- position
- repeat
- replace
- reverse
- right
- rpad
- rtrim
- split_part
- starts_with
- strpos
- substr
- substr_index
- substring
- substring_index
- to_hex
- translate
- trim
- upper
- uuid
ascii
Returns the Unicode character code of the first character in a string.
ascii(str)
Arguments
- str: String expression to operate on. Can be a constant, column, or function, and any combination of operators.
Example
> select ascii('abc');
+--------------------+
| ascii(Utf8("abc")) |
+--------------------+
| 97 |
+--------------------+
> select ascii('🚀');
+-------------------+
| ascii(Utf8("🚀")) |
+-------------------+
| 128640 |
+-------------------+
Related functions:
bit_length
Returns the bit length of a string.
bit_length(str)
Arguments
- str: String expression to operate on. Can be a constant, column, or function, and any combination of operators.
Example
> select bit_length('datafusion');
+--------------------------------+
| bit_length(Utf8("datafusion")) |
+--------------------------------+
| 80 |
+--------------------------------+
Related functions:
btrim
Trims the specified trim string from the start and end of a string. If no trim string is provided, all whitespace is removed from the start and end of the input string.
btrim(str[, trim_str])
Arguments
- str: String expression to operate on. Can be a constant, column, or function, and any combination of operators.
- trim_str: String expression to operate on. Can be a constant, column, or function, and any combination of operators. Default is whitespace characters.
Example
> select btrim('__datafusion____', '_');
+-------------------------------------------+
| btrim(Utf8("__datafusion____"),Utf8("_")) |
+-------------------------------------------+
| datafusion |
+-------------------------------------------+
Alternative Syntax
trim(BOTH trim_str FROM str)
trim(trim_str FROM str)
Aliases
- trim
Related functions:
char_length
Alias of character_length.
character_length
Returns the number of characters in a string.
character_length(str)
Arguments
- str: String expression to operate on. Can be a constant, column, or function, and any combination of operators.
Example
> select character_length('Ångström');
+------------------------------------+
| character_length(Utf8("Ångström")) |
+------------------------------------+
| 8 |
+------------------------------------+
Aliases
- length
- char_length
Related functions:
chr
Returns the character with the specified ASCII or Unicode code value.
chr(expression)
Arguments
- expression: String expression to operate on. Can be a constant, column, or function, and any combination of operators.
Example
> select chr(128640);
+--------------------+
| chr(Int64(128640)) |
+--------------------+
| 🚀 |
+--------------------+
Related functions:
concat
Concatenates multiple strings together.
concat(str[, ..., str_n])
Arguments
- str: String expression to operate on. Can be a constant, column, or function, and any combination of operators.
- str_n: Subsequent string expressions to concatenate.
Example
> select concat('data', 'f', 'us', 'ion');
+-------------------------------------------------------+
| concat(Utf8("data"),Utf8("f"),Utf8("us"),Utf8("ion")) |
+-------------------------------------------------------+
| datafusion |
+-------------------------------------------------------+
Related functions:
concat_ws
Concatenates multiple strings together with a specified separator.
concat_ws(separator, str[, ..., str_n])
Arguments
- separator: Separator to insert between concatenated strings.
- str: String expression to operate on. Can be a constant, column, or function, and any combination of operators.
- str_n: Subsequent string expressions to concatenate.
Example
> select concat_ws('_', 'data', 'fusion');
+--------------------------------------------------+
| concat_ws(Utf8("_"),Utf8("data"),Utf8("fusion")) |
+--------------------------------------------------+
| data_fusion |
+--------------------------------------------------+