The State of Machine Intelligence – AI Landscape

Eval Functions

Load/Store Functions

Math Functions

String Functions

Tuple, Bag, Map Functions

User Defined Functions (UDFs)
Pig provides extensive support for user defined functions (UDFs) as a way to specify custom processing. Pig UDFs can currently be implemented in three languages: Java, Python, JavaScript and Ruby.

Registering UDFs Registering Java UDFs:

—register_java_udf.pig register ‘your_path_to_piggybank/piggybank.jar’; divs = load ‘NYSE_dividends’ as (exchange:chararray, symbol:chararray, date:chararray, dividends:float);Registering Python UDFs (The Python script must be in your current directory): –register_python_udf.pig register ‘’ using jython as bballudfs; players = load ‘baseball’ as (name:chararray, team:chararray, pos:bag{t:(p:chararray)}, bat:map[]);

Writing UDFs Java UDFs:

package myudfs; import; import org.apache.pig.EvalFunc; import;

public class UPPER extends EvalFunc { Public String exec(Tuple input) throws IOException { If (input == null || input.size() == 0) return null; try{ String str = (String)input.get(0); return str.toUpperCase(); }catch(Exception e){ throw new IOException(“Caught exception processing input row “, e); } } }Python UDFs # usr/bin/python #Square – Square of a number of any data type @outputSchemaFunction(“squareSchema”) — Defines a script delegate function that defines schema for this function depending upon the input type. def square(num): return ((num)*(num)) @schemaFunction(“squareSchema”) –Defines delegate function and is not registered to Pig. def squareSchema(input): return input #Percent- Percentage @outputSchema(“percent:double”) –Defines schema for a script UDF in a format that Pig understands and is able to parse def percent(num, total): return num * 100 / total

Share this post

Related Posts

Leave a Reply

Your email address will not be published.