rica.core

The primary data-frame API offered by Rica. Includes functions for creating
and manipulating data-frames.

-main

(-main & args)

append-row

(append-row df row)
Append a row (either as a map or vector) to the given data-frame.

col-map->DataFrame

(col-map->DataFrame col-map)
Creates a DataFrame from a hash-map where each key is the column name and
each value is a indexed collection contianing elements of a single type and
nils.

col-vec->DataFrame

(col-vec->DataFrame col-vecs column-names)
Creates a DataFrame from a vector of vectors, each corresponding to a
column in the resulting DataFrame. The function also requires a collection
of keywords to be used as column names.

column-names

(column-names df)
Returns the names of all the column in the given data-frame.

create-data-frame

(create-data-frame col-name col-data & args)
Creates a dataframe.

DataFrame->matrix

(DataFrame->matrix df)(DataFrame->matrix df implemenation)
Creates a vectorz matrix out of a dataframe. All values must be numeric.

drop-cols

(drop-cols df col-name & args)
Returns a data-frame with the given columns removed.

get-col

(get-col df col-name)
Returns an entire column of a DataFrame based on the given column name.

get-row

(get-row df row-ndx)
Returns an entire row of a DataFrame based on the given index.

head

(head df n)
Returns a DataFrame the first n rows of a DataFrame.

horizontal-stack

(horizontal-stack df1 df2)

n-col

(n-col df)
Returns the number of columns in the given DataFrame.

n-row

(n-row df)
Returns the number of rows in the given DataFrame.

order-by

(order-by df col1 & args)
Returns the given data-frame with the rows ordered one or more columns.

print-schema

(print-schema df)
Prints the schema of a DataFrame to stdout. This includes column names
and their types.

row-maps->DataFrame

(row-maps->DataFrame row-maps)
Creates a DataFrame from a collection of hash maps each representing a row.
The set of keys found in each hash-map (row) are used to determine the column
names of the resulting DataFrame.

row-range

(row-range df start end)
Returns a DataFrame of a subset of rows based on a range of indexes.
Start is inclusive, and end is exclusive.

row-vecs->DataFrame

(row-vecs->DataFrame row-vecs column-names)
Creates a DataFrame from a vector of vectors, each corresponding to a
row in the resulting DataFrame. The function also requires a collection
of keywords to be used as column names.

sample

(sample df frac)
Returns a random sample of the given data-frame with a fraction of the rows.

select

(select df col-name & args)
Returns a data-frame with only the given columns present.

shape

(shape df)
Returns the shape of the given data-frame as a list where the first element
is the number of rows and the second element is the number of columns.

show

(show df)(show df n)
Prins the firs n (default 50) rows of a DataFrame to stdout. Formats the
DataFrame as a table.

tail

(tail df n)
Returns a DataFrame the last n rows of a DataFrame.

union

(union df1 df2)

unique

(unique df)
Returns the given df with only the unique rows.

vertical-stack

(vertical-stack df1 df2)

where

(where df pred)
Returns a dataframe where all rows match the given predicate.

with-column

(with-column df col-name col-expr)
Adds a new column to the given dataframe by associating the given column
name with the new colllection.

with-column-renamed

(with-column-renamed df old-col-name new-col-name)
Renames a column in the DataFrame.