Reading and Writing DataFrames
DataFrames store relational/tabular data with has many standard forms of serialization. Rica targets the most common, non-distributed file types and data stores to read and write from.
CSV
One of the most ubiquitous file types for storing data is the comma seperated values files. The rica.io
namespace provides the from-csv
and to-csv
functions for reading and writing DataFrames to and from CSV files.
As shown in the example below, any non-string column must be cast to the correct type.
(require '[rica.core :refer :all])
(require '[rica.io :refer :all])
(def apps
(-> (from-csv "resources/apps_data.csv" true)
(with-column :major_version #(Long/parseLong (:major_version %)))
(with-column :minor_version #(Long/parseLong (:minor_version %)))))
Rica wraps org.clojure/data.csv
for serialization and thus passes the arguments typically used as options to read-csv
and write-csv
through to the underlying function.
For example, the below Rica example sets the :seperator
option in the same way as org.clojure/data.csv
.
(def pre-releases
(-> apps
(where #(zero? (:major_version %)))
(order-by :minor_version)
(with-column :version
#(str (:major_version %) "." (:minor_version %)))
(select :app :version)))
(show pre-releases)
; | :app | :version |
; |-----------+----------|
; | Zoolab | 0.2 |
; | Lotstring | 0.5 |
; | Wrapsafe | 0.6 |
(to-csv pre-releases "resources/prerelease_apps_data.csv" true :separator \|)
; Contents of "resources/prerelease_apps_data.csv"
; app|version
; Zoolab|0.2
; Lotstring|0.5
; Wrapsafe|0.6