Getting Started
Scautable: One line CSV import and dataframe utilities based on scala's NamedTuple
.
Scala CLI
//> using dep io.github.quafadas::scautable::0.0.28
Here's a screencap of a tiny, self contained example.
Quickstart...
Source: Kaggle
//> using scala 3.7.2
//> using dep io.github.quafadas::scautable::0.0.28
//> using resourceDir resources
import io.github.quafadas.table.*
@main def run(): Unit =
val df = CSV.resource("cereals.csv", TypeInferrer.FromAllRows)
val data = LazyList.from(
df
.addColumn["double_the_sugar", Double](_.sugars * 2)
.dropColumn["fiber"] // no one cares about the healthy bit
.mapColumn["name", String](_.toUpperCase)
.renameColumn["mfr", "manufacturer"]
)
data.take(20).ptbln
println("Hot cereals: ")
data.collect{
case row if row.`type` == "H" =>
(name = row.name, made_by = row.manufacturer, sugar = row.sugars, salt = row.sodium)
}.ptbln
Mill
mvn"io.github.quafadas::scautable::0.0.28"
Then run the same code as above in src/Example.scala
.
Goals
- Strongly typed compile-time CSV import
- pretty printing to console for
Product
types - Auto-magically generate html tables from case classes
- Searchable, sortable browser GUI for your tables
5 second CSV quickstart
import io.github.quafadas.table.*
val data = CSV.resource("titanic.csv", TypeInferrer.FromAllRows)
// This doesn't display well on a website because of the ANSI...
data.toSeq.describe
// But these lines should be all you need to get an overview of the data.
// In order to make it look nice on a website
val (numerics, categoricals) = LazyList.from(
CSV.resource("titanic.csv", TypeInferrer.FromAllRows)
).summary
In order to make it look nice on a website
println(
numerics
.mapColumn["mean", String](s => "%.2f".format(s))
.mapColumn["0.25", String](s => "%.2f".format(s))
.mapColumn["0.75", String](s => "%.2f".format(s))
.consoleFormatNt(fansi = false)
)
// | | name| typ| mean| min| 0.25| median| 0.75| max|
// +-+-----------+------+------+----+------+------------------+------+--------+
// |0|PassengerId| Int|446.00| 1.0|223.25| 446.0|668.75| 891.0|
// |1| Pclass| Int| 2.31| 1.0| 1.77| 3.0| 3.00| 3.0|
// |2| Age|Double| 29.70|0.42| 20.37| 28.2952| 38.35| 80.0|
// |3| SibSp| Int| 0.52| 0.0| 0.00| 0.0| 1.00| 8.0|
// |4| Parch| Int| 0.38| 0.0| 0.00| 0.0| 0.09| 6.0|
// |5| Fare|Double| 32.20| 0.0| 7.91|14.302549127640036| 31.04|512.3292|
// +-+-----------+------+------+----+------+------------------+------+--------+
println(
categoricals
.mapColumn["sample", String](_.take(20))
.consoleFormatNt(fansi = false)
)
// | | name|uniqueEntries| mostFrequent|frequency| sample|
// +-+--------+-------------+------------------------+---------+--------------------+
// |0|Survived| 2| false| 549| false, true|
// |1| Name| 891|Young, Miss. Marie Grice| 1|Wick, Miss. Mary Nat|
// |2| Sex| 2| male| 577| male, female|
// |3| Ticket| 681| 347082| 7|11967, 372622, 13568|
// |4| Cabin| 147| B96 B98| 4|C126, C54, D28, A23,|
// |5|Embarked| 3| S| 644| S, Q, C|
// +-+--------+-------------+------------------------+---------+--------------------+