Cheatsheet
scala-cli repl --dep io.github.quafadas::scautable::0.0.34 \
--scalac-option -Xmax-inlines --scalac-option 2048 \
--java-opt -Xss4m \
--repl-init-script 'import io.github.quafadas.table.{*, given}'
Reading CSV Files
| Csv Available As | CSV Size | Columns | Hints |
|---|---|---|---|
| Inlined in code | <1Kb | < 100 | inline val s= "aHeader\nr1c1\nr2c1" CSV.fromString(s) |
| File - Compile Time | < 20Mb | <100 | CSV.pwd("a.csv") CSV.resource("a.csv") CSV.absolutePath("/abs/path/to/a.csv") |
| File - Compile Time | > 20Mb & < 250 Mb | Dont attempt to infer types at compile time. val inferOpts = CsvOpts(TypeInferrer.FirstN(20000)) val knownOpts = CsvOpts(TypeInferrer.FromTuple[(Int, String, Double)]) CSV.pwd("/path/a.csv", inferOpts) CSV.absolutePath("/path/a.csv", knownOpts) CSV.resource("a.csv", knownOpts) |
|
| File - Run Time | < 250Mb | <100 | val reader = CSV.fromTyped[("col1", "col2", "col3"), (String, Int, Double)] val data = reader(os.pwd / "simple.csv") |
| File | > 250 Mb or | >100 columns | Scautable is not the right thing to analyse this file. Consider a proper query optimised dataframe library such as Spark or Polars |
Displaying Tables
Assuming you have an Iterator / Iterable of named tuples.
e.g. val data : Seq[(col1 : String, col2 : Int, col3 : Double)] = ???
| Target | Hints |
|---|---|
| repl print / markdown | data.ptbln |
| String | data.consoleFormatNt(fansi = false) |
| html | HtmlRenderer.nt(data) |
| Almond | Html(HtmlRenderer.nt(data)) |
| browser window | HtmlRenderer.desktopShowNt(data) |
Excel Operations (JVM only)
TBD
Type Inference
| Inferrer | Use Case |
|---|---|
FromAllRows |
Most accurate type detection (default) |
FirstRow |
Fast, but less accurate type detection |
StringType |
Safe Mode |
FromFirstNRows(n) |
Balance between accuracy and speed |
FromTuple[T] |
You have control of type inference. Can decode complex types via implicit evidence |