Ben Chuanlong Du's Blog

It is never too late to learn.

Filter a Polars LazyFrame in Rust

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

Tips and Traps

  1. LazyFrame.filter filters rows using an Expr while DataFrame.filter filters rows using a mask of the type ChunkedArray<BooleanType>.
In [2]:
:timing
:sccache 1
:dep polars = { version = "0.21.1", features = ["lazy", "parquet"] }
Out[2]:
sccache: true
In [3]:
use polars::prelude::*;
use polars::df;
In [8]:
let frame = df![
    "names" => ["a", "b", "c"],
    "values" => [1, 2, 3],
    "values_nulls" => [Some(1), None, Some(3)]
].unwrap();
frame
Out[8]:
shape: (3, 3)
┌───────┬────────┬──────────────┐
│ names ┆ values ┆ values_nulls │
│ ---   ┆ ---    ┆ ---          │
│ str   ┆ i32    ┆ i32          │
╞═══════╪════════╪══════════════╡
│ a     ┆ 1      ┆ 1            │
├╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ b     ┆ 2      ┆ null         │
├╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ c     ┆ 3      ┆ 3            │
└───────┴────────┴──────────────┘

Convert the DataFrame to a lazy one and then filter rows using an Expr.

In [9]:
frame.lazy().filter(
    col("values").gt(lit::<i32>(1))
).collect()
Out[9]:
Ok(shape: (2, 3)
┌───────┬────────┬──────────────┐
│ names ┆ values ┆ values_nulls │
│ ---   ┆ ---    ┆ ---          │
│ str   ┆ i32    ┆ i32          │
╞═══════╪════════╪══════════════╡
│ b     ┆ 2      ┆ null         │
├╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ c     ┆ 3      ┆ 3            │
└───────┴────────┴──────────────┘)
In [15]:
let frame = df![
    "names" => ["a", "b", "c"],
    "values" => [1, 2, 3],
    "values_nulls" => [Some(1), None, Some(3)]
].unwrap();
let lframe = frame.lazy();
[lframe.fetch(2), lframe.fetch(2)]
[lframe.fetch(2), lframe.fetch(2)]
        ^^^^^^^^ `lframe` moved due to this method call
[lframe.fetch(2), lframe.fetch(2)]
                  ^^^^^^ value used here after move
let lframe = frame.lazy();
    ^^^^^^ move occurs because `lframe` has type `LazyFrame`, which does not implement the `Copy` trait
use of moved value: `lframe`
In [ ]:

Comments