Ben Chuanlong Du's Blog

It is never too late to learn.

Useful Rust Crates for Database

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

https://github.com/neondatabase/neon Neon is a serverless open-source alternative to AWS Aurora Postgres. It separates storage and compute and substitutes the PostgreSQL storage layer by redistributing data across a …

Hands on the Polars Library in Python

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

Tips and Traps

  1. polars.DataFrame.unique and polars.Series.unique do not maintain the original order by default. To maintain the original order, pass the option maintain_order=True.

Polars

Polars is a blazingly fast DataFrames library implemented in Rust using Apache Arrow as memory model.

Improve the Performance of Spark

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

Plan Your Work

  1. Have a clear idea about what you want to do is very important, especially when you are working on an explorative project. It often saves you time to …

Spark Issue: UriSyntaxException

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

Symptoms

java.lang.IllegalArgumentException: java.net.URISyntaxException: Relative path in absolute URI: hdfs::/cluster-name/user/dclong/feature_example/features/train/2022-03-11

Possible Causes

As the error message points out, there's a syntax …

Spark Issue: SIGBUS

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

Symptoms

CalledProcessError: Command './pine' died with .

Possible Causes

SIGBUS (bus error) is a signal that happens when you try to access memory that has not been physically mapped . There are several …