Show HN: The Canada census data in a SQLite file; advice appreciated This is niche, I'll admit. I needed to look through the latest census data, but it was exported as multiple multi-gigabyte bespoke latin1-encoded CSV files. Pandas, Polars, and SQLite's CSV import tool weren't much help, so I shelved the project until recently, when I started taking a SQLite course online. I picked it up again, normalized the data, and now there's a database that can be queried through a SQL view that matches the headings in the original CSVs. I'm proud of the script I created to export the data, as well as automatically compress the artifact, make the diagrams and checksums, etc. This is my first time building up a big database, does my schema seem sane? I've been considering switching the counts from REALs to TEXT, since then SQLite's decimal extension can do exact calculations, but considering there's only one or two places after the decimal points in the data, I'm not sure if it's worth it space-wise. https://ift.tt/vQycDfN December 5, 2024 at 02:20AM
0 Comments