sqlite can be used to provide an SQL interface to CSV files.
sqlite
provides several ways to interact with CSV files.
CSV virtual tables is a separate loadable library that provides efficient read-only access to
CSV files. At the time of writing, however, the library is compiled separately from sqlite
itself, making it a little cumbersome to use.
Fortunately, sqlite
has a very straightforward built-in way to interact with CSV files: .mode csv
.
.mode csv
allows CSV files to be imported into existing table schema using .import csv_filename table_name
and for the
output of SQL queries to be written to CSV files using .output csv_filename
.
Below three shell functions are defined to easily allow the creation, querying and updating of CSV files using SQL.
Before defining the functions, let’s see what kind of interaction becomes possible.
Create a new CSV table called people
. The table will have three fields, a text field called name
, an integer field called age
,
and another text field postcode
.
Also create a table called address
with postcode
and street
name.
$ fields='name text, age integer, postcode text'
$ csv_create people $fields
$ afields='postcode text, street text'
$ csv_create address $afields
Next insert some records into these tables to play with (any SQL expression can be used):
$ csv_update people "$fields" "insert into people(name, age, postcode) values('alice', 12, 'a1')"
$ csv_update people "$fields" "insert into people(name, age, postcode) values('bob', 21, 'b2')"
$ csv_update people "$fields" "insert into people(name, age, postcode) values('tom', 37, 'a1')"
$ csv_update people "$fields" "insert into people(name, age, postcode) values('dick', 45, 'c3')"
$ csv_update people "$fields" "insert into people(name, age, postcode) values('harry', 59, 'b2')"
$ csv_update address "$afields" "insert into address(postcode, street) values('a1', 'the road')"
$ csv_update address "$afields" "insert into address(postcode, street) values('b2', 'the street')"
$ csv_update address "$afields" "insert into address(postcode, street) values('c3', 'the lane')"
Basic queries against the CSV file:
$ csv_query people "$fields" 'select * from people where age < 30'
alice,12,a1
bob,21,b2
Somewhat ineligantly two CSV files can be joined together:
$ sqlite3 :memory: '.mode csv' \
"create table people($fields)" ".import people.csv people" \
"create table address($afields)" ".import address.csv address" \
'select name,street from people inner join address on people.postcode=address.postcode'
alice,"the road"
bob,"the street"
tom,"the road"
dick,"the lane"
harry,"the street"
CSV SQL shell functions
Here are some simple shell functions that use sqlite for single CSV file access.
function csv_create {
local name=$1
touch $name.csv
}
function csv_query {
local name=$1
shift
local fields="$1"
shift
sqlite3 :memory: '.mode csv' "create table $name($fields)" ".import $name.csv $name" "$@"
}
function csv_update {
local name=$1
shift
local fields="$1"
shift
csv_query "$name" "$fields" "$@" ".output $name.csv" "select * from $name"
}
More fiddly: primary keys
When primary keys are involved, explicit casts are required. Table creation is the same as above.
$ sqlite3 :memory: '.mode csv' \
'create table test(id integer primary key, name text, age integer)' \
'insert into test(name, age) values ("alice", 60)' \
'insert into test(name, age) values ("bob", 70)' \
'insert into test(name, age) values ("charlie", 65)' \
'insert into test(name, age) values ("dave", 55)' \
'insert into test(name, age) values ("edith", 70)' \
'.headers on' \
'.output test.csv' \
'select * from test'
Loading data for querying or updating is more complicated, as explicit casts are required:
$ sqlite3 :memory: '.mode csv' \
'create table test(id integer primary key, name text, age integer)' \
'.import test.csv test_tmp' \
'insert into test select cast(id as integer), cast(age as integer), name from test_tmp' \
'select * from test'