@@ -14,18 +14,23 @@ You can download the TSV from [IMDb Non-Commercial Datasets](https://developer.i
1414### Command-line arguments
1515
1616``` shell
17- usage: load_imdb.py [-h] [--output_format {csv,sqlite}] [--db_file DB_FILE] [--output_dir OUTPUT_DIR] input_dir
17+ ❯ poetry run python load_imdb.py -h
18+ usage: load_imdb [-h] [-f {csv,sqlite}] [-d DB_FILE] [-o OUTPUT_DIR] [-c CHUNK_SIZE] input_dir
1819
1920Process TSV files from IMDb and save to CSV or SQLite.
2021
2122positional arguments:
22- input_dir Directory that contains the input TSV files
23+ input_dir Directory that contains the input TSV files (default: input)
2324
2425options:
2526 -h, --help show this help message and exit
26- --output_format {csv,sqlite}
27- Output format: csv or sqlite
28- --db_file DB_FILE Path to the SQLite database file (required if output is sqlite)
29- --output_dir OUTPUT_DIR
27+ -f {csv,sqlite}, --output_format {csv,sqlite}
28+ Output format: csv or sqlite (default: csv)
29+ -d DB_FILE, --db_file DB_FILE
30+ Path to SQLite database file - required if output is sqlite (default:imdb.sqlite)
31+ -o OUTPUT_DIR, --output_dir OUTPUT_DIR
32+ Path to the output CSV file - required if output is csv (default:data)
33+ -c CHUNK_SIZE, --chunk_size CHUNK_SIZE
34+ Block size in bytes when processing files (default: 1048576)
3035 Path to the output CSV file (required if output is csv)
3136` ` `
0 commit comments