csvsql: ability to pass arguments to create_engine

On macOS and using:

pyodbc==4.0.30
SQLAlchemy==1.4.3
csvkit==1.0.5

.. and the Microsoft ODBC Driver 17 for SQL Server (x64)

Inserts are really-really slow.

For example:

$ time csvsql --db 'mssql+pyodbc:///?odbc_connect=...' --db-schema foo --table bar --insert 1000_rows.csv

real	0m56.999s
user	0m0.662s
sys	0m0.177s

Or:

$ time csvsql --db 'mssql+pyodbc:///?odbc_connect=...' --db-schema foo --table bar --insert 10000_rows.csv

real	9m17.993s
user	0m2.520s
sys	0m0.771s

Whereas with the mysql-connector-python==8.0.23 they work as expected:

$ time csvsql --db 'mysql+mysqlconnector://...' --db-schema foo --table bar --insert 284224_rows.csv

real	0m9.189s
user	0m7.366s
sys	0m0.301s

Obviously I haven't tried to time an insert of 284224 rows into the SQL Server, I don't have that much time 😜. But based on the above data it would take an estimated 280 minutes. 😢

But all of this was quite easy to fix by adding fast_executemany=True to the create_engine call here.

But arguments to the create_engine call can't be passed to csvsql from the command line.

So I'd suggest another argument to csvsql, maybe something called --engine-kwargs or something like that, that should take a something-separated list of key=value pairs that could be passed as create_engine(self.args.connection_string, **self.args.engine_kwargs). I'm sure others would also appreciate the power this would give to the user but then it must be specified in the documentation that some engines only accept some arguments.

Or am I missing something and there's actually a way to pass arguments to the create_engine call? Or is there another way to speed things up?