Worker Resource Access Collisions
When ingesting, workers sometimes access the temporary sqlite database concurrently. This causes the OS to not give out a file descriptor and one of the workers will crash:
0%| | 0/287080 [00:02<?, ? jobs/s]Traceback (most recent call last):
File "/pdo/users/mkuni/.local/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1802, in _execute_context
self.dialect.do_execute(
File "/pdo/users/mkuni/.local/lib/python3.9/site-packages/sqlalchemy/engine/default.py", line 732, in do_execute
cursor.execute(statement, parameters)
sqlite3.OperationalError: unable to open database file
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/sw/python-versions/python-3.9.2/lib/python3.9/multiprocessing/process.py", line 315, in _bootstrap
self.run()
File "/pdo/users/mkuni/.local/lib/python3.9/site-packages/lightcurvedb/core/ingestors/consumer.py", line 112, in run
self._load_contexts()
File "/pdo/users/mkuni/.local/lib/python3.9/site-packages/lightcurvedb/core/ingestors/lightcurve_arrays.py", line 60, in _load_contexts
self.corrector = LightcurveCorrector(self.cache_path)
File "/pdo/users/mkuni/.local/lib/python3.9/site-packages/lightcurvedb/core/ingestors/correction.py", line 42, in __init__
self.tjd_map = contexts.get_tjd_mapping(sqlite_path)
File "/pdo/users/mkuni/.local/lib/python3.9/site-packages/lightcurvedb/core/ingestors/contexts.py", line 116, in wrapper
return function(session, *args, **kwargs)
File "/pdo/users/mkuni/.local/lib/python3.9/site-packages/lightcurvedb/core/ingestors/contexts.py", line 565, in get_tjd_mapping
df = pd.read_sql(
File "/sw/python-versions/python-3.9.2/lib/python3.9/site-packages/pandas/io/sql.py", line 510, in read_sql
return pandas_sql.read_query(
File "/sw/python-versions/python-3.9.2/lib/python3.9/site-packages/pandas/io/sql.py", line 1294, in read_query
result = self.execute(*args)
File "/sw/python-versions/python-3.9.2/lib/python3.9/site-packages/pandas/io/sql.py", line 1162, in execute
return self.connectable.execution_options().execute(*args, **kwargs)
File "<string>", line 2, in execute
File "/pdo/users/mkuni/.local/lib/python3.9/site-packages/sqlalchemy/util/deprecations.py", line 401, in warned
return fn(*args, **kwargs)
File "/pdo/users/mkuni/.local/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 3146, in execute
return connection.execute(statement, *multiparams, **params)
File "/pdo/users/mkuni/.local/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1289, in execute
return meth(self, multiparams, params, _EMPTY_EXECUTION_OPTS)
File "/pdo/users/mkuni/.local/lib/python3.9/site-packages/sqlalchemy/sql/elements.py", line 325, in _execute_on_connection
return connection._execute_clauseelement(
File "/pdo/users/mkuni/.local/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1481, in _execute_clauseelement
ret = self._execute_context(
File "/pdo/users/mkuni/.local/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1845, in _execute_context
self._handle_dbapi_exception(
File "/pdo/users/mkuni/.local/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 2026, in _handle_dbapi_exception
util.raise_(
File "/pdo/users/mkuni/.local/lib/python3.9/site-packages/sqlalchemy/util/compat.py", line 207, in raise_
raise exception
File "/pdo/users/mkuni/.local/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1802, in _execute_context
self.dialect.do_execute(
File "/pdo/users/mkuni/.local/lib/python3.9/site-packages/sqlalchemy/engine/default.py", line 732, in do_execute
cursor.execute(statement, parameters)
sqlalchemy.exc.OperationalError: (sqlite3.OperationalError) unable to open database file
[SQL: SELECT tjdmappings.cadence, tjdmappings.camera, tjdmappings.tjd
FROM tjdmappings ORDER BY tjdmappings.camera, tjdmappings.cadence]
(Background on this error at: https://sqlalche.me/e/14/e3q8)
It should be simple enough to add some kind of back-off procedure so workers are more likely to recover.
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information