used for blackbox testing, data-ingestion procedures
- Installed CLARIN-DSpace7.*. with running database, solr, tomcat
- Clone python-api: https://github.com/dataquest-dev/dspace-python-api (branch
main) and dpace://https://github.com/dataquest-dev/DSpace (branchdtq-dev)
- Get database dump (old CLARIN-DSpace) and unzip it into the
<PSQL_PATH>/bin(or wherever you want)
- Create CLARIN-DSpace5.* databases (dspace, utilities) from dump.
// clarin-dspace database
createdb --username=postgres --owner=dspace --encoding=UNICODE clarin-dspace// create a clarin database with owner
// It run on second try:
psql -U postgres clarin-dspace < <CLARIN_DUMP_FILE_PATH>
// clarin-utilities database
createdb --username=postgres --owner=dspace --encoding=UNICODE clarin-utilities// create a utilities database with owner
// It run on second try:
psql -U postgres clarin-utilities < <UTILITIES_DUMP_FILE_PATH>
- Recreate your local CLARIN-DSpace7.* database NOTE: all data will be deleted
- Install again the database following the official tutorial steps: https://wiki.lyrasis.org/display/DSDOC7x/Installing+DSpace#InstallingDSpace-PostgreSQL11.x,12.xor13.x(withpgcryptoinstalled)
- Or try to run these commands in the <PSQL_PATH>/bin:
createdb --username=postgres --owner=dspace --encoding=UNICODE dspace// create databasepsql --username=postgres dspace -c "CREATE EXTENSION pgcrypto;"// Add pgcrypto extensionIf it throws warning that
-cparameter was ignored, just write aCREATE EXTENSION pgcrypto;command in the database cmd. CREATE EXTENSION pgcrypto;
// Now the clarin database for DSpace7 should be created
- Run the database by the command:
pg_ctl start -D "<PSQL_PATH>\data\"
- (Your DSpace project must be installed) Go to the
dspace/binand run the commanddspace database migrate force// force because of local types NOTE:dspace database migrate forcecreates default database data that may be not in database dump, so after migration, some tables may have more data than the database dump. Data from database dump that already exists in database is not migrated.
- Create an admin by running the command
dspace create-administratorin thedspace/bin
- Prepare
dspace-python-apiproject for migration IMPORTANT: Ifdatafolder doesn't exist in the project, create it
Update const.py
-
user = "<ADMIN_NAME>" -
password = "<ADMIN_PASSWORD>" -
# http or https -
use_ssl = False -
host = "<YOUR_SERVER>" e.g., localhost -
# host = "dev-5.pc" -
fe_port = "<YOUR_FE_PORT>" -
# fe_port = ":4000" -
be_port = "<YOUR_BE_PORT>" -
# be_port = ":8080" -
be_location = "/server/"
Update migration_const.py
REPOSITORY_PATH = "<PROJECT_PATH>"DATA_PATH = REPOSITORY_PATH + "data/"
- Create JSON files from the database tables. NOTE: You must do it for both databases
clarin-dspaceandclarin-utilities(JSON files are stored in thedatafolder)
- Go to
dspace-python-apiin the cmd - Run
pip install -r requirements.txt - Run
python data_migration.py <DATABSE NAME> <HOST> postgres <PASSWORD FOR POSTGRES>e.g.,python data_migration.py clarin-dspace localhost postgres pass(arguments for database connection - database, host, user, password) for the BOTH databases // NOTE there must exist data folder in the project structure
- Copy
assetstorefrom dspace5 to dspace7 (for bitstream import).assetstoreis in the folder where you have installed DSpacedspace/assetstore.
- Import data from the json files (python-api/data/) into dspace database (CLARIN-DSpace7.)
- NOTE: database must be up to date (
dspace database migrate forcemust be called in thedspace/bin) - NOTE: dspace server must be running
- From the
dspace-python-apirun commandpython dspace_import.py
- The values of table attributes that describe the last modification time of dspace object (for example attribute
last_modifiedin tableItem) have a value that represents the time when that object was migrated and not the value from migrated database dump.
Check test.example package. Everything necessary should be there.
Test data are in test/data folder.
If your test data contains special characters like čřšáý and so on, it is recommended
to make .stripped variation of the file.
E.g. my_format.json and my_format.stripped.json for loading data
and my_format.test.xml and my_format.test.stripped.xml for testing.
If not on dev-5 (e.g. when run on localhost), .stripped version of files will be loaded.
The reason for this is, that when dspace runs on windows, it has trouble with special characters.
See const.py for constants used at testing.
To set up logs, navigate to support.logs.py and modify method set_up_logging.
In order to run tests, use command
python -m unittest
Recommended variation is
python -m unittest -v 2> output.txt
which leaves result in output.txt
Before running for the first time, requirements must be installed with following command
pip install -r requirements.txt
It is possible to run in Pycharm with configuration like so:

