OpenRIMS - Database Cleaning Part 1: Difference between revisions
Khoppenworth (talk | contribs) |
Khoppenworth (talk | contribs) No edit summary |
||
Line 36: | Line 36: | ||
FROM tree_userdefined c | FROM tree_userdefined c | ||
'''This script may run a long time!''' | '''This script may run for a long time!''' | ||
The results table created is the actual cleaning script. Now "Select All" from the table, copy and then paste into a New Query. | |||
The easiest way is to go to the | Click Run/Execute. | ||
'''This script will also run for a long time!''' | |||
After the script has executed please login to OpenRIMS as administrator and verify that the script has cleaned the workflows. | |||
The easiest way is to go to the Application Tracker of Monitor as verify that table is empty. | |||
== SQL Description of the cleaning process == | == SQL Description of the cleaning process == |
Latest revision as of 17:02, 23 July 2024
Purpose
The OpenRIMS-RP database contains configuration and user-defined data. The configuration data is defined by Supervisor. This data intends to manage electronic forms and workflows of applications. The user-defined data is data collected using electronic forms of applications and workflows.
Before the database will become production it is necessary to define the primary configuration data. Because the configuration data is particular for a country, this data should be created and tested on the sandbox or demo installation of OpenRIMS-RP.
While testing, the test data will be created. Thus, to prepare a production database it will be necessary to clean up all user-defined data that was created while testing, presenting, and, possible, learning.
For the current release, there is no special defined software to clean up the database, because it is an on-time operation.
Prerequisites
- MySQL Workbench is installed
- Connection to the database is established
- Connection rights to the pdx2 database allow at least select, delete, insert, execute, update, show view
- The below cleaning script
The conceptual model[1] of the OpenRIMS-RP database is a forest of data trees built upon a relation logical model using the MySQL database engine.
The stored procedure “print_tree” allows selecting all branches of a data tree from the root or any node.
Any tree in this forest is a configuration or user-defined data, but not both. The clean database assumed the absence of user-defined data. To remove the user-defined data all data trees that contain user-defined data should be removed.
To remove a data tree, it will be enough to remove the data tree root using the “remove_branch” stored procedure.
Thus, it will be necessary to recognize all roots of user-defined data and remove them by the “remove_branch” stored procedure.
Currently, there are no SQL queries to select roots of the user-defined-data roots, however, SQL queries to select roots of the configurations have been developed for Administrative features implementations.
The approach is to select all roots of configurations and, then, subtract them from the whole roots. This is described in detail in the sections just below the "Generate cleaning script" here:.
Generate cleaning script
Login to MySQL CLI or Workbench.
Select you OpenRIMS database and open a new query.
SELECT concat('call remove_branch(',c.ID, ');', '-- ', c.Identifier) as 'sql'
FROM tree_userdefined c
This script may run for a long time!
The results table created is the actual cleaning script. Now "Select All" from the table, copy and then paste into a New Query.
Click Run/Execute.
This script will also run for a long time!
After the script has executed please login to OpenRIMS as administrator and verify that the script has cleaned the workflows.
The easiest way is to go to the Application Tracker of Monitor as verify that table is empty.
SQL Description of the cleaning process
Do NOT run the below scripts individually!
They are for educational purpose only but might assist in troubleshooting.
Roots of configurations
The data tree roots as well as all data tree nodes are recorded in the “concept” table. All roots can be selected using the “tree_roots” view.
This view is a result of SQL query
SELECT
`concept`.`ID` AS `ID`,
`concept`.`Identifier` AS `Identifier`
FROM
`concept`
LEFT JOIN `closure` `clos` ON `clos`.`childID` = `concept`.`ID` AND (`clos`.`Level` <> 0)
WHERE
ISNULL(`clos`.`ID`) and `concept`.`Active` -- only roots
Each data tree root is defined by unique field “Identifier”.
The data configuration trees are:
1. Authorities and users
2. Workflows
3. Dictionaries
4. Resources
5. Files uploaded to resources
6. Configurations of data
7. Configurations of reports
8. Tiles
9. Legacy data
These configurations will not be extended until the next version of OpenRIMS-RP software.
Authorities and users
All users are branches of a tree. The root of this tree is “user.data”.
All authorities are branches of a tree “organization.authority”.
The SQL query to select roots:
SELECT c.ID, c.Identifier
FROM concept c
left join closure clo on clo.childID=c.ID and clo.Level<>0
where clo.ID is null and c.Active and c.Identifier in ('organization.authority', 'user.data')
Workflow definitions
Workflow definitions allow control user’s application processing.
The definition of a workflow is stored in a tree. The root URL is “configuration.”URL of a workflow”, e.g., “configuration.application.retail.site.owned.pvt”. All root URLs can be gotten from lifecycle dictionaries:
· dictionary.guest.deregistration
· dictionary.guest.amendments
· dictionary.guest.applications
· dictionary.guest.renewal
· dictionary.shutdown.applications
· dictionary.host.applications
The SQL query to select all root URL’s from the life cycle dictionaries is:
SELECT distinct concat('configuration.',val.Label)
FROM concept c
join closure clo1 on clo1.parentID=c.ID and clo1.`Level`=3
join concept var on var.ID=clo1.childID and var.Identifier='applicationurl'
join closure clo2 on clo2.parentID=var.ID and clo2.Level=1
join concept val on val.ID=clo2.childID and val.Identifier='EN_US'
left join closure clo on clo.childID=c.ID and clo.Level<>0
where
clo.ID is null and
c.Identifier in (
'dictionary.guest.deregistration',
'dictionary.guest.amendments',
'dictionary.guest.applications',
'dictionary.guest.renewal',
'dictionary.shutdown.applications',
'dictionary.host.applications'
)
The SQL to select all workflow definition roots is
SELECT c.ID, c.Identifier
FROM concept c
left join closure clo on clo.childID=c.ID and clo.Level<>0
where clo.ID is null and c.Active and c.Identifier in (
-- query for workflow definition URLs from the lifecycle dictionaries
SELECT distinct concat('configuration.',val.Label)
FROM concept c
join closure clo1 on clo1.parentID=c.ID and clo1.`Level`=3
join concept var on var.ID=clo1.childID and var.Identifier='applicationurl'
join closure clo2 on clo2.parentID=var.ID and clo2.Level=1
join concept val on val.ID=clo2.childID and val.Identifier='EN_US'
left join closure clo on clo.childID=c.ID and clo.Level<>0
where
clo.ID is null and
c.Identifier in (
'dictionary.guest.deregistration',
'dictionary.guest.amendments',
'dictionary.guest.applications',
'dictionary.guest.renewal',
'dictionary.shutdown.applications',
'dictionary.host.applications'
)
)
Dictionaries
Dictionaries contains pre-defined classifiers used by data input forms. Each dictionary is a tree.
By the agreement, all URLs of dictionaries should start with “dictionary.”. Thus, SQL is
SELECT c.ID, c.Identifier
FROM concept c
left join closure clo on clo.childID=c.ID and clo.Level<>0
where clo.ID is null and c.Active and c.Identifier like 'dictionary.%'
Resources
The definition of a resource consists of:
1. Resource definition
2. Data configuration
3. Dictionary
4. Stored files
The data configuration and dictionary are common configuration trees, that can be gotten by the appropriate SQLs (see above and below)
Resource definitions are branches of the root URL “configuration.resources”.
The SQL is
SELECT c.ID, c.Identifier
FROM concept c
left join closure clo on clo.childID=c.ID and clo.Level<>0
where clo.ID is null and c.Active and c.Identifier='configuration.resources'
Files, uploaded to resources
The resource files are stored in a tree for each resource. The root URL is defined in the resource data configurations. A data configuration of a resource is a uniform data configuration. Thus, it is necessary:
1. Select URLs of data configurations for all resources
2. Select data configurations for all resources, using data configuration URLs
3. Select URLs of file trees from the data configurations
4. Select file trees, using the URLs
URLs of data configurations for all resources
SELECT item.Label
FROM concept c
join closure clo1 on clo1.parentID=c.ID and clo1.`Level`=1
join concept lang on lang.ID=clo1.childID and lang.Identifier='EN_US'
join closure clo2 on clo2.parentID=lang.ID and clo2.`Level`=1
join concept item on item.ID=clo2.childID
left join closure clo on clo.childID=c.ID and clo.`Level`<>0
where
clo.ID is null and
item.Active and
c.Active and
c.Identifier='configuration.resources'
Data configurations for all resources:
SELECT conf.ID, conf.Identifier
FROM concept c
join closure clo1 on clo1.parentID=c.ID and clo1.`Level`=1
join concept conf on conf.ID=clo1.childID
left join closure clo on clo.childID=c.ID and clo.Level<>0
where clo.ID is null
and c.Active
and c.Identifier in ('configuration.data')
and conf.Identifier in (
-- URLs of data configurations for all resources
SELECT item.Label
FROM concept c
join closure clo1 on clo1.parentID=c.ID and clo1.`Level`=1
join concept lang on lang.ID=clo1.childID and lang.Identifier='EN_US'
join closure clo2 on clo2.parentID=lang.ID and clo2.`Level`=1
join concept item on item.ID=clo2.childID
left join closure clo on clo.childID=c.ID and clo.`Level`<>0
where
clo.ID is null and
item.Active and
c.Active and
c.Identifier='configuration.resources'
)
URLs of file trees from the data configurations
SELECT asm.Url
FROM concept c -- roots of configurations (right table on the screen)
-- variables for configurations (left table on the screen)
join closure clo1 on clo1.parentID=c.ID and clo1.`Level`=1
join concept conf on conf.ID=clo1.childID
join closure clo2 on clo2.parentID=conf.ID and clo2.Level='1'
join concept var on var.ID=clo2.childID and var.Identifier <> '_LITERALS_' and var.Active
-- get only files (documents) from the configuration variables
join assembly asm on asm.conceptID=var.ID and asm.Clazz='documents'
-- roots
left join closure clo on clo.childID=c.ID and clo.Level<>0
where clo.ID is null
and c.Active
and c.Identifier in ('configuration.data')
and conf.Identifier in (
-- URLs of data configurations for all resources
SELECT item.Label
FROM concept c
join closure clo1 on clo1.parentID=c.ID and clo1.`Level`=1
join concept lang on lang.ID=clo1.childID and lang.Identifier='EN_US'
join closure clo2 on clo2.parentID=lang.ID and clo2.`Level`=1
join concept item on item.ID=clo2.childID
left join closure clo on clo.childID=c.ID and clo.`Level`<>0
where
clo.ID is null and
item.Active and
c.Active and
c.Identifier='configuration.resources'
)
File trees:
SELECT c.ID, c.Identifier
FROM concept c
left join closure clo on clo.childID=c.ID and clo.Level<>0
where clo.ID is null and c.Active and c.Identifier in
(
-- URLs of file trees from the data configurations
SELECT asm.Url
FROM concept c -- roots of configurations (right table on the screen)
-- variables for configurations (left table on the screen)
join closure clo1 on clo1.parentID=c.ID and clo1.`Level`=1
join concept conf on conf.ID=clo1.childID
join closure clo2 on clo2.parentID=conf.ID and clo2.Level='1'
join concept var on var.ID=clo2.childID and var.Identifier <> '_LITERALS_' and var.Active
-- get only files (documents) from the configuration variables
join assembly asm on asm.conceptID=var.ID and asm.Clazz='documents'
-- roots
left join closure clo on clo.childID=c.ID and clo.Level<>0
where clo.ID is null
and c.Active
and c.Identifier in ('configuration.data')
and conf.Identifier in (
-- URLs of data configurations for all resources
SELECT item.Label
FROM concept c
join closure clo1 on clo1.parentID=c.ID and clo1.`Level`=1
join concept lang on lang.ID=clo1.childID and lang.Identifier='EN_US'
join closure clo2 on clo2.parentID=lang.ID and clo2.`Level`=1
join concept item on item.ID=clo2.childID
left join closure clo on clo.childID=c.ID and clo.`Level`<>0
where
clo.ID is null and
item.Active and
c.Active and
c.Identifier='configuration.resources'
)
)
Configurations of data
Configurations of data intends to manage on-screen forms for configurations and user-defined data For each form the configuration allows to assign:
· Form layout on a screen
· Data input fields, i.e., text, numeric, dates, logical
· Data input components, i.e., dictionaries, registers, resource downloads, etc.
· Data validation rules
Details may be found here (MSH, 2022)
For each data form the configuration is a branch in the tree with URL “configuration.data”
The SQL is
SELECT c.ID, c.Identifier
FROM concept c
left join closure clo on clo.childID=c.ID and clo.Level<>0
where clo.ID is null and c.Active and c.Identifier in ('configuration.data')
Configurations of reports
Configurations of internal reports intend to define parameters to execute internal reports. For each report configuration is a branch in a tree with root URL “report.configuration”
The SQL is
SELECT c.ID, c.Identifier
FROM concept c
left join closure clo on clo.childID=c.ID and clo.Level<>0
where clo.ID is null and c.Active and c.Identifier in ('report.configuration')
Tiles
Tiles is a system dictionary “dictionary.system.tiles”
Legacy Data
The legacy data may be placed to one or more trees. The URLs of the roots should be clarified for the installation. For Nepal database URLs are “system.import.legacy.data”, “legacy.pharmacies” and “legacy.ws”.
The SQL is
SELECT c.ID, c.Identifier
FROM concept c
left join closure clo on clo.childID=c.ID and clo.Level<>0
where clo.ID is null and c.Active and c.Identifier in ('legacy.pharmacies', 'legacy.ws', 'system.import.legacy.data')
Put it all together
Create two database views:
tree_configurations:
-- 1. Authorities and users
SELECT c.ID, c.Identifier
FROM concept c
left join closure clo on clo.childID=c.ID and clo.Level<>0
where clo.ID is null and c.Active and c.Identifier in ('organization.authority', 'user.data')
union
-- 2. Workflows
SELECT c.ID, c.Identifier
FROM concept c
left join closure clo on clo.childID=c.ID and clo.Level<>0
where clo.ID is null and c.Active and c.Identifier in (
-- query for workflow definition URLs from the lifecycle dictionaries
SELECT distinct concat('configuration.',val.Label)
FROM concept c
join closure clo1 on clo1.parentID=c.ID and clo1.`Level`=3
join concept var on var.ID=clo1.childID and var.Identifier='applicationurl'
join closure clo2 on clo2.parentID=var.ID and clo2.Level=1
join concept val on val.ID=clo2.childID and val.Identifier='EN_US'
left join closure clo on clo.childID=c.ID and clo.Level<>0
where
clo.ID is null and
c.Identifier in (
'dictionary.guest.deregistration',
'dictionary.guest.amendments',
'dictionary.guest.applications',
'dictionary.guest.renewal',
'dictionary.shutdown.applications',
'dictionary.host.applications'
)
)
union
-- 3. Dictionaries
SELECT c.ID, c.Identifier
FROM concept c
left join closure clo on clo.childID=c.ID and clo.Level<>0
where clo.ID is null and c.Active and c.Identifier like 'dictionary.%'
union
-- 4. Resources
SELECT c.ID, c.Identifier
FROM concept c
left join closure clo on clo.childID=c.ID and clo.Level<>0
where clo.ID is null and c.Active and c.Identifier='configuration.resources'
union
-- 5. Files uploaded to resources
SELECT c.ID, c.Identifier
FROM concept c
left join closure clo on clo.childID=c.ID and clo.Level<>0
where clo.ID is null and c.Active and c.Identifier in
(
-- URLs of file trees from the data configurations
SELECT asm.Url
FROM concept c -- roots of configurations (right table on the screen)
-- variables for configurations (left table on the screen)
join closure clo1 on clo1.parentID=c.ID and clo1.`Level`=1
join concept conf on conf.ID=clo1.childID
join closure clo2 on clo2.parentID=conf.ID and clo2.Level='1'
join concept var on var.ID=clo2.childID and var.Identifier <> '_LITERALS_' and var.Active
-- get only files (documents) from the configuration variables
join assembly asm on asm.conceptID=var.ID and asm.Clazz='documents'
-- roots
left join closure clo on clo.childID=c.ID and clo.Level<>0
where clo.ID is null
and c.Active
and c.Identifier in ('configuration.data')
and conf.Identifier in (
-- URLs of data configurations for all resources
SELECT item.Label
FROM concept c
join closure clo1 on clo1.parentID=c.ID and clo1.`Level`=1
join concept lang on lang.ID=clo1.childID and lang.Identifier='EN_US'
join closure clo2 on clo2.parentID=lang.ID and clo2.`Level`=1
join concept item on item.ID=clo2.childID
left join closure clo on clo.childID=c.ID and clo.`Level`<>0
where
clo.ID is null and
item.Active and
c.Active and
c.Identifier='configuration.resources'
)
)
union
-- 6. Configurations of data
SELECT c.ID, c.Identifier
FROM concept c
left join closure clo on clo.childID=c.ID and clo.Level<>0
where clo.ID is null and c.Active and c.Identifier in ('configuration.data')
union
-- 7. Configurations of reports
SELECT c.ID, c.Identifier
FROM concept c
left join closure clo on clo.childID=c.ID and clo.Level<>0
where clo.ID is null and c.Active and c.Identifier in ('report.configuration')
union
-- 8. Tiles are a dictionary
-- 9. Legacy data
SELECT c.ID, c.Identifier
FROM concept c
left join closure clo on clo.childID=c.ID and clo.Level<>0
where clo.ID is null and c.Active and c.Identifier in ('legacy.pharmacies', 'legacy.ws', 'system.import.legacy.data')
tree_userdefined:
SELECT c.*
FROM
`concept` `c`
LEFT JOIN `closure` `clo` ON `clo`.`childID` = `c`.`ID`
AND (`clo`.`Level` <> 0)
WHERE
ISNULL(`clo`.`ID`) AND `c`.`Active`
AND NOT `c`.`ID` IN (SELECT
`tree_configurations`.`ID`
FROM
`tree_configurations`)
[1] sciencedirect.com/topics/computer-science/conceptual-data-model