OpenRIMS - Database Cleaning Part 1: Difference between revisions

From OpenRIMS Wiki
Jump to navigation Jump to search
No edit summary
 
Line 36: Line 36:


  FROM tree_userdefined c
  FROM tree_userdefined c
'''This script may run a long time!'''
'''This script may run for a long time!'''


After the script run concludes please login as administrator and verify that the script has cleaned the workflows.
The results table created is the actual cleaning script. Now "Select All" from the table, copy and then paste into a New Query.


The easiest way is to go to the   
Click Run/Execute.
 
'''This script will also run for a long time!'''
 
After the script has executed please login to OpenRIMS as administrator and verify that the script has cleaned the workflows.
 
The easiest way is to go to the Application Tracker of Monitor as verify that table is empty.  


== SQL Description of the cleaning process ==
== SQL Description of the cleaning process ==

Latest revision as of 17:02, 23 July 2024

Purpose

The OpenRIMS-RP database contains configuration and user-defined data. The configuration data is defined by Supervisor. This data intends to manage electronic forms and workflows of applications. The user-defined data is data collected using electronic forms of applications and workflows.

Before the database will become production it is necessary to define the primary configuration data. Because the configuration data is particular for a country, this data should be created and tested on the sandbox or demo installation of OpenRIMS-RP.

While testing, the test data will be created. Thus, to prepare a production database it will be necessary to clean up all user-defined data that was created while testing, presenting, and, possible, learning.

For the current release, there is no special defined software to clean up the database, because it is an on-time operation.

Prerequisites

  • MySQL Workbench is installed
  • Connection to the database is established
  • Connection rights to the pdx2 database allow at least select, delete, insert, execute, update, show view
  • The below cleaning script

The conceptual model[1] of the OpenRIMS-RP database is a forest of data trees built upon a relation logical model using the MySQL database engine.

The stored procedure “print_tree” allows selecting all branches of a data tree from the root or any node.

Any tree in this forest is a configuration or user-defined data, but not both. The clean database assumed the absence of user-defined data. To remove the user-defined data all data trees that contain user-defined data should be removed.

To remove a data tree, it will be enough to remove the data tree root using the “remove_branch” stored procedure.

Thus, it will be necessary to recognize all roots of user-defined data and remove them by the “remove_branch” stored procedure.

Currently, there are no SQL queries to select roots of the user-defined-data roots, however, SQL queries to select roots of the configurations have been developed for Administrative features implementations.

The approach is to select all roots of configurations and, then, subtract them from the whole roots. This is described in detail in the sections just below the "Generate cleaning script" here:.

Generate cleaning script

Login to MySQL CLI or Workbench.

Select you OpenRIMS database and open a new query.

SELECT concat('call remove_branch(',c.ID, ');', '-- ', c.Identifier) as 'sql'
FROM tree_userdefined c

This script may run for a long time!

The results table created is the actual cleaning script. Now "Select All" from the table, copy and then paste into a New Query.

Click Run/Execute.

This script will also run for a long time!

After the script has executed please login to OpenRIMS as administrator and verify that the script has cleaned the workflows.

The easiest way is to go to the Application Tracker of Monitor as verify that table is empty.

SQL Description of the cleaning process

Do NOT run the below scripts individually!

They are for educational purpose only but might assist in troubleshooting.

Roots of configurations

The data tree roots as well as all data tree nodes are recorded in the “concept” table. All roots can be selected using the “tree_roots” view.

This view is a result of SQL query

SELECT

      `concept`.`ID` AS `ID`,

      `concept`.`Identifier` AS `Identifier`

   FROM

      `concept`

       LEFT JOIN `closure` `clos` ON `clos`.`childID` = `concept`.`ID`  AND (`clos`.`Level` <> 0)

   WHERE

      ISNULL(`clos`.`ID`) and `concept`.`Active` -- only roots

Each data tree root is defined by unique field “Identifier”.

The data configuration trees are:

1.     Authorities and users

2.     Workflows

3.     Dictionaries

4.     Resources

5.     Files uploaded to resources

6.     Configurations of data

7.     Configurations of reports

8.     Tiles

9.     Legacy data

These configurations will not be extended until the next version of OpenRIMS-RP software.

Authorities and users

              All users are branches of a tree. The root of this tree is “user.data”.

All authorities are branches of a tree “organization.authority”.

              The SQL query to select roots:

SELECT c.ID, c.Identifier

FROM concept c

left join closure clo on clo.childID=c.ID and clo.Level<>0

where clo.ID is null and c.Active and c.Identifier in ('organization.authority', 'user.data')

Workflow definitions

              Workflow definitions allow control user’s application processing.

              The definition of a workflow is stored in a tree. The root URL is “configuration.”URL of a workflow”, e.g., “configuration.application.retail.site.owned.pvt”. All root URLs can be gotten from lifecycle dictionaries:

·       dictionary.guest.deregistration

·       dictionary.guest.amendments

·       dictionary.guest.applications

·       dictionary.guest.renewal

·       dictionary.shutdown.applications

·       dictionary.host.applications

The SQL query to select all root URL’s from the life cycle dictionaries is:

SELECT distinct concat('configuration.',val.Label)

FROM concept c

join closure clo1 on clo1.parentID=c.ID and clo1.`Level`=3

join concept var on var.ID=clo1.childID and var.Identifier='applicationurl'

join closure clo2 on clo2.parentID=var.ID and clo2.Level=1

join concept val on val.ID=clo2.childID and val.Identifier='EN_US'

left join closure clo on clo.childID=c.ID and clo.Level<>0

where

clo.ID is null and

c.Identifier in (

'dictionary.guest.deregistration',

'dictionary.guest.amendments',

'dictionary.guest.applications',

'dictionary.guest.renewal',

'dictionary.shutdown.applications',

'dictionary.host.applications'

)


              The SQL to select all workflow definition roots is

SELECT c.ID, c.Identifier

FROM concept c

left join closure clo on clo.childID=c.ID and clo.Level<>0

where clo.ID is null and c.Active and c.Identifier in (

                                             -- query for workflow definition URLs from the lifecycle dictionaries

                                             SELECT distinct concat('configuration.',val.Label)

                                             FROM concept c

                                             join closure clo1 on clo1.parentID=c.ID and clo1.`Level`=3

                                             join concept var on var.ID=clo1.childID and var.Identifier='applicationurl'

                                             join closure clo2 on clo2.parentID=var.ID and clo2.Level=1

                                             join concept val on val.ID=clo2.childID and val.Identifier='EN_US'

                                             left join closure clo on clo.childID=c.ID and clo.Level<>0

                                             where

                                             clo.ID is null and

                                             c.Identifier in (

                                             'dictionary.guest.deregistration',

                                             'dictionary.guest.amendments',

                                             'dictionary.guest.applications',

                                             'dictionary.guest.renewal',

                                             'dictionary.shutdown.applications',

                                             'dictionary.host.applications'

                                             )

)

Dictionaries

              Dictionaries contains pre-defined classifiers used by data input forms. Each dictionary is a tree.

              By the agreement, all URLs of dictionaries should start with “dictionary.”. Thus, SQL is

SELECT c.ID, c.Identifier

FROM concept c

left join closure clo on clo.childID=c.ID and clo.Level<>0

where clo.ID is null and c.Active and c.Identifier like 'dictionary.%'

Resources

              The definition of a resource consists of:

1.     Resource definition

2.     Data configuration

3.     Dictionary

4.     Stored files

The data configuration and dictionary are common configuration trees,  that can be gotten by the appropriate SQLs (see above and below)

Resource definitions are branches of the root URL “configuration.resources”.

The SQL is

SELECT c.ID, c.Identifier

FROM concept c

left join closure clo on clo.childID=c.ID and clo.Level<>0

where clo.ID is null and c.Active and c.Identifier='configuration.resources'

Files, uploaded to resources

The resource files are stored in a tree for each resource. The root URL is defined in the resource data configurations. A data configuration of a resource is a uniform data configuration.  Thus, it is necessary:

1.     Select URLs of data configurations for all resources

2.     Select data configurations for all resources, using data configuration URLs

3.     Select URLs of file trees from the data configurations

4.     Select file trees, using the URLs

URLs of data configurations for all resources

SELECT item.Label

FROM concept c

join closure clo1 on clo1.parentID=c.ID and clo1.`Level`=1

join concept lang on lang.ID=clo1.childID and lang.Identifier='EN_US'

join closure clo2 on clo2.parentID=lang.ID and clo2.`Level`=1

join concept item on item.ID=clo2.childID

left join closure clo on clo.childID=c.ID and clo.`Level`<>0

where

clo.ID is null and

item.Active and

c.Active and

c.Identifier='configuration.resources'


Data configurations for all resources:

SELECT conf.ID, conf.Identifier

FROM concept c

join closure clo1 on clo1.parentID=c.ID and clo1.`Level`=1

join concept conf on conf.ID=clo1.childID

left join closure clo on clo.childID=c.ID and clo.Level<>0

where clo.ID is null

and c.Active

and c.Identifier in ('configuration.data')

and conf.Identifier in (

                              -- URLs of data configurations for all resources

                              SELECT item.Label

                              FROM concept c

                              join closure clo1 on clo1.parentID=c.ID and clo1.`Level`=1

                              join concept lang on lang.ID=clo1.childID and lang.Identifier='EN_US'

                              join closure clo2 on clo2.parentID=lang.ID and clo2.`Level`=1

                              join concept item on item.ID=clo2.childID

                              left join closure clo on clo.childID=c.ID and clo.`Level`<>0

                              where

                              clo.ID is null and

                              item.Active and

                               c.Active and

                               c.Identifier='configuration.resources'

)            

URLs of file trees from the data configurations

SELECT asm.Url

              FROM concept c  -- roots of configurations (right table on the screen)

   -- variables for configurations (left table on the screen)

              join closure clo1 on clo1.parentID=c.ID and clo1.`Level`=1

              join concept conf on conf.ID=clo1.childID

              join closure clo2 on clo2.parentID=conf.ID and clo2.Level='1'

              join concept var on var.ID=clo2.childID and var.Identifier <> '_LITERALS_' and var.Active

   -- get only files (documents) from the configuration variables

              join assembly asm on asm.conceptID=var.ID and asm.Clazz='documents'

   -- roots

              left join closure clo on clo.childID=c.ID and clo.Level<>0

              where clo.ID is null

               and c.Active

               and c.Identifier in ('configuration.data')

               and conf.Identifier in (

                                             -- URLs of data configurations for all resources

                                             SELECT item.Label

                                             FROM concept c

                                             join closure clo1 on clo1.parentID=c.ID and clo1.`Level`=1

                                             join concept lang on lang.ID=clo1.childID and lang.Identifier='EN_US'

                                             join closure clo2 on clo2.parentID=lang.ID and clo2.`Level`=1

                                             join concept item on item.ID=clo2.childID

                                             left join closure clo on clo.childID=c.ID and clo.`Level`<>0

                                             where

                                             clo.ID is null and

                                             item.Active and

                                              c.Active and

                                              c.Identifier='configuration.resources'

               )

File trees:

SELECT c.ID, c.Identifier

FROM concept c

left join closure clo on clo.childID=c.ID and clo.Level<>0

where clo.ID is null and c.Active and c.Identifier in

(

       -- URLs of file trees from the data configurations

       SELECT asm.Url

       FROM concept c  -- roots of configurations (right table on the screen)

  -- variables for configurations (left table on the screen)

       join closure clo1 on clo1.parentID=c.ID and clo1.`Level`=1

       join concept conf on conf.ID=clo1.childID

       join closure clo2 on clo2.parentID=conf.ID and clo2.Level='1'

       join concept var on var.ID=clo2.childID and var.Identifier <> '_LITERALS_' and var.Active

  -- get only files (documents) from the configuration variables

       join assembly asm on asm.conceptID=var.ID and asm.Clazz='documents'

  -- roots

       left join closure clo on clo.childID=c.ID and clo.Level<>0

       where clo.ID is null

        and c.Active

        and c.Identifier in ('configuration.data')

        and conf.Identifier in (

                                      -- URLs of data configurations for all resources

                                      SELECT item.Label

                                      FROM concept c

                                      join closure clo1 on clo1.parentID=c.ID and clo1.`Level`=1

                                      join concept lang on lang.ID=clo1.childID and lang.Identifier='EN_US'

                                      join closure clo2 on clo2.parentID=lang.ID and clo2.`Level`=1

                                      join concept item on item.ID=clo2.childID

                                      left join closure clo on clo.childID=c.ID and clo.`Level`<>0

                                      where

                                      clo.ID is null and

                                      item.Active and

                                       c.Active and

                                       c.Identifier='configuration.resources'

        )

)

Configurations of data

              Configurations of data intends to manage on-screen forms for configurations and user-defined data  For each form the configuration allows to assign:

·       Form layout on a screen

·       Data input fields, i.e., text, numeric, dates, logical

·       Data input components, i.e., dictionaries, registers, resource downloads, etc.

·       Data validation rules

              Details may be found here (MSH, 2022)

              For each data form the configuration is a branch in the tree with URL “configuration.data”

The SQL is

SELECT c.ID, c.Identifier

FROM concept c

left join closure clo on clo.childID=c.ID and clo.Level<>0

where clo.ID is null and c.Active and c.Identifier in ('configuration.data')

Configurations of reports

              Configurations of internal reports intend to define parameters to execute internal reports. For each report configuration is a branch in a tree with root URL “report.configuration”

The SQL is

SELECT c.ID, c.Identifier

FROM concept c

left join closure clo on clo.childID=c.ID and clo.Level<>0

where clo.ID is null and c.Active and c.Identifier in ('report.configuration')

Tiles

              Tiles is a system dictionary “dictionary.system.tiles”

Legacy Data

              The legacy data may be placed to one or more trees. The URLs of the roots should be clarified for the installation. For Nepal database URLs are “system.import.legacy.data”, “legacy.pharmacies” and “legacy.ws”.

The SQL is

SELECT c.ID, c.Identifier

FROM concept c

left join closure clo on clo.childID=c.ID and clo.Level<>0

where clo.ID is null and c.Active and c.Identifier in ('legacy.pharmacies', 'legacy.ws', 'system.import.legacy.data')

Put it all together

Create two database views:

tree_configurations:

-- 1.        Authorities and users

SELECT c.ID, c.Identifier

FROM concept c

left join closure clo on clo.childID=c.ID and clo.Level<>0

where clo.ID is null and c.Active and c.Identifier in ('organization.authority', 'user.data')

union

-- 2.        Workflows

SELECT c.ID, c.Identifier

FROM concept c

left join closure clo on clo.childID=c.ID and clo.Level<>0

where clo.ID is null and c.Active and c.Identifier in (

                                             -- query for workflow definition URLs from the lifecycle dictionaries

                                             SELECT distinct concat('configuration.',val.Label)

                                             FROM concept c

                                             join closure clo1 on clo1.parentID=c.ID and clo1.`Level`=3

                                             join concept var on var.ID=clo1.childID and var.Identifier='applicationurl'

                                             join closure clo2 on clo2.parentID=var.ID and clo2.Level=1

                                             join concept val on val.ID=clo2.childID and val.Identifier='EN_US'

                                             left join closure clo on clo.childID=c.ID and clo.Level<>0

                                             where

                                             clo.ID is null and

                                             c.Identifier in (

                                             'dictionary.guest.deregistration',

                                             'dictionary.guest.amendments',

                                             'dictionary.guest.applications',

                                             'dictionary.guest.renewal',

                                             'dictionary.shutdown.applications',

                                             'dictionary.host.applications'

                                             )

)

union

-- 3.        Dictionaries

SELECT c.ID, c.Identifier

FROM concept c

left join closure clo on clo.childID=c.ID and clo.Level<>0

where clo.ID is null and c.Active and c.Identifier like 'dictionary.%'


union

-- 4.        Resources

SELECT c.ID, c.Identifier

FROM concept c

left join closure clo on clo.childID=c.ID and clo.Level<>0

where clo.ID is null and c.Active and c.Identifier='configuration.resources'


union

-- 5.       Files uploaded to resources

SELECT c.ID, c.Identifier

FROM concept c

left join closure clo on clo.childID=c.ID and clo.Level<>0

where clo.ID is null and c.Active and c.Identifier in

(

              -- URLs of file trees from the data configurations

              SELECT asm.Url

              FROM concept c  -- roots of configurations (right table on the screen)

   -- variables for configurations (left table on the screen)

              join closure clo1 on clo1.parentID=c.ID and clo1.`Level`=1

              join concept conf on conf.ID=clo1.childID

              join closure clo2 on clo2.parentID=conf.ID and clo2.Level='1'

              join concept var on var.ID=clo2.childID and var.Identifier <> '_LITERALS_' and var.Active

   -- get only files (documents) from the configuration variables

              join assembly asm on asm.conceptID=var.ID and asm.Clazz='documents'

   -- roots

              left join closure clo on clo.childID=c.ID and clo.Level<>0

              where clo.ID is null

               and c.Active

               and c.Identifier in ('configuration.data')

               and conf.Identifier in (

                                             -- URLs of data configurations for all resources

                                             SELECT item.Label

                                             FROM concept c

                                             join closure clo1 on clo1.parentID=c.ID and clo1.`Level`=1

                                             join concept lang on lang.ID=clo1.childID and lang.Identifier='EN_US'

                                             join closure clo2 on clo2.parentID=lang.ID and clo2.`Level`=1

                                             join concept item on item.ID=clo2.childID

                                             left join closure clo on clo.childID=c.ID and clo.`Level`<>0

                                             where

                                             clo.ID is null and

                                             item.Active and

                                              c.Active and

                                              c.Identifier='configuration.resources'

               )

)

union

-- 6.        Configurations of data

SELECT c.ID, c.Identifier

FROM concept c

left join closure clo on clo.childID=c.ID and clo.Level<>0

where clo.ID is null and c.Active and c.Identifier in ('configuration.data')


union

-- 7.        Configurations of reports

SELECT c.ID, c.Identifier

FROM concept c

left join closure clo on clo.childID=c.ID and clo.Level<>0

where clo.ID is null and c.Active and c.Identifier in ('report.configuration')


union

-- 8.        Tiles are a dictionary


-- 9.        Legacy data

SELECT c.ID, c.Identifier

FROM concept c

left join closure clo on clo.childID=c.ID and clo.Level<>0

where clo.ID is null and c.Active and c.Identifier in ('legacy.pharmacies', 'legacy.ws', 'system.import.legacy.data')


tree_userdefined:

SELECT c.*

   FROM

              `concept` `c`

       LEFT JOIN `closure` `clo` ON `clo`.`childID` = `c`.`ID`

           AND (`clo`.`Level` <> 0)

   WHERE

      ISNULL(`clo`.`ID`) AND `c`.`Active`

           AND NOT  `c`.`ID` IN (SELECT

              `tree_configurations`.`ID`

           FROM

               `tree_configurations`)


[1] sciencedirect.com/topics/computer-science/conceptual-data-model