Bug #7215
closedWrong usage of DB array in mentat_main/settings_reporting/emails
100%
Description
As was discussed in VC, the emails
column of table settings_reporting
in the database mentat_main
is declared as a character varying
array, however the upper layer implementation based on Python/SQLAlchemy stores a single string value inside the array, containing comma separated emails. This was identified as a result of an oversight and should be fixed.
In addition, there are both NULL values and empty arrays present in this column, while logically equivalent, they are still distinct values; and although this might be abstracted in the upper layer, it could be improved for consistency in the DB.
Related issues
Updated by Radko Krkoš over 3 years ago
- Related to Feature #7221: Referential integrity for database arrays added
Updated by Pavel Kácha over 3 years ago
- Blocked by Bug #7220: Erroneous values in mentat_main/reports_events/mail_to added
Updated by Pavel Kácha over 3 years ago
- Blocked by deleted (Bug #7220: Erroneous values in mentat_main/reports_events/mail_to)
Updated by Pavel Kácha over 3 years ago
- Blocks Bug #7220: Erroneous values in mentat_main/reports_events/mail_to added
Updated by Rajmund Hruška over 3 years ago
- Assignee set to Rajmund Hruška
- Target version changed from Backlog to 2.9
Updated by Rajmund Hruška over 3 years ago
- Status changed from New to In Progress
- Priority changed from Low to Normal
In the code the validity of emails
field in the form is already checked, so the only necessary thing to change in the code is changing attribute emails
of SettingsReportingModel
to not null. So the real work to do is writing a migration to fix existing data.
Updated by Rajmund Hruška over 3 years ago
- Status changed from In Progress to Feedback
- % Done changed from 0 to 80
- To be discussed changed from No to Yes
Similarly to #7220, I came up with the following code. Radko Krkoš could you please review this code?
CREATE OR REPLACE FUNCTION alembic_33cf3078976b_trim_array(character varying[])
RETURNS character varying[]
AS
$$
DECLARE
arr ALIAS FOR $1;
element character varying;
retVal character varying[];
BEGIN
FOREACH element IN ARRAY arr
LOOP
retVal := array_append(retVal, trim(element)::varchar);
END LOOP;
RETURN retVal;
END;
$$
LANGUAGE plpgsql
STABLE
RETURNS NULL ON NULL INPUT;
CREATE OR REPLACE FUNCTION alembic_33cf3078976b_split_strings(character varying[])
RETURNS character varying[]
AS
$$
DECLARE
arr ALIAS FOR $1;
element character varying;
retVal character varying[];
BEGIN
FOREACH element IN ARRAY arr
LOOP
retVal := retVal || alembic_33cf3078976b_trim_array(string_to_array(element, ',')::varchar[]);
END LOOP;
RETURN retVal;
END;
$$
LANGUAGE plpgsql
STABLE
RETURNS NULL ON NULL INPUT;
UPDATE settings_reporting SET emails = alembic_33cf3078976b_split_strings(emails);
UPDATE settings_reporting SET emails = ARRAY[]::varchar[] WHERE emails IS NULL;
ALTER TABLE settings_reporting ALTER COLUMN emails SET NOT NULL;
DROP FUNCTION alembic_33cf3078976b_split_strings(character varying[]);
DROP FUNCTION alembic_33cf3078976b_trim_array(character varying[]);
CLUSTER settings_reporting;
Updated by Radko Krkoš over 3 years ago
This is just a variation on the fix for #7220, it looks good to me.
As these are closely related and there is even one central migration function that both of these migrations share, I would go for one migration to fix both issues. Fixing only one does not make sense as for each it would only be a partial fix. The same goes for reverting.
Updated by Rajmund Hruška over 3 years ago
- Status changed from Feedback to In Progress
- To be discussed changed from Yes to No
I agree, I put the code in the same migration file, code can be found in the branch hruska-bugfix-#7220-mail_to.
There is some issue with unit tests, it seems like NULL value is being inserted into column with NOT NULL constraint, so I am still looking into that.
Updated by Rajmund Hruška over 3 years ago
- Status changed from In Progress to Feedback
- % Done changed from 80 to 90
- To be discussed changed from No to Yes
Rajmund Hruska wrote in #note-11:
I agree, I put the code in the same migration file, code can be found in the branch hruska-bugfix-#7220-mail_to.
There is some issue with unit tests, it seems like NULL value is being inserted into column with NOT NULL constraint, so I am still looking into that.
I fixed the unit tests so the only thing left is to discuss whether or not to put CLUSTER
in the migration.
Updated by Pavel Kácha over 3 years ago
In the Upgrading docs we have "step 4", where we want admin to run VACUUM/CLUSTER/ANALYZE unconditionally on any upgrade - which we think solves it and we do not need CLUSTER in the migration.
However, we might make "step 4" conditional - "run this only if Alembic runs any migrations".
Radko, could you please confirm?
Updated by Radko Krkoš over 3 years ago
Pavel Kácha wrote in #note-13:
In the Upgrading docs we have "step 4", where we want admin to run VACUUM/CLUSTER/ANALYZE unconditionally on any upgrade - which we think solves it and we do not need CLUSTER in the migration.
However, we might make "step 4" conditional - "run this only if Alembic runs any migrations".
Radko, could you please confirm?
It is more complicated. The first part of step 4 of the "Upgrading" section indeed runs alembic
migrations, first for mentat_main
through flask
's CLI, then on mentat_events
through a helper script, then sets CLUSTER
-ing indices and autovacuum parameters. All of that is idempotent and effectively no-op if there are no changes.
The second part instructs the user to run VACUUM FREEZE
, CLUSTER
and ANALYZE
by hand. At least VACUUM FREEZE
is either useful or quick, if it was not or was run recently, respectively. I would keep that one. The last one, ANALYZE
is also useful and takes a short time, so skipping it adds less value than it takes.
The only non-straightforward one is the CLUSTER
. That is never no-op, it can and will take a lot of time. Also, one can CLUSTER
a freshly CLUSTER
-ed table, running the whole process again with the same run time but no benefits. The trouble is that it is hard to decide whether a CLUSTER
might be useful, based on information that the administrator has available. First, it is not obvious whether any alembic
migrations will be run until the above mentioned commands are run. Next, even if there are any migrations, they might not affect the tables in a way that CLUSTER
would be beneficial. Last, it is a good idea to run CLUSTER
from time to time on workloads such as Mentat, where data are piped (as in running window) through the database. The need for CLUSTER
should be determined based on table/index bloat.
I would go with extending the sqldb-optimize.sh
script and adding code for conditionally CLUSTER
-ing only tables with excess bloat. A bloat level detection code does exist, it is reported by the system-status
script.
Updated by Pavel Kácha over 3 years ago
Mmmkay. After discussion - I guess we might revisit tuning section of Database doc - akin to "add recommended CLUSTER run based on high bloat". Or, as you mentioned, we could consider adding this into sqldb-optimize.sh (however I'm unsure about that, this script only sets recommended values for arbitrary runs of VACUUM/CLUSTER/ANALYZE).
Current bloat reporting script is 11k of dense Perl code from Nagios tools, Radko, isn't there something ligher on stomach?
Updated by Pavel Kácha about 3 years ago
- Related to Task #7538: Streamline documentation concerning VACUUM/CLUSTER/ANALYZE added
Updated by Pavel Kácha about 3 years ago
Regarding #7215#note-12, we have agreed that CLUSTER (or other big operations) does not belong to migration, the rest is moved into #7538.
Updated by Rajmund Hruška about 3 years ago
- Status changed from Feedback to In Review
- % Done changed from 90 to 100
Merged into devel.
Updated by Pavel Kácha over 2 years ago
- Status changed from In Review to Closed