Bug #7220
closedErroneous values in mentat_main/reports_events/mail_to
100%
Description
As was discussed in VC, the mail_to
column of table reports_events
in the database mentat_main
contains strange or straight erroneous values. Some of these arrays are empty, some e-mails have prepended spaces and there is at least one "None" which seems to be an alternative realisation of `empty`.
From discussion: It might be the case that the mail_to
column is not really accessed after store and could be removed entirely.
Related issues
Updated by Radko Krkoš over 3 years ago
- Related to Feature #7221: Referential integrity for database arrays added
Updated by Pavel Kácha over 3 years ago
- Related to Feature #7257: Store email addresses to which the report was sent added
Updated by Radko Krkoš over 3 years ago
Based on discussion, the empty array case can be detected and enforced in the DB as follows:
ALTER TABLE reports_events ADD CONSTRAINT mail_to_non_empty_array CHECK (mail_to <> '{}');
Updated by Rajmund Hruška over 3 years ago
- Status changed from New to In Progress
- Assignee set to Rajmund Hruška
- Target version changed from Backlog to 2.9
- To be discussed changed from No to Yes
Some values are Null because the report was never sent. This happens when abuse group wants to receive only "extra" reports. To create an "extra" a "summary" report must be created before that and those "summary" reports aren't sent. I haven't seen any empty arrays ({}). Should empty value be stored as empty array or Null?
Reports with values {None} contain in the mail_res
column this:
message transmission error (552 5.3.4 Error: message file too big) Server said: 552 5.3.4 Error: message file too big
The emails with prepended spaces is caused by the same issue as #7215, so that issue should be solved first.
I think we should keep the mail_to
column. After adding mail_cc
column, this would solve #7257.
Updated by Rajmund Hruška over 3 years ago
- Status changed from In Progress to Feedback
Updated by Pavel Kácha over 3 years ago
The first question is - it this information actually accessed/used somewhere after write?
In relation to #7257, do I understand it right that this column bears third type of information - what emails the report actually went to, and that this may diverge from what was in the mail headers?
Updated by Rajmund Hruška over 3 years ago
Pavel Kácha wrote in #note-6:
The first question is - it this information actually accessed/used somewhere after write?
Yes. The values of this this attribute/array are shown in the detail of a report, in the tab 'Metadata', row 'Target email'.
In relation to #7257, do I understand it right that this column bears third type of information - what emails the report actually went to, and that this may diverge from what was in the mail headers?
Correct, it stores actual email addresses that the report went to. So for example, on mentat-alt there is always 'root'. Mail headers change either because of the configuration setting or because the abuse group has option 'report redirection' enabled.
Updated by Pavel Kácha over 3 years ago
So - correct solution after discussion would be:
- we should not have two values representing one state, so let's settle on empty array for generated but unsent reports and migrate column to not null (and update null -> [])
- repair known wrong values (strip leading/trailing whitespace, 'none', or maybe check nonconforming emails), where we are pretty sure we won't break good data
Of course, depends on what worms are going to crawl from the can, we can stil reconsider.
Updated by Rajmund Hruška over 3 years ago
- % Done changed from 0 to 90
Pavel Kácha wrote in #note-8:
So - correct solution after discussion would be:
- we should not have two values representing one state, so let's settle on empty array for generated but unsent reports and migrate column to not null (and update null -> [])
- repair known wrong values (strip leading/trailing whitespace, 'none', or maybe check nonconforming emails), where we are pretty sure we won't break good data
Of course, depends on what worms are going to crawl from the can, we can stil reconsider.
I tried solving this issue and I came up with following code. I don't feel confident in writing efficient SQL queries, so can someone check this code?
CREATE OR REPLACE FUNCTION trim_array(character varying[])
RETURNS character varying[]
AS
$$
DECLARE
arr ALIAS FOR $1;
retVal character varying[];
BEGIN
FOR I IN array_lower(arr, 1)..array_upper(arr, 1) LOOP
retVal[I] := trim(arr[I]);
END LOOP;
RETURN retVal;
END;
$$
LANGUAGE plpgsql
STABLE
RETURNS NULL ON NULL INPUT;
update reports_events set mail_to = ARRAY[]::varchar[] where mail_to is null;
update reports_events set mail_to = trim_array(mail_to);
delete from reports_events where mail_to = '{None}';
alter table reports_events alter column mail_to set not null;
If the code is OK, I will create a migration.
Updated by Pavel Kácha over 3 years ago
- Blocks Bug #7215: Wrong usage of DB array in mentat_main/settings_reporting/emails added
Updated by Pavel Kácha over 3 years ago
- Blocks deleted (Bug #7215: Wrong usage of DB array in mentat_main/settings_reporting/emails)
Updated by Pavel Kácha over 3 years ago
- Blocked by Bug #7215: Wrong usage of DB array in mentat_main/settings_reporting/emails added
Updated by Pavel Kácha over 3 years ago
Added #7215 as blocker, as it's the culprit of the wrong data here. We have to solve that first to prevent new wrong data creeping in here.
Updated by Radko Krkoš over 3 years ago
A review was requested, here are my remarks (in no particular order):
- I would use
FOREACH
for the loop, as follows:FOREACH element IN ARRAY arr LOOP array_append(retVal, trim(element)); END LOOP;
Of course, there is just a syntactic difference. Both do exactly the same thing, so pick by personal taste. (Warning: I did not run that code). - The case of
mail_to = '{None}'
represents a situation when the report was generated and was intended to be sent, but that failed (as it happens, always due to the message being too big, but possibly due to other causes). I am against flatly deleting such unsuccessfully sent reports. Setting them to empty array (as the NULL ones) further hides the distinction (from those not intended to be sent), but that should be marked in logs if the case occurred recently and I guess is of diminishing usefulness over time. If it is important, then this might need an additional attribute or logic (I am unsure whether themail_res
is displayed in the UI somewhere). - The created function is left lingering. I would add something like this somewhere towards the end:
DROP FUNCTION trim_array(character varying[]);
- The code as it stands alters all rows of the table. There is little that can be done to prevent this. Some sort of
{VACUUM FULL, CLUSTER, pg_repack}
should be performed at the end, otherwise the table and especially the indices would remain horribly bloated. The table is small (<750MB on production), so it would take a short time. Of course, this could be just documented and left for the DB admin. - I would advise on prefixing the names of functions used only during migrations (or any other such temporary entities) with some obvious identifier ('alembic_' + alembic migration id?). It is a rare case (this is only the second time we needed any sort of custom function for migration), but that lowers the chance of a naming conflict and better shows the purpose if a forgotten entity is later found in the DB.
I should be available this week for any follow-up.
Updated by Rajmund Hruška over 3 years ago
Thank you so much for your review. The updated version can be found at the end of this note.
Radko Krkoš wrote in #note-14:
A review was requested, here are my remarks (in no particular order):
- I would use
FOREACH
for the loop, as follows:
[...]
Of course, there is just a syntactic difference. Both do exactly the same thing, so pick by personal taste. (Warning: I did not run that code).
I tried running the function with this change and I had to change it a bit. array_append
returns an array, so I had to reassign the variable. Also, the trim
function is returning type text, so I had to cast it to varchar.
- The case of
mail_to = '{None}'
represents a situation when the report was generated and was intended to be sent, but that failed (as it happens, always due to the message being too big, but possibly due to other causes). I am against flatly deleting such unsuccessfully sent reports. Setting them to empty array (as the NULL ones) further hides the distinction (from those not intended to be sent), but that should be marked in logs if the case occurred recently and I guess is of diminishing usefulness over time. If it is important, then this might need an additional attribute or logic (I am unsure whether themail_res
is displayed in the UI somewhere).
Instead of deleting those reports, I changed the mail_to
to empty array. Those reports are from 2017, so I don't think they are useful anymore.
- The created function is left lingering. I would add something like this somewhere towards the end:
[...]- I would advise on prefixing the names of functions used only during migrations (or any other such temporary entities) with some obvious identifier ('alembic_' + alembic migration id?). It is a rare case (this is only the second time we needed any sort of custom function for migration), but that lowers the chance of a naming conflict and better shows the purpose if a forgotten entity is later found in the DB.
I renamed the function and added the drop statement at the end.
- The code as it stands alters all rows of the table. There is little that can be done to prevent this. Some sort of
{VACUUM FULL, CLUSTER, pg_repack}
should be performed at the end, otherwise the table and especially the indices would remain horribly bloated. The table is small (<750MB on production), so it would take a short time. Of course, this could be just documented and left for the DB admin.
I tried running CLUSTER VERBOSE reports_events
on my local machine with 1010 reports and it seems like not much has happened. I will add it to the migration though (without the verbose option). Is it also important in downgrade? The downgrade will only remove the NOT NULL constraint.
Here is the updated version:
CREATE OR REPLACE FUNCTION alembic_4a172cd00ef0_trim_array(character varying[])
RETURNS character varying[]
AS
$$
DECLARE
arr ALIAS FOR $1;
element character varying;
retVal character varying[];
BEGIN
FOREACH element IN ARRAY arr
LOOP
retVal := array_append(retVal, trim(element)::varchar);
END LOOP;
RETURN retVal;
END;
$$
LANGUAGE plpgsql
STABLE
RETURNS NULL ON NULL INPUT;
update reports_events set mail_to = ARRAY[]::varchar[] where mail_to is null or mail_to = '{None}';
update reports_events set mail_to = alembic_4a172cd00ef0_trim_array(mail_to);
alter table reports_events alter column mail_to set not null;
DROP FUNCTION alembic_4a172cd00ef0_trim_array(character varying[]);
CLUSTER reports_events;
It is also worth mentioning that because this task will be merged before #6227 which has some migrations, Alembic merge will be needed at some point.
Updated by Radko Krkoš over 3 years ago
Rajmund Hruska wrote in #note-15:
Thank you so much for your review. The updated version can be found at the end of this note.
No problem.
Instead of deleting those reports, I changed the
mail_to
to empty array. Those reports are from 2017, so I don't think they are useful anymore.
Yes, they are probably not, but we keep full history there yet and punching holes is against that.
I tried running
CLUSTER VERBOSE reports_events
on my local machine with 1010 reports and it seems like not much has happened.
The whole table and its indices was rewritten from scratch. On such a small table, it is a quick operation. There are definitely differences in table and index sizes, albeit small. It is also much more efficient to do any operation on the table.
I will add it to the migration though (without the verbose option).
We did not do that before. This should better be discussed before taking action.
Is it also important in downgrade? The downgrade will only remove the NOT NULL constraint.
No, as no modification to table data is done in that case, only a change in table metadata. The removal of NOT NULL constraint is trivial, does not even need a scan (the data is logically guaranteed to be a subset of the new 'constraints').
Here is the updated version:
[...]
Looks good to me.
It is also worth mentioning that because this task will be merged before #6227 which has some migrations, Alembic merge will be needed at some point.
Well, if I am not missing something, the migration from #6227 can still be rebased on this one by changing the down_revision attribute (plus the comments). It is not like that one was deployed anywhere. OK, except for your development environment, but that should be fixable locally and not blocking.
Even this one might need such a 'rebase' as it is blocked by #7215, which might need a migration of its own.
Updated by Radko Krkoš over 3 years ago
Instead of deleting those reports, I changed the
mail_to
to empty array. Those reports are from 2017, so I don't think they are useful anymore.
Is the mail_res
displayed somewhere in the report view? If not, maybe it should be added.
Updated by Rajmund Hruška over 3 years ago
Radko Krkoš wrote in #note-17:
Instead of deleting those reports, I changed the
mail_to
to empty array. Those reports are from 2017, so I don't think they are useful anymore.Is the
mail_res
displayed somewhere in the report view? If not, maybe it should be added.
mail_res
contains basically the same information as mail_to
. Moreover, the reports in the last 3 years have this column empty. Even in the current code this attribute is never assigned nor used, so it seems like it was deleted from the code but not from the database.
Updated by Radko Krkoš over 3 years ago
Rajmund Hruska wrote in #note-18:
Radko Krkoš wrote in #note-17:
Is the
mail_res
displayed somewhere in the report view? If not, maybe it should be added.
mail_res
contains basically the same information asmail_to
. Moreover, the reports in the last 3 years have this column empty. Even in the current code this attribute is never assigned nor used, so it seems like it was deleted from the code but not from the database.
OK, then I retract that.
Updated by Rajmund Hruška over 3 years ago
I made the Alembic migration in 096d69c4. As adding CLUSTER reports_events
into the migration file is still to be discussed, I will keep the 'To be discussed' flag set to 'Yes'.
Updated by Pavel Kácha over 3 years ago
- To be discussed deleted (
Yes)
Updated by Rajmund Hruška about 3 years ago
- Status changed from Feedback to In Review
- % Done changed from 90 to 100
Merged into devel.
Updated by Pavel Kácha about 3 years ago
Radko Krkoš wrote in #note-16:
Rajmund Hruska wrote in #note-15:
I tried running
CLUSTER VERBOSE reports_events
on my local machine with 1010 reports and it seems like not much has happened.The whole table and its indices was rewritten from scratch. On such a small table, it is a quick operation. There are definitely differences in table and index sizes, albeit small. It is also much more efficient to do any operation on the table.
I will add it to the migration though (without the verbose option).
We did not do that before. This should better be discussed before taking action.
I think we have already discussed this and settled against adding VACUUM/CLUSTER/ANALYZE into migrations (see #7215), also #7538 started from that. We don't have the documentation cleaned up, however if we do not do CLUSTER here - well reporting table is not small, but also not critical, and nothing much happens without optimizing it. So I'd put the notice into the next release notes and call it a day.
Updated by Pavel Kácha almost 3 years ago
- Status changed from In Review to Closed