Bug #7220: Erroneous values in mentat_main/reports_events/mail_to - Mentat - Homeproj: Redmine for CESNET

Actions

Copy link

Bug #7220

closed

Erroneous values in mentat_main/reports_events/mail_to

Added by Radko Krkoš over 3 years ago. Updated almost 3 years ago.

Status:

Closed

Priority:

Low

Assignee:

Rajmund Hruška

Category:

Development - Database

Target version:

2.9

Start date:

04/16/2021

Due date:

% Done:

100%

Estimated time:

To be discussed:

Description

As was discussed in VC, the mail_to column of table reports_events in the database mentat_main contains strange or straight erroneous values. Some of these arrays are empty, some e-mails have prepended spaces and there is at least one "None" which seems to be an alternative realisation of `empty`.
From discussion: It might be the case that the mail_to column is not really accessed after store and could be removed entirely.

Related issues

Actions

Copy link

Updated by Radko Krkoš over 3 years ago

Related to Feature #7221: Referential integrity for database arrays added

Actions

Copy link

Updated by Pavel Kácha over 3 years ago

Related to Feature #7257: Store email addresses to which the report was sent added

Actions

Copy link

Updated by Radko Krkoš over 3 years ago

Based on discussion, the empty array case can be detected and enforced in the DB as follows:

ALTER TABLE reports_events ADD CONSTRAINT mail_to_non_empty_array CHECK (mail_to <> '{}');

Actions

Copy link

Updated by Rajmund Hruška over 3 years ago

Status changed from New to In Progress
Assignee set to Rajmund Hruška
Target version changed from Backlog to 2.9
To be discussed changed from No to Yes

Some values are Null because the report was never sent. This happens when abuse group wants to receive only "extra" reports. To create an "extra" a "summary" report must be created before that and those "summary" reports aren't sent. I haven't seen any empty arrays ({}). Should empty value be stored as empty array or Null?

Reports with values {None} contain in the mail_res column this:

message transmission error (552 5.3.4 Error: message file too big)
Server said: 552 5.3.4 Error: message file too big

The emails with prepended spaces is caused by the same issue as #7215, so that issue should be solved first.

I think we should keep the mail_to column. After adding mail_cc column, this would solve #7257.

Actions

Copy link

Updated by Rajmund Hruška over 3 years ago

Status changed from In Progress to Feedback

Actions

Copy link

Updated by Pavel Kácha over 3 years ago

The first question is - it this information actually accessed/used somewhere after write?

In relation to #7257, do I understand it right that this column bears third type of information - what emails the report actually went to, and that this may diverge from what was in the mail headers?

Actions

Copy link

Updated by Rajmund Hruška over 3 years ago

Pavel Kácha wrote in #note-6:

The first question is - it this information actually accessed/used somewhere after write?

Yes. The values of this this attribute/array are shown in the detail of a report, in the tab 'Metadata', row 'Target email'.

In relation to #7257, do I understand it right that this column bears third type of information - what emails the report actually went to, and that this may diverge from what was in the mail headers?

Correct, it stores actual email addresses that the report went to. So for example, on mentat-alt there is always 'root'. Mail headers change either because of the configuration setting or because the abuse group has option 'report redirection' enabled.

Actions

Copy link

Updated by Pavel Kácha over 3 years ago

So - correct solution after discussion would be:

we should not have two values representing one state, so let's settle on empty array for generated but unsent reports and migrate column to not null (and update null -> [])
repair known wrong values (strip leading/trailing whitespace, 'none', or maybe check nonconforming emails), where we are pretty sure we won't break good data

Of course, depends on what worms are going to crawl from the can, we can stil reconsider.

Actions

Copy link

Updated by Rajmund Hruška over 3 years ago

% Done changed from 0 to 90

Pavel Kácha wrote in #note-8:

So - correct solution after discussion would be:

we should not have two values representing one state, so let's settle on empty array for generated but unsent reports and migrate column to not null (and update null -> [])

repair known wrong values (strip leading/trailing whitespace, 'none', or maybe check nonconforming emails), where we are pretty sure we won't break good data

Of course, depends on what worms are going to crawl from the can, we can stil reconsider.

I tried solving this issue and I came up with following code. I don't feel confident in writing efficient SQL queries, so can someone check this code?

CREATE OR REPLACE FUNCTION trim_array(character varying[])
RETURNS character varying[]
AS
$$
DECLARE
   arr ALIAS FOR $1;
   retVal character varying[];
BEGIN
   FOR I IN array_lower(arr, 1)..array_upper(arr, 1) LOOP
    retVal[I] := trim(arr[I]);
   END LOOP;
RETURN retVal;
END;
$$
LANGUAGE plpgsql 
   STABLE 
RETURNS NULL ON NULL INPUT;

update reports_events set mail_to = ARRAY[]::varchar[] where mail_to is null;
update reports_events set mail_to = trim_array(mail_to);
delete from reports_events where mail_to = '{None}';
alter table reports_events alter column mail_to set not null;

If the code is OK, I will create a migration.

Actions

Copy link

#10

Updated by Pavel Kácha over 3 years ago

Blocks Bug #7215: Wrong usage of DB array in mentat_main/settings_reporting/emails added

Actions

Copy link

#11

Updated by Pavel Kácha over 3 years ago

Blocks deleted (Bug #7215: Wrong usage of DB array in mentat_main/settings_reporting/emails)

Actions

Copy link

#12

Updated by Pavel Kácha over 3 years ago

Blocked by Bug #7215: Wrong usage of DB array in mentat_main/settings_reporting/emails added

Actions

Copy link

#13

Updated by Pavel Kácha over 3 years ago

Added #7215 as blocker, as it's the culprit of the wrong data here. We have to solve that first to prevent new wrong data creeping in here.

Actions

Copy link

#14

Updated by Radko Krkoš over 3 years ago

A review was requested, here are my remarks (in no particular order):

I would use FOREACH for the loop, as follows:
```
   FOREACH element IN ARRAY arr LOOP
    array_append(retVal, trim(element));
   END LOOP;
```
Of course, there is just a syntactic difference. Both do exactly the same thing, so pick by personal taste. (Warning: I did not run that code).
The case of mail_to = '{None}' represents a situation when the report was generated and was intended to be sent, but that failed (as it happens, always due to the message being too big, but possibly due to other causes). I am against flatly deleting such unsuccessfully sent reports. Setting them to empty array (as the NULL ones) further hides the distinction (from those not intended to be sent), but that should be marked in logs if the case occurred recently and I guess is of diminishing usefulness over time. If it is important, then this might need an additional attribute or logic (I am unsure whether the mail_res is displayed in the UI somewhere).
The created function is left lingering. I would add something like this somewhere towards the end:
```
DROP FUNCTION trim_array(character varying[]);
```
The code as it stands alters all rows of the table. There is little that can be done to prevent this. Some sort of {VACUUM FULL, CLUSTER, pg_repack} should be performed at the end, otherwise the table and especially the indices would remain horribly bloated. The table is small (<750MB on production), so it would take a short time. Of course, this could be just documented and left for the DB admin.
I would advise on prefixing the names of functions used only during migrations (or any other such temporary entities) with some obvious identifier ('alembic_' + alembic migration id?). It is a rare case (this is only the second time we needed any sort of custom function for migration), but that lowers the chance of a naming conflict and better shows the purpose if a forgotten entity is later found in the DB.

I should be available this week for any follow-up.

Actions

Copy link

#15

Updated by Rajmund Hruška over 3 years ago

Thank you so much for your review. The updated version can be found at the end of this note.

Radko Krkoš wrote in #note-14:

A review was requested, here are my remarks (in no particular order):

I would use FOREACH for the loop, as follows:
[...]
Of course, there is just a syntactic difference. Both do exactly the same thing, so pick by personal taste. (Warning: I did not run that code).

I tried running the function with this change and I had to change it a bit. array_append returns an array, so I had to reassign the variable. Also, the trim function is returning type text, so I had to cast it to varchar.

The case of mail_to = '{None}' represents a situation when the report was generated and was intended to be sent, but that failed (as it happens, always due to the message being too big, but possibly due to other causes). I am against flatly deleting such unsuccessfully sent reports. Setting them to empty array (as the NULL ones) further hides the distinction (from those not intended to be sent), but that should be marked in logs if the case occurred recently and I guess is of diminishing usefulness over time. If it is important, then this might need an additional attribute or logic (I am unsure whether the mail_res is displayed in the UI somewhere).

Instead of deleting those reports, I changed the mail_to to empty array. Those reports are from 2017, so I don't think they are useful anymore.

The created function is left lingering. I would add something like this somewhere towards the end:
[...]

I would advise on prefixing the names of functions used only during migrations (or any other such temporary entities) with some obvious identifier ('alembic_' + alembic migration id?). It is a rare case (this is only the second time we needed any sort of custom function for migration), but that lowers the chance of a naming conflict and better shows the purpose if a forgotten entity is later found in the DB.

I renamed the function and added the drop statement at the end.

The code as it stands alters all rows of the table. There is little that can be done to prevent this. Some sort of {VACUUM FULL, CLUSTER, pg_repack} should be performed at the end, otherwise the table and especially the indices would remain horribly bloated. The table is small (<750MB on production), so it would take a short time. Of course, this could be just documented and left for the DB admin.

I tried running CLUSTER VERBOSE reports_events on my local machine with 1010 reports and it seems like not much has happened. I will add it to the migration though (without the verbose option). Is it also important in downgrade? The downgrade will only remove the NOT NULL constraint.

Here is the updated version:

CREATE OR REPLACE FUNCTION alembic_4a172cd00ef0_trim_array(character varying[])
RETURNS character varying[]
AS
$$
DECLARE
   arr ALIAS FOR $1;
   element character varying;
   retVal character varying[];
BEGIN
   FOREACH element IN ARRAY arr
   LOOP
    retVal := array_append(retVal, trim(element)::varchar);
   END LOOP;
RETURN retVal;
END;    
$$
LANGUAGE plpgsql 
   STABLE 
RETURNS NULL ON NULL INPUT;

update reports_events set mail_to = ARRAY[]::varchar[] where mail_to is null or mail_to = '{None}';
update reports_events set mail_to = alembic_4a172cd00ef0_trim_array(mail_to);
alter table reports_events alter column mail_to set not null;
DROP FUNCTION alembic_4a172cd00ef0_trim_array(character varying[]);
CLUSTER reports_events;

It is also worth mentioning that because this task will be merged before #6227 which has some migrations, Alembic merge will be needed at some point.

Actions

Copy link

#16

Updated by Radko Krkoš over 3 years ago

Rajmund Hruska wrote in #note-15:

Thank you so much for your review. The updated version can be found at the end of this note.

No problem.

Instead of deleting those reports, I changed the mail_to to empty array. Those reports are from 2017, so I don't think they are useful anymore.

Yes, they are probably not, but we keep full history there yet and punching holes is against that.

I tried running CLUSTER VERBOSE reports_events on my local machine with 1010 reports and it seems like not much has happened.

The whole table and its indices was rewritten from scratch. On such a small table, it is a quick operation. There are definitely differences in table and index sizes, albeit small. It is also much more efficient to do any operation on the table.

I will add it to the migration though (without the verbose option).

We did not do that before. This should better be discussed before taking action.

Is it also important in downgrade? The downgrade will only remove the NOT NULL constraint.

No, as no modification to table data is done in that case, only a change in table metadata. The removal of NOT NULL constraint is trivial, does not even need a scan (the data is logically guaranteed to be a subset of the new 'constraints').

Here is the updated version:
[...]

Looks good to me.

It is also worth mentioning that because this task will be merged before #6227 which has some migrations, Alembic merge will be needed at some point.

Well, if I am not missing something, the migration from #6227 can still be rebased on this one by changing the down_revision attribute (plus the comments). It is not like that one was deployed anywhere. OK, except for your development environment, but that should be fixable locally and not blocking.
Even this one might need such a 'rebase' as it is blocked by #7215, which might need a migration of its own.

Actions

Copy link

#17

Updated by Radko Krkoš over 3 years ago

Instead of deleting those reports, I changed the mail_to to empty array. Those reports are from 2017, so I don't think they are useful anymore.

Is the mail_res displayed somewhere in the report view? If not, maybe it should be added.

Actions

Copy link

#18

Updated by Rajmund Hruška over 3 years ago

Radko Krkoš wrote in #note-17:

Instead of deleting those reports, I changed the mail_to to empty array. Those reports are from 2017, so I don't think they are useful anymore.

Is the mail_res displayed somewhere in the report view? If not, maybe it should be added.

mail_res contains basically the same information as mail_to. Moreover, the reports in the last 3 years have this column empty. Even in the current code this attribute is never assigned nor used, so it seems like it was deleted from the code but not from the database.

Actions

Copy link

#19

Updated by Radko Krkoš over 3 years ago

Rajmund Hruska wrote in #note-18:

Radko Krkoš wrote in #note-17:

Is the mail_res displayed somewhere in the report view? If not, maybe it should be added.

mail_res contains basically the same information as mail_to. Moreover, the reports in the last 3 years have this column empty. Even in the current code this attribute is never assigned nor used, so it seems like it was deleted from the code but not from the database.

OK, then I retract that.

Actions

Copy link

#20

Updated by Rajmund Hruška over 3 years ago

I made the Alembic migration in 096d69c4. As adding CLUSTER reports_events into the migration file is still to be discussed, I will keep the 'To be discussed' flag set to 'Yes'.

Actions

Copy link

#21

Updated by Pavel Kácha over 3 years ago

To be discussed deleted (~~Yes~~)

Rajmund Hruska wrote in #note-20:

I made the Alembic migration in 096d69c4. As adding CLUSTER reports_events into the migration file is still to be discussed, I will keep the 'To be discussed' flag set to 'Yes'.

Depends on decision on #7215.

Actions

Copy link

#22

Updated by Rajmund Hruška about 3 years ago

Status changed from Feedback to In Review
% Done changed from 90 to 100

Merged into devel.

Actions

Copy link

#23

Updated by Pavel Kácha about 3 years ago

Radko Krkoš wrote in #note-16:

Rajmund Hruska wrote in #note-15:

I tried running CLUSTER VERBOSE reports_events on my local machine with 1010 reports and it seems like not much has happened.

The whole table and its indices was rewritten from scratch. On such a small table, it is a quick operation. There are definitely differences in table and index sizes, albeit small. It is also much more efficient to do any operation on the table.

I will add it to the migration though (without the verbose option).

We did not do that before. This should better be discussed before taking action.

I think we have already discussed this and settled against adding VACUUM/CLUSTER/ANALYZE into migrations (see #7215), also #7538 started from that. We don't have the documentation cleaned up, however if we do not do CLUSTER here - well reporting table is not small, but also not critical, and nothing much happens without optimizing it. So I'd put the notice into the next release notes and call it a day.

Actions

Copy link

#24

Updated by Pavel Kácha almost 3 years ago

Status changed from In Review to Closed

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

Mentat

Custom queries

Bug #7220

Erroneous values in mentat_main/reports_events/mail_to

Updated by Radko Krkoš over 3 years ago

Updated by Pavel Kácha over 3 years ago

Updated by Radko Krkoš over 3 years ago

Updated by Rajmund Hruška over 3 years ago

Updated by Rajmund Hruška over 3 years ago

Updated by Pavel Kácha over 3 years ago

Updated by Rajmund Hruška over 3 years ago

Updated by Pavel Kácha over 3 years ago

Updated by Rajmund Hruška over 3 years ago

Updated by Pavel Kácha over 3 years ago

Updated by Pavel Kácha over 3 years ago

Updated by Pavel Kácha over 3 years ago

Updated by Pavel Kácha over 3 years ago

Updated by Radko Krkoš over 3 years ago

Updated by Rajmund Hruška over 3 years ago

Updated by Radko Krkoš over 3 years ago

Updated by Radko Krkoš over 3 years ago

Updated by Rajmund Hruška over 3 years ago

Updated by Radko Krkoš over 3 years ago

Updated by Rajmund Hruška over 3 years ago

Updated by Pavel Kácha over 3 years ago

Updated by Rajmund Hruška about 3 years ago

Updated by Pavel Kácha about 3 years ago

Updated by Pavel Kácha almost 3 years ago