Project

General

Profile

Actions

Feature #7221

open

Referential integrity for database arrays

Added by Radko Krkoš almost 3 years ago. Updated over 2 years ago.

Status:
New
Priority:
Low
Assignee:
Category:
Research and analysis
Target version:
Start date:
04/16/2021
Due date:
% Done:

0%

Estimated time:
To be discussed:
No

Description

A classic referential integrity model in relational databases is based on FOREIGN KEYs, implemented as REFERENCES in the PostgreSQL DB. This feature has several advantages, preventing irrecoverable errors in data and allowing CASCADE of data modification procedures.
The database system used currently does not support references for array values, which are used commonly in the Mentat database model and missing this feature caused several inconsistencies (albeit with minimum impact so far).
A workaround at the database level might be a using a TRIGGER, checking the data during all modifications. There are two paths that must be analyzed: INSERT/UPDATE to the referring array and INSERT/UPDATE/DELETE to the KEY column/table. Both should be doable using the RIGGER approach.
Things to consider:
- implementation feasibility,
- performance impact in all scenarios (read should not be affected, others must be analyzed),
- parametrisability/reusability (there are several such reference relations, with more to come).


Related issues

Related to Mentat - Bug #7215: Wrong usage of DB array in mentat_main/settings_reporting/emailsClosedRajmund Hruška04/15/2021

Actions
Related to Mentat - Bug #7220: Erroneous values in mentat_main/reports_events/mail_toClosedRajmund Hruška04/16/2021

Actions
Actions #1

Updated by Radko Krkoš almost 3 years ago

  • Related to Feature #7052: Link report to each group which owns it added
Actions #2

Updated by Radko Krkoš almost 3 years ago

  • Related to Bug #7215: Wrong usage of DB array in mentat_main/settings_reporting/emails added
Actions #3

Updated by Radko Krkoš almost 3 years ago

  • Related to Bug #7220: Erroneous values in mentat_main/reports_events/mail_to added
Actions #4

Updated by Radko Krkoš almost 3 years ago

  • Related to deleted (Feature #7052: Link report to each group which owns it)
Actions #5

Updated by Radko Krkoš almost 3 years ago

Removing relation to #7052, as it was dealt with differently, based on relational model.

Actions #6

Updated by Rajmund Hruška over 2 years ago

Is this really related to #7215 and #7220? In both issues the column in question has type varchar[] and the data are unique.

Actions #7

Updated by Radko Krkoš over 2 years ago

Rajmund Hruska wrote in #note-6:

Is this really related to #7215 and #7220? In both issues the column in question has type varchar[] and the data are unique.

This issue is about referential integrity of values in an array, effectively that an array can only contain values from some set.
In case of #7215 and #7220, the values in mail_to must exist in some set of all abuse contacts.
There is a complication here as the set of all abuse contacts is not represented very well in the database. It can be currently obtained by:


SELECT DISTINCT trim(unnest(string_to_array(unnest(emails),','))) FROM settings_reporting;

Which is horrible. It will be a bit easier after #7215 is fixed, fewer nested functions, but still not very usable.

To sum it up, this issue is related to #7215 and #7220 in the sense, that it would have prevented them. Also, it is not related in the sense that it does not need to be solved, for those issues to be fixed. Take your pick.

Does that answer your question?

Actions #8

Updated by Rajmund Hruška over 2 years ago

Data in column mail_to can be a subset of all emails from column emails. But that's not always true. An abuse group can change its emails and it that case the previous reports of that group will have original values, which is correct. That's the reason why mail_to exists. And in that case I don't see how referential integrity can be enforced.

Actions #9

Updated by Radko Krkoš over 2 years ago

Rajmund Hruska wrote in #note-8:

Data in column mail_to can be a subset of all emails from column emails. But that's not always true. An abuse group can change its emails and it that case the previous reports of that group will have original values, which is correct. That's the reason why mail_to exists. And in that case I don't see how referential integrity can be enforced.

Yes, in that case it cannot. An argument can be made that that's a bad design. While a relational database should not be used as simple object storage, it certainly can be used that way. In other words, if we don't care about referential integrity, it might not be possible to and definitely does not make sense to try to enforce it.

To go back to your original question, this issue stemmed from #7215 and #7220 as a possible solution, the one most compatible with the general idea of relational databases. It is however certainly not the only solution and it might not be a viable solution, if the requirements are not met (there is no set of possible values defined). That is why this issue is categorized as "Research and analysis" and prioritized as "low". As for the relation status, I do not have a strong opinion against removing it. Please, feel free to do so. Of course then this might be a void issue, not worth spending effort on. I do not think we have other cases in Mentat, where an array referential integrity might be helpful.

Actions

Also available in: Atom PDF