Bug #7593
closedCase sensitive identifiers for users and groups cause problems
100%
Description
Email-like identifiers apparently can come in inconsistent case from source. We had one cause of user identifier (those are generated from EPPN) causing creation of two user entries (with varying case), and at least one cause of group identifier (those are generated from email) of inconsistent case in hostname part.
As for EPPN case sensitivity, we're not sure whether it should be enforced, however data seems to support this.
As for email - hostname part is insensitive, username part (according to spec) is not, however in reality is treated as such.
We have decided to go for insensitivity within Mentat identifiers, or more precisely, for enforced lower case.
This would need lowercasing on insert and migration to treat existent data accordingly.
Updated by Rajmund Hruška over 2 years ago
- Status changed from New to Resolved
- % Done changed from 0 to 100
Instead of validating login and email from every form, I decided to convert the values to lowercase in the database model using validates from sqlalchemy.orm
. I also wrote a migration for existing data so it should be solved.
Updated by Rajmund Hruška over 2 years ago
- Status changed from Resolved to In Progress
- % Done changed from 100 to 70
So, as we discussed during the meeting on June 22, I didn't consider the case when there are two users who have the same login when in lowercase (e.g. login123 and LOGIN123). The proposed way of solving this was to add some suffix in order to maintain UNIQUE constraint and disable the user.
I am trying to come up with an SQL query, which adds the suffix to users who don't have unique login when in lowercase. So far I can only identify users who don't have login in lowercase and add random suffix. Here is my code (I feel like I will need to join the tables for the final query, even though this query can be constructed without JOIN):
SELECT concat(u1.login, floor(random() * 10000 + 1)) FROM users AS u1 JOIN users AS u2 ON u1.id = u2.id WHERE lower(u1.login) <> u1.login;
Updated by Rajmund Hruška over 2 years ago
Rajmund Hruska wrote in #note-2:
So, as we discussed during the meeting on June 22, I didn't consider the case when there are two users who have the same login when in lowercase (e.g. login123 and LOGIN123). The proposed way of solving this was to add some suffix in order to maintain UNIQUE constraint and disable the user.
I am trying to come up with an SQL query, which adds the suffix to users who don't have unique login when in lowercase. So far I can only identify users who don't have login in lowercase and add random suffix. Here is my code (I feel like I will need to join the tables for the final query, even though this query can be constructed without JOIN):
[...]
I have the query now:
SELECT concat(u1.login, floor(random() * 10000 + 1)) FROM users u1 WHERE lower(u1.login) <> u1.login AND EXISTS (SELECT u2.login FROM users u2 WHERE lower(u1.login) = u2.login);
Turns out we have the case which was described earlier in our database: https://mentat-hub.cesnet.cz/mentat/users/45/show and https://mentat-hub.cesnet.cz/mentat/users/89/show.
Updated by Rajmund Hruška over 2 years ago
- Status changed from In Progress to Feedback
- % Done changed from 70 to 100
Should be fixed now in 92e95adf. Is my solution OK?
Updated by Radko Krkoš over 2 years ago
- To be discussed changed from No to Yes
Updated by Rajmund Hruška over 2 years ago
Based on today's meeting, I changed the migration so that all users with conflicting logins will get disabled and their login will change. Radko Krkoš could you look at this query, please? Is it OK? I tried the query on our data and it seems fine.
UPDATE users u1 SET login = concat(u1.login, '_', floor(random() * 10000 + 1), '_CONFLICT_CASE'), enabled = FALSE FROM (SELECT lower(login) login FROM users GROUP BY lower(login) HAVING count(*) > 1) u2 where lower(u1.login) = u2.login;
Updated by Radko Krkoš over 2 years ago
- To be discussed changed from Yes to No
Rajmund Hruska wrote in #note-6:
Based on today's meeting, I changed the migration so that all users with conflicting logins will get disabled and their login will change. Radko Krkoš could you look at this query, please? Is it OK? I tried the query on our data and it seems fine.
[...]
If it was tested, it is fine by me. I see no obvious problems. I would make the last WHERE
uppercase to be consistent. Also I like AS
, but that is purely preferential, so feel free to ignore that.
Functionally, the uppercase '_CONFLICT_CASE' does not match today's conclusion of enforcing lowercase as soon as possible; and would definitely conflict with the discussed constraint `CHECK (login = lower(login))`.
Updated by Rajmund Hruška over 2 years ago
- Status changed from Feedback to In Review
Merged into devel and deployed on mentat-alt.
Updated by Rajmund Hruška over 2 years ago
- Status changed from In Review to In Progress
- % Done changed from 100 to 90
- To be discussed changed from No to Yes
I forgot to add CHECK constraint.
Updated by Rajmund Hruška over 2 years ago
- Status changed from In Progress to Resolved
- % Done changed from 90 to 100
Updated by Radko Krkoš over 2 years ago
Rajmund Hruska wrote in #note-10:
Rajmund Hruska wrote in #note-9:
I forgot to add CHECK constraint.
Fixed in b4dc04de and tested locally. If no one is opposed I will merge it to devel.
I suggest altering the naming. A name such as chk_login
is not descriptive of what the CHECK
actually does. I suggest login_lowercase
or chk_login_lowercase
if you prefer. Also, I suggest to think of it as a CONSTRAINT
, of which CHECK
is just one variant out of six (For details: https://www.postgresql.org/docs/14/ddl-constraints.html).
Updated by Rajmund Hruška over 2 years ago
- To be discussed deleted (
Yes)
Radko Krkoš wrote in #note-11:
Rajmund Hruska wrote in #note-10:
Rajmund Hruska wrote in #note-9:
I forgot to add CHECK constraint.
Fixed in b4dc04de and tested locally. If no one is opposed I will merge it to devel.
I suggest altering the naming. A name such as
chk_login
is not descriptive of what theCHECK
actually does. I suggestlogin_lowercase
orchk_login_lowercase
if you prefer. Also, I suggest to think of it as aCONSTRAINT
, of whichCHECK
is just one variant out of six (For details: https://www.postgresql.org/docs/14/ddl-constraints.html).
Thank you, I renamed the constraints.
Updated by Rajmund Hruška over 2 years ago
- Status changed from Resolved to In Review
Deployed on mentat-alt. We still have all data and the constraints exist so hopefully it's all good.
Updated by Pavel Kácha over 2 years ago
- Status changed from In Review to Closed