feat: postmortem on why users were not able to signup yesterday.
This commit is contained in:
parent
36a0108a2c
commit
53d5da9a56
8 changed files with 402 additions and 3 deletions
86
src/postmortem/2023-09-27_Signup-failures.md
Normal file
86
src/postmortem/2023-09-27_Signup-failures.md
Normal file
|
@ -0,0 +1,86 @@
|
|||
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
|
||||
|
||||
title = '2023-09-27 Signup Failures Postmortem'
|
||||
date = 2023-09-28
|
||||
|
||||
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
|
||||
|
||||
# 2023-09-27 Signup Failures Postmortem
|
||||
|
||||
Key people: Brendan (silver).
|
||||
|
||||
## What happened
|
||||
During the signup event for Skynet new users hit a snag in that they werent receiving emails.
|
||||
These emails are used to verify that they own the address that is on wolves.
|
||||
This allows us to link the accounts together.
|
||||
|
||||
## What was done during outage
|
||||
|
||||
First action was to try and remote into the impacted server (``kitt.skynet.ie``) and see what was teh status of the data update command.
|
||||
This was hampered by ``ssh: could not resolve hostname kitt.skynet.ie: Name or service not known``.
|
||||
I tried to remote into ``vendetta.skynet.ie`` (ns1) and got the same error.
|
||||
|
||||
Next action wast to go into the server room and reboot the vendetta to see if we could get teh hostname issue fixed.
|
||||
Restarted, it came up, went back into Room 3.
|
||||
|
||||
Same issue, though now I was certain teh DNS was working.
|
||||
My laptop had decided to disconnect from eduroam and refused to reconnect.
|
||||
Rebooted it, took quite a while since it was trying to mount a network drive that was not accessible,
|
||||
not a great way to spend 1:30 under time pressure.
|
||||
When it booted up it connected to eduroam without issues.
|
||||
|
||||
I was able to sftp into the server and pull the database file.
|
||||
Confirmed that the csv import was failing, even on a freshly generated database (sqlite databases can be regenerated easily).
|
||||
I tried to rebuild it locally, but my dev environment on my laptop was not set up correctly which hampered my efforts.
|
||||
|
||||
Soon enough cut losses and continued on with the remainder of the presentation.
|
||||
Although without the interactive elements that were planned.
|
||||
|
||||
## What was the root cause
|
||||
|
||||
At home with my normal dev env I was abale to properly investigate.
|
||||
The issue was a database schema I had planned to use as an interim one between teh current method of using the CSV export and the future of using Wolves API.
|
||||
|
||||
The plan was to have fields for ``id_member`` which changes yearly (csv) and ``id_wolves`` which would identify the user (does not change).
|
||||
In combining the two lead to insertion errors leading to no rows being added or updated.
|
||||
|
||||
Table schema as it was during teh signup event.
|
||||
```sql
|
||||
CREATE TABLE IF NOT EXISTS accounts_wolves (
|
||||
id_wolves text DEFAULT '',
|
||||
id_member text DEFAULT '',
|
||||
id_student text,
|
||||
email text not null,
|
||||
expiry text not null,
|
||||
name_first text,
|
||||
name_second text,
|
||||
PRIMARY KEY (id_wolves, id_member)
|
||||
)
|
||||
```
|
||||
|
||||
## What was the solution
|
||||
The main solution was to simplify so only one primary col.
|
||||
Like so:
|
||||
|
||||
```sql
|
||||
CREATE TABLE IF NOT EXISTS accounts_wolves (
|
||||
id_wolves text PRIMARY KEY,
|
||||
id_student text,
|
||||
email text NOT NULL,
|
||||
expiry text NOT NULL,
|
||||
name_first text,
|
||||
name_second text
|
||||
)
|
||||
```
|
||||
This also makes it easier to do the migration when teh Wolves API releases.
|
||||
|
||||
[Patch that fixed it](https://gitlab.skynet.ie/compsoc1/skynet/ldap/backend/-/commit/9db8a238d2bf7be8bcfa86012b26180c041c13d1)
|
||||
|
||||
|
||||
## Things to improve for the future.
|
||||
|
||||
* Verify/test it works before a big event.
|
||||
* If its new enough have a dev env on hand.
|
||||
* Its not //always// DNS (probally is though)
|
||||
* Network connectivity is a good first step to test.
|
||||
|
Loading…
Add table
Add a link
Reference in a new issue