3.4 KiB
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
title = '2023-09-27 Signup Failures Postmortem' date = 2023-09-28
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
2023-09-27 Signup Failures Postmortem
Key people: Brendan (silver).
What happened
During the signup event for Skynet new users hit a snag in that they werent receiving emails.
These emails are used to verify that they own the address that is on wolves.
This allows us to link the accounts together.
What was done during outage
First action was to try and remote into the impacted server (kitt.skynet.ie
) and see what was teh status of the data update command.
This was hampered by ssh: could not resolve hostname kitt.skynet.ie: Name or service not known
.
I tried to remote into vendetta.skynet.ie
(ns1) and got the same error.
Next action wast to go into the server room and reboot the vendetta to see if we could get teh hostname issue fixed.
Restarted, it came up, went back into Room 3.
Same issue, though now I was certain teh DNS was working.
My laptop had decided to disconnect from eduroam and refused to reconnect.
Rebooted it, took quite a while since it was trying to mount a network drive that was not accessible,
not a great way to spend 1:30 under time pressure.
When it booted up it connected to eduroam without issues.
I was able to sftp into the server and pull the database file.
Confirmed that the csv import was failing, even on a freshly generated database (sqlite databases can be regenerated easily).
I tried to rebuild it locally, but my dev environment on my laptop was not set up correctly which hampered my efforts.
Soon enough cut losses and continued on with the remainder of the presentation.
Although without the interactive elements that were planned.
What was the root cause
At home with my normal dev env I was abale to properly investigate.
The issue was a database schema I had planned to use as an interim one between teh current method of using the CSV export and the future of using Wolves API.
The plan was to have fields for id_member
which changes yearly (csv) and id_wolves
which would identify the user (does not change).
In combining the two lead to insertion errors leading to no rows being added or updated.
Table schema as it was during teh signup event.
CREATE TABLE IF NOT EXISTS accounts_wolves (
id_wolves text DEFAULT '',
id_member text DEFAULT '',
id_student text,
email text not null,
expiry text not null,
name_first text,
name_second text,
PRIMARY KEY (id_wolves, id_member)
)
What was the solution
The main solution was to simplify so only one primary col.
Like so:
CREATE TABLE IF NOT EXISTS accounts_wolves (
id_wolves text PRIMARY KEY,
id_student text,
email text NOT NULL,
expiry text NOT NULL,
name_first text,
name_second text
)
This also makes it easier to do the migration when teh Wolves API releases.
Things to improve for the future.
- Verify/test it works before a big event.
- If its new enough have a dev env on hand.
- Its not //always// DNS (probally is though)
- Network connectivity is a good first step to test.