Skip to content

Files

Latest commit

 

History

History
15 lines (8 loc) · 3.92 KB

README_Process.md

File metadata and controls

15 lines (8 loc) · 3.92 KB

EzProxy Logs : Process

Coming from a digital humanities background, I wholeheartedly agree with privacy responsibilities in light of our contemporary era which values people for their data surplus and how that can be on sold for unethical commercial aims. Wearing my day-job hat, I also agree with taking appropriate actions to ensure that library subscriptions do not breach or infringe publishers’ conditions. Given the amount of email phishing and social engineering being deployed to obtain peoples’ account details, you do need to find a middle ground that protects users from profiling while also having enough detail to identify which accounts have been compromised and are being used to data-mine various publishers’ offerings.

After seeing some odd behaviour this year on our EzProxy, I worked with our EzProxy specialist and I wrote a series of python scripts (which I’ll package up into GitHub for sharing over the next week). With these scripts, my basic methodology is to download the previous day’s audit and main logs. These are then parsed by a script that orders users according to the number of EzProxy sessions they created that previous day. Generally, anything around 10 or less sessions per day (allowing for multiple devices) I consider normal use of EzProxy. Anything above that or which stands out of the crowd (say 35 or 45 sessions) always warrants further investigation.

If there is a user like this in the ordered results then I move onto the next script which parses their behaviour in broad brushstrokes – what top-level platforms they were accessing (e.g., ProQuest, Springer), downloads in megabytes, if there were any referring websites (like pubmed007 or 2447, for example), and if they were accessing from multiple national or international regions. This last point can be a tad tricky given the uptake of VPNs but it is often a very good indicator of suspicious behaviour if, for example, you have a user with lots of sessions accessing EzProxy from the US and China.

If they are referred to EzProxy via a dodgy website, then they warrant reporting to our digital security IT team for resetting the password and contacting the student, and the website is added to our blacklist in the EzProxy configuration file. If they have unusual downloading behaviour that seems practically impossible unless it was automated by a bot (for example, gigs of data across all hours of the day) then the student is reported to the digital security team. If the student’s access originates from across multiple international time-zones then the same is done. The important part is that we cannot have our digital security IT team assist this student under the university’s acceptable use of our resources without the student’s username. After we’ve requested IT to contact the student and assist them with resetting their password, we delete the day’s logs and reset the process.

Over the course of this year, we have reduced data-mining from an uncomfortably high value to nil, and we no longer receive IEEE notices. We have found that many compromised accounts are via students who have been phished by spam or fake email (and whose other details could be compromised too, given the interrelationship of university systems). In this work, we also discovered that some compromised accounts were being used overseas and then being proxied out to other users.

We used to do this process three times a day (morning, midday, evening), then daily, and now about once or twice a week. We expect to ramp up activity again at the start of semester two.

So, it is a matter of working in a reasonable and Library-enforced way to protect user’s privacy – not only within the university and also, importantly, when their login details appear to be compromised, which can affect other online university systems if users authenticate to EzProxy via LDAP, AD or SSO credentials – and ensuring compliance in the library’s contractual obligations to publishers.