Bot Protection During Signup and Signin

Bots, bots bots.  They're everywhere apparently.  They are becoming more complex and cause havoc to customer facing identity management systems, IoT devices and more.  Fake accounts.  Dummy accounts.  Redundant accounts. Orphan accounts.  Fraudulent accounts.  Not to mention DDoS (Distributed Denial of Service) attacks.  Mine field.

Well, there are certainly some basic steps that can be taken to help identify and prevent bot usage of the key identity management services many public facing API's and applications expose.  Firstly, let's describe some of the main functional areas bots are likely to attack.

A Design for Modern Authentication

The password is dead. Long live the password! I have lost count of how many articles and blogs I have seen with regards to the weaknesses, the management, the flexibility, security, insecurity and overall usage of passwords when it comes to user authentication. We all use them and they’re not going anywhere any time soon. OK, so next step. What else can and should we be using for our user and device based authentication and login journeys?


Where We Are Now – The Sticking Plaster of MFA


So we accept that the traditional combo of user name and passwords is bad for our (system) health. Step forward multi-factor authentication. Or 2FA. Take your pick. This generally saw the introduction of something you have in the form of a token, phone-as-a-token or some other out of band mechanism that would create a one-time-password. Traditionally the “out of band” mechanism was either an email or SMS to a preregistered address or phone number, that contained a 6 digit pass code. Internal or employee systems would often leverage a hard token – either a USB dongle or a small tag with a tiny display that would show a rotating pin. These concepts were certainly better from a security perspective, but a) were not unbreakable and b) often created a disjointed user login experience with lots on interruptions and user interaction.

Basic MFA factors


Where We Are Moving To – Nano Authentication!


OK, so user name and passwords are not great. MFA is simple, pretty cheap to implement, but means either the end user needs to carry something around (a bit 2006) or has an interrupted login journey by constantly being asked for a one-time-password. What we need is not two-factor-authentication but more nano-factor-authentication! More factors that are smaller.  At least 10 to be precise. Increase the factors and aim to reduce the material impact of a single factor compromise whilst simultaneously reducing the number of user interrupts. These 10 factors (it could be 8, it could be 15, you get the idea) are all about introducing a broad spectrum analysis for the login journey.

Each factor is must more cohesive and modular, analysing a single piece of the login journey. The login journey could still leverage pretty static profile related data such as a user name, but is augmented with much more context – the location, time, device origin of the request and comparison factors that look at previous login requests to determine patterns or abnormalities.

Breaking authentication down? "Nano-authentication"

Some of these factors could “pass” and some could “fail” during the login journey, but the process is much about about accumulating and analysing risk and therefore being able to respond to high risk more accurately. Applying 2FA to every user login does not reduce risk per-se, it simply applies a blanket risk to every actor.

Wouldn’t it be much better to allow login variation for genuine users who do regularly change machine, location, network and time zone? Wouldn’t it be better to give users more choice over their login journeys and provide numerous options if and when high risks scenarios do occur?

 Another key area I think authentication is moving towards, is that of transparency. “Frictionless”, “effortless” or “zero-effort” logins are all the buzz. If, as an end user I enrol, sacrifice the privacy regarding a device fingerprint, maybe download a OTP or push app, why can’t I just “login” without having my experience interrupted? The classic security/convenience paradox. By introducing more factors and “gluing” those factors together with processing logic, the user authentication system can be much more responsive – perhaps mimicking a state machine, designed in a non-deterministic fashion, where any given factor could have multiple outcomes.

Where We Want To Get To – Transparent Pre-Authentication


So I guess the sci-fi end goal is to just turn up at work/coffee shop/door/car/website/application (delete as applicable) and just present one selves. The service would not only “know” who you were, but also trust that it is you. A bit like the Queen. Every time you presented yourself, transparent background checks would continually evaluate every part of the interaction, looking for changes and identifying risk.

Session + Bind + Usage - increasing transparency?


The closest we are to that today in the web world at least, is the exchange of the authentication process for a cookie/session/tokenId/access_token. Whether that is stateless or stateful, it’s something to represent the user when they attempt to gain access to the service again. Couple that token to some kind of binding (either to a PKI key pair, or TLS session) to reduce the impact of token theft and there is some kind of repeatable access use case. However, change is all around and the token presentation, must therefore be coupled with all usage, context, resource and transaction data that the token is attempting to access, to allow the authentication machine to loop through the necessary nano-factors either individually or collectively to identify risk or change.

Authentication is moving on.  A more modern system must accommodate a broad spectrum  when it comes to analysing who is instigating a transaction which must also be coupled with mechanisms that increase transparency and pre-identification of risk without unnecessary and obtrusive interruptions.

The Role of Identity Management in the GDPR

Unless you have been living in a darkened room for a long time, you will know the countdown for the EU's General Data Protection Regulation is dramatically coming to a head.  May 2018 is when the regulation really takes hold, and organisations are fast in the act on putting plans, processes and personnel in place, in order to comply.

Whilst many organisations are looking at employing a Data Privacy Officer (DPO), reading through all the legalese and developing data analytics and tagging processes, many need to embrace and understand the requirements with how their consumer identity and access management platform can and should be used in this new regulatory setting.

My intention in this blog, isn't to list every single article and what they mean - there are plenty of other sites that can help with that.  I want to really highlight, some of the more identity related components of the GDPR and what needs to be done.

Personal Data

On the the personal data front, more and more organisations are collecting more data, more frequently than ever before.  Some data is explicit, like when you enter your first name, last name and date of birth when you register for a service for example, through to the more subtle - location, history and preference details amongst others. The GDPR focuses on making sure personal data is processed legally and data is only kept for as long as necessary - with a full end user interface that has the ability to make sure their data is up to date and accurate.

It goes with out saying, that this personal data needs to have the necessary security, confidentiality, integrity and availability constraints applied to it.  This will require the necessary least privileged administrative controls and data persistence security, such as the necessary hashing or encryption.

Lawful Processing

Ah the word law! That must be the legal team. Or the newly appointed DPO. That can't be a security, identity or technology issue.  Partially correct. But the lawful processing, also has a significant requirement surrounding the capture and management of consent.  So what is this explicit consent? The data owner - that's Joe Blogs whose data has been snaffled - needs to be fully aware of the data that has been captured, why it is captured and who has access.

The service provider, also needs to explicitly capture consent - not an implicit "the end user needs to opt out", but more the end user needs to "opt-in" for their data to be used and processed.  This will require a transparent user driven consent system, with sharing and more importantly, timely revocation of access.  Protocols such as User Managed Access may come in useful here.

Individuals Right to be Informed

The lawful processing aspect, flows neatly into the entire area of the end user being informed.  The end user needs to be in a position to make informed decisions, around data sharing, service registration, data revocation and more.  The use of 10 page terms and conditions thrust down the end user's screen at service startup, are over.

Non-tech language is now a must, with clear explanations of why data has been captured and which 3rd parties - if any - have access to the data.  This again flows into the consent model - with the data owner being able to make consent decisions, they need simple to understand information.  So registration flows will now need to be much more progressive - only collecting data when it is needed, with a clear explanation of why the data is needed and what processing will be done with it.  20 attribute registration forms are dead.

Individuals Right to Rectification, Export and Erasure

Certainly some new requirements here - if you are a service provider, can you allow your end users to clearly see what data you have captured about them, and also provide that data in a simple to use end user dashboard where they can make changes and keep it up to date?  What about the ability for the data owner to export that data in a machine readable and standard format such as CSV or JSON?

Right to erasure is also interesting - do you know where your end user data resides?  Which systems, what attributes, what correlations or translations have taken place?  Could you issue a de-provisioning request to either delete, clean or anonymize that data? If not you may need to investigate why and what can be done to remediate that.


Conclusion

The GDPR is big.  It contains over 90 articles, containing lots of legalese and fine grained print.  Don't just assume the legal team or the newly appointed DPO will cover your company's ass.  Full platform data analytics tagging will be needed, along with a modern consumer identity and access management design pattern.  End user dashboards, registration journeys and consent frameworks will need updating.

The interesting aspect, is that privacy is now becoming a competitive differentiator.  The GDPR should not just be seen as an internal compliance exercise.  It could actually be a launch pad for building closer more trusted relationships with your end user community.

Why Tim Berners-Lee Is Right About Internet Privacy

Last week, the "father" of the Internet, Tim Berners-Lee, did a series of interviews to mark the 28 year anniversary since he submitted his original proposal for the worldwide web.

The interviews were focused on the phenomenal success of the web, along with a macabre warning describing 3 key areas we need to change in order to "save" the Internet as we know it.

The three points were:


  1. We’ve lost control of our personal data
  2. It’s too easy for misinformation to spread on the web
  3. Political advertising online needs transparency and understanding
I want to primarily discuss the first point - personal data, privacy and our lack of control.

As nearly every private, non-profit and public sector organisation on the planet, either has a digital presence, or is in the process of transforming itself to be a digital force, the transfer of personal data to service provider is growing at an unprecedented rate. 

Every time we register for a service - be it for an insurance quote, to submit a tax return, when we download an app on our smart phones, register at the local leisure centre, join a new dentists or buy a fitness wearable, we are sharing an ever growing list of personal information or providing access to our own personal data.

The terms and conditions often associated with such registration flows, are often so full of "legalese", or the app permissions or "scope" so large and complex, that the end user literally has no control or choice over the type, quality and and duration of the information they share.  It is generally an "all or nothing" type of data exchange.  Provide the details the service provider is asking for, or don't sign up to the service. There are no alternatives.

This throws up several important questions surrounding data privacy, ownership and control.
  1. What is the data being used for?
  2. Who has access to the data, including 3rd parties?
  3. Can I revoke access to the data?
  4. How long with the service provider have access to the data for?
  5. Can the end user amend the data?
  6. Can the end user remove the data from the service provider - aka right to erasure?
Many service providers are likely unable to provide an identity framework that can answer those sorts of questions.

The interesting news, is that there are alternatives and things are likely to change pretty soon.  The EU General Data Protection Regulation (GDPR), provides a regulatory framework around how organisations should collect and manage personal data.  The wide ranging regulation, covers things like how consent from the end user is managed and captured, how breach notifications are handled and how information pertaining to the reasons for data capture are explained to the end user.

The GDPR isn't a choice either - it's mandatory for any organisation (irregardless of their location) that handles data of European Union citizens.

Couple with that, new technology standards such as the User Managed Access working group being run by the Kantara Initiative, that look to empower end users to have more control and consent of data exchanges, will open doors for organisations who want to deliver personalised services, but do so in a more privacy preserving and user friendly way.

So, whilst the Internet certainly has some major flaws, and data protection and user privacy is a big one currently, there are some green shoots of recovery from an end user perspective.  It will be interesting to see what the Internet will look like another 28 years from now.