The problem
Our client wanted to improve the authentication and authorisation of users to their spatial data. They had some data that the whole company could use and other data that only specific teams could see. Their vector data was stored in a number of PostGIS databases and along with their raster data was served to users via GeoServer.
What is single sign-on (SSO)?
Single sign-on (SSO) is a session and user authentication service that permits a user to use one set of login credentials – for example, a username and password – to access multiple applications. SSO can be used by enterprises, small and midsize organisations, and individuals to ease the management of multiple credentials. This is managed by providing a federated identity system.
We set out to build a system that leveraged the client’s existing user management system which was a Microsoft Active Directory service running on an Azure instance. While we planned to run the rest of their machines on AWS EC2 instances. After much research we came up with an LDAP system that would connect the AD service to GeoServer and PostGIS. This was quite easy to accomplish (in theory) as both GeoServer and PostGIS provide LDAP based authentication and provide a way to create access restrictions based on the users who are authenticated via LDAP, and in theory Active Directory is only a service talking the LDAP protocol.
GeoServer has an LDAP authentication module that takes a user’s name and password and attempts to connect to the LDAP server to check if that combination is valid and if it is to determine which groups the user is in. PostGIS allows the setting of an LDAP server URL in the pg_hba.conf file that checks if the user password combination is valid and allows the user access based on that.
The change request
At this point in the project the client discovered that there was no way that their IT department was going to give them administrative access to the corporate AD server to allow them to create groups of users and move users from one group to another. So we went back to the drawing board with the customer and decided that the easiest solution would be to run a simple LDAP server (OpenLDAP) for them in place of the Azure AD server, we could export their users from the Azure AD server and import them to our server and then assign them to the necessary groups there. In an ideal world we would be able to set up a job to automatically import them from the corporate service when they joined the company and delete them when they left, but for the time being we have left those tasks as a manual job for the GIS administrators.
The implementation details
For ease of implementation we decided to run all of the SSO stack in a collection of docker containers, as pre-built images were available for all the components. Then we were left with just two other machines, one running Tomcat and GeoServer and the other running PostGIS – due to some of the extensions required by the client we had to manage our own database instance rather than using a pre-built RDS instance.
A System Diagram
Within the docker cluster, there are three machines. The first runs Keycloak which (in this instance) we use to provide a simple and easy to use interface for the administrators to create and delete users and groups, reset user passwords and assign users to groups. This means we are using Keycloak in user federation mode rather than making use of its identity provider capabilities. Some readers may be thinking that GeoServer supports Keycloak as an authentication method, unfortunately PostgreSQL doesn’t (yet?) support it as an authentication method. Also, as we originally planned to use Active Directory rather than our own server we decided to stick with LDAP. The second machine in the cluster is a PostgreSQL server that stores Keycloak’s configuration, we decided to keep this separate from the data storage database for security and ease of use. The final machine in the cluster is running OpenLDAP which stores all the users, groups and group memberships of the users.
Keycloak configuration
Once Keycloak is running it provides a web interface that gives administrators access to the system. To connect Keycloak to an LDAP server is a simple case of setting the Federated Identity Server to point to the LDAP server. Once the two services have synchronised all the users defined in the LDAP server can be seen in the Keycloak interface, creating a user in Keycloak will automatically be sent back to the LDAP server as will any changes in groups (or roles). Keycloak also allows you to specify rules that will be executed to convert fields in Keycloak to the expected ones in LDAP or vice versa. For example, Keycloak stores the user’s name as first_name and surname while LDAP wants the whole name stored as the cn_name of the person record. These rules also allow you to set default values or to add all users to a group (e.g. gis_users) automatically.
LDAP configuration
The OpenLDAP server was configured with two groups: users and groups. Users were based on the LDAP class Internet Organizational Person and groups based on groupOfUniqueNames this allows the construction of group memberships using the uniqueMember list in the group and the matching memberOf in the users. This allows systems connecting to the LDAP service to work out which groups a user is a member of or which users are in a group with a single look up.
GeoServer configuration
GeoServer allows administrators to define a LDAP authentication provider, which takes the user name and password of the web user (or desktop GIS) and checks them against the LDAP directory to check if they match and if they do it allows them to see any data they are authorised for. This also checks the groups that the user is in which are mapped to GeoServer groups. The GeoServer administrator then assigns GeoServer roles to users and groups to determine what different users are allowed to see and do with the data on the server.
PostGIS configuration
PostgreSQL and PostGIS make use of a host based access (HBA) system to determine which users are allowed to connect to the database and then uses a system of roles and groups to determine which data they user can see and edit. The HBA system uses a combination of username, database and originating host and an authentication method. This authentication method can vary from a simple trust method to a hashed password in usual usage. It is also possible to specify an LDAP service that PostgreSQL will contact with the username password combination to check if the user can login in. PostgreSQL does not make use of the LDAP group memberships to assign role and group permissions so it is necessary to make sure that roles and groups on the database server match those set using Keycloak and stored in the LDAP service; another open source program, LDAPSync is used to achieve this. A Unix cron job is run periodically that checks the LDAP server for a list of users and groups and then adds any that are new or modified to the PostgreSQL server using SQL commands. While there is a slight chance that a user might try to log in before the database is synchronised it was not felt to be a great risk, and the chances are the user would think they mistyped their password and would try again.
Benefits for the client
Our client has experienced several benefits since having the system in place.
Users enjoy the simplicity of being able to use the same credentials for GeoServer and PostGIS. Should they forget their password, they can reset it using Keycloak.
Life is easier for the GIS administrator as they can use Keycloak to control groups/roles for GeoServer and Postgres combined. They can also use Keycloak to automatically issue password reset reminders and enforce rules about password complexity. This improves system security.
As for the company itself, it benefits from reducing risk but without the cost and limitations of using a proprietary credential and access control system.