This is an old revision of the document!


Will You Trust This TLS Certificate? Perceptions of People Working in IT (extended version) [ACM DTRAP 2020]

   Authors: Martin Ukrop, Lydia Kraus and Vashek Matyas

 Primary contact: Martin Ukrop <mukrop@mail.muni.cz>

 Journal: Digital Threats: Research and Practice

   DOI: 10.1145/3419472

Pre-print PDF   Artifacts   BiBTeX

@Article{2020-dtrap-ukrop,
  Title         = {Will You Trust This TLS Certificate? Perceptions of People Working in IT (Extended Version)},
  Author        = {Martin Ukrop and Lydia Kraus and Vashek Matyas},
  Journal       = {Digital Threats: Research and Practice},
  Volume        = {1},
  Number        = {4},
  numpages      = {30},
  Publisher     = {Association for Computing Machinery},
  Year          = {2020},
  ISSN          = {2692-1626},
  DOI           = {10.1145/3419472},
}

Extended version of preivously published paper

This is an extended version of the paper previously published at the Annual Computer Security Applications Conference (ACSAC) 2019.

Abstract

Flawed TLS certificates are not uncommon on the Internet. While they signal a potential issue, in most cases they have benign causes (e.g., misconfiguration or even deliberate deployment). This adds fuzziness to the decision on whether to trust a connection or not. Little is known about perceptions of flawed certificates by IT professionals, even though their decisions impact high numbers of end users. Moreover, it is unclear how much does the content of error messages and documentation influence these perceptions.

To shed light on these issues, we observed 75 attendees of an industrial IT conference investigating different certificate validation errors. We also analyzed the influence of re-worded error messages and redesigned documentation. We find that people working in IT have very nuanced opinions with trust decisions being far from binary. The self-signed and the name constrained certificates seem to be over-trusted (the latter also being poorly understood). We show that even small changes in existing error messages can positively influence resource use, comprehension, and trust assessment. At the end of the paper, we summarize lessons learned from conducting usable security studies with IT professionals.

  • We investigated perceived trust in five certificate cases: hostname mismatch, self-signed, expired, name constrained and a flawless certificate (as a control case).
  • When validating certificates, the trust decisions are not binary. Even IT professionals do not completely refuse a certificate just because its validation check fails.
    • In case of expired certificates, the expiry duration plays an important role: Certificates expired yesterday were mostly considered as “looking OK”, while a certificate expired 2 weeks ago “looks suspicious” and the one expired a year ago seems “outright untrustworthy”.
    • The certificate subject plays a role: Flaws were less likely to be tolerated for big, established companies (Microsoft was mentioned as an example).
  • We found some certificate cases as over-trusted.
    • 21% of the participants considered the self-signed certificate as “looking OK” or better, with a trust mean comparable to that of an expired certificate. We find this concerning as the self-signed certificate never had any identity assurances.
    • Similarly, 20% of the participants considered the name constrained certificate as “looking OK” or better, with a trust mean again comparable to that of an expired certificate. We find this concerning as the name constraints violation hints at misconfiguration or even malicious activity at the sub-authority level.
  • We had half of the participants interact with real OpenSSL error messages and the other half with our re-designed error messages and documentation. Here is the comparison:
    • The self-signed case was considered significantly less trustworthy with our error message (which we consider a success).
    • The name constrained case was also perceived as less trusted and required less time and less online browsing to understand.
    • The other attributes were comparable – thus, we see our documentation in these cases as better than the existing one.
  • In the redesigned error messages, we included a link to the documentation. To our surprise, 71% of the participants clicked this link. This suggests a nice opportunity of directing the developers to a usable place recommended by the library designers.
  • As a follow-up work, we started gathering X.509 certificate validation errors and documentation from multiple libraries to consolidate the documentation on a single place.

Visit x509errors.org

  • Methodological implications

Based on our experience with usable security experiments on IT professionals, we summarize several study design suggestions. Firstly, IT conferences seem to be an excellent sampling opportunity for studies involving IT professionals. As such samples tend to be quite heterogeneous, controlling for previous experience is crucial. Secondly, the inclusion of educational debriefing after empirical experiments may be beneficial as IT professionals appear to be interested to learn. Thirdly, observed behaviors should be preferred to self-reported ones. As much data as possible should be collected automatically to ease the processing.

The content of this research was partially covered at the DevConf 2019 talk that can be seen below. Presentation

The artifacts contain the full experimental setup (as described in Section 2.1 of the paper) and the complete anonymized dataset underlying the evaluation presented in Sections 3 and 4.

The experimental setup contains the documents accompanying the task: the informed consent, pre-task questionnaire, task description, trust scales, and the list of questions posed during the post-task interview (all in PDFs). We further include the custom website with certificate validation documentation for the “redesigned” condition (static HTML). While working on the task, participants in the “redesigned” condition could access this website via a link that was in the redesigned error messages. Furthermore, we provide the software with which the participants interacted.  It contains the displayed error messages and validated certificates. These things are available both individually and incorporated in a snapshot of a virtual machine used at the experiment (importable directly into VirtualBox).

The collected data is presented in a single dataset (SPSS format; you can use PSPP as a free alternative). It includes the analysis syntax files to obtain the numerical results presented in the paper. For each participant, the dataset contains: 1) pre-task questionnaire answers, 2) reported trust ratings, 3) sub-task timing, 4) information on whether they browsed the Internet and 5) the interview codes assigned. Note that we do not publish the interview transcripts to preserve participant privacy.

Go to artifacts repository (gDrive)