Clarity

Enhance communication between deaf and hard-of-hearing and hearing individuals - UX/UI Design

My Role

• Team Lead (team of 3)
• UX Designer
• User Test Moderator

Methods

• Interview
• Persona (Figma)
• Ideation (design method cards + sketching)
• Use Cases & Storyboards
• Lo-fi Prototype (paper)
• Hi-fi Prototype (Figma)
• User Tests 

Time

August – December, 2019

INTRODUCTION

Clarity is a speech-to-text application for the purpose of enhancing the real-time communication between deaf and hard-of-hearing (DHH) and hearing people in professional environments. Although the design prompt required the form factor of our design solution must be mobile, our design deliverable focused on the desktop version of our design. By doing so, we wanted to maintain a certain level of professionalism in the use of our design. Our design can definitely be extended to other mobile platforms such as smartphones and tablets.

CHALLENGE

How to utilize speech-to-text technologies to enhance real-time communication between deaf and hard of hearing (DHH) and hearing people?

There are a plethora of tools that utilize speech. In particular, speech-to-text technologies have potential to enhance real-time communication between deaf and hard of hearing (DHH) and hearing people. In this project, you will design a solution that harnesses automatic speech-to-text, vibro-tactile, and other technical capabilities to enable fluid, real-time communication between DHH and hearing individuals on one or more mobile devices. Effective solutions will enhance communication in professional environments where DHH individuals work with hearing counterparts.

DESIGN PROCESS

The design of Clarity followed the User-Centered Design (UCD) process in which users and their needs are primary focuses of each stage of the design process.

NEEDS ASSESSMENT

User Interviews

Interview icon

We interviewed four target users including one deaf, one hard-of-hearing, and two hearing individuals in the needs assessment step. The interview responses helped us understand hearing people’s habits and preferences when communicating with DHH people, and vice versa. When adopting a new tool for communication, our stakeholders consider the following factors important:

  • Efficiency.
  • Accuracy (including the accuracy of verbal English and professional terms).
  • Background noise minimization.
  • Be able to identify the speaker in a group setting.
  • Comprehensive and clear.
  • Easy to operate.
  • Easy to correct erros.

Personas

The interviews with stakeholders especially our DHH expert users gave us great insight into their experience, frustrations, and wishes of communicating with each other. We created three personas based on analysis of information gathered from interviews with four target users. Behavioral variables of interviewees were extracted from user interviews and then organized to represent three types of target users: deaf, hard-of-hearing, and hearing ones. Behavioral patterns, frustrations, communication preferences, and goals and needs were delineated in the personas

IDEATION

Sketching: 30 - 3 - 1

30 Sketches

Each of us ideated broadly and created 30 design ideas without as much constraint as possible. I found it quite difficult to brainstorm ideas without much constraint as the goal of this course is to design solutions centering on users.

In terms of my sketching abilities, I became better at quickly sketching ideas without excessive details. in order to capture 30 ideas and generate sketches in a short period of time, I had to work in a much faster pace and give up a lot of details of the sketches. Based on the feedback I received in class, I believe my sketches are quite easy to understand.

Top 3 Sketches​

As a team, we organized and converged our 90 ideas to the top 3 ideas.

Idea 1: AR + Captioning

Displays live captions on mobile devices using augmented reality (AR) and speech-to-text technology. DHH individuals read captions in live conversations with hearing individuals without having to look down at the device.

This design was chosen because it allows DHH individuals to catch non-verbal cues such as facial expressions and body language of the speaker, helping to make the communication more personal and effective.

Idea 1 AR Captioning
Idea 1. AR + Captioning

Idea 2: Input Acceleration with Word Prediction

Provides sentence templates based on context and text prediction to accelerate text entry and improve communication efficiency. The user can choose the context of communication and select and edit preset sentences appropriate for the conversation. The design predicts words based on other words typed by the user. The quality and accuracy of word prediction will improve by machine learning as the user keeps on using this function.

This design was chosen because it addressed the lack of efficiency issue of communicating with DHH people through typing.

Idea 2 Input Acceleration with Word Prediction
Idea 2. Input Acceleration with Word Prediction

Idea 3: Dynamic Presentation

Looking down at the device to type makes the conversation impersonal because non-verbal cues are not always obvious. This design highlights certain information by displaying words of captions in various font sizes, colors, and animations. Bigger fonts are easy to read as well as emphasizing the importance of certain information. Text colors and animation helps to convey the emotions of the speaker.

Idea 3 Dynamic Presentation
Idea 3. Dynamic Presentation

1 Final Sketch

We further converged our design ideas to a final one and sketched it out. Our final sketch consists of two parts.

Part I: Output

Part one focused on output of the system. In group meetings, the microphone in front of each hearing individual captures and transcribes the person’s speeches. Captions with identifications of speakers are displayed under the presentation slides on the screen. The same screen is also available on the device (laptop / tablet / smartphone) in front of the attendees. They are also able to remove the slide portion on their screens to read the captions solely.

Part 1. Output of the System
Part I. System output

Part II: Input

Part two of our final sketch concerns with input into the system by users. While DHH individuals can use the keyboard on the laptop to type in meetings, it is more convenient to type on mobile devices (tablet / smartphone) in one-on-one conversation. No matter what input devices are used, our design predicts words based on previous sentences as well as the context of communication selected by the user. In addition, Text typed by DHH individuals is displayed in different font sizes and colors depending on the importance of the word such as name, location, and date. Sometimes words are animated to highlight certain information to make an impression of someone speaking in order to convey the emotions of the “speaker”.
Part 2 System Input
Part II. System input

User Feedback

We presented the final sketch to an expert user to solicit feedback on our design idea. Our expert user who is self-identified as deaf would like to be able to customize the display of presentation materials on her device. She would also like to have the option to scroll and review the transcript during the meeting and discussion. Instead of using color and animation to realize dynamic presentation, our expert user suggested to use alternative ways to avoid accessibility issues.

I led the feedback session by communicating with our expert user using Google Doc, and my teammates reminded me information I needed to add and questions I should to ask. Instead of jumping right into describing our design, I started by asking our expert user to describe what she saw on the design sketch. Based on her description, I hoped to gauge the difference between the conceptual model of our design and her mental model. Additionally, this method could also help us evaluate our ability on presenting our ideas with sketch.

Feedback Session
We solicited feedback on our top one design idea from an expert user.

USE CASES & STORYBOARDS

Case 1. Real-Time Transcription in a Meeting

Actors
A hard-of-hearing UX researcher Ryan and his hearing colleagues, including a presenter and other attendees.

Purpose
Displaying transcript of speech in real time.

Storyboard

  1. The presenter speaks into a microphone connected to his/her laptop.
  2. Real-time transcript displays on every attendee’s screen.
  3. Ryan, a hard-of-hearing user, minimizes the area containing the shared slides on his screen.
  4. The area containing the shared slides is minimized thus the transcript area is expanded.

Case 2. Input Acceleration and Dynamic Presentation of Text

Actors
Matthew, a hearing cardiology doctor, and a deaf patient.

Purpose
Communicate with DHH patients efficiently.

Storyboard

  1. The doctor, Matthew asks the patient where he feels pain.
  2. Real-time transcript displays on both the doctor’s and the patient’s screens.
  3. After the patient responds, Matthew asks a follow-up question.
  4. Matthew edits the transcript to fix transcription errors.
  5. The patient replies.

Case 3. Transcription Review during Discussion

Actors
A team of students, including a deaf student, Emily, and other hearing ones.

Purpose
Review previous transcript.

Storyboard

  1. A team of students are having a group discussion. A hearing student speaks into the microphone connected to his/her laptop.
  2. Real-time transcript and name of the speaker display on every member’s screen.
  3. Emily forgets some previous information so she scrolls and reviews the transcript.

LO-FI PROTOTYPING

Process

Before diving into prototyping with digital tools, we first created a lo-fidelity prototype for our design with pen and paper.

Although paper prototype is supposed to be fast and inexpensive, it appears more difficult and time-consuming than we expected. Before we cut any paper and drew any user interface elements, we read our use cases together again and sketched individual screens respective to each use case. As we sketched, we found that our ideas of how to represent the same element on the screen were sometimes quite different. As a result, we discussed all the discrepancies of our ideas in order to finalize our sketches. Sometimes the discussion took a while as we could not reach an agreement.

User Feedback

We tested our paper prototype with two deaf and one hearing participants. We received useful feedback as well as suggestions for design modifications from our expert users.

Overall, the conceptual model of our design was able to match expert users’ mental models of what they understood by interacting with the paper prototype. As our participants interacted with the prototype, they indicated they were able to notice the key features, for example, minimizing the presentation window and important information highlights.

paper prototype
The paper prototype we used in user feedback sessions.

Modifications

Based on the user feedback, we made changes to our design in the following aspects based on feedback and suggestions from our participants:

1. Position of minimized window
The minimized presentation window will be centered; users should also be able to remove the window from the screen to focus on the main presentation screen in the classroom.

2. Microphone default setting
The microphone is disabled by default, but once the user makes changes to this setting, the system will remember the user preference so that the user does not need to set the microphone every time in the future.

3. Editing icon/interaction
Our participants preferred a different way of displaying transcription/text edit option. We would make a modification in our next iteration.

lo-fi change screen
Based on user feedback on the paper prototype, the minimized shared screen was moved from the top-right corner to the center because one of our expert users suggested that it would be easier to identify the shared screen.

HI-FI PROTOTYPING

We created a hi-fidelity prototype of our design using Figma. Our hi-fidelity prototype reflected the latest version of our design which incorporates previous changes made based on user feedback.

Features

Primary Feature 1. Virtual Meeting Space

When an in-person meeting or discussion is about to start, a member, a user with an account in the system, can create a virtual meeting space in the system and become the host of the meeting space. The default mode of a meeting space is the discussion mode.

virtual meeting space
A virtual meeting space was created, and the default mode of a meeting space is the discussion mode.
The presentation mode of the system.
The presentation mode of the system.
Primary Feature 2. 1 Real-Time Transcription with Speaker Identification
When a user speaks into the microphone of his/her device, the system transcribes the speech instantly and display the transcription along with his/her name and/or photo on each attendee’s screen. The transcript is displayed automatically without the need of being confirmed by the speaker. In the presentation mode, when the presenter speaks into the microphone of his/her device, the system transcribes the speech instantly and display the transcript on each attendee’s screen.
speaker identification
When a user speaks into the microphone of his/her device, the system transcribes the speech instantly and display the transcription along with his/her name and/or photo on each attendee's screen.
The presentation mode of the system.
The presentation mode of the system.
Primary Feature 2. 2 Editable Transcript
A user can manually edit his/her own transcript or text input through typing during the meeting or discussion by clicking the three dots next to his/her input and selecting the edit option on the side menu. The user can also copy, quote, and delete his/her own input, but s/he cannot edit, copy, quote, or delete input of other attendees. When the person has finished editing, the system will notify all other attendees that whose transcript or text input has been edited and allow them to jump to the edited text. Edited text is marked as it has been edited.
editable transcript
A user can manually edit his/her own transcript or text input through typing during the meeting or discussion by clicking the three dots next to his/her input and selecting the edit option on the side menu. The user can also copy, quote, and delete his/her own input. Edited text is marked as it has been edited.
Primary Feature 2. 3 Meeting Records
The transcript and text input display and stay in the meeting space during the meeting or discussion. Attendees can scroll up to review previous transcript and text input in the meeting space. When the meeting or discussion ends, attendees can review the transcript and text record in the system. There are two columns on the record: one column is the transcript of the presenter, and the other is the transcript of discussion during the meeting. The location of the discussion script approximately matches the time of presenter’s script.
meeting records
When the meeting or discussion ends, attendees can review the transcript and text record in the system.
Primary Feature 3.1 Text Input
All users, including the hearing ones, can participate in the meeting or discussion by typing on their devices instead of speaking. Users can disable the speech capture by turning off the microphone next to the input field. Users can also permanently disable the speech capture in setting. The system will remember this setting for members of the system while non-members need to set up this feature every time they interact with the system.
text input
All users can participate in the meeting or discussion by typing on their devices instead of speaking by turning of the microphone to disable speech capture.

Primary Feature 3.2 Text Input Acceleration
The system predicts and suggests words following previous words typed by the user. The user can confirm the suggesting words to speed up the input process by pressing the “tab” button on the keyboard.

text input acceleration
The user can confirm the suggesting words to speed up the input process by pressing the “tab” button on the keyboard.
Primary Feature 4 Dynamic Presentation of Text
Dynamic Presentation highlights important information in transcript and text input to emphasize the importance of certain information. The system detects and highlights the following information in transcript and text input.
Content and highlight style of information that is highlighted by the system.
Content and highlight style of information that is highlighted by the system.
dynamic presentation
The system detects and highlights the following information in transcript and text input.

User Testing

We conducted the high-fidelity prototyping user testing on our design with two expert users. Considered the time limit for each testing, we decided to have our expert users complete three primary tasks along with a few small tasks focusing on the main features of our design. The three primary tasks include joining an online meeting space on our system to view the real-time transcript of a lecture, participating in a small group discussion by typing or speaking and editing text or speech input, and reviewing previous transcript
on the system. Both our experts were able to finish all three tasks in spite of some difficulties in understanding some parts of the user interface.

We received positive feedback on the prototype, especially the dynamic presentation and identification of speaker features. Our expert users also provided feedback on the user interface components to make information more readable. It was also mentioned that the system can also be a good tool for online meetings and discussions.

Due to the functional limit of the prototyping tool we used, we were not able to create the effect of real-time transcription on our prototype. We could instead only display the whole sentence on a different screen once we gave the instruction to our expert user.
hi-fi user test
We conducted user testings with two expert users.

Deliverable

A design specification for communicating our design with developers was created at the end of this project. The specification provides detailed information about our design idea, decisions, process, and key design details with rationale.

REFLECTION

Design of Clarity

Reviewing the design process of Clarity this semester, I believe our design has come a long way. Our design idea is based on state-of-the-art technology, making it more realistic and practical. We also paid attention to the social accessibility of our design by drawing as little attention from people around our users as possible. While the primary features of our design remained almost the same throughout the semester, we kept refining our design based on results of user testing. Results of the user testings revealed that our design is able to address most of the user needs we originally identified in needs assessment.

Accessibility

Regarding my perspectives on accessibility, I feel more comfortable and confident to work with people with diverse abilities since the first meeting with our expert user. Similar to any relationships with people, it is important to be patient and considerate when interacting with people with diverse abilities.